<< Masa.20100908-test_import | 2010 | Masa.20100906-debug-gkv >>
Test First principle, otherwise we will waste our time on unnecessary debug task
Install Ruby via emerge normally
$ emerge ruby
Install Ruby 1.9 via emerge
$ vim /etc/portage/package.unmask >=dev-lang/ruby-1.8.9
eselect Ruby 1.8
$ sudo eselect ruby set ruby18
Ruby source: http://ftp.ruby-lang.org/pub/ruby/1.8/ruby-1.8.6-p369.tar.gz
Install Ruby 1.8.6 with oniguruma from source code
masa@masa ~/work $ tar zxvf ruby-1.8.6-p369.tar.gz masa@masa ~/work $ tar zxvf oniguruma_patch_for_ruby-1.8.6_p369.20100615_from_Hannes-san.tar.gz masa@masa ~/work $ cd oniguruma masa@masa ~/work/oniguruma $ ./configure --with-rubydir=/home/masa/work/ruby-1.8.6-p369 masa@masa ~/work/oniguruma $ make 186 masa@masa ~/work/oniguruma $ cd /home/masa/work/ruby-1.8.6-p369 masa@masa ~/work/ruby-1.8.6-p369 $ ./configure --prefix=/home/masa/bin/ruby186 masa@masa ~/work/ruby-1.8.6-p369 $ make masa@masa ~/work/ruby-1.8.6-p369 $ make install
Set PATH
$ vim ~/.bashrc export PATH=/home/masa/bin/ruby186/bin:$PATH:
Confirm
masa@masa ~/ywesee/de.oddb.org $ ruby -v ruby 1.8.6 (2009-06-08 patchlevel 369) [x86_64-linux] masa@masa ~/ywesee/de.oddb.org $ gem list *** LOCAL GEMS *** activesupport (2.3.8) archive-tar-minitar (0.5.2) archive-tarsimple (1.1.1) character-encodings (0.4.1) color (1.4.0) columnize (0.3.1) csvparser (0.1.1) facets (1.8.54) fastercsv (1.5.3) flexmock (0.8.6) gd2 (1.1.1) gruff (0.3.6) hoe (2.6.1) hpricot (0.8.2, 0.6.164) htmlentities (4.2.1) json (1.4.3) json_pure (1.4.3) linecache (0.43) mechanize (1.0.0) money (3.0.2) needle (1.3.0) nokogiri (1.4.2) oniguruma (1.1.0) parseexcel (0.5.2) paypal (2.0.0) pdf-writer (1.1.8) pg (0.9.0) postgres (0.7.9.2008.01.28) rake (0.8.7) rcov (0.9.8) rmagick (2.9.0) rmail (1.0.0) rockit (0.7.2) ruby-debug (0.10.3) ruby-debug-base (0.10.3) ruby-ole (1.2.10.1) ruby-termios (0.9.6) rubyforge (2.0.4) rubyzip (0.9.4) setup (5.0.1) spreadsheet (0.6.4.1) swissmedic-diff (0.1.3) text-hyphen (1.0.0) tmail (1.2.7.1) transaction-simple (1.4.0) turing (0.0.11) version (0.9.2) vim-ruby (2007.05.07)
Test run de.oddb.org bin/oddbd
masa@masa ~/ywesee/de.oddb.org $ bin/oddbd ruby: no such file to load -- auto_gem (LoadError)
FAIL
Check source code
$ locate ruby-1.8.6 /usr/portage/distfiles/ruby-1.8.6-p369.tar.bz2
Download ruby-1.8.6_p369.ebulid
masa@masa /usr/portage/dev-lang/ruby $ sudo ebuild /usr/portage/dev-lang/ruby/ruby-1.8.6_p369.ebuild digest
>>> Creating Manifest for /usr/portage/dev-lang/ruby
masa@masa /usr/portage/dev-lang/ruby $ sudo ebuild /usr/portage/dev-lang/ruby/ruby-1.8.6_p369.ebuild compile
>>> Existing ${T}/environment for 'ruby-1.8.6_p369' will be sourced. Run
>>> 'clean' to start with a fresh environment.
* ruby-1.8.6-p369.tar.bz2 RMD160 SHA1 SHA256 size ;-) ... [ ok ]
* checking ebuild checksums ;-) ... [ ok ]
* checking auxfile checksums ;-) ... [ ok ]
* checking miscfile checksums ;-) ... [ ok ]
* checking ruby-1.8.6-p369.tar.bz2 ;-) ... [ ok ]
* CPV: dev-lang/ruby-1.8.6_p369
* REPO: funtoo
* USE: amd64 berkdb elibc_glibc gdbm ipv6 kernel_linux multilib ssl userland_GNU
>>> Checking ruby-1.8.6-p369.tar.bz2's mtime...
>>> WORKDIR is up-to-date, keeping...
>>> It appears that 'ruby-1.8.6_p369' is already compiled; skipping.
>>> Remove '/var/tmp/portage/dev-lang/ruby-1.8.6_p369/.compiled' to force compilation.
masa@masa /usr/portage/dev-lang/ruby $ sudo ebuild /usr/portage/dev-lang/ruby/ruby-1.8.6_p369.ebuild install
masa@masa /usr/portage/dev-lang/ruby $ sudo ebuild /usr/portage/dev-lang/ruby/ruby-1.8.6_p369.ebuild qmerge
Confirm
masa@masa /usr/portage/dev-lang/ruby $ ruby -v ruby 1.8.6 (2009-06-08 patchlevel 369) [x86_64-linux] masa@masa ~/ywesee/de.oddb.org $ bin/oddbd /home/masa/ywesee/de.oddb.org/lib/oddb/html/view/drugs/package.rb:373: warning: parenthesize argument(s) for future version I, [2010-09-07T09:57:47.614035 #2821] INFO -- start: starting oddb-server on druby://localhost:11000
Correct YAML.rb bug
$ sudo vim /usr/lib64/ruby/1.8/yaml/rubytypes.rb
def is_binary_data?
#( self.count( "^ -~", "^\r\n" ) / self.size > 0.3 || self.count( "\x00" ) > 0 ) unless empty?
( self.count( "^ -~", "^\r\n" ) / self.size > 0.3 || self.index( "\x00" ) > 0 ) unless empty?
end
NOTE
This is wrong! It should be as follows
$ sudo vim /usr/lib64/ruby/1.8/yaml/rubytypes.rb
def is_binary_data?
#( self.count( "^ -~", "^\r\n" ) / self.size > 0.3 || self.count( "\x00" ) > 0 ) unless empty?
( self.count( "^ -~", "^\r\n" ) / self.size > 0.3 || self.index( "\x00" ) ) unless empty?
end
Summary
Comment: usually emerge command does
If I had never got the source code, I might do as following
Link
Confirm failure again
masa@masa ~/ywesee/de.oddb.org $ ruby test/import/test_gkv.rb Loaded suite test/import/test_gkv Started .FF..... Finished in 0.088421 seconds. 1) Failure: test_import(ODDB::Import::TestGkv) [test/import/test_gkv.rb:106]: <2> expected but was <0>. 2) Failure: test_import__ml(ODDB::Import::TestGkv) [test/import/test_gkv.rb:178]: <1> expected but was <3>. 8 tests, 14 assertions, 2 failures, 0 errors
Reading source code
Comment out other assert
def test_import
existing = Drugs::Package.new
existing.add_code(Util::Code.new(:cid, '4000741', 'DE'))
existing.add_part(Drugs::Part.new)
existing.save
sequence = Drugs::Sequence.new
product = Drugs::Product.new
product.name.de = 'A product'
existing.sequence = sequence
sequence.product = product
assert_nil(existing.code(:zuzahlungsbefreit))
## simulate a call to @import.import
report = simulate_import
assert_equal 2, @invalidator.invalidated.size
Confirm
masa@masa ~/ywesee/de.oddb.org $ ruby test/import/test_gkv.rb Loaded suite test/import/test_gkv Started F Finished in 0.019398 seconds. 1) Failure: test_import(ODDB::Import::TestGkv) [test/import/test_gkv.rb:108]: <2> expected but was <0>. 1 tests, 2 assertions, 1 failures, 0 errors
lib/oddb/util/updater.rb
def Updater.import_gkv(opts = {})
importer = Import::Gkv.new
if url = importer.latest_url(Mechanize.new, opts)
importer.download_latest url, opts do |fh|
reported_import(importer, fh,
:subject => 'Zubef', :filetype => 'PDF')
end
end
end
Consideration
def import fh, opts={}
parser = Rpdf2txt::Parser.new(fh.read, 'utf8')
handler = GkvHandler.new method(:process_page)
parser.extract_text handler
postprocess
report
end
Comments
def postprocess
#
# 1. searching drug packages anyway with some specific keys
#
Drugs::Package.search_by_code(:type => 'zuzahlungsbefreit',
:value => 'true',
:country => 'DE').each { |package|
# what is pzn? cid code?
# what is @confirmed_pzns?
pzn = package.code(:cid).value
# save package information and the number of deleting package?
unless(@confirmed_pzns.include?(pzn))
@deleted += 1
package.code(:zuzahlungsbefreit).value = false
save package
end
} unless(@confirmed_pzns.empty?)
#
# 2. parsing drug products data
#
Drugs::Product.all { |product|
unless(product.company)
keys = product.name.de.split
key = keys.pop
if(key == 'Comp')
key = keys.pop
end
company = Business::Company.find_by_name(key)
if(company.nil?)
companies = Business::Company.search_by_name(key)
if(companies.size == 1)
company = companies.pop
end
end
if(company)
@assigned_companies += 1 # this is probably the main point
product.company = company
save product # save product information
end # what is the save method?
end
}
#
# 3. parsing drug composition data
#
Drugs::Composition.all { |composition|
next if(composition.active_agents.size < 2)
composition.active_agents.dup.each { |agent|
next unless composition.active_agents.include?(agent)
name = agent.substance.name.de
if(other = composition.active_agents.find { |candidate|
candidate != agent \
&& candidate.substance.name.de[0,name.length] == name })
qty = other.dose.qty
if(qty > 0 && qty == qty.to_i && !other.chemical_equivalence)
agent, other = other, agent # swapping!?
end
if(agent.chemical_equivalence) # raise an error
raise "multiple chemical equivalences in #{composition.parts.first.package.code(:cid)}"
end
@assigned_equivalences += 1 # these below are probably the main points
composition.remove_active_agent(other)
agent.chemical_equivalence = other
save agent
save other
save composition
end
}
}
end
Summary
# definitely this method report something result of some method
def report
doubtfuls = @doubtful_pzns.collect do |pzn|
"http://de.oddb.org/de/drugs/package/pzn/#{pzn}"
end
[
sprintf("Imported %5i Zubef-Entries on %s:",
@count, Date.today.strftime("%d.%m.%Y")),
sprintf("Visited %5i existing Zubef-Entries", @existing),
sprintf("Visited %5i existing Companies",
@existing_companies),
sprintf("Visited %5i existing Substances",
@existing_substances),
sprintf("Created %5i new Zubef-Entries", @created),
sprintf("Created %5i new Products", @created_products),
sprintf("Created %5i new Sequences", @created_sequences),
sprintf("Created %5i new Companies", @created_companies),
sprintf("Created %5i new Substances", @created_substances),
sprintf("Assigned %5i Chemical Equivalences",
@assigned_equivalences),
sprintf("Assigned %5i Companies", @assigned_companies),
sprintf("Created %5i Incomplete Packages:", doubtfuls.size),
].concat doubtfuls
end
Summary
Test the first test_gkv.rb
$ git checkout 8da2441beead2c66bf2d8f887ce0100a00494c18
$ ruby test/import/test_gkv.rb
Loaded suite test/import/test_gkv
Started
...!!!!! DEPRECATION NOTICE !!!!!
The WWW constant is deprecated, please switch to the new top-level Mechanize
constant. WWW will be removed in Mechanize version 2.0
You've referenced the WWW constant from test/import/test_gkv.rb:71:in `test_latest_url', please
switch the "WWW" to "Mechanize". Thanks!
Sincerely,
Pew Pew Pew
!!!!! DEPRECATION NOTICE !!!!!
The WWW constant is deprecated, please switch to the new top-level Mechanize
constant. WWW will be removed in Mechanize version 2.0
You've referenced the WWW constant from test/import/test_gkv.rb:58:in `setup_page', please
switch the "WWW" to "Mechanize". Thanks!
Sincerely,
Pew Pew Pew
E....
Finished in 0.015076 seconds.
1) Error:
test_latest_url(ODDB::Import::TestGkv):
NoMethodError: undefined method `html_parser' for nil:NilClass
/usr/lib64/ruby/gems/1.8/gems/mechanize-1.0.0/lib/mechanize/page.rb:83:in `parser'
/home/masa/ywesee/de.oddb.org/lib/oddb/import/gkv.rb:87:in `latest_url'
test/import/test_gkv.rb:77:in `test_latest_url'
8 tests, 46 assertions, 0 failures, 1 errors
masa@masa ~/ywesee/de.oddb.org $ git checkout 4adb046616d3cf37ac9644ad2cb26d6c29801a63
Previous HEAD position was fc894bb... FB -> Zubef
HEAD is now at 4adb046... Import unknown packages, even if data quality cannot be guaranteed.
masa@masa ~/ywesee/de.oddb.org $ ruby test/import/test_gkv.rb
Loaded suite test/import/test_gkv
Started
.FF!!!!! DEPRECATION NOTICE !!!!!
The WWW constant is deprecated, please switch to the new top-level Mechanize
constant. WWW will be removed in Mechanize version 2.0
You've referenced the WWW constant from test/import/test_gkv.rb:71:in `test_latest_url', please
switch the "WWW" to "Mechanize". Thanks!
Sincerely,
Pew Pew Pew
!!!!! DEPRECATION NOTICE !!!!!
The WWW constant is deprecated, please switch to the new top-level Mechanize
constant. WWW will be removed in Mechanize version 2.0
You've referenced the WWW constant from test/import/test_gkv.rb:58:in `setup_page', please
switch the "WWW" to "Mechanize". Thanks!
Sincerely,
Pew Pew Pew
E....
Finished in 0.074977 seconds.
1) Failure:
test_import(ODDB::Import::TestGkv) [test/import/test_gkv.rb:106]:
<2> expected but was
<79>.
2) Failure:
test_import__ml(ODDB::Import::TestGkv) [test/import/test_gkv.rb:178]:
<1> expected but was
<3>.
3) Error:
test_latest_url(ODDB::Import::TestGkv):
NoMethodError: undefined method `html_parser' for nil:NilClass
/usr/lib64/ruby/gems/1.8/gems/mechanize-1.0.0/lib/mechanize/page.rb:83:in `parser'
/home/masa/ywesee/de.oddb.org/lib/oddb/import/gkv.rb:89:in `latest_url'
test/import/test_gkv.rb:77:in `test_latest_url'
8 tests, 13 assertions, 2 failures, 1 errors
masa@masa ~/ywesee/de.oddb.org $ git checkout 1b0624b8fc4aa8985b5013c3d2e942bd98b3426f
Previous HEAD position was 4adb046... Import unknown packages, even if data quality cannot be guaranteed.
HEAD is now at 1b0624b... Peer ODBA caches before starting GKV-Import.
masa@masa ~/ywesee/de.oddb.org $ ruby test/import/test_gkv.rb
Loaded suite test/import/test_gkv
Started
.FF!!!!! DEPRECATION NOTICE !!!!!
The WWW constant is deprecated, please switch to the new top-level Mechanize
constant. WWW will be removed in Mechanize version 2.0
You've referenced the WWW constant from test/import/test_gkv.rb:71:in `test_latest_url', please
switch the "WWW" to "Mechanize". Thanks!
Sincerely,
Pew Pew Pew
!!!!! DEPRECATION NOTICE !!!!!
The WWW constant is deprecated, please switch to the new top-level Mechanize
constant. WWW will be removed in Mechanize version 2.0
You've referenced the WWW constant from test/import/test_gkv.rb:58:in `setup_page', please
switch the "WWW" to "Mechanize". Thanks!
Sincerely,
Pew Pew Pew
E....
Finished in 0.045382 seconds.
1) Failure:
test_import(ODDB::Import::TestGkv) [test/import/test_gkv.rb:106]:
<2> expected but was
<0>.
2) Failure:
test_import__ml(ODDB::Import::TestGkv) [test/import/test_gkv.rb:178]:
<1> expected but was
<3>.
3) Error:
test_latest_url(ODDB::Import::TestGkv):
NoMethodError: undefined method `html_parser' for nil:NilClass
/usr/lib64/ruby/gems/1.8/gems/mechanize-1.0.0/lib/mechanize/page.rb:83:in `parser'
/home/masa/ywesee/de.oddb.org/lib/oddb/import/gkv.rb:88:in `latest_url'
test/import/test_gkv.rb:77:in `test_latest_url'
8 tests, 13 assertions, 2 failures, 1 errors
masa@masa ~/ywesee/de.oddb.org $