<< Masa.20100908-test_import | 2010 | Masa.20100906-debug-gkv >>
Test First
principle, otherwise we will waste our time on unnecessary debug task
Install Ruby via emerge normally
$ emerge ruby
Install Ruby 1.9 via emerge
$ vim /etc/portage/package.unmask >=dev-lang/ruby-1.8.9
eselect Ruby 1.8
$ sudo eselect ruby set ruby18
Ruby source: http://ftp.ruby-lang.org/pub/ruby/1.8/ruby-1.8.6-p369.tar.gz
Install Ruby 1.8.6 with oniguruma from source code
masa@masa ~/work $ tar zxvf ruby-1.8.6-p369.tar.gz masa@masa ~/work $ tar zxvf oniguruma_patch_for_ruby-1.8.6_p369.20100615_from_Hannes-san.tar.gz masa@masa ~/work $ cd oniguruma masa@masa ~/work/oniguruma $ ./configure --with-rubydir=/home/masa/work/ruby-1.8.6-p369 masa@masa ~/work/oniguruma $ make 186 masa@masa ~/work/oniguruma $ cd /home/masa/work/ruby-1.8.6-p369 masa@masa ~/work/ruby-1.8.6-p369 $ ./configure --prefix=/home/masa/bin/ruby186 masa@masa ~/work/ruby-1.8.6-p369 $ make masa@masa ~/work/ruby-1.8.6-p369 $ make install
Set PATH
$ vim ~/.bashrc export PATH=/home/masa/bin/ruby186/bin:$PATH:
Confirm
masa@masa ~/ywesee/de.oddb.org $ ruby -v ruby 1.8.6 (2009-06-08 patchlevel 369) [x86_64-linux] masa@masa ~/ywesee/de.oddb.org $ gem list *** LOCAL GEMS *** activesupport (2.3.8) archive-tar-minitar (0.5.2) archive-tarsimple (1.1.1) character-encodings (0.4.1) color (1.4.0) columnize (0.3.1) csvparser (0.1.1) facets (1.8.54) fastercsv (1.5.3) flexmock (0.8.6) gd2 (1.1.1) gruff (0.3.6) hoe (2.6.1) hpricot (0.8.2, 0.6.164) htmlentities (4.2.1) json (1.4.3) json_pure (1.4.3) linecache (0.43) mechanize (1.0.0) money (3.0.2) needle (1.3.0) nokogiri (1.4.2) oniguruma (1.1.0) parseexcel (0.5.2) paypal (2.0.0) pdf-writer (1.1.8) pg (0.9.0) postgres (0.7.9.2008.01.28) rake (0.8.7) rcov (0.9.8) rmagick (2.9.0) rmail (1.0.0) rockit (0.7.2) ruby-debug (0.10.3) ruby-debug-base (0.10.3) ruby-ole (1.2.10.1) ruby-termios (0.9.6) rubyforge (2.0.4) rubyzip (0.9.4) setup (5.0.1) spreadsheet (0.6.4.1) swissmedic-diff (0.1.3) text-hyphen (1.0.0) tmail (1.2.7.1) transaction-simple (1.4.0) turing (0.0.11) version (0.9.2) vim-ruby (2007.05.07)
Test run de.oddb.org bin/oddbd
masa@masa ~/ywesee/de.oddb.org $ bin/oddbd ruby: no such file to load -- auto_gem (LoadError)
FAIL
Check source code
$ locate ruby-1.8.6 /usr/portage/distfiles/ruby-1.8.6-p369.tar.bz2
Download ruby-1.8.6_p369.ebulid
masa@masa /usr/portage/dev-lang/ruby $ sudo ebuild /usr/portage/dev-lang/ruby/ruby-1.8.6_p369.ebuild digest >>> Creating Manifest for /usr/portage/dev-lang/ruby masa@masa /usr/portage/dev-lang/ruby $ sudo ebuild /usr/portage/dev-lang/ruby/ruby-1.8.6_p369.ebuild compile >>> Existing ${T}/environment for 'ruby-1.8.6_p369' will be sourced. Run >>> 'clean' to start with a fresh environment. * ruby-1.8.6-p369.tar.bz2 RMD160 SHA1 SHA256 size ;-) ... [ ok ] * checking ebuild checksums ;-) ... [ ok ] * checking auxfile checksums ;-) ... [ ok ] * checking miscfile checksums ;-) ... [ ok ] * checking ruby-1.8.6-p369.tar.bz2 ;-) ... [ ok ] * CPV: dev-lang/ruby-1.8.6_p369 * REPO: funtoo * USE: amd64 berkdb elibc_glibc gdbm ipv6 kernel_linux multilib ssl userland_GNU >>> Checking ruby-1.8.6-p369.tar.bz2's mtime... >>> WORKDIR is up-to-date, keeping... >>> It appears that 'ruby-1.8.6_p369' is already compiled; skipping. >>> Remove '/var/tmp/portage/dev-lang/ruby-1.8.6_p369/.compiled' to force compilation. masa@masa /usr/portage/dev-lang/ruby $ sudo ebuild /usr/portage/dev-lang/ruby/ruby-1.8.6_p369.ebuild install masa@masa /usr/portage/dev-lang/ruby $ sudo ebuild /usr/portage/dev-lang/ruby/ruby-1.8.6_p369.ebuild qmerge
Confirm
masa@masa /usr/portage/dev-lang/ruby $ ruby -v ruby 1.8.6 (2009-06-08 patchlevel 369) [x86_64-linux] masa@masa ~/ywesee/de.oddb.org $ bin/oddbd /home/masa/ywesee/de.oddb.org/lib/oddb/html/view/drugs/package.rb:373: warning: parenthesize argument(s) for future version I, [2010-09-07T09:57:47.614035 #2821] INFO -- start: starting oddb-server on druby://localhost:11000
Correct YAML.rb bug
$ sudo vim /usr/lib64/ruby/1.8/yaml/rubytypes.rb def is_binary_data? #( self.count( "^ -~", "^\r\n" ) / self.size > 0.3 || self.count( "\x00" ) > 0 ) unless empty? ( self.count( "^ -~", "^\r\n" ) / self.size > 0.3 || self.index( "\x00" ) > 0 ) unless empty? end
NOTE
This is wrong! It should be as follows
$ sudo vim /usr/lib64/ruby/1.8/yaml/rubytypes.rb def is_binary_data? #( self.count( "^ -~", "^\r\n" ) / self.size > 0.3 || self.count( "\x00" ) > 0 ) unless empty? ( self.count( "^ -~", "^\r\n" ) / self.size > 0.3 || self.index( "\x00" ) ) unless empty? end
Summary
Comment: usually emerge command does
If I had never got the source code, I might do as following
Link
Confirm failure again
masa@masa ~/ywesee/de.oddb.org $ ruby test/import/test_gkv.rb Loaded suite test/import/test_gkv Started .FF..... Finished in 0.088421 seconds. 1) Failure: test_import(ODDB::Import::TestGkv) [test/import/test_gkv.rb:106]: <2> expected but was <0>. 2) Failure: test_import__ml(ODDB::Import::TestGkv) [test/import/test_gkv.rb:178]: <1> expected but was <3>. 8 tests, 14 assertions, 2 failures, 0 errors
Reading source code
Comment out other assert
def test_import existing = Drugs::Package.new existing.add_code(Util::Code.new(:cid, '4000741', 'DE')) existing.add_part(Drugs::Part.new) existing.save sequence = Drugs::Sequence.new product = Drugs::Product.new product.name.de = 'A product' existing.sequence = sequence sequence.product = product assert_nil(existing.code(:zuzahlungsbefreit)) ## simulate a call to @import.import report = simulate_import assert_equal 2, @invalidator.invalidated.size
Confirm
masa@masa ~/ywesee/de.oddb.org $ ruby test/import/test_gkv.rb Loaded suite test/import/test_gkv Started F Finished in 0.019398 seconds. 1) Failure: test_import(ODDB::Import::TestGkv) [test/import/test_gkv.rb:108]: <2> expected but was <0>. 1 tests, 2 assertions, 1 failures, 0 errors
lib/oddb/util/updater.rb
def Updater.import_gkv(opts = {}) importer = Import::Gkv.new if url = importer.latest_url(Mechanize.new, opts) importer.download_latest url, opts do |fh| reported_import(importer, fh, :subject => 'Zubef', :filetype => 'PDF') end end end
Consideration
def import fh, opts={} parser = Rpdf2txt::Parser.new(fh.read, 'utf8') handler = GkvHandler.new method(:process_page) parser.extract_text handler postprocess report end
Comments
def postprocess # # 1. searching drug packages anyway with some specific keys # Drugs::Package.search_by_code(:type => 'zuzahlungsbefreit', :value => 'true', :country => 'DE').each { |package| # what is pzn? cid code? # what is @confirmed_pzns? pzn = package.code(:cid).value # save package information and the number of deleting package? unless(@confirmed_pzns.include?(pzn)) @deleted += 1 package.code(:zuzahlungsbefreit).value = false save package end } unless(@confirmed_pzns.empty?) # # 2. parsing drug products data # Drugs::Product.all { |product| unless(product.company) keys = product.name.de.split key = keys.pop if(key == 'Comp') key = keys.pop end company = Business::Company.find_by_name(key) if(company.nil?) companies = Business::Company.search_by_name(key) if(companies.size == 1) company = companies.pop end end if(company) @assigned_companies += 1 # this is probably the main point product.company = company save product # save product information end # what is the save method? end } # # 3. parsing drug composition data # Drugs::Composition.all { |composition| next if(composition.active_agents.size < 2) composition.active_agents.dup.each { |agent| next unless composition.active_agents.include?(agent) name = agent.substance.name.de if(other = composition.active_agents.find { |candidate| candidate != agent \ && candidate.substance.name.de[0,name.length] == name }) qty = other.dose.qty if(qty > 0 && qty == qty.to_i && !other.chemical_equivalence) agent, other = other, agent # swapping!? end if(agent.chemical_equivalence) # raise an error raise "multiple chemical equivalences in #{composition.parts.first.package.code(:cid)}" end @assigned_equivalences += 1 # these below are probably the main points composition.remove_active_agent(other) agent.chemical_equivalence = other save agent save other save composition end } } end
Summary
# definitely this method report something result of some method def report doubtfuls = @doubtful_pzns.collect do |pzn| "http://de.oddb.org/de/drugs/package/pzn/#{pzn}" end [ sprintf("Imported %5i Zubef-Entries on %s:", @count, Date.today.strftime("%d.%m.%Y")), sprintf("Visited %5i existing Zubef-Entries", @existing), sprintf("Visited %5i existing Companies", @existing_companies), sprintf("Visited %5i existing Substances", @existing_substances), sprintf("Created %5i new Zubef-Entries", @created), sprintf("Created %5i new Products", @created_products), sprintf("Created %5i new Sequences", @created_sequences), sprintf("Created %5i new Companies", @created_companies), sprintf("Created %5i new Substances", @created_substances), sprintf("Assigned %5i Chemical Equivalences", @assigned_equivalences), sprintf("Assigned %5i Companies", @assigned_companies), sprintf("Created %5i Incomplete Packages:", doubtfuls.size), ].concat doubtfuls end
Summary
Test the first test_gkv.rb
$ git checkout 8da2441beead2c66bf2d8f887ce0100a00494c18 $ ruby test/import/test_gkv.rb Loaded suite test/import/test_gkv Started ...!!!!! DEPRECATION NOTICE !!!!! The WWW constant is deprecated, please switch to the new top-level Mechanize constant. WWW will be removed in Mechanize version 2.0 You've referenced the WWW constant from test/import/test_gkv.rb:71:in `test_latest_url', please switch the "WWW" to "Mechanize". Thanks! Sincerely, Pew Pew Pew !!!!! DEPRECATION NOTICE !!!!! The WWW constant is deprecated, please switch to the new top-level Mechanize constant. WWW will be removed in Mechanize version 2.0 You've referenced the WWW constant from test/import/test_gkv.rb:58:in `setup_page', please switch the "WWW" to "Mechanize". Thanks! Sincerely, Pew Pew Pew E.... Finished in 0.015076 seconds. 1) Error: test_latest_url(ODDB::Import::TestGkv): NoMethodError: undefined method `html_parser' for nil:NilClass /usr/lib64/ruby/gems/1.8/gems/mechanize-1.0.0/lib/mechanize/page.rb:83:in `parser' /home/masa/ywesee/de.oddb.org/lib/oddb/import/gkv.rb:87:in `latest_url' test/import/test_gkv.rb:77:in `test_latest_url' 8 tests, 46 assertions, 0 failures, 1 errors
masa@masa ~/ywesee/de.oddb.org $ git checkout 4adb046616d3cf37ac9644ad2cb26d6c29801a63 Previous HEAD position was fc894bb... FB -> Zubef HEAD is now at 4adb046... Import unknown packages, even if data quality cannot be guaranteed. masa@masa ~/ywesee/de.oddb.org $ ruby test/import/test_gkv.rb Loaded suite test/import/test_gkv Started .FF!!!!! DEPRECATION NOTICE !!!!! The WWW constant is deprecated, please switch to the new top-level Mechanize constant. WWW will be removed in Mechanize version 2.0 You've referenced the WWW constant from test/import/test_gkv.rb:71:in `test_latest_url', please switch the "WWW" to "Mechanize". Thanks! Sincerely, Pew Pew Pew !!!!! DEPRECATION NOTICE !!!!! The WWW constant is deprecated, please switch to the new top-level Mechanize constant. WWW will be removed in Mechanize version 2.0 You've referenced the WWW constant from test/import/test_gkv.rb:58:in `setup_page', please switch the "WWW" to "Mechanize". Thanks! Sincerely, Pew Pew Pew E.... Finished in 0.074977 seconds. 1) Failure: test_import(ODDB::Import::TestGkv) [test/import/test_gkv.rb:106]: <2> expected but was <79>. 2) Failure: test_import__ml(ODDB::Import::TestGkv) [test/import/test_gkv.rb:178]: <1> expected but was <3>. 3) Error: test_latest_url(ODDB::Import::TestGkv): NoMethodError: undefined method `html_parser' for nil:NilClass /usr/lib64/ruby/gems/1.8/gems/mechanize-1.0.0/lib/mechanize/page.rb:83:in `parser' /home/masa/ywesee/de.oddb.org/lib/oddb/import/gkv.rb:89:in `latest_url' test/import/test_gkv.rb:77:in `test_latest_url' 8 tests, 13 assertions, 2 failures, 1 errors masa@masa ~/ywesee/de.oddb.org $ git checkout 1b0624b8fc4aa8985b5013c3d2e942bd98b3426f Previous HEAD position was 4adb046... Import unknown packages, even if data quality cannot be guaranteed. HEAD is now at 1b0624b... Peer ODBA caches before starting GKV-Import. masa@masa ~/ywesee/de.oddb.org $ ruby test/import/test_gkv.rb Loaded suite test/import/test_gkv Started .FF!!!!! DEPRECATION NOTICE !!!!! The WWW constant is deprecated, please switch to the new top-level Mechanize constant. WWW will be removed in Mechanize version 2.0 You've referenced the WWW constant from test/import/test_gkv.rb:71:in `test_latest_url', please switch the "WWW" to "Mechanize". Thanks! Sincerely, Pew Pew Pew !!!!! DEPRECATION NOTICE !!!!! The WWW constant is deprecated, please switch to the new top-level Mechanize constant. WWW will be removed in Mechanize version 2.0 You've referenced the WWW constant from test/import/test_gkv.rb:58:in `setup_page', please switch the "WWW" to "Mechanize". Thanks! Sincerely, Pew Pew Pew E.... Finished in 0.045382 seconds. 1) Failure: test_import(ODDB::Import::TestGkv) [test/import/test_gkv.rb:106]: <2> expected but was <0>. 2) Failure: test_import__ml(ODDB::Import::TestGkv) [test/import/test_gkv.rb:178]: <1> expected but was <3>. 3) Error: test_latest_url(ODDB::Import::TestGkv): NoMethodError: undefined method `html_parser' for nil:NilClass /usr/lib64/ruby/gems/1.8/gems/mechanize-1.0.0/lib/mechanize/page.rb:83:in `parser' /home/masa/ywesee/de.oddb.org/lib/oddb/import/gkv.rb:88:in `latest_url' test/import/test_gkv.rb:77:in `test_latest_url' 8 tests, 13 assertions, 2 failures, 1 errors masa@masa ~/ywesee/de.oddb.org $