view · edit · sidebar · attach · print · history

20100907 install ruby

<< Masa.20100908-test_import | 2010 | Masa.20100906-debug-gkv >>


  1. Install Ruby
  2. test_import
  3. Reading Gkv#import
  4. Search the past commit that passes all the tests of test_gkv.rb
  5. Next: searching the commit that makes the current failures

Goal
  • Install Ruby / 100%
Milestones
  1. Install Ruby 10:30
  2. Debug test_gkv.rb
    1. test_import(ODDB::Import::TestGkv) [test/import/test_gkv.rb:106]
    2. test_import__ml(ODDB::Import::TestGkv) [test/import/test_gkv.rb:178]
    3. read import method one by one
    4. check the commits that passes and fails
Summary
Commits
ToDo Tomorrow
  • Continuing to correct test cases in test_gkv.rb
Keep in Mind
Attached Files
  1. Attach:oniguruma_patch_for_ruby-1.8.6_p369.20100615_from_Hannes-san.tar.gz

Install Ruby

Install Ruby via emerge normally

$ emerge ruby

Install Ruby 1.9 via emerge

$ vim /etc/portage/package.unmask
>=dev-lang/ruby-1.8.9

eselect Ruby 1.8

$ sudo eselect ruby set ruby18

Ruby source: http://ftp.ruby-lang.org/pub/ruby/1.8/ruby-1.8.6-p369.tar.gz

Install Ruby 1.8.6 with oniguruma from source code

masa@masa ~/work $ tar zxvf ruby-1.8.6-p369.tar.gz
masa@masa ~/work $ tar zxvf oniguruma_patch_for_ruby-1.8.6_p369.20100615_from_Hannes-san.tar.gz
masa@masa ~/work $ cd oniguruma
masa@masa ~/work/oniguruma $ ./configure --with-rubydir=/home/masa/work/ruby-1.8.6-p369
masa@masa ~/work/oniguruma $ make 186
masa@masa ~/work/oniguruma $ cd /home/masa/work/ruby-1.8.6-p369
masa@masa ~/work/ruby-1.8.6-p369 $ ./configure --prefix=/home/masa/bin/ruby186
masa@masa ~/work/ruby-1.8.6-p369 $ make
masa@masa ~/work/ruby-1.8.6-p369 $ make install

Set PATH

$ vim ~/.bashrc
export PATH=/home/masa/bin/ruby186/bin:$PATH:

Confirm

masa@masa ~/ywesee/de.oddb.org $ ruby -v
ruby 1.8.6 (2009-06-08 patchlevel 369) [x86_64-linux]

masa@masa ~/ywesee/de.oddb.org $ gem list

*** LOCAL GEMS ***

activesupport (2.3.8)
archive-tar-minitar (0.5.2)
archive-tarsimple (1.1.1)
character-encodings (0.4.1)
color (1.4.0)
columnize (0.3.1)
csvparser (0.1.1)
facets (1.8.54)
fastercsv (1.5.3)
flexmock (0.8.6)
gd2 (1.1.1)
gruff (0.3.6)
hoe (2.6.1)
hpricot (0.8.2, 0.6.164)
htmlentities (4.2.1)
json (1.4.3)
json_pure (1.4.3)
linecache (0.43)
mechanize (1.0.0)
money (3.0.2)
needle (1.3.0)
nokogiri (1.4.2)
oniguruma (1.1.0)
parseexcel (0.5.2)
paypal (2.0.0)
pdf-writer (1.1.8)
pg (0.9.0)
postgres (0.7.9.2008.01.28)
rake (0.8.7)
rcov (0.9.8)
rmagick (2.9.0)
rmail (1.0.0)
rockit (0.7.2)
ruby-debug (0.10.3)
ruby-debug-base (0.10.3)
ruby-ole (1.2.10.1)
ruby-termios (0.9.6)
rubyforge (2.0.4)
rubyzip (0.9.4)
setup (5.0.1)
spreadsheet (0.6.4.1)
swissmedic-diff (0.1.3)
text-hyphen (1.0.0)
tmail (1.2.7.1)
transaction-simple (1.4.0)
turing (0.0.11)
version (0.9.2)
vim-ruby (2007.05.07)

Test run de.oddb.org bin/oddbd

masa@masa ~/ywesee/de.oddb.org $ bin/oddbd 
ruby: no such file to load -- auto_gem (LoadError)

FAIL


Check source code

$ locate ruby-1.8.6
/usr/portage/distfiles/ruby-1.8.6-p369.tar.bz2

Download ruby-1.8.6_p369.ebulid

masa@masa /usr/portage/dev-lang/ruby $ sudo ebuild /usr/portage/dev-lang/ruby/ruby-1.8.6_p369.ebuild digest
>>> Creating Manifest for /usr/portage/dev-lang/ruby
masa@masa /usr/portage/dev-lang/ruby $ sudo ebuild /usr/portage/dev-lang/ruby/ruby-1.8.6_p369.ebuild compile
>>> Existing ${T}/environment for 'ruby-1.8.6_p369' will be sourced. Run
>>> 'clean' to start with a fresh environment.
 * ruby-1.8.6-p369.tar.bz2 RMD160 SHA1 SHA256 size ;-) ...                                                                                                      [ ok ]
 * checking ebuild checksums ;-) ...                                                                                                                            [ ok ]
 * checking auxfile checksums ;-) ...                                                                                                                           [ ok ]
 * checking miscfile checksums ;-) ...                                                                                                                          [ ok ]
 * checking ruby-1.8.6-p369.tar.bz2 ;-) ...                                                                                                                     [ ok ]
 * CPV:  dev-lang/ruby-1.8.6_p369
 * REPO: funtoo
 * USE:  amd64 berkdb elibc_glibc gdbm ipv6 kernel_linux multilib ssl userland_GNU
>>> Checking ruby-1.8.6-p369.tar.bz2's mtime...
>>> WORKDIR is up-to-date, keeping...
>>> It appears that 'ruby-1.8.6_p369' is already compiled; skipping.
>>> Remove '/var/tmp/portage/dev-lang/ruby-1.8.6_p369/.compiled' to force compilation.

masa@masa /usr/portage/dev-lang/ruby $ sudo ebuild /usr/portage/dev-lang/ruby/ruby-1.8.6_p369.ebuild install
masa@masa /usr/portage/dev-lang/ruby $ sudo ebuild /usr/portage/dev-lang/ruby/ruby-1.8.6_p369.ebuild qmerge

Confirm

masa@masa /usr/portage/dev-lang/ruby $ ruby -v
ruby 1.8.6 (2009-06-08 patchlevel 369) [x86_64-linux]

masa@masa ~/ywesee/de.oddb.org $ bin/oddbd
/home/masa/ywesee/de.oddb.org/lib/oddb/html/view/drugs/package.rb:373: warning: parenthesize argument(s) for future version
I, [2010-09-07T09:57:47.614035 #2821]  INFO -- start: starting oddb-server on druby://localhost:11000

Correct YAML.rb bug

$ sudo vim /usr/lib64/ruby/1.8/yaml/rubytypes.rb
    def is_binary_data?
        #( self.count( "^ -~", "^\r\n" ) / self.size > 0.3 || self.count( "\x00" ) > 0 ) unless empty?
        ( self.count( "^ -~", "^\r\n" ) / self.size > 0.3 || self.index( "\x00" ) > 0 ) unless empty?
    end

NOTE

  • This is wrong! It should be as follows
$ sudo vim /usr/lib64/ruby/1.8/yaml/rubytypes.rb
    def is_binary_data?
        #( self.count( "^ -~", "^\r\n" ) / self.size > 0.3 || self.count( "\x00" ) > 0 ) unless empty?
        ( self.count( "^ -~", "^\r\n" ) / self.size > 0.3 || self.index( "\x00" ) ) unless empty?
    end

Summary

  1. Download ruby-1.8.6_p369.ebuild http://sources.gentoo.org/cgi-bin/viewvc.cgi/gentoo-x86/dev-lang/ruby/ruby-1.8.6_p369.ebuild?hideattic=0&view=log
  2. Save it in /usr/portage/dev-lang/ruby/
  3. $ sudo ebuild /usr/portage/dev-lang/ruby/ruby-1.8.6_p369.ebuild digest
  4. $ sudo ebuild /usr/portage/dev-lang/ruby/ruby-1.8.6_p369.ebuild install
  5. $ sudo ebuild /usr/portage/dev-lang/ruby/ruby-1.8.6_p369.ebuild qmerge
  6. Correct YAML.rb bug

Comment: usually emerge command does

  1. ebuild ebuild.package fetch / download archive in /usr/portage/distfiles/
  2. ebuild ebuild.package unpack / unpack source code in /var/tmp/portage/
  3. ebuild ebuild.package compile
  4. ebuild ebuild.package qmerge / ??

If I had never got the source code, I might do as following

  1. Download ruby-1.8.6_p369.ebuild
  2. $ sudo ebuild /usr/portage/dev-lang/ruby/ruby-1.8.6_p369.ebuild digest
  3. $ sudo ebuild /usr/portage/dev-lang/ruby/ruby-1.8.6_p369.ebuild fetch
  4. $ sudo ebuild /usr/portage/dev-lang/ruby/ruby-1.8.6_p369.ebuild unpack
  5. Download oniguruma patch (Attach:oniguruma_patch_for_ruby-1.8.6_p369.20100615_from_Hannes-san.tar.gz)
  6. $ tar zxvf oniguruma_patch_for_ruby-1.8.6_p369.20100615_from_Hannes-san.tar.gz
  7. $ cd oniguruma
  8. $ ./configure --with-rubydir=/var/tmp/portage/dev-lang/ruby-1.8.6_p369/work/ruby-1.8.6-p369
  9. $ make 186
  10. $ ebuild /usr/portage/dev-lang/ruby/ruby-1.8.6_p369.ebuild compile
  11. $ ebuild /usr/portage/dev-lang/ruby/ruby-1.8.6_p369.ebuild install
  12. $ ebuild /usr/portage/dev-lang/ruby/ruby-1.8.6_p369.ebuild qmerge

Link

  1. ebulid http://transit-serv.sakura.ne.jp/wiki/wiki.cgi?page=Gentoo+Linux%2FPortage%B4%D8%CF%A2#p8 (japanese)

test_import

Confirm failure again

masa@masa ~/ywesee/de.oddb.org $ ruby test/import/test_gkv.rb 
Loaded suite test/import/test_gkv
Started
.FF.....
Finished in 0.088421 seconds.

  1) Failure:
test_import(ODDB::Import::TestGkv) [test/import/test_gkv.rb:106]:
<2> expected but was
<0>.

  2) Failure:
test_import__ml(ODDB::Import::TestGkv) [test/import/test_gkv.rb:178]:
<1> expected but was
<3>.

8 tests, 14 assertions, 2 failures, 0 errors

Reading source code

  1. test/import/test_gkv.rb
  2. lib/oddb/import/gkv.rb

assert_equal 2, @invalidator.invalidated.size

Comment out other assert

  def test_import
    existing = Drugs::Package.new
    existing.add_code(Util::Code.new(:cid, '4000741', 'DE'))
    existing.add_part(Drugs::Part.new)
    existing.save
    sequence = Drugs::Sequence.new
    product = Drugs::Product.new
    product.name.de = 'A product'
    existing.sequence = sequence
    sequence.product = product
    assert_nil(existing.code(:zuzahlungsbefreit))

    ## simulate a call to @import.import
    report = simulate_import

    assert_equal 2, @invalidator.invalidated.size

Confirm

masa@masa ~/ywesee/de.oddb.org $ ruby test/import/test_gkv.rb 
Loaded suite test/import/test_gkv
Started
F
Finished in 0.019398 seconds.

  1) Failure:
test_import(ODDB::Import::TestGkv) [test/import/test_gkv.rb:108]:
<2> expected but was
<0>.

1 tests, 2 assertions, 1 failures, 0 errors

Stop

BraSt

  1. @invalidator == Invalidator instance only in test_gkv.rb
  2. @import == Gkv instance
  3. what is Gkv?
  4. class Gkv < Importer
  5. Where is the caller of this class?
  6. jobs/import_gkv
    1. Updater.import_gkv

lib/oddb/util/updater.rb

      def Updater.import_gkv(opts = {})
        importer = Import::Gkv.new
        if url = importer.latest_url(Mechanize.new, opts)
          importer.download_latest url, opts do |fh|
            reported_import(importer, fh,
                            :subject => 'Zubef', :filetype => 'PDF')
          end
        end
      end

Consideration

  1. def import looks not called from import_gkv script

Reading Gkv#import

lib/oddb/import/gkv.rb

  def import fh, opts={}
    parser = Rpdf2txt::Parser.new(fh.read, 'utf8')
    handler = GkvHandler.new method(:process_page)
    parser.extract_text handler
    postprocess
    report
  end

Comments

  1. parser = Rpdf2txt::Parser.new(fh.read, 'utf8')
  2. handler = GkvHandler.new method(:process_page)
    • class GkvHandler < Rpdf2txt::SimpleHandler (lib/oddb/import/gkv.rb)
  3. parser.extract_text handler
    • I guess this parses a pdf file and get text information
  4. postprocess
    • calling postprocess method
  5. report
    • calling report method, probably sending an email
    • the return value of this method becomes the return value of import method
    • return value class is Array

postprocess method

http://scm.ywesee.com/?p=de.oddb.org/.git;a=blob;f=lib/oddb/import/gkv.rb;h=7bbd7bbe2e8d8b90be58d81617f4d19dc98dfd28;hb=HEAD#l350

  def postprocess
    #
    # 1. searching drug packages anyway with some specific keys
    #
    Drugs::Package.search_by_code(:type => 'zuzahlungsbefreit',
                                  :value => 'true',
                                  :country => 'DE').each { |package|
      # what is pzn? cid code?
      # what is @confirmed_pzns?
      pzn = package.code(:cid).value
      # save package information and the number of deleting package?
      unless(@confirmed_pzns.include?(pzn))
        @deleted += 1
        package.code(:zuzahlungsbefreit).value = false
        save package
      end
    } unless(@confirmed_pzns.empty?)


    #
    # 2. parsing drug products data
    #
    Drugs::Product.all { |product|
      unless(product.company)
        keys = product.name.de.split
        key = keys.pop
        if(key == 'Comp')
          key = keys.pop
        end
        company = Business::Company.find_by_name(key)
        if(company.nil?)
          companies = Business::Company.search_by_name(key)
          if(companies.size == 1)
            company = companies.pop
          end
        end
        if(company)
          @assigned_companies += 1    # this is probably the main point
          product.company = company
          save product                # save product information
        end                           # what is the save method?
      end
    }

    #
    # 3. parsing drug composition data
    #
    Drugs::Composition.all { |composition|
      next if(composition.active_agents.size < 2)
      composition.active_agents.dup.each { |agent|
        next unless composition.active_agents.include?(agent)
        name = agent.substance.name.de
        if(other = composition.active_agents.find { |candidate|
          candidate != agent \
            && candidate.substance.name.de[0,name.length] == name })
          qty = other.dose.qty
          if(qty > 0 && qty == qty.to_i && !other.chemical_equivalence)
            agent, other = other, agent    # swapping!?
          end
          if(agent.chemical_equivalence)   # raise an error
            raise "multiple chemical equivalences in #{composition.parts.first.package.code(:cid)}"
          end
          @assigned_equivalences += 1      # these below are probably the main points
          composition.remove_active_agent(other)
          agent.chemical_equivalence = other
          save agent
          save other
          save composition
        end
      }
    }
  end

Summary

  1. parsing drug data, packages, products and composition, with particular condition and save them
  2. I do not understand what kind of meanings the those condition have.
  3. I guess probably it is related to the 'import gkv' meaning.

report method

http://scm.ywesee.com/?p=de.oddb.org/.git;a=blob;f=lib/oddb/import/gkv.rb;h=7bbd7bbe2e8d8b90be58d81617f4d19dc98dfd28;hb=HEAD#l412

  # definitely this method report something result of some method
  def report
    doubtfuls = @doubtful_pzns.collect do |pzn|
      "http://de.oddb.org/de/drugs/package/pzn/#{pzn}"
    end
    [
      sprintf("Imported %5i Zubef-Entries on %s:",
              @count, Date.today.strftime("%d.%m.%Y")),
      sprintf("Visited  %5i existing Zubef-Entries", @existing),
      sprintf("Visited  %5i existing Companies",
              @existing_companies),
      sprintf("Visited  %5i existing Substances",
              @existing_substances),
      sprintf("Created  %5i new Zubef-Entries", @created),
      sprintf("Created  %5i new Products", @created_products),
      sprintf("Created  %5i new Sequences", @created_sequences),
      sprintf("Created  %5i new Companies", @created_companies),
      sprintf("Created  %5i new Substances", @created_substances),
      sprintf("Assigned %5i Chemical Equivalences",
              @assigned_equivalences),
      sprintf("Assigned %5i Companies", @assigned_companies),
      sprintf("Created  %5i Incomplete Packages:", doubtfuls.size),
    ].concat doubtfuls
  end

Summary

  • I guess this is Array informatino about the reoprt of 'import gkv'

Search the past commit that passes all the tests of test_gkv.rb

Test the first test_gkv.rb

$ git checkout 8da2441beead2c66bf2d8f887ce0100a00494c18

$ ruby test/import/test_gkv.rb 
Loaded suite test/import/test_gkv
Started
...!!!!! DEPRECATION NOTICE !!!!!
The WWW constant is deprecated, please switch to the new top-level Mechanize
constant.  WWW will be removed in Mechanize version 2.0

You've referenced the WWW constant from test/import/test_gkv.rb:71:in `test_latest_url', please
switch the "WWW" to "Mechanize".  Thanks!

Sincerely,

  Pew Pew Pew

!!!!! DEPRECATION NOTICE !!!!!
The WWW constant is deprecated, please switch to the new top-level Mechanize
constant.  WWW will be removed in Mechanize version 2.0

You've referenced the WWW constant from test/import/test_gkv.rb:58:in `setup_page', please
switch the "WWW" to "Mechanize".  Thanks!

Sincerely,

  Pew Pew Pew

E....
Finished in 0.015076 seconds.

  1) Error:
test_latest_url(ODDB::Import::TestGkv):
NoMethodError: undefined method `html_parser' for nil:NilClass
    /usr/lib64/ruby/gems/1.8/gems/mechanize-1.0.0/lib/mechanize/page.rb:83:in `parser'
    /home/masa/ywesee/de.oddb.org/lib/oddb/import/gkv.rb:87:in `latest_url'
    test/import/test_gkv.rb:77:in `test_latest_url'

8 tests, 46 assertions, 0 failures, 1 errors
  1. Pew Pew Pew?!
  2. Looks passing the test
  3. The Errror is probably the one that I have already resolved.

Next: searching the commit that makes the current failures

  1. 2 failures but not the same as the current 2009-09-07 Hannes Wyss Import unknown packages, even if data quality cannot... 4adb046616d3cf37ac9644ad2cb26d6c29801a63
  2. the same 2 failues 2009-11-10 Hannes Wyss Peer ODBA caches before starting GKV-Import. 2009-11-10 Hannes Wyss Peer ODBA caches before starting GKV-Import 1b0624b8fc4aa8985b5013c3d2e942bd98b3426f
masa@masa ~/ywesee/de.oddb.org $ git checkout 4adb046616d3cf37ac9644ad2cb26d6c29801a63
Previous HEAD position was fc894bb... FB -> Zubef
HEAD is now at 4adb046... Import unknown packages, even if data quality cannot be guaranteed.
masa@masa ~/ywesee/de.oddb.org $ ruby test/import/test_gkv.rb 
Loaded suite test/import/test_gkv
Started
.FF!!!!! DEPRECATION NOTICE !!!!!
The WWW constant is deprecated, please switch to the new top-level Mechanize
constant.  WWW will be removed in Mechanize version 2.0

You've referenced the WWW constant from test/import/test_gkv.rb:71:in `test_latest_url', please
switch the "WWW" to "Mechanize".  Thanks!

Sincerely,

  Pew Pew Pew

!!!!! DEPRECATION NOTICE !!!!!
The WWW constant is deprecated, please switch to the new top-level Mechanize
constant.  WWW will be removed in Mechanize version 2.0

You've referenced the WWW constant from test/import/test_gkv.rb:58:in `setup_page', please
switch the "WWW" to "Mechanize".  Thanks!

Sincerely,

  Pew Pew Pew

E....
Finished in 0.074977 seconds.

  1) Failure:
test_import(ODDB::Import::TestGkv) [test/import/test_gkv.rb:106]:
<2> expected but was
<79>.

  2) Failure:
test_import__ml(ODDB::Import::TestGkv) [test/import/test_gkv.rb:178]:
<1> expected but was
<3>.

  3) Error:
test_latest_url(ODDB::Import::TestGkv):
NoMethodError: undefined method `html_parser' for nil:NilClass
    /usr/lib64/ruby/gems/1.8/gems/mechanize-1.0.0/lib/mechanize/page.rb:83:in `parser'
    /home/masa/ywesee/de.oddb.org/lib/oddb/import/gkv.rb:89:in `latest_url'
    test/import/test_gkv.rb:77:in `test_latest_url'

8 tests, 13 assertions, 2 failures, 1 errors

masa@masa ~/ywesee/de.oddb.org $ git checkout 1b0624b8fc4aa8985b5013c3d2e942bd98b3426f
Previous HEAD position was 4adb046... Import unknown packages, even if data quality cannot be guaranteed.
HEAD is now at 1b0624b... Peer ODBA caches before starting GKV-Import.
masa@masa ~/ywesee/de.oddb.org $ ruby test/import/test_gkv.rb 
Loaded suite test/import/test_gkv
Started
.FF!!!!! DEPRECATION NOTICE !!!!!
The WWW constant is deprecated, please switch to the new top-level Mechanize
constant.  WWW will be removed in Mechanize version 2.0

You've referenced the WWW constant from test/import/test_gkv.rb:71:in `test_latest_url', please
switch the "WWW" to "Mechanize".  Thanks!

Sincerely,

  Pew Pew Pew

!!!!! DEPRECATION NOTICE !!!!!
The WWW constant is deprecated, please switch to the new top-level Mechanize
constant.  WWW will be removed in Mechanize version 2.0

You've referenced the WWW constant from test/import/test_gkv.rb:58:in `setup_page', please
switch the "WWW" to "Mechanize".  Thanks!

Sincerely,

  Pew Pew Pew

E....
Finished in 0.045382 seconds.

  1) Failure:
test_import(ODDB::Import::TestGkv) [test/import/test_gkv.rb:106]:
<2> expected but was
<0>.

  2) Failure:
test_import__ml(ODDB::Import::TestGkv) [test/import/test_gkv.rb:178]:
<1> expected but was
<3>.

  3) Error:
test_latest_url(ODDB::Import::TestGkv):
NoMethodError: undefined method `html_parser' for nil:NilClass
    /usr/lib64/ruby/gems/1.8/gems/mechanize-1.0.0/lib/mechanize/page.rb:83:in `parser'
    /home/masa/ywesee/de.oddb.org/lib/oddb/import/gkv.rb:88:in `latest_url'
    test/import/test_gkv.rb:77:in `test_latest_url'

8 tests, 13 assertions, 2 failures, 1 errors
masa@masa ~/ywesee/de.oddb.org $ 

BraSt

  1. The Gkv#import method function looks changed at the commit on 2009-11-10
  2. Searching the place where the Gkv#import method is called
  3. Deleting unnecessary assertions from test_import method in test_gkv.rb
  4. Why did Hannes-san leave many assertions commented out in test_gkv.rb?
  5. If I were Hannes-san, I would make import method that has two functions
    1. get some information from somewhere
    2. save it in the database
  6. According to the current method, the import method parses a PDF file
  7. save method probably save the import data in database, probably via odba
  8. I do not understand where the parsing data is kept temporary
    1. parser.extract_text handler
  9. Studyin Rpdf2txt more
  10. simulate_import method probably still works
  11. I should study the meaning and function of import method from the test code
view · edit · sidebar · attach · print · history
Page last modified on June 13, 2013, at 02:48 AM