view · edit · sidebar · attach · print · history

20110114-debug-import_gkv-de_oddb_org-update-rpdf2txt

<< Masa.20110117-patch-spreadsheet-update-rpdf2txt | 2011 | Masa.20110113-change_condition_followers_run-simplify_installation_oddb >>


  1. warning when emerge --sync
  2. Restart import_bsv (oddb.org)
  3. Debug import_gkv (de.oddb.org)
  4. Check 'name' value of import_active_agent method
  5. Check Rpdf2txt by Ruby1.9
  6. Set up de.oddb.org on Ubuntu suspend
  7. Update rpdf2txt for Ruby1.9

Goal
  • Set up de.oddb.org on Windows7 / 80%
Milestones
  1. Debug import_gkv 13:30
  2. Check name of import_active_agent 14:20
  3. Check Rpdf2txt by Ruby1.9 14:50
  4. Install de.oddb.org on Ubuntu through gem suspend
  5. Update rpdf2txt for Ruby1.9
Summary
Commits
ToDo Tomorrow
Keep in Mind
  1. emerge portage warning
  2. On Ice
  3. emerge --sync

warning when emerge --sync

When I run 'emerge --sync' the following warning comes

Performing Global Updates: /usr/portage/profiles/updates/1Q-2011
(Could take a couple of minutes if you have a lot of binary packages.)
  .='update pass'  *='binary update'  #='/var/db update'  @='/var/db move'
  s='/var/db SLOT move'  %='binary move'  S='binary SLOT move'
  p='update /etc/portage/package.*'
....



 * An update to portage is available. It is _highly_ recommended
 * that you update portage now, before any other packages are updated.

 * To update portage, run 'emerge portage' now.

masa@masa ~ $ sudo emerge portage
Passwort: 
Calculating dependencies... done!

>>> Verifying ebuild manifests

!!! A file listed in the Manifest could not be found: /usr/portage/sys-apps/portage/portage-9999.ebuild

Note

  • An error comes when 'emerge portage'

Restart import_bsv (oddb.org)

Plugin: ODDB::BsvXmlPlugin
Error: RuntimeError
Message: could not connect to www.medwin.ch: #<Net::HTTPInternalServerError:0x7f811a679cb8>
@report: {:name_descr=>"Tabl 100 mg ", :pharmacode_bag=>"2159644", :atc_class=>"B01AC06", :deductible=>:deductible_g, :swissmedic_no5_bag=>"55197", :generic_type=>:unknown, :name_base=>"Thrombace Neo 100"}
Backtrace:
(druby://localhost:10006) /var/www/oddb.org/src/util/http.rb:82:in `post'
(druby://localhost:10006) /var/www/oddb.org/ext/meddata/src/session.rb:97:in `get_result_list'
(druby://localhost:10006) /var/www/oddb.org/ext/meddata/src/drbsession.rb:31:in `search'
(druby://thinpower:47886) /var/www/oddb.org/src/plugin/bsv_xml.rb:244:in `load_ikskey'
(druby://thinpower:47886) /usr/lib64/ruby/1.8/drb/drb.rb:1555:in `call'
(druby://localhost:10006) /usr/lib64/ruby/1.8/drb/invokemethod.rb:10:in `block_yield'
(druby://localhost:10006) /usr/lib64/ruby/1.8/drb/invokemethod.rb:17:in `perform_with_block'
(druby://localhost:10006) /var/www/oddb.org/ext/meddata/src/meddata.rb:15:in `session'
/var/www/oddb.org/src/plugin/bsv_xml.rb:243:in `load_ikskey'
/var/www/oddb.org/src/plugin/bsv_xml.rb:286:in `tag_start'
/usr/lib64/ruby/1.8/rexml/parsers/streamparser.rb:24:in `parse'
/usr/lib64/ruby/1.8/rexml/document.rb:200:in `parse_stream'
/var/www/oddb.org/src/plugin/bsv_xml.rb:978:in `update_preparations'
/var/www/oddb.org/src/plugin/bsv_xml.rb:659:in `send'
/var/www/oddb.org/src/plugin/bsv_xml.rb:659:in `_update'
/usr/lib64/ruby/gems/1.8/gems/rubyzip-0.9.1/lib/zip/zip.rb:761:in `get_input_stream'
/var/www/oddb.org/src/plugin/bsv_xml.rb:659:in `_update'
/usr/lib64/ruby/gems/1.8/gems/rubyzip-0.9.1/lib/zip/zip.rb:1123:in `each'
/usr/lib64/ruby/gems/1.8/gems/rubyzip-0.9.1/lib/zip/zip.rb:1123:in `each'
/usr/lib64/ruby/gems/1.8/gems/rubyzip-0.9.1/lib/zip/zip.rb:1266:in `each'
/usr/lib64/ruby/gems/1.8/gems/rubyzip-0.9.1/lib/zip/zip.rb:1403:in `foreach'
/usr/lib64/ruby/gems/1.8/gems/rubyzip-0.9.1/lib/zip/zip.rb:1382:in `open'
/usr/lib64/ruby/gems/1.8/gems/rubyzip-0.9.1/lib/zip/zip.rb:1401:in `foreach'
/var/www/oddb.org/src/plugin/bsv_xml.rb:655:in `_update'
/var/www/oddb.org/src/plugin/bsv_xml.rb:650:in `update'
/var/www/oddb.org/src/util/updater.rb:247:in `update_bsv'
/var/www/oddb.org/src/util/updater.rb:456:in `call'
/var/www/oddb.org/src/util/updater.rb:456:in `wrap_update'
/var/www/oddb.org/src/util/updater.rb:245:in `update_bsv'
/var/www/oddb.org/src/util/updater.rb:209:in `run'
/var/www/oddb.org/jobs/import_daily:13
/var/www/oddb.org/src/util/job.rb:17:in `call'
/var/www/oddb.org/src/util/job.rb:17:in `run'
/var/www/oddb.org/jobs/import_daily:12

Note

  • This is a connection error
  • We should try to run import_bsv again

Run import_bsv again online

$ cd /var/www/oddb.org
$ sudo -u apache jobs/import_bsv

Result

Debug import_gkv (de.oddb.org)

Fri Jan 14 02:00:06 2011: de.oddb.org ODDB::Import::Gkv#import
NoMethodError
private method `gsub' called for nil:NilClass
/var/www/de.oddb.org/lib/oddb/util/multilingual.rb:60:in `all'
/var/www/de.oddb.org/lib/oddb/util/multilingual.rb:60:in `collect'
/var/www/de.oddb.org/lib/oddb/util/multilingual.rb:60:in `all'
/var/www/de.oddb.org/lib/oddb/import/gkv.rb:175:in `import_active_agent'
/var/www/de.oddb.org/lib/oddb/util/multilingual.rb:60:in `find'
/var/www/de.oddb.org/lib/oddb/import/gkv.rb:174:in `each'
/var/www/de.oddb.org/lib/oddb/import/gkv.rb:174:in `find'
/var/www/de.oddb.org/lib/oddb/import/gkv.rb:174:in `import_active_agent'
/var/www/de.oddb.org/lib/oddb/import/gkv.rb:160:in `import_row'
/var/www/de.oddb.org/lib/oddb/import/gkv.rb:458:in `process_page'
/var/www/de.oddb.org/lib/oddb/import/gkv.rb:457:in `each'
/var/www/de.oddb.org/lib/oddb/import/gkv.rb:457:in `process_page'
/var/www/de.oddb.org/lib/oddb/import/gkv.rb:40:in `call'
/var/www/de.oddb.org/lib/oddb/import/gkv.rb:40:in `send_page'
/usr/lib64/ruby/site_ruby/1.8/rpdf2txt/parser.rb:42:in `extract_text'
/usr/lib64/ruby/site_ruby/1.8/rpdf2txt/object.rb:479:in `each'
/usr/lib64/ruby/site_ruby/1.8/rpdf2txt/object.rb:442:in `each'
/usr/lib64/ruby/site_ruby/1.8/rpdf2txt/object.rb:479:in `each'
/usr/lib64/ruby/site_ruby/1.8/rpdf2txt/object.rb:478:in `each'
/usr/lib64/ruby/site_ruby/1.8/rpdf2txt/object.rb:463:in `each'
/usr/lib64/ruby/site_ruby/1.8/rpdf2txt/parser.rb:40:in `extract_text'
/var/www/de.oddb.org/lib/oddb/import/gkv.rb:100:in `import'
/var/www/de.oddb.org/lib/oddb/util/updater.rb:106:in `reported_import'
/var/www/de.oddb.org/lib/oddb/util/updater.rb:113:in `call'
/var/www/de.oddb.org/lib/oddb/util/updater.rb:113:in `_reported_import'
/var/www/de.oddb.org/lib/oddb/util/updater.rb:106:in `reported_import'
/var/www/de.oddb.org/lib/oddb/util/updater.rb:58:in `import_gkv'
/usr/lib64/ruby/1.8/open-uri.rb:32:in `open_uri_original_open'
/usr/lib64/ruby/1.8/open-uri.rb:32:in `open'
/var/www/de.oddb.org/lib/oddb/import/gkv.rb:77:in `download_latest'
/var/www/de.oddb.org/lib/oddb/util/updater.rb:57:in `import_gkv'
/var/www/de.oddb.org/jobs/import_gkv:16
/var/www/de.oddb.org/lib/oddb/util/job.rb:16:in `call'
/var/www/de.oddb.org/lib/oddb/util/job.rb:16:in `run'
/var/www/de.oddb.org/jobs/import_gkv:15
Imported   191 Zubef-Entries on 14.01.2011:
Visited    191 existing Zubef-Entries
Visited     13 existing Companies
Visited      8 existing Substances
Created      0 new Zubef-Entries
Created      0 new Products
Created      0 new Sequences
Created    178 new Companies
Created    179 new Substances
Assigned     0 Chemical Equivalences
Assigned     0 Companies
Created      0 Incomplete Packages:
Created      0 Product(s) without a name (missing product name):

Test run locally

Run

  • de.oddb.org/bin/oddbd
  • jobs/import_gkv

Result

Fri Jan 14 08:32:12 2011: de.oddb.org ODDB::Import::Gkv#import
NoMethodError
private method `gsub' called for nil:NilClass
/home/masa/ywesee/de.oddb.org/lib/oddb/util/multilingual.rb:60:in `all'
/home/masa/ywesee/de.oddb.org/lib/oddb/util/multilingual.rb:60:in `collect'
/home/masa/ywesee/de.oddb.org/lib/oddb/util/multilingual.rb:60:in `all'
/home/masa/ywesee/de.oddb.org/lib/oddb/import/gkv.rb:175:in `import_active_agent'
/home/masa/ywesee/de.oddb.org/lib/oddb/util/multilingual.rb:60:in `find'
/home/masa/ywesee/de.oddb.org/lib/oddb/import/gkv.rb:174:in `each'
/home/masa/ywesee/de.oddb.org/lib/oddb/import/gkv.rb:174:in `find'
/home/masa/ywesee/de.oddb.org/lib/oddb/import/gkv.rb:174:in `import_active_agent'
/home/masa/ywesee/de.oddb.org/lib/oddb/import/gkv.rb:160:in `import_row'
/home/masa/ywesee/de.oddb.org/lib/oddb/import/gkv.rb:458:in `process_page'
/home/masa/ywesee/de.oddb.org/lib/oddb/import/gkv.rb:457:in `each'
/home/masa/ywesee/de.oddb.org/lib/oddb/import/gkv.rb:457:in `process_page'
/home/masa/ywesee/de.oddb.org/lib/oddb/import/gkv.rb:40:in `call'
/home/masa/ywesee/de.oddb.org/lib/oddb/import/gkv.rb:40:in `send_page'
/usr/lib64/ruby/site_ruby/1.8/rpdf2txt/parser.rb:42:in `extract_text'
/usr/lib64/ruby/site_ruby/1.8/rpdf2txt/object.rb:479:in `each'
/usr/lib64/ruby/site_ruby/1.8/rpdf2txt/object.rb:442:in `each'
/usr/lib64/ruby/site_ruby/1.8/rpdf2txt/object.rb:479:in `each'
/usr/lib64/ruby/site_ruby/1.8/rpdf2txt/object.rb:478:in `each'
/usr/lib64/ruby/site_ruby/1.8/rpdf2txt/object.rb:463:in `each'
/usr/lib64/ruby/site_ruby/1.8/rpdf2txt/parser.rb:40:in `extract_text'
/home/masa/ywesee/de.oddb.org/lib/oddb/import/gkv.rb:100:in `import'
/home/masa/ywesee/de.oddb.org/lib/oddb/util/updater.rb:106:in `reported_import'
/home/masa/ywesee/de.oddb.org/lib/oddb/util/updater.rb:113:in `call'
/home/masa/ywesee/de.oddb.org/lib/oddb/util/updater.rb:113:in `_reported_import'
/home/masa/ywesee/de.oddb.org/lib/oddb/util/updater.rb:106:in `reported_import'
/home/masa/ywesee/de.oddb.org/lib/oddb/util/updater.rb:58:in `import_gkv'
/usr/lib64/ruby/1.8/open-uri.rb:32:in `open_uri_original_open'
/usr/lib64/ruby/1.8/open-uri.rb:32:in `open'
/home/masa/ywesee/de.oddb.org/lib/oddb/import/gkv.rb:77:in `download_latest'
/home/masa/ywesee/de.oddb.org/lib/oddb/util/updater.rb:57:in `import_gkv'
jobs/import_gkv:16
/home/masa/ywesee/de.oddb.org/lib/oddb/util/job.rb:16:in `call'
/home/masa/ywesee/de.oddb.org/lib/oddb/util/job.rb:16:in `run'
jobs/import_gkv:15
Imported   191 Zubef-Entries on 14.01.2011:
Visited    191 existing Zubef-Entries
Visited     13 existing Companies
Visited      8 existing Substances
Created      0 new Zubef-Entries
Created      0 new Products
Created      0 new Sequences
Created    178 new Companies
Created    179 new Substances
Assigned     0 Chemical Equivalences
Assigned     0 Companies
Created      0 Incomplete Packages:
Created      0 Product(s) without a name (missing product name):

Note

  • No matter how many it runs, the same error comes
  • The pdf files are download

Check files

masa@masa ~/ywesee/de.oddb.org $ ls var/pdf/gkv/ -al
insgesamt 2528
drwxr-xr-x 2 masa masa      16 14. Jan 08:43 .
drwxr-xr-x 3 masa masa       8 14. Jan 08:32 ..
-rw-r--r-- 1 masa masa 1291122 14. Jan 08:42 2011.01.14-Zuzahlungsbefreit_sort_Name_110101_15372.pdf
-rw-r--r-- 1 masa masa 1291122 14. Jan 08:43 Zuzahlungsbefreit_sort_Name_110101_15372.pdf

Hypothesis

  • The problem is in the pdf file (probably)

Traceback error messages

lib/oddb/util/multilingual.rb:60

      def all
        terms = super.concat(@synonyms)
        terms.concat(terms.collect do |term| term.gsub(/[^\w]/, '') end)   #<= here
        terms.uniq
      end

Note

  • 'term' is nil

Experiment lib/oddb/util/multilingual.rb:60

      def all
        terms = super.concat(@synonyms)
print "@synonyms="
p @synonyms
print "terms="
p terms
        terms.concat(terms.collect do |term| term.gsub(/[^\w]/, '') end)   #<= here
        terms.uniq
      end

Result

...
@synonyms=[nil]
terms=["30,13", nil]
@synonyms=[]
terms=[u"Eaxlhag"]
@synonyms=[]
terms=[u"Eaxlhag"]
@synonyms=[]
terms=[u"Xagaehl"]
@synonyms=[]
terms=[u"Xagaehl"]
@synonyms=[]
terms=["Amoxihexal"]
@synonyms=[]
terms=["Amoxihexal"]
@synonyms=[]
terms=["Amoxihexal"]
@synonyms=[]
terms=["Amoxicillin"]
@synonyms=[]
terms=["-mAWassoxicillin-3er"]
@synonyms=[nil]
terms=["30,13", nil]

Note

  • @synonyms includes nil

Question

  • When does @synonyms get nil?

Check @synonyms

lib/oddb/util/multilingual.rb

      include Comparable
      attr_reader :canonical
      attr_reader :synonyms
 ...
    class Multilingual
      include M10lMethods
      def initialize(canonical={})
        super
        @synonyms = []
      end

Note

  • 'synonyms' is attr_reader but Array, so it is possible to be changed from outside of the class

Check accessor methods for @synonyms

lib/oddb/util/multilingual.rb

      def add_synonym(synonym)
        @synonyms.push(synonym).uniq! && synonym
      end
...
      def merge(other)
        @synonyms.concat(other.all).uniq!
      end

grep search

masa@masa ~/ywesee/de.oddb.org $ grep -r add_synonym lib
lib/oddb/import/dimdi.rb:        ass.name.add_synonym('ASS')
lib/oddb/import/gkv.rb:        substance.name.add_synonym(name) && save(substance)
lib/oddb/import/gkv.rb:        product.name.add_synonym(name) && save(product)
lib/oddb/import/pharmnet.rb:      company.name.add_synonym name
lib/oddb/import/pharmnet.rb:        galform.description.add_synonym description
lib/oddb/import/pharmnet.rb:        unit.name.add_synonym name
lib/oddb/util/multilingual.rb:      def add_synonym(synonym)

masa@masa ~/ywesee/de.oddb.org $ grep -r merge lib
lib/oddb/drugs/substance.rb:      def merge(other)
lib/oddb/drugs/substance.rb:        name.merge other.name
lib/oddb/html/util/lookandfeel.rb:      :chapter_emergency          => 'Verhalten im Notfall', 
lib/oddb/import/gkv.rb:    opts = {:date => Date.today}.merge(opts)
lib/oddb/import/pharmnet.rb:               'emergency'
lib/oddb/import/pharmnet.rb:The two Registrations should probably be merged manually.
lib/oddb/import/pharmnet.rb:             :retry_unit => 60 }.merge opts
lib/oddb/import/pharmnet.rb:    opts = { :skip_totals => true }.merge opts
lib/oddb/import/pharmnet.rb:          detail = data.merge extract_details(dpg)
lib/oddb/util/multilingual.rb:      def merge(other)

Check lib/oddb/import/gkv.rb

Experiment

lib/oddb/imoprt/gkv.rb#import_active_agent

  def import_active_agent(sequence, row, offset)
    name = row.at(offset)
    sane = sanitize_substance_name(name)
    composition = sequence.compositions.first
    dose = Drugs::Dose.new(row.at(offset + 1),
                           row.at(offset + 2))
    ## check for slightly different names
    if(composition \
      && (agent = composition.active_agents.find { |act|
        act.substance.name.all.any? { |sub|
          sane == sanitize_substance_name(sub)
        }
      }))
      agent.dose = dose
      save agent
      substance = agent.substance
      if(substance.name != name)
print "1. name="
p name
        substance.name.add_synonym(name) && save(substance)
      end

lib/oddb/import/gkv.rb#import_product

  def import_product(package, row)
    name = product_name(row)
    search = name.dup
    product = nil
    candidates = []
    until(product || search.empty? || candidates.size > 1)
      candidates = Drugs::Product.search_by_name(search)
      if(candidates.size == 1)
        product = candidates.first
print "2. name="
p name
        product.name.add_synonym(name) && save(product)
      end
      search.sub!(/(\s|^)\S*$/, '')
    end
    if(product.nil?)
      @created_products += 1
      product = Drugs::Product.new
      product.name.de = name
      save product
    end
    product
  end

Result

1. name=nil
1. name=nil
1. name=nil
1. name=nil
1. name=nil

Note

  • The 'nil' data is set in import_active_agent method

Experiment

lib/oddb/imoprt/gkv.rb#import_active_agent

  def import_active_agent(sequence, row, offset)
    name = row.at(offset)
print "name="
p name
    sane = sanitize_substance_name(name)
print "sane="
p sane
    composition = sequence.compositions.first
    dose = Drugs::Dose.new(row.at(offset + 1),
                           row.at(offset + 2))
    ## check for slightly different names
    if(composition \
      && (agent = composition.active_agents.find { |act|
        act.substance.name.all.any? { |sub|
          sane == sanitize_substance_name(sub)
        }
      }))
      agent.dose = dose
      save agent
      substance = agent.substance
print "substance.name="
p substance.name
      if(substance.name != name)
print "1. name="
p name
        substance.name.add_synonym(name) && save(substance)
      end

Result

name="teinAscytylce"
sane="teinascytylce"
substance.name=#<ODDB::Util::Multilingual:0x7f0d5de20740 @synonyms=[], @canonical={:de=>"teinAscytylce"}>
name=nil
sane=""
name=nil
sane=""
name="cetsyineyAtcl"
sane="cetsyineyatcl"
substance.name=#<ODDB::Util::Multilingual:0x7f0d5de0de10 @synonyms=[], @canonical={:de=>"cetsyineyAtcl"}>
name=nil
sane=""
name=nil
sane=""
name="tsyceintyAcel"
sane="tsyceintyacel"
substance.name=#<ODDB::Util::Multilingual:0x7f0d5dbbe628 @synonyms=[], @canonical={:de=>"tsyceintyAcel"}>
...
name=nil
sane=""
substance.name=#<ODDB::Util::Multilingual:0x7f0d5d99a658 @synonyms=[], @canonical={:de=>"39,13"}>
1. name=nil
...
name=nil
sane=""
substance.name=#<ODDB::Util::Multilingual:0x7f0d6368f778 @synonyms=[], @canonical={:de=>"1353,"}>
1. name=nil
...

Note

  • Most of the cases become the comparison the names
  • But sometime it compares 'nil' and "" (null)
  • The cause to 'add_synonym' nil data is the 'name' is nil, not ""

Experiment

lib/oddb/import/gkv.rb#import_active_agent

      #if(substance.name != name)
      if(name && substance.name != name)

Result

Fri Jan 14 10:14:08 2011: de.oddb.org ODDB::Import::Gkv#import
Imported  6314 Zubef-Entries on 14.01.2011:
Visited   6244 existing Zubef-Entries
Visited   1533 existing Companies
Visited    971 existing Substances
Created     70 new Zubef-Entries
Created     37 new Products
Created     41 new Sequences
Created   4781 new Companies
Created   4914 new Substances
Assigned     0 Chemical Equivalences
Assigned     8 Companies
Created     29 Incomplete Packages:
http://de.oddb.org/de/drugs/package/pzn/6944624
http://de.oddb.org/de/drugs/package/pzn/7580489
http://de.oddb.org/de/drugs/package/pzn/7580503
http://de.oddb.org/de/drugs/package/pzn/7503974
http://de.oddb.org/de/drugs/package/pzn/7558188
http://de.oddb.org/de/drugs/package/pzn/7782000
http://de.oddb.org/de/drugs/package/pzn/7781992
http://de.oddb.org/de/drugs/package/pzn/7781986
http://de.oddb.org/de/drugs/package/pzn/7745855
http://de.oddb.org/de/drugs/package/pzn/7745849
http://de.oddb.org/de/drugs/package/pzn/7745832
http://de.oddb.org/de/drugs/package/pzn/7745826
http://de.oddb.org/de/drugs/package/pzn/7745803
http://de.oddb.org/de/drugs/package/pzn/7745795
http://de.oddb.org/de/drugs/package/pzn/7745772
http://de.oddb.org/de/drugs/package/pzn/7745766
http://de.oddb.org/de/drugs/package/pzn/7745921
http://de.oddb.org/de/drugs/package/pzn/7745915
http://de.oddb.org/de/drugs/package/pzn/7745909
http://de.oddb.org/de/drugs/package/pzn/7745890
http://de.oddb.org/de/drugs/package/pzn/7745884
http://de.oddb.org/de/drugs/package/pzn/7745878
http://de.oddb.org/de/drugs/package/pzn/7745861
http://de.oddb.org/de/drugs/package/pzn/7533283
http://de.oddb.org/de/drugs/package/pzn/6938919
http://de.oddb.org/de/drugs/package/pzn/6938902
http://de.oddb.org/de/drugs/package/pzn/6834829
http://de.oddb.org/de/drugs/package/pzn/6834812
http://de.oddb.org/de/drugs/package/pzn/7533308
Created      1 Product(s) without a name (missing product name):
http://de.oddb.org/de/drugs/product/uid/3480899

Note

  • Looks good

Log

WARNING:  nonstandard use of \' in a string literal
LINE 3:         AND search_term = 'pfizer ltd. \''
                                  ^
HINT:  Use '' to write quotes in strings, or use the escape string syntax (E'...').
WARNING:  nonstandard use of \' in a string literal
LINE 3:         AND search_term = 'pfizer ltd. \''
                                  ^
HINT:  Use '' to write quotes in strings, or use the escape string syntax (E'...').
WARNING:  nonstandard use of \' in a string literal
LINE 3:         AND search_term = 'pfizer ltd. \''
                                  ^
HINT:  Use '' to write quotes in strings, or use the escape string syntax (E'...').
WARNING:  nonstandard use of \' in a string literal
LINE 3:         AND search_term = 'pfizer ltd. \''
                                  ^
HINT:  Use '' to write quotes in strings, or use the escape string syntax (E'...').
WARNING:  nonstandard use of \' in a string literal
LINE 3:         AND search_term = 'pfizer ltd. \''
                                  ^
HINT:  Use '' to write quotes in strings, or use the escape string syntax (E'...').
WARNING:  nonstandard use of \' in a string literal
LINE 3:         AND search_term = 'pfizer ltd. \''
                                  ^
HINT:  Use '' to write quotes in strings, or use the escape string syntax (E'...').
WARNING:  nonstandard use of \' in a string literal
LINE 3:         AND search_term = 'pfizer ltd. \''
                                  ^
HINT:  Use '' to write quotes in strings, or use the escape string syntax (E'...').
WARNING:  nonstandard use of \' in a string literal
LINE 3:         AND search_term = 'pfizer ltd. \''
                                  ^
HINT:  Use '' to write quotes in strings, or use the escape string syntax (E'...').
WARNING:  nonstandard use of \' in a string literal
LINE 3:         AND search_term = 'pfizer ltd. \''
                                  ^
HINT:  Use '' to write quotes in strings, or use the escape string syntax (E'...').
WARNING:  nonstandard use of \' in a string literal
LINE 3:         AND search_term = 'pfizer ltd. \''
                                  ^
HINT:  Use '' to write quotes in strings, or use the escape string syntax (E'...').
WARNING:  nonstandard use of \' in a string literal
LINE 3:         AND search_term = 'pfizer ltd. \''
....

Note

  • There are many warnings
  • These are probably because of charactor code for PostgreSQL
...
name="ispeRidonr"
sane="isperidonr"
substance.name=#<ODDB::Util::Multilingual:0x7fa09152abc8 @synonyms=[], @canonical={:de=>"isperidonR"}>
1. name="ispeRidonr"
...
name=",8717"
sane=""
substance.name=#<ODDB::Util::Multilingual:0x7fa09145ee88 @synonyms=[], @canonical={:de=>"25,72"}>
1. name=",8717"
...
name="26,12"
sane=""
substance.name=#<ODDB::Util::Multilingual:0x7fa0920f9f08 @synonyms=[], @canonical={:de=>"14,76"}>
1. name="26,12"
...

Note

  • It is no problem that the 'name' is 'isperidonr' but
  • ',8717' or '26,12' looks not correct for synonym name
  • This may catch a wrong column data

Commit

Check 'name' value of import_active_agent method

Experiment

lib/oddb/import/gkv.rb#import_active_agent

  def import_active_agent(sequence, row, offset)
    name = row.at(offset)
    sane = sanitize_substance_name(name)
    composition = sequence.compositions.first
    dose = Drugs::Dose.new(row.at(offset + 1),
                           row.at(offset + 2))
    ## check for slightly different names
    if(composition \
      && (agent = composition.active_agents.find { |act|
        act.substance.name.all.any? { |sub|
          sane == sanitize_substance_name(sub)
        }
      }))
      agent.dose = dose
      save agent
      substance = agent.substance
      #if(name && substance.name != name)
      if(substance.name != name)
print "row="
p row
print "offset=", offset, "\n"
print "name="
p name
print "substance.name=", substance.name, "\n"
        substance.name.add_synonym(name) && save(substance)
      end

Result

row=["A/CGM25ARIOMLOHRIVICLSP", "2087531", "HOSPIRABHMG. LCHSTUDE", "iolcicrnatriumAv", ",4274", "mg", "105X", "l", "m", "fnIsons-iu", "39,13"]
offset=13
name=nil
substance.name=39,13
row=["AML5/GM500Z ONDSAI XOM", "0774121", "NDOSAZ", "aillin-3-WcsserxmAio", "95,573", "mg", "100", "l", "m", "Perulv", "1353,"]
offset=13
name=nil
substance.name=1353,
row=["AML5/GM250Z ONDSAI XOM", "0771252", "SANDZ O", "aillin-3-WcsserxomAi", "98,286", "mg", "100", "l", "m", "Perulv", "1142,"]
offset=13
name=nil
substance.name=1142,
row=["SAMOXI 250TMA1APHAR", "0658834", "h1APrabHmGam", "W-Aassermox-illinci3", ",28698", "mg", "100", "l", "m", "ftcorTkenas", "1138,"]
offset=13
name=nil
substance.name=1138,
row=["XEALHIXOMA", "4568855", "EAXLHAG", "-mAWassoxicillin-3er", "286,98", "mg", "1002X", "l", "m", "ftrTaencoks", "30,13"]
offset=13
name=nil
substance.name=30,13

Note

  • I do not know it is fine or not

Run import_gkv online

$ screen
$ cd /var/www/de.oddb.org
$ git pull
$ su
# svc -x /service/de.oddb
# svstat /service/de.oddb
# rm var/pdf/gkv/2011.01.14-Zuzahlungsbefreit_sort_Name_110101_15372.pdf  
# rm var/pdf/gkv/Zuzahlungsbefreit_sort_Name_110101_15372.pdf
# exit
$ sudo -u apache jobs/import_gkv
$ ctl+a, ctl+d (detouch)

Check Rpdf2txt by Ruby1.9

masa@masa ~/ywesee/rpdf2txt $ rpdf2txt test/data/test.pdf 

untitled text                                                                        Page 1 of 1
Printed: Donnerstag, 14. November 2002 14:04:29 Uhr
testpdf

masa@masa ~/ywesee/rpdf2txt $ ruby1.9 -I lib bin/rpdf2txt test/data/test.pdf 
/home/masa/ywesee/rpdf2txt/lib/rpdf2txt-rockit/grammar.rb:1:in `require': /home/masa/ywesee/rpdf2txt/lib/rpdf2txt-rockit/token.rb:138: invalid multibyte char (US-ASCII) (SyntaxError)
/home/masa/ywesee/rpdf2txt/lib/rpdf2txt-rockit/token.rb:138: syntax error, unexpected '~', expecting ')'
    super("EOF", "&#65533;~~&#65533;&#65533;~^^~" + rand(1e10).inspect)
                    ^
/home/masa/ywesee/rpdf2txt/lib/rpdf2txt-rockit/token.rb:138: invalid multibyte char (US-ASCII)
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt-rockit/grammar.rb:1:in `<top (required)>'
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt-rockit/lalr_parsetable_generator.rb:1:in `require'
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt-rockit/lalr_parsetable_generator.rb:1:in `<top (required)>'
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt-rockit/rockit.rb:2:in `require'
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt-rockit/rockit.rb:2:in `<top (required)>'
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt/textparser.rb:25:in `require'
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt/textparser.rb:25:in `<top (required)>'
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt/text.rb:26:in `require'
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt/text.rb:26:in `<top (required)>'
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt/object.rb:26:in `require'
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt/object.rb:26:in `<top (required)>'
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt/parser.rb:26:in `require'
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt/parser.rb:26:in `<top (required)>'
        from bin/rpdf2txt:25:in `require'
        from bin/rpdf2txt:25:in `<main>'

Note

  • Something is wrong
  • Rpdf2txt does not run on Ruby1.9 at the moment, anyway
  • It is probably due to character code

Set up de.oddb.org on Ubuntu

Check gem list

 masa@masa-VirtualBox:~$ gem list

 *** LOCAL GEMS ***

 de.oddb (2.0.0)
 rclconf (1.0.0)

Copy git master file

 masa@masa-VirtualBox:~$ sudo mkdir /var/lib/gems/1.8/gems/de.oddb-2.0.0/.git/
 masa@masa-VirtualBox:~$ sudo mkdir /var/lib/gems/1.8/gems/de.oddb-2.0.0/.git/refs/
 masa@masa-VirtualBox:~$ sudo mkdir /var/lib/gems/1.8/gems/de.oddb-2.0.0/.git/refs/heads/
 masa@masa-VirtualBox:~$ sudo vim /var/lib/gems/1.8/gems/de.oddb-2.0.0/.git/refs/heads/master
 4c244cee62c31edf569c6c4dd4b0af25db35fff6

Install libraries

 masa@masa-VirtualBox:~$ sudo gem install facets -v=1.8.54
 masa@masa-VirtualBox:~$ sudo gem install odba

 # this two are for pg
 masa@masa-VirtualBox:~$ sudo apt-get install ruby1.8-dev
 masa@masa-VirtualBox:~$ sudo apt-get install libpq-dev

 masa@masa-VirtualBox:~$ sudo gem install pg -v=0.9.0

 masa@masa-VirtualBox:~$ sudo gem install dbd-pg
 Successfully installed deprecated-2.0.1
 Successfully installed dbi-0.4.5
 Successfully installed dbd-pg-0.3.9

load_driver error

 masa@masa-VirtualBox:~$ oddbd
 /var/lib/gems/1.8/gems/de.oddb-2.0.0/lib/oddb.rb:4: warning: already initialized constant VERSION
 /var/lib/gems/1.8/gems/dbi-0.4.5/lib/dbi.rb:300:in `load_driver': Unable to load driver 'pg' (underlying error: wrong constant name pg) (DBI::InterfaceError)
	from /home/masa/bin/lib/ruby/1.8/monitor.rb:242:in `synchronize'
	from /var/lib/gems/1.8/gems/dbi-0.4.5/lib/dbi.rb:242:in `load_driver'
	from /var/lib/gems/1.8/gems/dbi-0.4.5/lib/dbi.rb:160:in `_get_full_driver'
	from /var/lib/gems/1.8/gems/dbi-0.4.5/lib/dbi.rb:145:in `connect'
	from /var/lib/gems/1.8/gems/odba-1.0.0/lib/odba/connection_pool.rb:60:in `_connect'
	from /var/lib/gems/1.8/gems/odba-1.0.0/lib/odba/connection_pool.rb:59:in `times'
	from /var/lib/gems/1.8/gems/odba-1.0.0/lib/odba/connection_pool.rb:59:in `_connect'
	from /var/lib/gems/1.8/gems/odba-1.0.0/lib/odba/connection_pool.rb:56:in `connect'
	from /var/lib/gems/1.8/gems/odba-1.0.0/lib/odba/connection_pool.rb:56:in `synchronize'
	from /var/lib/gems/1.8/gems/odba-1.0.0/lib/odba/connection_pool.rb:56:in `connect'
	from /var/lib/gems/1.8/gems/odba-1.0.0/lib/odba/connection_pool.rb:19:in `initialize'
	from /var/lib/gems/1.8/gems/de.oddb-2.0.0/lib/oddb/persistence/odba.rb:29:in `new'
	from /var/lib/gems/1.8/gems/de.oddb-2.0.0/lib/oddb/persistence/odba.rb:29
	from /usr/lib/ruby/1.8/rubygems/custom_require.rb:31:in `gem_original_require'
	from /usr/lib/ruby/1.8/rubygems/custom_require.rb:31:in `require'
	from /var/lib/gems/1.8/gems/de.oddb-2.0.0/lib/oddb/persistence.rb:4
	from /usr/lib/ruby/1.8/rubygems/custom_require.rb:31:in `gem_original_require'
	from /usr/lib/ruby/1.8/rubygems/custom_require.rb:31:in `require'
	from /var/lib/gems/1.8/gems/de.oddb-2.0.0/bin/oddbd:10
	from /var/lib/gems/1.8/bin/oddbd:19:in `load'
	from /var/lib/gems/1.8/bin/oddbd:19

also see a similar error on windows

 http://dev.ywesee.com/wiki.php/Gem/DeoddbWindows#pg_error

suspend

Update rpdf2txt for Ruby1.9

Memo

  • All the regular expressions fo rpdf2txt have '/u' option
  • This option indicates that the regular expression is parsed as 'UTF8'
  • So, the files look written as 'UTF8' but

for example

lib/rpdf2txt-rockit/token.rb

 # encoding: utf-8
 require 'rpdf2txt-rockit/syntax_tree'
 require 'rpdf2txt-rockit/sourcecode_dumpable'
 require 'rpdf2txt-rockit/bounded_lru_cache'

Result

masa@masa ~/ywesee/rpdf2txt $ ruby1.9 -I lib bin/rpdf2txt test/data/test.pdf 
/home/masa/ywesee/rpdf2txt/lib/rpdf2txt-rockit/grammar.rb:1:in `require': /home/masa/ywesee/rpdf2txt/lib/rpdf2txt-rockit/token.rb:139: invalid multibyte char (UTF-8) (SyntaxError)
/home/masa/ywesee/rpdf2txt/lib/rpdf2txt-rockit/token.rb:139: syntax error, unexpected '~', expecting ')'
    super("EOF", "&#65533;~~&#65533;&#65533;~^^~" + rand(1e10).inspect)
                    ^
/home/masa/ywesee/rpdf2txt/lib/rpdf2txt-rockit/token.rb:139: invalid multibyte char (UTF-8)
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt-rockit/grammar.rb:1:in `<top (required)>'
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt-rockit/lalr_parsetable_generator.rb:1:in `require'
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt-rockit/lalr_parsetable_generator.rb:1:in `<top (required)>'
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt-rockit/rockit.rb:2:in `require'
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt-rockit/rockit.rb:2:in `<top (required)>'
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt/textparser.rb:25:in `require'
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt/textparser.rb:25:in `<top (required)>'
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt/text.rb:26:in `require'
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt/text.rb:26:in `<top (required)>'
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt/object.rb:26:in `require'
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt/object.rb:26:in `<top (required)>'
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt/parser.rb:26:in `require'
        from /home/masa/ywesee/rpdf2txt/lib/rpdf2txt/parser.rb:26:in `<top (required)>'
        from bin/rpdf2txt:25:in `require'
        from bin/rpdf2txt:25:in `<main>'

Note

  • This character is not recognized as 'UTF8'

It is difficult to find out the character code of the files

view · edit · sidebar · attach · print · history
Page last modified on June 14, 2013, at 01:32 AM