view · edit · sidebar · attach · print · history

20111018-migrate_to_utf8-ruby193-oddb_org

<< | Index | >>


  1. Debug Hash iteration error
  2. Debug encoding error regular expression
  3. Find ODBA::Stub purpose (usage) suspend

Goal/Estimate/Evaluation
  • migerate_to_utf bin/admin oddb.org / 70% / 60%
Milestones
  1. migerate ruby 1.9.3 oddb.org
    1. debug Hash iteration bug in ODBA
    2. debug encoding error regular expression
    3. Check Drug search function
    4. debug ODDB::Stub error

Change log


Debug Hash iteration error

Experiment

  • src/util/oddbapp.rb
    def migrate_to_utf8
      ODBA.cache.retire_age = 5
      ODBA.cache.cleaner_step = 100000

       child = @system.instance_variable_get('@doctors')
       p child.class
       exit

Run

 ruby193 -I ../oddb/lib bin/oddbd
  • bin/admin
ch.oddb> migrate_to_utf8

Result

failsafe rescued RuntimeError < StandardError
can't add a new key into hash during iteration
/home/masa/ywesee/oddb.org.ruby193/src/util/oddbapp.rb:1474:in `instance_eval'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/cache.rb:325:in `block (3 levels) in fetch_or_restore'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/cache.rb:313:in `call'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/cache.rb:313:in `fetch_or_do'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/cache.rb:320:in `block (2 levels) in fetch_or_restore'
<internal:prelude>:10:in `synchronize'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/cache.rb:319:in `block in fetch_or_restore'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/cache.rb:313:in `call'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/cache.rb:313:in `fetch_or_do'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/cache.rb:317:in `fetch_or_restore'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/cache.rb:64:in `block in bulk_restore'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/cache.rb:61:in `each'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/cache.rb:61:in `bulk_restore'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/cache.rb:55:in `bulk_fetch'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/cache.rb:260:in `fetch_collection'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/cache.rb:597:in `restore'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/cache.rb:318:in `block in fetch_or_restore'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/cache.rb:313:in `call'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/cache.rb:313:in `fetch_or_do'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/cache.rb:317:in `fetch_or_restore'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/cache.rb:605:in `restore_object'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/cache.rb:570:in `load_object'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/cache.rb:226:in `block in fetch'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/cache.rb:313:in `call'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/cache.rb:313:in `fetch_or_do'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/cache.rb:225:in `fetch'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/stub.rb:49:in `odba_receiver'
/home/masa/bin/ruby193rc1/lib/ruby/gems/1.9.1/gems/odba-1.0.0/lib/odba/stub.rb:17:in `class'
/home/masa/ywesee/oddb.org.ruby193/src/util/oddbapp.rb:1788:in `migrate_to_utf8'
(eval):1:in `block (2 levels) in _admin'
/home/masa/ywesee/oddb.org.ruby193/src/util/oddbapp.rb:1474:in `instance_eval'
/home/masa/ywesee/oddb.org.ruby193/src/util/oddbapp.rb:1474:in `block (2 levels) in _admin'
/home/masa/ywesee/oddb.org.ruby193/src/util/failsafe.rb:9:in `call'
/home/masa/ywesee/oddb.org.ruby193/src/util/failsafe.rb:9:in `failsafe'
/home/masa/ywesee/oddb.org.ruby193/src/util/oddbapp.rb:1473:in `block in _admin'

Experiment

  • src/util/oddbapp.rb
    def migrate_to_utf8
      ODBA.cache.retire_age = 5
       child = @system.instance_variable_get('@doctors')
       p child.class
       exit
      ODBA.cache.cleaner_step = 100000

Result

Hash

Note

  • It seems that ODBA.cache.cleaner_step does something
  • This value is 500 as default. odba/lib/cache.rb

Experiment

  • src/util/oddbapp.rb
    def migrate_to_utf8
      ...
      #ODBA.cache.cleaner_step = 100000


    def _migrate_child_to_utf8 child, queue, table, iconv, opts={}
      #child = iconv.iconv(child)
      child.force_encoding('utf-8')

Run

 ruby193 -I ../oddb/lib bin/oddbd
  • bin/admin
ch.oddb> migrate_to_utf8

Note

  • There is no Hash error
  • Many encoding errors
Encoding::CompatibilityError: incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string) when updating index 'fachinfo_index_de' with a ODDB::Fachinfo
["/home/masa/ywesee/oddb.org.ruby193/src/util/searchterms.rb:79:in `gsub'", "/home/masa/ywesee/oddb.org.ruby193/src/util/searchterms.rb:79:in `search_term'", "/home/masa/ywesee/oddb.org.ruby193/src/model/fachinfo.rb:69:in `search_text'", "(eval):3:in `block in proc_resolve_search_term'"]
[...]

Encoding::CompatibilityError: incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string) when updating index 'fachinfo_index_fr' with a ODDB::Fachinfo
["/home/masa/ywesee/oddb.org.ruby193/src/util/searchterms.rb:79:in `gsub'", "/home/masa/ywesee/oddb.org.ruby193/src/util/searchterms.rb:79:in `search_term'", "/home/masa/ywesee/oddb.org.ruby193/src/model/fachinfo.rb:69:in `search_text'", "(eval):3:in `block in proc_resolve_search_term'"]
[...]
  • Most of the encoding error is related to regular expression
  • Many ODBA::Stub error
ODBA::Stub was unable to replace ODDB::SimpleLanguage::Descriptions#27473952 from ODDB::AtcClass:#13744
ODBA::Stub was unable to replace Hash#27501234 from ODDB::CommercialForm:#1067691
ODBA::Stub was unable to replace ODDB::SimpleLanguage::Descriptions#27501236 from ODDB::CommercialForm:#1067691
ODBA::Stub was unable to replace Array#27501235 from ODDB::CommercialForm:#1067691
saved: 295664
failsafe rescued NoMethodError < StandardError
undefined method `odba_store' for nil:NilClass
/home/masa/ywesee/oddb.org.ruby193/src/util/oddbapp.rb:1474:in `instance_eval'
/home/masa/ywesee/oddb.org.ruby193/src/util/oddbapp.rb:1821:in `_migrate_to_utf8'
/home/masa/ywesee/oddb.org.ruby193/src/util/oddbapp.rb:1802:in `migrate_to_utf8'
(eval):1:in `block (2 levels) in _admin'
/home/masa/ywesee/oddb.org.ruby193/src/util/oddbapp.rb:1474:in `instance_eval'
/home/masa/ywesee/oddb.org.ruby193/src/util/oddbapp.rb:1474:in `block (2 levels) in _admin'
/home/masa/ywesee/oddb.org.ruby193/src/util/failsafe.rb:9:in `call'
/home/masa/ywesee/oddb.org.ruby193/src/util/failsafe.rb:9:in `failsafe'
/home/masa/ywesee/oddb.org.ruby193/src/util/oddbapp.rb:1473:in `block in _admin'
  • In this time the total saved objects are 295664
  • This error is strange becuase I wrote in src/util/oddbapp.rb:1821#_migrate_to_utf8 as follows
if obj
@count ||= 0
      obj.odba_store unless obj.odba_unsaved?
print "saved: ", @count+=1, "\n"
end
  • 'odj' must not be nil, but actually the error come from 'obj.odba_store'
  • A temporary solution
class NilClass
  def odba_store
  end
end
  • and just in case
    def migrate_to_utf8
      @migrate_mutex = Mutex.new
    ...

    def _migrate_to_utf8 queue, table, iconv, opts={}
    ...
    @migrate_mutex.synchronize {
    if obj
    @count ||= 0
      obj.odba_store unless obj.odba_unsaved?
    print "saved: ", @count+=1, "\n"
    end
    }

Result

....
saved: 2484074

Note

  • It takes for 5 hours to run
  • saved in total 2.5 million objects
  • Drug search does not work
  • 'FI' results in encoding error
  • 'PI' results in an error
  • 'WHODDD' results in an error
  • Sequence data is not shown and results in NilClass error

Debug encoding error regular expression

  1. src/util/searchterms.rb:79:in `gsub'
  2. src/util/searchterms.rb:90:in `split'"
  3. src/model/address.rb:64:in `match'"
  • src/util/searchterms.rb
 def ODDB.search_term(term)
    term.force_encoding('UTF-8')

 def ODDB.search_terms(words, opts={})
    terms = []
    words.flatten.compact.uniq.inject(terms) { |terms, term|
      if(opts[:downcase])
        term = term.downcase
      end
      term.force_encoding('UTF-8')
      parts = term.split(/[\/-]/u)
  • src/model/address.rb
 #!/usr/bin/env ruby
 # utf-8
 ...
    def city
      @location.force_encoding('utf-8')

Find ODBA::Stub purpose (usage)

Question

  • Why is ODBA::Stub used?
  • Why does ODBA::Stub need to replace Persistable objects?

Refer to testcases

Experiment (Check directly the instance that causes Stub error)

ch.oddb> ODBA.cache.fetch('13744').description.encoding
-> US-ASCII
ch.oddb> ODBA.cache.fetch('13744').description.force_encoding('utf-8')
-> 
ch.oddb> ODBA.cache.fetch('13744').description.encoding
-> US-ASCII
ch.oddb> ODBA.cache.fetch('13744').description.length
-> 0
  • src/util/language.rb
# encoding: utf-8

Reboot oddbd

bin/admin

ch.oddb> ODBA.cache.fetch('13744').description.encoding
-> UTF-8

Note

  • 'description' is defined in src/util/language.rb as follows
    class Descriptions < Hash
      def first
        if empty?
          ''
        else
          sort.first.last
        end
      end
    end
    def description(key=nil)
      descriptions[key.to_s] || descriptions.first
    end
    def descriptions
      @descriptions ||= Descriptions.new
    end
  • In such a case, force_encoding does not work for '@descriptions.first'
  • because every time null string with default encoding is created when 'description' method is called

Next

  • I should first put magic comment in every source file
view · edit · sidebar · attach · print · history
Page last modified on October 19, 2011, at 07:11 AM