view · edit · sidebar · attach · print · history

20110902-refactoring-testcases-search-migel

<< Masa.20110905-update-degi-flavor-oddb_org-502-entry-xmlconv-debug-importer-testcases-migel | 2011 | Masa.20110901-fulltext-search-migel >>


  1. Replace Migel::Model::Migelid.all
  2. Product article_name fulltext search
  3. Refactoring importer suspend
  4. Move workspace of migel
  5. Fix price sort order
  6. Debug gsub nil error
  7. Behaviours (Testcases) and refactoring migel

Goal/Estimate/Evaluation:

  • Update migel search function / 100% /
Milestones
  1. replace Migel.all method
  2. create fulltext index table for product article name and company name
  3. price sort
  4. Testcases
ToDo
  • refactor rebuild_fulltext_index_tables
  • delete_migelids in lib/migel/util/server.rb
  • 'could not sort' warning -> this is caused by probably to_s method
  • RangeError 0x3fd07d5c4ee0 is recycled object
  • Refactor Group, Subgroup, Migelid because some code in the classes are totally same
  • Testcases oddb.org, migel
  • Importer takes longer
  • Migel::Model::Migelid.all problem
    • This method takes much long time and result in memory leak

Replace Migel::Model::Migelid.all

Problem

  • Migel::Model::Migelid.all method causes memory leak

Experiment

  • lib/migel/util/server.rb
      def all_migelids
         @migelids ||= ODBA.cache.fetch_named('all_migelids', self){
           []
         }
      end
      def add_migelid(migelid)
        all_migelids << migelid
        all_migelids.odba_store
      end
      def init_migelids
        ODBA.cache.index_keys('migel_model_migelid_migel_code').each do |migel_code|
          add_migelid Migel::Model::Migelid.find_by_migel_code(migel_code)
        end
      end
      def delete_migelid
        # TODO
      end
      def rebuild_fulltext_index_tables
 index_def = YAML.load <<-EOD
 --- !ruby/object:ODBA::IndexDefinition 
 index_name: 'migel_fulltext_index_de'
 origin_klass: 'Migel::Model::Migelid'
 target_klass: 'Migel::Model::Migelid'
 resolve_search_term: 'full_description(:de)'
 resolve_target: ''
 resolve_origin: ''
 fulltext: true
 init_source: 'all_migelids'
 dictionary: 'german'
 EOD

Restore

masa@masa ~/ywesee/migel $ sudo -u postgres dropdb migel
masa@masa ~/ywesee/migel $ sudo -u postgres createdb -E UTF8 -T template0 migel
masa@masa ~/ywesee/migel $ bin/migeld
migel> Migel::Importer.new.update('data/csv/migel_de_test.csv', 'de')
-> Array
migel> Migel::Importer.new.update('data/csv/migel_fr_test.csv', 'fr')
-> Array
migel> Migel::Importer.new.update_products_by_migel_code('15.30.50.00.1', 'de')
-> Array
migel> Migel::Importer.new.update_products_by_migel_code('15.30.50.00.1', 'fr')
-> Array
migel> init_migelids
-> Array
migel> rebuild_fulltext_index_tables
-> Array

Search (z.B. search 'wochen')

Attach:wochen_search.20110902.jpg Δ

Product article_name fulltext search

  • lib/migel/model/product.rb
      def full_description(lang)
        [(article_name.send(lang) or ''), (company_name and company_name.send(lang) or '')].join(' ')
      end
  • lib/migel/util/server.rb
      def products
        @products ||= ODBA.cache.fetch_named('all_products', self){
          {}
         }
      end
      def add_product(product)
        products.store(product.pharmacode, product)
        products.odba_store
      end
      def init_products
        ODBA.cache.index_keys('migel_model_product_pharmacode').each do |pharmacode|
          add_product Migel::Model::Migelid.find_by_migel_code(pharma)
        end
      end
      alias :all_products :products
      def delete_product
        # TODO
      end
      def rebuild_fulltext_index_tables
...
index_def = YAML.load <<-EOD
--- !ruby/object:ODBA::IndexDefinition 
index_name: 'migel_fulltext_index_fr'
origin_klass: 'Migel::Model::Migelid'
target_klass: 'Migel::Model::Migelid'
resolve_search_term: 'full_description(:fr)'
resolve_target: ''
resolve_origin: ''
fulltext: true
init_source: 'all_migelids.values'
dictionary: 'french'
EOD
#init_source: 'Migel::Model::Migelid.all'
        begin
        ODBA.cache.drop_index('migel_fulltext_index_fr')
        rescue
        end
        ODBA.cache.create_index(index_def, Migel)

        puts "filling: #{index_def.index_name}"
        puts index_def.init_source
        source = instance_eval(index_def.init_source)
        puts "source.size: #{source.size}"
        ODBA.cache.fill_index(index_def.index_name, source)
...
  • src/util/oddbapp.rb
    def search_migel_items(query, lang)
      if query =~ /^\d{13}$/
        MIGEL_SERVER.product.search_by_ean_code(query)
      elsif query =~ /^\d{6,}$/
        MIGEL_SERVER.product.search_by_pharmacode(query)
      else
        MIGEL_SERVER.search_migel_product(query, lang)
      end
    end
  • lib/migel/util/server.rb
      def search_migel_product(query, lang)
        # search product by fulltext search
        index_table_name = 'migel_product_fulltext_index_' + lang
        result = ODBA.cache.retrieve_from_index(index_table_name, query)
        unless result.empty?
          ODBA::DRbWrapper.new(result)
        else
        # search product by name (prefix search)
          search_method_article_name = 'search_by_article_name_' + lang.downcase.to_s
          search_method_company_name = 'search_by_company_name_' + lang.downcase.to_s
          result = Migel::Model::Product.send(search_method_article_name, query) + Migel::Model::Product.send(search_method_company_name, query)
          ODBA::DRbWrapper(result)
        end
      end

Restore

masa@masa ~/ywesee/migel $ sudo -u postgres dropdb migel
masa@masa ~/ywesee/migel $ sudo -u postgres createdb -E UTF8 -T template0 migel
masa@masa ~/ywesee/migel $ bin/migeld
migel> Migel::Importer.new.update('data/csv/migel_de_test.csv', 'de')
-> Array
migel> Migel::Importer.new.update('data/csv/migel_fr_test.csv', 'fr')
-> Array
migel> Migel::Importer.new.update_products_by_migel_code('15.30.50.00.1', 'de')
-> Array
migel> Migel::Importer.new.update_products_by_migel_code('15.30.50.00.1', 'fr')
-> Array
migel> init_migelids
-> Array
migel> init_products
-> 21269172126923212697521269812760086276009227601002998899
migel> rebuild_fulltext_index_tables
-> Array

Search

Refactoring importer

Problem

  • the first csv file takes only 5 minutes to be improted but the second csv file takes much longer (about 40 minutes)

Note

  • This is not a file format or data problem, but import algorithm.

Install Profiling

  • sudo gem install ruby-prof fail
  • sudo emerge dev-ruby/ruby-prof ok

Reference

Run

masa@masa ~/ywesee/migel $ ruby-prof bin/migeld -f profile.html -p call_stack

1st profile

migel> Migel::Importer.new.update('data/csv/migel_de_test.csv', 'de')

2nd profile

migel> Migel::Importer.new.update('data/csv/migel_de_test.csv', 'de')

Result

Note

  • Most of the time is consumed for ODBA#save method
  • The 2nd profile does not include update_subgroup and update_migelid, why?

Note

  • Maybe, on_save(:cascade) method is related to the save time

Move workspace of migel

Task

  • Move workspace of migel in order to develop migel code while the importer is running

Idea

  • Copy code somewhere
  • Change database name
  • Change DRb URL

Experiment

  • create database 'migel_test'
 masa@masa ~/ywesee/migel_dev $ sudo -u postgres createdb -E UTF8 -T template0 migel_test

Run

  • bin/migel
 masa@masa ~/ywesee/migel_dev $ bin/migeld server_url='druby://localhost:77777' db_name='migel_test'
  • src/util/oddbconfig.rb
module ODDB
  #MIGEL_URI = 'druby://localhost:33000'
  MIGEL_URI = 'druby://localhost:77777'
  • bin/oddbd

Restore

 masa@masa ~/ywesee/migel_dev $ ruby bin/admin server_url='druby://localhost:77777' db_name='migel_test'
 masa@masa ~/ywesee/migel_dev $ migel> Migel::Importer.new.update('data/csv/migel_de_test.csv', 'de')
 masa@masa ~/ywesee/migel_dev $ migel> Migel::Importer.new.update_products_by_migel_code('15.30.50.00.1')

Result

  • migeld works with the different database (migel_test)

Fix price sort order

Problem

  • price is sorted by dictionary order

Note

  • I found a new bug
  • When I click 'Packungsgrösse', the following error message comes
error in SBSM::Session#process: /de/gcc/sort/state_id/70078219690680/sortvalue/size
NoMethodError
private method `gsub' called for nil:NilClass
(druby://localhost:10000) /home/masa/ywesee/oddb.org/src/remote/multilingual.rb:63:in `all'
(druby://localhost:10000) /home/masa/ywesee/oddb.org/src/remote/multilingual.rb:63:in `collect'
(druby://localhost:10000) /home/masa/ywesee/oddb.org/src/remote/multilingual.rb:63:in `all'
(druby://localhost:10000) /home/masa/ywesee/oddb.org/src/remote/multilingual.rb:49:in `<=>'
(druby://localhost:10000) /home/masa/ywesee/oddb.org/src/state/global.rb:769:in `compare_entries'
  • This may be caused by comparison with nil data and Multilingual instance

Experiment

  • src/state/migel/items.rb
class Items < State::Migel::Global
  VIEW = ODDB::View::Migel::Items
  def compare_entries(a, b)
    @sortby.each { |sortby|
      if sortby == :ppub
        return a.ppub.to_f <=> b.ppub.to_f
      else
        aval, bval = nil
        begin
          aval = umlaut_filter(a.send(sortby))
          bval = umlaut_filter(b.send(sortby))
        rescue
          next
        end
        res = if (aval.nil? && bval.nil?)
          0
        elsif (aval.nil?)
          1
        elsif (bval.nil?)
          -1
        else
          aval <=> bval
        end
        return res unless(res == 0)
      end
    }
    0
  end

Result

Debug gsub nil error

Problem

  • When I click 'Packungsgrösse', the following error message comes
error in SBSM::Session#process: /de/gcc/sort/state_id/70078219690680/sortvalue/size
NoMethodError
private method `gsub' called for nil:NilClass
(druby://localhost:10000) /home/masa/ywesee/oddb.org/src/remote/multilingual.rb:63:in `all'
(druby://localhost:10000) /home/masa/ywesee/oddb.org/src/remote/multilingual.rb:63:in `collect'
(druby://localhost:10000) /home/masa/ywesee/oddb.org/src/remote/multilingual.rb:63:in `all'
(druby://localhost:10000) /home/masa/ywesee/oddb.org/src/remote/multilingual.rb:49:in `<=>'
(druby://localhost:10000) /home/masa/ywesee/oddb.org/src/state/global.rb:769:in `compare_entries'

Experiment

migel> Migel::Model::Product.find_by_pharmacode('2126975').size
-> undefined method `<=>' for nil:NilClass
  • lib/migel/util/multilingual.rb
      def to_s
        #@canonical.values.sort.first.to_s
        @canonical.values.compact.sort.first.to_s
      end

Experiment

migel> Migel::Model::Product.find_by_pharmacode('2126975').size.to_s
-> 
migel> Migel::Model::Product.find_by_pharmacode('2126975').size.to_s.class
-> String

Sort by size in the view

  • still error happens
  • lib/migel/util/multilingual.rb
      def all
        #terms = super.concat(@synonyms)
        terms = super.concat(@synonyms).compact

Result

  • no error, and sort works

Behaviours (Testcases) and refactoring migel

List

migel server side

  1. model
    1. group.rb
    2. subgroup.rb
    3. migelid.rb
    4. product.rb
  2. util
    1. swissindex.rb
    2. server.rb
    3. multilingual.rb
  3. others
    1. importer.rb
    2. model_super.rb

oddb side (client side)

  1. model
    1. migel/group.rb
    2. items.rb
  2. util
    1. oddbapp.rb
    2. session.rb
  3. state
    1. global.rb
    2. migel/alphabetical.rb
    3. migel/items.rb
    4. migel/result.rb
  4. view
    1. migel/group.rb
    2. migel/items.rb
    3. migel/limitationtext.rb
    4. migel/product.rb
    5. migel/result.rb
    6. subgroup.rb
    7. pointersteps.rb
    8. pointervalue.rb

z.B.

Run

view · edit · sidebar · attach · print · history
Page last modified on March 12, 2013, at 02:08 PM