view · edit · sidebar · attach · print · history

20130828-export-therapeuticus-2

<< | Index | >>


Summary

  • fix error with doctors.csv and index_therapeuticus.csv
  • cleanup unit test under test/test_view

Commits

Index


test_viewcleanup unit test under test/test_view

The lookandfeel.rb had some changes, but not its test classes. Fixed errors in test_model and about two third of the errors in test_view with the following commits:

TODO: We should get rid of the test/mock.rb and replace it by flexmock. It is only used in a few places.

fix error with doctors.csv and index_therapeuticus.csv

Analysing why I don't get any debug output. Must run sudo -u apache /var/www/oddb.org/ext/export/bin/exportd on a different screen to see the output. Adding some debug output to ext/export/src/odba_exporter.rb and limiting output to 5 item to speed things up. Now an export completes in 7 minutes but still does not give me more insights.

With my ruby version I get utf-8? (siehe auch Entzündungshemmende Mittel 07.10.10) -> (siehe auch Entzündungshemmende Mittel 07.10.10).

Now trying older ruby-version ruby-1.9.3-p0. Installing using a script the same gems as on thinpower:

gem install --no-ri --no-rdoc activesupport  --version=2.3.3
gem install --no-ri --no-rdoc builder  --version=2.1.2
gem install --no-ri --no-rdoc dbd-pg  --version=0.3.8
gem install --no-ri --no-rdoc dbi  --version=0.4.2
gem install --no-ri --no-rdoc deprecated  --version=2.0.1
gem install --no-ri --no-rdoc facets  --version=1.8.54
gem install --no-ri --no-rdoc gd2  --version=1.1.1
gem install --no-ri --no-rdoc haml  --version=2.2.2
gem install --no-ri --no-rdoc hpricot  --version=0.8.1
gem install --no-ri --no-rdoc innate  --version=2010.07
gem install --no-ri --no-rdoc mechanize  --version=0.9.3
gem install --no-ri --no-rdoc money  --version=2.1.3
gem install --no-ri --no-rdoc nokogiri  --version=1.3.3
gem install --no-ri --no-rdoc paypal  --version=2.0.0
gem install --no-ri --no-rdoc pg  --version=0.8.0
gem install --no-ri --no-rdoc rack  --version=1.2.1
# gem install --no-ri --no-rdoc ramaze  --version=2009.10
# gem install --no-ri --no-rdoc rmagick  --version=2.11.0
gem install --no-ri --no-rdoc rmail  --version=1.0.0
gem install --no-ri --no-rdoc ruby-ole  --version=1.2.10
gem install --no-ri --no-rdoc rubyzip  --version=0.9.4
gem install --no-ri --no-rdoc spreadsheet  --version=0.6.4
gem install --no-ri --no-rdoc turing  --version=0.0.11

gem instal odba flickraw sbsm yus

Now I get the following error when running ruby /var/www/oddb.org/ext/export/bin/exportd

/home/vagrant/.rvm/rubies/ruby-1.9.3-p0/lib/ruby/site_ruby/1.9.1/rubygems/core_ext/kernel_require.rb:51:in `require': iconv will be deprecated in the future, use String#encode instead.
/home/vagrant/.rvm/rubies/ruby-1.9.3-p0/lib/ruby/site_ruby/1.9.1/rubygems/core_ext/kernel_require.rb:51:in `require': /home/vagrant/.rvm/gems/ruby-1.9.3-p0/gems/hpricot-0.8.1/lib/fast_xs.so: undefined symbol: ruby_digitmap - /home/vagrant/.rvm/gems/ruby-1.9.3-p0/gems/hpricot-0.8.1/lib/fast_xs.so (LoadError)
        from /home/vagrant/.rvm/rubies/ruby-1.9.3-p0/lib/ruby/site_ruby/1.9.1/rubygems/core_ext/kernel_require.rb:51:in `require'

Must apply the following patches for the newly installed ruby.

  • /usr/local/share/columninfo.rb.patch
  • /usr/local/share/csv.rb.patch
  • /usr/local/share/notification.rb.patch
  • /usr/local/share/row.rb.patch
  • /usr/local/share/statement.rb.patch

Found that on my VM I don't have a /etc/env.d/02locale and my LANG is different from thinpower. Thinpower has

  • LC_COLLATE=C
  • LANG=de_CH.UTF-8
  • LANGUAGE=de_CH.UTF-8

whereas on my VM I have

  • LC_COLLATE=POSIX
  • LANG=en_US.UTF-8

Running export again with changed settings export LC_COLLATE=C; export LANG=de_CH.UTF-8; export LANGUAGE=de_CH.UTF-8 before starting exportd. Then running sudo -u apache LC_COLLATE=C LANG=de_CH.UTF-8 LANGUAGE=de_CH.UTF-8 ruby /var/www/oddb.org/jobs/mail_index_therapeuticus_csv

A problem is that I don't see any differences between item.to_s and item.to_s.force_encoding('utf-8') in my logs. They print correctly all the Umlaut,é, etc. I am trying to add :encoding => "utf-8" when opening the CSV, as currently it is saved as a ISO 8859-14 document

Found the following lines in src/plugin/csv_export.rb
def export_index_therapeuticus
   @options = { :iconv => 'ISO-8859-1//TRANSLIT//IGNORE' }

This seems to work. Therefore reverting all other changes and running the export again. The patch would be (for export_drugs and export_therapeuticus)

diff --git a/src/plugin/csv_export.rb b/src/plugin/csv_export.rb
index 8d529e9..659fbc6 100644
--- a/src/plugin/csv_export.rb
+++ b/src/plugin/csv_export.rb
@@ -25,7 +25,7 @@ module ODDB
                        EXPORT_SERVER.export_doc_csv(ids, EXPORT_DIR, 'doctors.csv')
                end
     def export_drugs
-      @options = { :iconv => 'ISO-8859-1//TRANSLIT//IGNORE', :compression => 'zip'}
+      @options = { :iconv => 'UTF-8', :compression => 'zip'}
       recipients.concat self.class::ODDB_RECIPIENTS
       _export_drugs 'oddb', [ :rectype, :iksnr, :ikscd, :ikskey, :barcode,
         :bsv_dossier, :pharmacode, :name_base, :galenic_form,
@@ -152,7 +152,7 @@ module ODDB
       raise
     end
     def export_index_therapeuticus
-      @options = { :iconv => 'ISO-8859-1//TRANSLIT//IGNORE' }
+      @options = { :iconv => 'UTF-8' }
       recipients.concat self.class::ODDB_RECIPIENTS
       ids = @app.indices_therapeutici.sort.collect { |code, idx| idx.odba_id }
       files = []

This is okay. Now running jobs/export_csv to see whether exporting the drugs works, too.

Pushed commit Export drugs & index_therapeuticus as UTF-8 files

Almost by accident found corresponding unit tests. Fixed it with commit Fix csv-exporter. Tests are using UTF-8 now

view · edit · sidebar · attach · print · history
Page last modified on August 28, 2013, at 11:17 PM