view · edit · sidebar · attach · print · history

Index>

20141208-gem-import-atc-code

Summary

  • gem to get atc_codes from ean of swiss drugs
  • Supervise import_doctors on thinpower
  • Check why import_swissmedic takes so long

Commits

Index

Keep in Mind
  • Fix dojo error http://www.sitepen.com/blog/2012/10/31/debugging-dojo-common-error-messages/#forgot-dom-ready
  • I removed on May-27 tests for ix_registrationss, fix_sequences, fix_compositions, fix_packages from test/test_plugin/swissmedic.rb,as he could not find any references for them in the src code. Did I erroneously remove stuff when cleaning up the swissmedic import earlier?
  • The whole test for older/newer Packages must be adapted to xlsx. One must compare the rows (e.g. by creating csv files) and do the same stuff in xlsx!

---

Do not modify frozen strings for searchterms

From thinpowers log

2014-12-08 09:42:44 +0100:  MedregDoctorPlugin Searching for doctor with GLN 7601000001139. Skipped 0, created 0 updated 28 of 33209).
RuntimeError: can't modify frozen String when updating index 'doctor_index' with a ODDB::Doctor
["/var/www/oddb.org/src/util/searchterms.rb:91:in `force_encoding'", "/var/www/oddb.org/src/util/searchterms.rb:91:in `block in search_terms'", "/var/www/oddb.org/src/util/searchterms.rb:87:in `each'", "/var/www/oddb.org/src/util/searchterms.rb:87:in `inject'"]

Tyring to fix the error with frozen strings with the following diff

diff --git a/src/util/searchterms.rb b/src/util/searchterms.rb
index 731b52c..bcda396 100644
--- a/src/util/searchterms.rb
+++ b/src/util/searchterms.rb
@@ -88,7 +88,7 @@ module ODDB
       if(opts[:downcase])
         term = term.downcase
       end
-      term.force_encoding('UTF-8')
+      term.force_encoding('UTF-8') unless term.frozen?
                        parts = term.split(/[\/-]/u)
                        if(parts.size > 1)
         terms.push(ODDB.search_term(parts.first))

From log on oddb-ci with applied patch

2014-12-08 16:55:28 +0100:  MedregDoctorPlugin Searching for doctor with GLN 7601000001122. Skipped 0, created 0 updated 27 of 33209).
2014-12-08 16:55:34 +0100:  MedregDoctorPlugin Searching for doctor with GLN 7601000001139. Skipped 0, created 0 updated 28 of 33209).
2014-12-08 16:55:39 +0100:  MedregDoctorPlugin Searching for doctor with GLN 7601000001177. Skipped 0, created 0 updated 29 of 33209).

Pushed commit Do not force encoding for frozen strings

Check why import_swissmedic takes so long

Swissmedic import took after my chances see this commit way longer to execute, as seen by the following log entries

grep minutes /var/www/oddb.org/log/oddb/debug/2014/1?.log
log/oddb/debug/2014/10.log:2014-10-15 09:22:58 +0200: /var/www/oddb.org/src/plugin/swissmedic.rb: 102 cp /var/www/oddb.org/data/xls/Packungen-2014.10.15.xlsx /var/www/oddb.org/data/xls/Packungen-latest.xlsx after 119 minutes
log/oddb/debug/2014/11.log:2014-11-05 09:19:24 +0100: /var/www/oddb.org/src/plugin/swissmedic.rb: 102 cp /var/www/oddb.org/data/xls/Packungen-2014.11.05.xlsx /var/www/oddb.org/data/xls/Packungen-latest.xlsx after 110 minutes
log/oddb/debug/2014/12.log:2014-12-04 06:53:12 +0100: /var/www/oddb.org/src/plugin/swissmedic.rb: 98 cp /var/www/oddb.org/data/xls/Packungen-2014.12.03.xlsx /var/www/oddb.org/data/xls/Packungen-latest.xlsx after 750 minutes
log/oddb/debug/2014/12.log:2014-12-05 00:38:36 +0100: /var/www/oddb.org/src/plugin/swissmedic.rb: 98 cp /var/www/oddb.org/data/xls/Packungen-2014.12.04.xlsx /var/www/oddb.org/data/xls/Packungen-latest.xlsx after 811 minutes

Analysing the log entries why the import_swissmedic took so long. We want to analyse the import started 2014-12-04 around 11 AM


2014-12-04 11:06:55 +0100: /var/www/oddb.org/src/plugin/swissmedic.rb: 263 skip writing /var/www/oddb.org/data/xls/Packungen-2014.12.04.xlsx as it already exists and is 2568182 bytes.
2014-12-04 11:06:55 +0100: /var/www/oddb.org/src/plugin/swissmedic.rb: 62 update target "/var/www/oddb.org/data/xls/Packungen-2014.12.04.xlsx" 2568182 bytes. Latest /var/www/oddb.org/data/xls/Packungen-latest.xlsx 2113687 bytes
2014-12-04 11:06:55 +0100: /var/www/oddb.org/src/plugin/swissmedic.rb: 263 skip writing /var/www/oddb.org/data/xls/Präparateliste-2014.12.04.xlsx as it already exists and is 2090918 bytes.
2014-12-04 11:06:55 +0100: /var/www/oddb.org/src/plugin/swissmedic.rb: 303 cp /var/www/oddb.org/data/xls/Präparateliste-2014.12.04.xlsx /var/www/oddb.org/data/xls/Präparateliste-latest.xlsx
2014-12-04 11:14:19 +0100: /var/www/oddb.org/src/plugin/swissmedic.rb: 73 Compared /var/www/oddb.org/data/xls/Packungen-2014.12.04.xlsx 2568182 bytes with /var/www/oddb.org/data/xls/Packungen-latest.xlsx 2113687 bytes
2014-12-04 11:14:19 +0100: /var/www/oddb.org/src/plugin/swissmedic.rb: 77 Found 0 news, 17249 updates, 0 replacements and 0 package_deletions

<..>
2014-12-04 11:14:33 +0100: /var/www/oddb.org/src/plugin/swissmedic.rb: 846: update_sequence 00274 atc_code_swissmedic J06AA atc_code_sequence J06AA equal true 
2014-12-04 11:14:35 +0100: /var/www/oddb.org/src/plugin/swissmedic.rb: 855: res Cardio-Pulmo-Rénal Sérocytol, suppositoire == Cardio-Pulmo-Rénal Sérocytol, suppositoire? seqnr 01 args {:composition_text=>"globulina equina (immunisé avec coeur, tissu pulmonaire, reins de porcins) 8 mg, propylenglycolum, conserv.: E 216, E 218, excipiens pro suppositorio.", :name_base=>"Cardio-Pulmo-Rénal Sérocytol", :name_descr=>"suppositoire", :dose=>nil, :sequence_date=>#<DateTime: 2010-04-26T00:00:00+00:00 ((2455313j,0s,0n),+0s,2299161j)>, :export_flag=>nil} 
<..>

2014-12-04 22:48:33 +0100: /var/www/oddb.org/src/plugin/swissmedic.rb: 855: res Exviera 250 mg, Filmtabletten == Exviera 250 mg, Filmtabletten? seqnr 01 args {:composition_text=>"dasabuvirum 250 mg ut dasabuvirum natricum corresp. dasabuvirum natricum monohydricum 270.26 mg, excipiens pro compresso obducto.", :name_base=>"Exviera 250 mg", :name_descr=>"Filmtabletten", :dose=>nil, :sequence_date=>#<DateTime: 2014-11-25T00:00:00+00:00 ((2456987j,0s,0n),+0s,2299161j)>, :export_flag=>nil}
2014-12-04 22:48:33 +0100: update_compositions: row[0] 65302 iksnr 65302 01 seq Exviera 250 mg, Filmtabletten opts {:create_only=>false, :date=>#<Date: 2014-12-04 ((2456996j,0s,0n),+0s,2299161j)>, :composition=>0, :label=>nil} cell_content dasabuvirum 250 mg ut dasabuvirum natricum corresp. dasabuvirum natricum monohydricum 270.26 mg, excipiens pro compresso obducto.
<.. What is happging in the next 1000 minutes? ..>
2014-12-05 00:38:36 +0100: /var/www/oddb.org/src/plugin/swissmedic.rb: 181: delete 0 items  
2014-12-05 00:38:36 +0100: /var/www/oddb.org/src/plugin/swissmedic.rb: 98 cp /var/www/oddb.org/data/xls/Packungen-2014.12.04.xlsx /var/www/oddb.org/data/xls/Packungen-latest.xlsx after 811 minutes

Comparing it with import_dayly from 2014-11-01 where it took about 4 hours

2014-11-01 07:30:41 +0100: /var/www/oddb.org/src/plugin/swissmedic.rb: 296 skip writing /var/www/oddb.org/data/xls/Packungen-2014.11.01.xlsx as /var/www/oddb.org/data/xls/Packungen-latest.xlsx is 2723179 bytes. Returning latest
<..>
2014-11-01 11:09:31 CET return_value_BsvXmlPlugin.update = "/var/www/oddb.org/data/xml/XMLPublications-2014.11.01.zip"
2014-11-01 11:09:31 CET getin log_notify_bsv
2014-11-01 11:09:31 CET date=#<Date: 2014-11-01 ((2456963j,0s,0n),+0s,2299161j)>
2014-11-01 11:09:31 CET after pointer creating
2014-11-01 11:10:13 CET bsv_xml: attached file Duplicate_Registrations_in_SL_01.11.2014.txt is saved
<..>
2014-11-01 11:10:14 CETUtil.send_mail send_mail_with_attachments ["oddb_bsv"]  
2014-11-01 11:10:14 CET Util.log_and_deliver_mail to=["ngiger@ywesee.com", "paul.wiederkehr@pharmasuisse.org", "zdavatz@ywesee.com"] subject ch.ODDB.org Report - SL-Update (XML) - 11/2014 size Created SL-Entries                                           46  
Updated SL-Entries                                         9467  
Deleted SL-Entries                                          102  
Created Limitation-Texts                                     38  
Updated Limitation-Texts                                   1817  
Deleted Limitation-Texts                                      0  

Also we would like to get rid of all the errors, when updating atc_code (frozen strings)

It looks as if updating a single sequence take at least 2 seconds, sometimes more. There were 3787 sequences where an update was necessary. We logged 32217 occurences, where no update was necessary.

Also have warnings like ODBA::Stub was unable to replace ODDB::Text::Chapter#31103789 from ODDB::FachinfoDocument2001:#31103775

Did run /usr/local/bin/ruby jobs/import_swissmedic on oddb-ci (after removing data/xls/Präparateliste-latest.xlsx and data/xls/Packungen-latest.xlsx=. Some characteristics:

  • data/xls/Präparateliste-latest.xlsx created after 2 minutes
  • entries update_compositions after 6 minutes since start till 16 minutes after start.

Also I am concerned, whether the changes really got propagated to the user. Eg. Packung.xlsl has the line

63179	1	Anouk, Tabletten	Spirig HealthCare AG	09.02.1.	G03AC09	Synthetika human	23.12.13	23.12.13	22.12.18	001	28	Tablette(n)	B	desogestrelum	desogestrelum 75 µg, excipiens pro compresso.	Orale Kontrazeption

and we find a concerning log entry

2014-12-04 22:36:54 +0100: /var/www/oddb.org/src/plugin/swissmedic.rb: 846: update_sequence 63179 atc_code_swissmedic G03AC09 atc_code_sequence G03AA09 equal false

but when looking at http://ch.oddb.org/de/gcc/drug/reg/63179/seq/01 we find ATC-Klassierung G03AA09. Confirmed by using bin/admin

ch.oddb> registration('63179').sequence('01').atc_class.code
-> G03AA09

Which means that the (often use code snippet) @app.update ptr, args, :swissmedic does not work as expected? Why do I have to resort sometime to methods like @@sequence.atc_class=

Trying the following code snippet in bin/admin

ch.oddb> registration('63179').sequence('01').atc_class.code
-> G03AA09
ch.oddb> registration('63179').sequence('01').atc_class=atc_classes['G03AC09']
-> Desogestrel
ch.oddb> registration('63179').sequence('01').atc_class.code
-> G03AC09
ch.oddb> atc_classes['G03AC09'] == atc_class('G03AC09')
-> true
# do not forget to persist it by calling
ch.oddb> registration('63179').sequence('01').odba_store
-> Anouk, Tabletten

Okay. You may get an ATC code via atc_class('G03AC09') or atc_classes['G03AC09']. After restarting oddbd http://oddb-ci2.dyndns.org/de/gcc/drug/reg/63179/seq/01 displays the changed atc_code G03AC09. After logging in as admin use I changed the atc_code manually to L01CD01 and saved the changes. Afters checking in bin/admin

ch.oddb> registration('63179').sequence('01').atc_class.code
-> L01CD01

Restarting oddbd and rechecking in bin/admin.

Creating a watir test to check, whether the atc code changes as expected.

Today on oddb-ci2 running swissmedic_update was a lot shorter than last week as attested by these log entries

2014-12-04 01:30:48 +0100: /var/www/oddb.org/src/plugin/swissmedic.rb: 98 cp /var/www/oddb.org/data/xls/Packungen-2014.12.03.xlsx /var/www/oddb.org/data/xls/Packungen-latest.xlsx after 479 minutes
2014-12-08 12:12:00 +0100: /var/www/oddb.org/src/plugin/swissmedic.rb: 98 cp /var/www/oddb.org/data/xls/Packungen-2014.12.08.xlsx /var/www/oddb.org/data/xls/Packungen-latest.xlsx after 64 minutes

Created new method to update all atc_code which worked fine for the first 5 items. Now running them for all. Updating atc_code only if swissmedic-atc_code is of length 7.

Updating started with 2014-12-08 17:20:31 +0100: update_atc_codes starting with /var/www/oddb.org/data/xls/Packungen-2014.12.08.xlsx and finished with the following error some 8 minutes later

2014-12-08 17:28:41 +0100: /var/www/oddb.org/src/plugin/swissmedic.rb: 87: atc_code update nr 130 for iksnr 57851/1 atc_code_swissmedic B03AC01 atc_code_sequence B03AC
2014-12-08 17:28:41 CETlog notify Error: swissmedic: start outgoing process ["log"]. Must attach 0 files and 0 parts. 
2014-12-08 17:28:41 CETUtil.send_mail list_and_recipients ["log"]
2014-12-08 17:28:41 CET /var/www/oddb.org/src/util/mail.rb: Configured email using /var/www/oddb.org/etc/oddb.yml @cfg is now "smtp.gmail.com" 587 "ngiger@ywesee.com"
2014-12-08 17:28:43 CET Util.log_and_deliver_mail to=["ngiger@ywesee.com"] subject ch.ODDB.org Report - Error: swissmedic - 12/2014 size 3587 with 0 attachments. Plugin: ODDB::SwissmedicPlugin
Error: DBI::ProgrammingError
Message: 
Backtrace:
/usr/local/lib/ruby/gems/1.9.1/gems/dbd-pg-0.3.9/lib/dbd/pg/statement.rb:62:in `rescue in execute'
/usr/local/lib/ruby/gems/1.9.1/gems/dbd-pg-0.3.9/lib/dbd/pg/statement.rb:37:in `execute'
/usr/local/lib/ruby/gems/1.9.1/gems/dbi-0.4.5/lib/dbi/base_classes/database.rb:96:in `execute'
/usr/local/lib/ruby/gems/1.9.1/gems/dbi-0.4.5/lib/dbi/handles/database.rb:81:in `execute'
/usr/local/lib/ruby/gems/1.9.1/gems/dbi-0.4.5/lib/dbi/handles/database.rb:128:in `select_all'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/connection_pool.rb:39:in `block in method_missing'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/connection_pool.rb:29:in `next_connection'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/connection_pool.rb:38:in `method_missing'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/storage.rb:508:in `restore_collection'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/cache.rb:236:in `fetch_collection'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/cache.rb:632:in `restore'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/cache.rb:318:in `block in fetch_or_restore'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/cache.rb:313:in `call'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/cache.rb:313:in `fetch_or_do'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/cache.rb:317:in `fetch_or_restore'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/cache.rb:65:in `block in bulk_restore'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/cache.rb:62:in `each'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/cache.rb:62:in `bulk_restore'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/cache.rb:56:in `bulk_fetch'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/cache.rb:260:in `fetch_collection'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/cache.rb:632:in `restore'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/cache.rb:318:in `block in fetch_or_restore'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/cache.rb:313:in `call'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/cache.rb:313:in `fetch_or_do'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/cache.rb:317:in `fetch_or_restore'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/cache.rb:640:in `restore_object'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/cache.rb:605:in `load_object'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/cache.rb:226:in `block in fetch'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/cache.rb:313:in `call'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/cache.rb:313:in `fetch_or_do'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/cache.rb:225:in `fetch'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/stub.rb:49:in `odba_receiver'
/usr/local/lib/ruby/gems/1.9.1/gems/odba-1.1.0/lib/odba/stub.rb:112:in `method_missing'
/var/www/oddb.org/src/util/oddbapp.rb:224:in `clean_invoices'
/var/www/oddb.org/src/util/oddbapp.rb:1485:in `clean'
/usr/local/lib/ruby/gems/1.9.1/gems/sbsm-1.2.5/lib/sbsm/drbserver.rb:140:in `block (3 levels) in run_cleaner'
<internal:prelude>:10:in `synchronize'
/usr/local/lib/ruby/gems/1.9.1/gems/sbsm-1.2.5/lib/sbsm/drbserver.rb:139:in `block (2 levels) in run_cleaner'
/usr/local/lib/ruby/gems/1.9.1/gems/sbsm-1.2.5/lib/sbsm/drbserver.rb:137:in `loop'
/usr/local/lib/ruby/gems/1.9.1/gems/sbsm-1.2.5/lib/sbsm/drbserver.rb:137:in `block in run_cleaner'
2014-12-08 17:28:45 CETlog notify Error: swissmedic: sent mail

I am puzzled! Restarting it another time, but this time using odba_isolated_store instead of .odba_store.

Supervise import_doctors on thinpower

Started (in /var/www/oddb.org) sudo -u apache /usr/local/bin/ruby jobs/import_regmed_doctors in screen import_doctors

gem to get atc_codes from ean of swiss drugs

Estimate the effort to build a new gem which read an input file with ean-codes of authorized swiss drugs and produces an output csv-file al atc_codes the ean.

Zeno suggest to use file "Excel-Version Zugelassene Verpackungen" from https://www.swissmedic.ch/arzneimittel/00156/00221/00222/00230/index.html?lang=de Therefore the solution is simple.

  • Download the Verpackungen file (approx 5 MB)
  • Load the Packungen via ruby XL
  • read the input file and find the corresponding ATC-Codes.
  • produce the csv

This should be simple to program (2 to 4 h). Assuming actual Ruby version. Add 1-2h overhead for getting the gem with travis-ci and unit tests ready. Add 1h per additional ruby version to support, possibly longer for older version, e.g. 1.8)

Add 1h if you want to download the XLS-File only every 24h or so or check against the date provided in the link

Add 1h-2h if you want to speed up parsing the xml (in this case I would produce an yaml file as cache for the Pharma file).

view · edit · sidebar · attach · print · history
Page last modified on December 08, 2014, at 05:42 PM