view · edit · sidebar · attach · print · history

Index>

20150204-oddb2xml-calc

Summary

  • oddb2xml should correctly handle units and package size
  • Fix typo Zulassungsinhaber5

Commits

Index

Keep in Mind for work to do
  • Fix dojo error http://www.sitepen.com/blog/2012/10/31/debugging-dojo-common-error-messages/#forgot-dom-ready
  • I removed on May-27 tests for ix_registrationss, fix_sequences, fix_compositions, fix_packages from test/test_plugin/swissmedic.rb,as he could not find any references for them in the src code. Did I erroneously remove stuff when cleaning up the swissmedic import earlier?
  • The whole test for older/newer Packages must be adapted to xlsx. One must compare the rows (e.g. by creating csv files) and do the same stuff in xlsx!
  • creat gem: task: input=file with ean-codes, standard output show ean-codes + atc-code. Source is Swissmedic Packungen.xlsx or XML.
  • Display 10 recalls not only those from this month
  • Import via data/medreg_companies.yaml

Fix typo Zulassungsinhaber5

See commit Fix type Zulassungsinhaber5

oddb2xml should correctly handle units and package size

Pushed commit Renamed --galenic to --calc

Travis-ci reminded me that I had introduced an error when adding my first unit test. Must fix it, too. Should be done with commit Fixed handling other options than --calc

To made the analysis of treated lines easier I generate a CSV-file, too. See Added csv output for easy manual tests. Generalized spec tests

Pushed two commits to have galenic_forms/groups in csv and xml file. See

This showed me that with my inital implementation I had added also 7254 entries of tablets. Seen with

grep -c "Tablette(n)" oddb_calc.csv
7254

Checking whether tablets are handled correctly. Eg.

grep 7680655530034 -A10 -B1 oddb_calc.xml 
  <ARTICLE>
    <GTIN>7680655530034</GTIN>
    <NAME>Esomeprazol Axapharm 20 mg, magensaftresistente Filmtabletten</NAME>
    <PKG_SIZE>60</PKG_SIZE>
    <COUNT>60</COUNT>
    <MULTI class="multi"/>
    <MEASURE>0</MEASURE>
    <ADDITION>0</ADDITION>
    <SCALE>1</SCALE>
    <GALENIC_FORM>magensaftresistente Filmtabletten</GALENIC_FORM>
    <GALENIC_GROUP>unbekannt</GALENIC_GROUP>
  </ARTICLE>

XML- is not good. I would expect the field MULTI to contain 1. Fixed with Fixed xml-generation and skip-download with --calc. Now Multi contains 1.

We have also 1477 Kapseln, 212 Suppositorien, but only 4 Zäpfchen. See

grep -i "pfchen" *.csv
oddb_calc.csv:7680463510532,Osa Schmerz- und Fieberzäpfchen,10,10,1,0,0,1,Suppositorien,Unbekannt,unbekannt
oddb_calc.csv:7680463510617,Osa Schmerz- und Fieberzäpfchen,10,10,1,0,0,1,Suppositorien,Unbekannt,unbekannt
oddb_calc.csv:7680622940019,Arilin,1,1,1,0,0,1,Vaginalzäpfchen,Vaginalzäpfchen,unbekannt
oddb_calc.csv:7680622940026,Arilin,2,2,1,0,0,1,Vaginalzäpfchen,Vaginalzäpfchen,unbekannt

Handling when the Packungen.xlsx contains a french term is not yet good. Eg. for 00274 1 Cardio-Pulmo-Rénal Sérocytol, suppositoire we have

  <ARTICLE>
    <GTIN>7680002740017</GTIN>
    <NAME>Cardio-Pulmo-Rénal Sérocytol, suppositoire</NAME>
    <PKG_SIZE>3</PKG_SIZE>
    <COUNT>3</COUNT>
    <MULTI>1</MULTI>
    <MEASURE>0</MEASURE>
    <ADDITION>0</ADDITION>
    <SCALE>1</SCALE>
    <GALENIC_FORM>suppositoire</GALENIC_FORM>
    <GALENIC_GROUP>unbekannt</GALENIC_GROUP>
  </ARTICLE>

Adding this as my second test case.

When counting the number of unknown galenic_groups I get 8083, which means, that I already attributed a galenic_form with a known galenic_group to over 50% of the lines in swissmedic_packages.xlsx. And many of them show, that the (inherited from ch.oddb.org) links between galenic_form and galenic_groups is missing or wrong. But we have also 5854 where the galenic_form is unknown. Some sample elements from oddb_calc.csv are

7680654620019,Perindopril-Amlodipine Servier,30,30,1,0,0,1,Tablette(n),Unbekannt,unbekannt
7680653180019,Floramed Beruhigungstee,"20 x 1,3 g",20,1,0,0,1,Beutel,Unbekannt,unbekannt
7680653090011,Veclavam,10 x 10,10,10,0,0,1,Kautabletten,Unbekannt,unbekannt
7680653330018,Eprivalan Pour-On ad us. vet.  solution,250,250,1,0,0,1,ml,Unbekannt,unbekannt

We must skip the veterinary products, too. Skipping lines with an ATC-Code starting with 'Q' or Heilmittelcode == 'Tierarztneimittel' eliminated 1495 products. Fixed with commit Skip veterinary products for --calc, too

After discussing with Zeno we agreed on the following priorities.

  • Add a field SELLING_UNITS which should say how many individual items could be sold
  • Use always the part after the (last ',') of the column Präparatebezeichnung as galenic_group
  • Show on the standard output all names where we don't have a name with ',' (about 200)
  • Show on the standard output all galenic_groups which are not present in the export of ch.oddb.org (about 740)

See Attach:oddb_calc_1.txt

Pushed commit Move selling units into Calc class for better testing

view · edit · sidebar · attach · print · history
Page last modified on February 04, 2015, at 06:28 PM