See commit Fix type Zulassungsinhaber5
Pushed commit Renamed --galenic to --calc
Travis-ci reminded me that I had introduced an error when adding my first unit test. Must fix it, too. Should be done with commit Fixed handling other options than --calc
To made the analysis of treated lines easier I generate a CSV-file, too. See Added csv output for easy manual tests. Generalized spec tests
Pushed two commits to have galenic_forms/groups in csv and xml file. See
This showed me that with my inital implementation I had added also 7254 entries of tablets. Seen with
grep -c "Tablette(n)" oddb_calc.csv 7254
Checking whether tablets are handled correctly. Eg.
grep 7680655530034 -A10 -B1 oddb_calc.xml <ARTICLE> <GTIN>7680655530034</GTIN> <NAME>Esomeprazol Axapharm 20 mg, magensaftresistente Filmtabletten</NAME> <PKG_SIZE>60</PKG_SIZE> <COUNT>60</COUNT> <MULTI class="multi"/> <MEASURE>0</MEASURE> <ADDITION>0</ADDITION> <SCALE>1</SCALE> <GALENIC_FORM>magensaftresistente Filmtabletten</GALENIC_FORM> <GALENIC_GROUP>unbekannt</GALENIC_GROUP> </ARTICLE>
XML- is not good. I would expect the field MULTI to contain 1. Fixed with Fixed xml-generation and skip-download with --calc. Now Multi contains 1.
We have also 1477 Kapseln, 212 Suppositorien, but only 4 Zäpfchen. See
grep -i "pfchen" *.csv oddb_calc.csv:7680463510532,Osa Schmerz- und Fieberzäpfchen,10,10,1,0,0,1,Suppositorien,Unbekannt,unbekannt oddb_calc.csv:7680463510617,Osa Schmerz- und Fieberzäpfchen,10,10,1,0,0,1,Suppositorien,Unbekannt,unbekannt oddb_calc.csv:7680622940019,Arilin,1,1,1,0,0,1,Vaginalzäpfchen,Vaginalzäpfchen,unbekannt oddb_calc.csv:7680622940026,Arilin,2,2,1,0,0,1,Vaginalzäpfchen,Vaginalzäpfchen,unbekannt
Handling when the Packungen.xlsx contains a french term is not yet good. Eg. for 00274 1 Cardio-Pulmo-Rénal Sérocytol, suppositoire
we have
<ARTICLE> <GTIN>7680002740017</GTIN> <NAME>Cardio-Pulmo-Rénal Sérocytol, suppositoire</NAME> <PKG_SIZE>3</PKG_SIZE> <COUNT>3</COUNT> <MULTI>1</MULTI> <MEASURE>0</MEASURE> <ADDITION>0</ADDITION> <SCALE>1</SCALE> <GALENIC_FORM>suppositoire</GALENIC_FORM> <GALENIC_GROUP>unbekannt</GALENIC_GROUP> </ARTICLE>
Adding this as my second test case.
When counting the number of unknown galenic_groups I get 8083, which means, that I already attributed a galenic_form with a known galenic_group to over 50% of the lines in swissmedic_packages.xlsx. And many of them show, that the (inherited from ch.oddb.org) links between galenic_form and galenic_groups is missing or wrong. But we have also 5854 where the galenic_form is unknown. Some sample elements from oddb_calc.csv are
7680654620019,Perindopril-Amlodipine Servier,30,30,1,0,0,1,Tablette(n),Unbekannt,unbekannt 7680653180019,Floramed Beruhigungstee,"20 x 1,3 g",20,1,0,0,1,Beutel,Unbekannt,unbekannt 7680653090011,Veclavam,10 x 10,10,10,0,0,1,Kautabletten,Unbekannt,unbekannt 7680653330018,Eprivalan Pour-On ad us. vet. solution,250,250,1,0,0,1,ml,Unbekannt,unbekannt
We must skip the veterinary products, too. Skipping lines with an ATC-Code starting with 'Q' or Heilmittelcode == 'Tierarztneimittel' eliminated 1495 products. Fixed with commit Skip veterinary products for --calc, too
After discussing with Zeno we agreed on the following priorities.
Pushed commit Move selling units into Calc class for better testing