Stuff to do today includes:
TODO
Adapting lib/oddb2xml/calc.rb to use the new library was easy. Had to add some lines to catch errors when we the parsed could not recognise lines. There were 653 such lines.
Must fix adding the label is_active_agents for each substance. Added a few lines.
Running oddb2xml --calc --skip-download takes now about 430 seconds, over 7 minutes (before about 2 minutes). This is probably the price we have to pay for the better parsing.
Running spec/calc_spec.rb signals 53 errors. Here I list the most important ones.
669: failed parsing ==> haemagglutininum influenzae A (H1N1) (Virus-Stamm A/California/7/2009 (H1N1)-like: reassortant virus NYMC X-179A) 15 µg, haemagglutininum influenzae A (H3N2) (Virus-Stamm A/Texas/50/2012 (H3N2)-like: reassortant virus NYMC X-223A) 15 µg, haemagglutininum influenzae B (Virus-Stamm B/Massachusetts/2/2012-like: B/Massachusetts/2/2012) 15 µg, natrii chloridum, kalii chloridum, dinatrii phosphas dihydricus, kalii dihydrogenophosphas, residui: formaldehydum max. 100 µg, octoxinolum-9 max. 500 µg, ovalbuminum max. 0.05 µg, saccharum nihil, neomycinum nihil, aqua ad iniectabilia q.s. ad suspensionem pro 0.5 ml 669: failed parsing ==> lactobacillus acidophilus cryodesiccatus min. 10^9 CFU, bifidobacterium infantis min. 10^9 CFU, color.: E 127, E 132, E 104, excipiens pro capsula 669: failed parsing ==> haemagglutininum influenzae A (H1N1) (Virus-Stamm A/California/7/2009 (H1N1)-like: reassortant virus NYMC X-179A) 15 µg, haemagglutininum influenzae A (H3N2) (Virus-Stamm A/Texas/50/2012 (H3N2)-like: reassortant virus NYMC X-223A) 15 µg, haemagglutininum influenzae B (Virus-Stamm B/Massachusetts/2/2012-like: B/Massachusetts/2/2012) 15 µg, natrii chloridum, kalii chloridum, dinatrii phosphas dihydricus, kalii dihydrogenophosphas, residui: formaldehydum max. 100 µg, octoxinolum-9 max. 500 µg, ovalbuminum max. 0.05 µg, saccharum nihil, neomycinum nihil, aqua ad iniectabilia q.s. ad suspensionem pro 0.5 ml 669: failed parsing ==> I) et II) et III) corresp.: aminoacida 48 g/l, carbohydrata 150 g/l, materia crassa 50 g/l, in emulsione recenter mixta 1250 ml 669: failed parsing ==> lactobacillus acidophilus cryodesiccatus min. 10^9 CFU, bifidobacterium infantis min. 10^9 CFU, color.: E 127, E 132, E 104, excipiens pro capsula 669: failed parsing ==> lactobacillus acidophilus cryodesiccatus min. 10^9 CFU, bifidobacterium infantis min. 10^9 CFU, color.: E 127, E 132, E 104, excipiens pro capsula 669: failed parsing ==> I) et II) et III) corresp.: aminoacida 32 g/l, acetas 32 mmol/l, acidum citricum monohydricum, in emulsione recenter mixta 1250 ml expected: "Toxoidum Diphtheriae" got: "Toxoidum Diphtheriae 30 U.i., Toxoidum Tetani 40 U.i., Toxoidum Pertussis 25 ?g Et Haemagglutininum expected: "U.I/ml" got: "U." expected: "Viscum Album (mali) Recens" got: "Extractum Aquosum Liquidum Fermentatum 0.05 Mg Ex Viscum Album (mali)"
Pushed commit Use parslet for --calc. Still many failing unit tests as the generated oddb_calc.xml file looks fine.
Try to fix first all the error where excipiens contains a dose which must be used when excipiens contains something like "pro". Done with commit Fixed handling unit if pro ml is given in excipiens
Pushed commit Many small fixes (label, qty, mineralia, residui). Still 567 lines fail parsing. 156 of them contain the ratio keyword.
Substance name 7680656280013 for is Vipera Aspis > 1000 Ld50 Mus
and must be corrected to Vipera Aspis > 1000 Ld50 Mus
Also for 7680616310026 the description for the label A is wrong. To be fixed tomorrow.