<< | Index | >>
When parsing FI of cipralex 62184, the chapters were not detected correctly.
To correct it we must change the methods detect_chapter
in ext/fiparse/src/textinfo_hpricot.rb
and chapter
in ext/fiparse/src/textinfo_hpricot.rb
. Testcases are okay now.
Run an import on my VM of 32917 58267 60916 62184 to verify the results.
Cipralex: Remaining errors:
Aussehen der Filmtabletten Tropfen mit 10 mg/ml Escitalopram, 1 ml corresp. 20 Tropfen corresp. 10 mg Escitalopram. Tropfen mit 20 mg/ml Escitalopram, 1 ml corresp. 20 Tropfen corresp. 20 mg Escitalopram und enthält 12% vol. Alkohol. Aussehen und Geschmack der Tropflösung
After checking the yaml file under data/html/fachinfo/de/
I discovered that a lot of fachinfo got parsed with the format :compendium, even when they were newer in reality in the newer :swissmedicinfo format. Changed the recognition to look for Section7000 or section1 in the HTML. Pushed the commit Fix recognition of format :compendium or :swissmedicinfo
Pushed the commit Fixed to recognize Section7000 (e.g. Cipralex)
Now Cipralex has galenecic form displayed after composition.
Made test_fachinfo_writer.rb
pass all unit tests. Added a unit test that cipralex should have italics (e.g. Hilfsstoffe). These changes broke test_patinfo_hpricot.rb. Investigating. They broke nasivin 45138. Looking at the newly generated nasvin patinfo I remarkes various silly spaces like Wenn Si e schwanger sind
(Probabably a different problem. I don't think that they are a result of my changes)
Cipralex displays a lot better, but not a few lines, eg. Aussehen der Filmtabletten
don't display in italic.
Pushed the following commit Some cleanups. Fixed missing line breaks in FI
P.S: Found that a search for trademark "nasivin" lead to display the patinfo http://ch.oddb.org/de/gcc/patinfo/reg/36352/seq/03
, which contained information about Cardiospermum Salbe Cosmochema