view · edit · sidebar · attach · print · history

20110214-testcases-fiparse-oddb_org

<< | Index | >>


  1. Update test-cases test_fachinfo_doc_parser.rb

Goal/Estimate
  • test_fachinfo_doc_parser test-cases / 90%
Milestones
  1. test_fachinfo_doc_parser.rb
  2. test_fachinfo_hpricot.rb
  3. test_fachinfo_pdf.rb
  4. test_fiparse.rb
  5. test_indications.rb
  6. test_patinfo_hpricot.rb
  7. test_minifi.rb
Summary
Commits
ToDo Tomorrow
Keep in Mind
  1. no more test_minifi.rb
  2. On Ice

Update test-cases fiparse

Confirm the current status

masa@masa ~/ywesee/oddb.org/ext/fiparse/test $ ruby test_fachinfo_doc_parser.rb 
...
  1) Failure:
test_composition10(TestFachinfoDocParser10) [test_fachinfo_doc_parser.rb:1089]:
<nil> expected to be an instance of
<ODDB::Text::Chapter> but was
<NilClass>.

  2) Failure:
test_galenic_form10(TestFachinfoDocParser10) [test_fachinfo_doc_parser.rb:1102]:
<"Galenische Form und Wirkstoffmengen pro Einheit"> expected but was
<"Tropfen. 1 ml enth\303\244lt: 25 mg Hamameliswasser, 5 mg Augentrosttinktur, 0,9 mg Dexpanthenol.">.

  3) Failure:
test_iksnrs10(TestFachinfoDocParser10) [test_fachinfo_doc_parser.rb:1120]:
<nil> expected to be an instance of
<ODDB::Text::Chapter> but was
<NilClass>.

  4) Failure:
test_indications10(TestFachinfoDocParser10) [test_fachinfo_doc_parser.rb:1113]:
<nil> expected to be an instance of
<ODDB::Text::Chapter> but was
<NilClass>.

  5) Failure:
test_name10(TestFachinfoDocParser10) [test_fachinfo_doc_parser.rb:1082]:
<"Tendro, Augentropfen\n"> expected but was
<"Tendro, Augentropfen\n\t\t\t\t\t\t\t\t\t\tTentan AG\n">.

  6) Failure:
test_registration_owner10(TestFachinfoDocParser10) [test_fachinfo_doc_parser.rb:1130]:
<nil> expected to be an instance of
<ODDB::Text::Chapter> but was
<NilClass>.

  7) Failure:
test_composition12(TestFachinfoDocParser12) [test_fachinfo_doc_parser.rb:1236]:
<nil> expected to be an instance of
<ODDB::Text::Chapter> but was
<NilClass>.

  8) Failure:
test_galenic_form12(TestFachinfoDocParser12) [test_fachinfo_doc_parser.rb:1249]:
<"Forme gal\351nique et quantit\351 de principe actif par unit\351"> expected but was
<"a       Principe actif : extrait de millepertuis (Hyperici herba extractum).">.

  9) Failure:
test_iksnrs12(TestFachinfoDocParser12) [test_fachinfo_doc_parser.rb:1267]:
<nil> expected to be an instance of
<ODDB::Text::Chapter> but was
<NilClass>.

 10) Failure:
test_indications12(TestFachinfoDocParser12) [test_fachinfo_doc_parser.rb:1260]:
<nil> expected to be an instance of
<ODDB::Text::Chapter> but was
<NilClass>.

 11) Failure:
test_name12(TestFachinfoDocParser12) [test_fachinfo_doc_parser.rb:1230]:
<"Yakona-Hypericum\n"> expected but was
<"Yakona-Hypericum\n\t\t\t\t\t\t\t\t\t\tTentan AG\n">.

 12) Failure:
test_registration_owner12(TestFachinfoDocParser12) [test_fachinfo_doc_parser.rb:1277]:
<nil> expected to be an instance of
<ODDB::Text::Chapter> but was
<NilClass>.

 13) Error:
test_date13(TestFachinfoDocParser13):
NoMethodError: undefined method `text' for nil:NilClass
    test_fachinfo_doc_parser.rb:1355:in `test_date13'

 14) Failure:
test_name13(TestFachinfoDocParser13) [test_fachinfo_doc_parser.rb:1303]:
<1> expected but was
<2>.

 15) Failure:
test_company5(TestFachinfoDocParser5) [test_fachinfo_doc_parser.rb:401]:
<"Salmon Pharma"> expected but was
<"Salmon Pharma\n\n\n\nZusammensetzung">.

 16) Failure:
test_composition5(TestFachinfoDocParser5) [test_fachinfo_doc_parser.rb:407]:
<nil> expected to be an instance of
<ODDB::Text::Chapter> but was
<NilClass>.

 17) Failure:
test_company8(TestFachinfoDocParser8) [test_fachinfo_doc_parser.rb:1001]:
<"ARS VITAE"> expected but was
<"ARS VITAE\n\n\n\nComposition">.

 18) Failure:
test_composition8(TestFachinfoDocParser8) [test_fachinfo_doc_parser.rb:898]:
<nil> expected to be an instance of
<ODDB::Text::Chapter> but was
<NilClass>.

87 tests, 359 assertions, 17 failures, 1 errors

Note

  • There are only two types of error
    1. ODDB::Text::Chapter is expected but NilClass
    2. expected result is a little bit different
  • And there is a part where the segmentatio fault occurs in the test-case

Next

  • Focus on the first one, the result becomes NilClass
 18) Failure:
test_composition8(TestFachinfoDocParser8) [test_fachinfo_doc_parser.rb:898]:
<nil> expected to be an instance of
<ODDB::Text::Chapter> but was
<NilClass>.

Experiment

ext/fiparse/src/fachinfo_doc.rb#run_of_text

      def run_of_text(text, char_props)
        text = @iconv.iconv(text)
if writer = @writers[0] and writer.composition
print "@writers[0].composition.class="
p @writers[0].composition.class
end
        text.split(/\v/u).each_with_index { |run, idx|
          if(idx > 0 && @writer)
            @writer.send_line_break
          end
          _run_of_text(run, char_props)
        }
      end

Result

masa@masa ~/ywesee/oddb.org/ext/fiparse/test $ ruby test_fachinfo_doc_parser.rb |more
Loaded suite test_fachinfo_doc_parser
Started
F
Finished in 0.055343 seconds.

  1) Failure:
test_composition8(TestFachinfoDocParser8) [test_fachinfo_doc_parser.rb:898]:
<nil> expected to be an instance of
<ODDB::Text::Chapter> but was
<NilClass>.

1 tests, 1 assertions, 1 failures, 0 errors

Note

  • '@writers[0].composition' is never generated

Consideration

  • @writers[0].compostion is probably set in the FachinfoDocWriter#new_font
  • 'FachinfoDocWriter' class inherits 'FachinfoWriter' class
  • and 'FachinfoWriter' class inherits 'Writer' class
  • I should check 'Writer' and 'FachinfoWriter' classes

Memo

  • There is no test-case for 'FachinfoWriter' class
  • '@composition' looks set in 'FachinfoWriter' class here:

ext/fiparse/src/fachinfo_writer.rb#set_templates

      private
      def set_templates(chapter)
        if(@amzv.nil?)
          case chapter.heading
          when /9\.11\.2001/u, /AMZV/u, /OEM.d/u
            @amzv = chapter
            @templates = named_chapters [
              :composition, :galenic_form, :indications,
              :usage, :contra_indications, :restrictions,
              :interactions, :pregnancy, :driving_ability,
              :unwanted_effects, :overdose, :effects, :switch,
            ]
          when /Galenische\s*Form/iu, /Forme\s*gal.nique/iu
            ## this is an amzv-FI without Declaration, switch to amzv-mode.
            @galenic_form = chapter
            named_chapter(:amzv)
            @templates = named_chapters [
              :indications, :usage, :contra_indications, :restrictions,
              :interactions, :pregnancy, :driving_ability, :unwanted_effects,
              :overdose, :effects, :switch,
            ]
          when /Zusammensetzung/u, /Composition/u, /Principes\s*actifs/u
            @composition = chapter
            @templates = named_chapters [
              :switch,
            ]

Next

  • Where is the 'set_templates' method called from?

Answer

  • The 'set_templates' is called only here in 'FachinfoDocWriter' class

ext/fiparse/fachinfo_doc.rb#new_font

      def new_font(char_props, text=nil)
        if(@chapter_flag)
          @chapter_flag = nil
          if(@chapter == @switch)
            # switch between old and new (2001) FI-Schema
            set_templates(@chapter)

Note

  • Not only 'set_templates' but also '@composition = @chapter' set the @compostion variable

Experiment

ext/fiparse/fachinfo_doc.rb#new_font

      def new_font(char_props, text=nil)
        if(@chapter_flag)
print "@chapter.heading="
p @chapter.heading

Result

masa@masa ~/ywesee/oddb.org/ext/fiparse/test $ ruby test_fachinfo_doc_parser.rb |more
Loaded suite test_fachinfo_doc_parser
Started
@chapter.heading="ARS VITAE\n\n\n\n"
@chapter.heading="Forme gal\303\251nique et quantit\303\251 de principe actif par unit\303\251\n\n"
@chapter.heading="Indications/Possibilit\303\251s d\342\200\231emploi\n\n"
@chapter.heading="Posologie/Mode d\342\200\231emploi\n\n"
@chapter.heading="Contre-indications\n\n"
@chapter.heading="Mises en garde et pr\303\251cautions\n\n"
@chapter.heading="Interactions\n\n"
@chapter.heading="Grossesse/Allaitement\n\n"
@chapter.heading="Effet sur l\342\200\231aptitude \303\240 la conduite et \303\240 l\342\200\231utilisation de machines\n\n"
@chapter.heading="Effets ind\303\251sirables\n\n"
@chapter.heading="Surdosage\n\n"
@chapter.heading="Propri\303\251t\303\251s/Effets\n\n"
@chapter.heading="Pharmacocin\303\251tique\n\n"
@chapter.heading="Donn\303\251es pr\303\251cliniques\n\n"
@chapter.heading="Autres remarques\n\n"
@chapter.heading="Estampille\n\n"
@chapter.heading="Titulaire de l\342\200\231autorisation\n\n"
@chapter.heading="Mise \303\240 jour de l\342\200\231information\n\n"
F
Finished in 0.056092 seconds.

  1) Failure:
test_composition8(TestFachinfoDocParser8) [test_fachinfo_doc_parser.rb:898]:
<nil> expected to be an instance of
<ODDB::Text::Chapter> but was
<NilClass>.

1 tests, 1 assertions, 1 failures, 0 errors

Note

  • '@chapter.heading' has never matched 'Composition'
  • It seems that only the 'Composition' text is skipped

Question

  • When (Where) is '@chapter.heading' set?

Memo

  • The 'new_font' method is called from '_run_of_text' method of 'FachinfoTextHandler' class

Experiment

ext/fiparse/src/fachinfo_doc.rb#new_font

      def new_font(char_props, text=nil)
if @chapter
  print "@chapter.heading="
  p @chapter.heading
end

ext/fiparse/src/fachinfo_doc.rb#_run_of_text

      def _run_of_text(text, char_props)
        #puts sprintf("%2i %s -> %s",char_props.fontsize, same_font?(@current_char_props, char_props), text[0,10])
p text[0..20]
        if(!same_font?(@current_char_props, char_props))
          if(char_props.fontsize >= @cutoff_fontsize \
             && (@writer.nil? || @writer.complete?))
p "FachinfoDocWriter.new"

Result

masa@masa ~/ywesee/oddb.org/ext/fiparse/test $ ruby test_fachinfo_doc_parser.rb 
...
"FachinfoDocWriter.new"
 text="\302\256"
 text=" ASS 300/500"
 text="ARS VITAE"
 text="Composition"
@chapter.heading="ARS VITAE\n\n\n\n"
 text="Principe actif\302\240:"
@chapter.heading="ARS VITAE\n\n\n\nComposition\n\n"
 text=" "
 text="Acidum acetylsalicyli"
@chapter.heading=""
...

Note

  • From this result, it is certain that the setting place of '@chapter.heading' is the one after 'FachinfoDocWriter.new'
  • The 'FachinfoDocWriter.new' is called only once

Memo

  • I guess the '@chapter.heading' is set in somewhere of 'FachinfoDocWriter' class

Experiment

ext/fiparse/src/fachinfo_doc.rb#new_font

          elsif(@chapter && @chapter.sections.size == 1 \
                && @chapter.sections.first.empty?)
            # stay with the previous heading
# Composition
if(text == "Composition")
p "===== Composition"
@chapter_flag = true
@chapter = next_chapter
@section = @chapter.next_section
set_target(@chapter.heading)
end

Result

masa@masa ~/ywesee/oddb.org/ext/fiparse/test $ ruby test_fachinfo_doc_parser.rb 
...
Loaded suite test_fachinfo_doc_parser
Started
"===== FachinfoDocWriter.new ====="
@chapter.heading="ARS VITAE\n\n\n\n"
"===== Composition"
@chapter.heading="Composition\n"
...
Finished in 0.057219 seconds.

1 tests, 1 assertions, 0 failures, 0 errors

Note

  • The test passed
  • Namely, the 'new_font' method is the place where a new chapter is recognized by the kind of font, bold and italic etc.

Consideration

  • It seems as follows:
    • If the bold lines are continued, it is recognized as the same chapter
    • Then the @chapter.heading is NOT updated
  • I should update this 'new_font' method to recognize a new chapter even if the bold lines are continued

Experiment

ext/fiparse/src/fachinfo_doc.rb#new_font

          elsif(@chapter && @chapter.sections.size == 1 \
                && @chapter.sections.first.empty? && check_exception?(text))
                #&& @chapter.sections.first.empty?)
            # stay with the previous heading
...
      def check_exception?(text)
        case text.strip
        when /Composition/u
          false
        else
          true
        end
      end

Note

  • I set exception cases even if the bold lines are continued

Result

masa@masa ~/ywesee/oddb.org/ext/fiparse/test $ ruby test_fachinfo_doc_parser.rb 
Loaded suite test_fachinfo_doc_parser
Started
.
Finished in 0.054611 seconds.

1 tests, 7 assertions, 0 failures, 0 errors

Note

  • Good

Next

  • Check the other NilClass cases

To begin with,

  • Which, test-case or souce code, should I update?
  • I have to check the dates of test-case and source code.

History test_fachinfo_doc_parser.rb

/ext/fiparse/test/test_fachinfo_doc_parser.rb

2008-02-28 Improved FI-Doc-Parser can deal with slightly different...
2008-02-25 Beta-0 Version of the Table-Parser. Still has some...
2008-02-21 Further improvements on the FI-Doc-Parser, less depende...
2008-02-21 Better results when parsing Word-Documents - recognize...
2008-02-20 Activate Word-FI-Parser. This is only a partial commit...
2005-05-26 ywesee ChangeSet 1.2: Import changeset

History of fachinfo_doc.rb

2009-03-17 Migration of ch.oddb.org to UTF-8 encoding
2008-07-18 Ignore Bold table-titles in French doc-files when parsi...
2008-07-18 Add support for fachinfo-doc-files with inconsistent...
2008-07-17 Minor changes to accommodate the structure of some...
2008-02-28 Remove fake AMZV-Heading
2008-02-28 Improved FI-Doc-Parser can deal with slightly different...
2008-02-28 Minor fixes in the Word-Fachinfo-Parser: recognize...
2008-02-25 Beta-0 Version of the Table-Parser. Still has some...
2008-02-21 Further improvements on the FI-Doc-Parser, less depende...
2008-02-21 Better results when parsing Word-Documents - recognize...

The following commit is the update of the chapter heading

So

  • I should update the test-case, not source code
  • but wait
 Add support for fachinfo-doc-files with inconsistent chapter headings (nonbold-zerolength parts)
  • The falilure part 'Composition' is bold
  • The source code may be wrong
  • Also in the other failure and error cases, I should look at both carefully

IMPORTANT

  • We should update BOTH source code and the corresponding test-case at the same time
  • Otherwise, a test-case will be useless, rather a hindrance

Confirmation

  • The source code is wrong
  • The fachinfo_doc.rb is used the manual loading of fachinfo doc file below

The temporary update

          elsif(@chapter && @chapter.sections.size == 1 \
                && @chapter.sections.first.empty? && check_exception?(text))
                #&& @chapter.sections.first.empty?)
            # stay with the previous heading
...
      def check_exception?(text)
        case text.strip
        when /Composition/u
          false
        else
          true
        end
      end

Note

  • This is not a fundamental solution
  • Let's think after checking the other failure and error cases

Next

  • Check the other NilClass cases
masa@masa ~/ywesee/oddb.org/ext/fiparse/test $ ruby test_fachinfo_doc_parser.rb 
Loaded suite test_fachinfo_doc_parser
Started
F
Finished in 0.058998 seconds.

  1) Failure:
test_kinetic3(TestFachinfoDocParser3) [test_fachinfo_doc_parser.rb:282]:
<nil> expected to be an instance of
<ODDB::Text::Chapter> but was
<NilClass>.

1 tests, 1 assertions, 1 failures, 0 errors

ext/fiparse/test/test_fachinfo_doc_parser.rb#test_kinetic3

  def test_kinetic3
    writer = @text_handler.writers.first
    chapter = writer.kinetic
    assert_instance_of(ODDB::Text::Chapter, chapter)
    assert_equal('Pharmakokinetik', chapter.heading)
    assert_equal(4, chapter.sections.size)
    section = chapter.sections.first
    assert_equal("Absorption\n", section.subheading)
  end

Note

  • 'Pharmakokinetik' is not recognized as '@chapter.heading'
  • This does not look the same as the previous case

Experiment

ext/fiparse/src/fachinfo_doc.rb#new_font

      def new_font(char_props, text=nil)
if @kinetic
  print "@kinetic="
  p @kinetic
end

Result

masa@masa ~/ywesee/oddb.org/ext/fiparse/test $ ruby test_fachinfo_doc_parser.rb 
Loaded suite test_fachinfo_doc_parser
Started
F
Finished in 0.059392 seconds.

  1) Failure:
test_kinetic3(TestFachinfoDocParser3) [test_fachinfo_doc_parser.rb:282]:
<nil> expected to be an instance of
<ODDB::Text::Chapter> but was
<NilClass>.

1 tests, 1 assertions, 1 failures, 0 errors

Note

  • The '@kinetic' is never set

Question

  • Where (When) is the '@kinetic' set?

Answer

Memo

  • It means that 'set_templates' in fachinfo_doc.rb is never called

Next

  • Check the condition to call the 'set_templates' in fachinfo_doc.rb

ext/fiparse/src/fachinfo_doc.rb#new_font

      def new_font(char_props, text=nil)
        if(@chapter_flag)
          @chapter_flag = nil
          if(@chapter == @switch)
            # switch between old and new (2001) FI-Schema
            set_templates(@chapter)

Note

  • The condition is as follows
    • @chapter_flag is neither nil nor false
    • @chapter == @switch
  • @chapter.headking is actually set as 'Pharmakokinetik' but @kinetic is never set

Conclusion

  • Skip this test-case

Next

masa@masa ~/ywesee/oddb.org/ext/fiparse/test $ ruby test_fachinfo_doc_parser.rb 
Loaded suite test_fachinfo_doc_parser
Started
F
Finished in 0.058547 seconds.

  1) Failure:
test_usage3(TestFachinfoDocParser3) [test_fachinfo_doc_parser.rb:266]:
<"Dosierung/Anwendung"> expected but was
<"Indikationen/Anwendungsm\303\266glichkeiten">.

1 tests, 2 assertions, 1 failures, 0 errors

Note

  • This is also old format of fachinfo doc, so I will skip this too.

Interim summary

Result

masa@masa ~/ywesee/oddb.org/ext/fiparse/test $ ruby test_fachinfo_doc_parser.rb 
Loaded suite test_fachinfo_doc_parser
Started
...........FFFFFF......FFFFFF.E..F..
>------------------------------------------------------------------------
                   Disktest (10 &#65533;g<
>------------------------------------------------------------------------
                   Disktest (10 µ<
.................................................
Finished in 14.857685 seconds.

  1) Failure:
test_composition10(TestFachinfoDocParser10) [test_fachinfo_doc_parser.rb:1041]:
<nil> expected to be an instance of
<ODDB::Text::Chapter> but was
<NilClass>.

  2) Failure:
test_galenic_form10(TestFachinfoDocParser10) [test_fachinfo_doc_parser.rb:1054]:
<"Galenische Form und Wirkstoffmengen pro Einheit"> expected but was
<"Tropfen. 1 ml enth\303\244lt: 25 mg Hamameliswasser, 5 mg Augentrosttinktur, 0,9 mg Dexpanthenol.">.

  3) Failure:
test_iksnrs10(TestFachinfoDocParser10) [test_fachinfo_doc_parser.rb:1072]:
<nil> expected to be an instance of
<ODDB::Text::Chapter> but was
<NilClass>.

  4) Failure:
test_indications10(TestFachinfoDocParser10) [test_fachinfo_doc_parser.rb:1065]:
<nil> expected to be an instance of
<ODDB::Text::Chapter> but was
<NilClass>.

  5) Failure:
test_name10(TestFachinfoDocParser10) [test_fachinfo_doc_parser.rb:1034]:
<"Tendro, Augentropfen\n"> expected but was
<"Tendro, Augentropfen\n\t\t\t\t\t\t\t\t\t\tTentan AG\n">.

  6) Failure:
test_registration_owner10(TestFachinfoDocParser10) [test_fachinfo_doc_parser.rb:1082]:
<nil> expected to be an instance of
<ODDB::Text::Chapter> but was
<NilClass>.

  7) Failure:
test_composition12(TestFachinfoDocParser12) [test_fachinfo_doc_parser.rb:1185]:
<nil> expected to be an instance of
<ODDB::Text::Chapter> but was
<NilClass>.

  8) Failure:
test_galenic_form12(TestFachinfoDocParser12) [test_fachinfo_doc_parser.rb:1198]:
<"Forme gal\351nique et quantit\351 de principe actif par unit\351"> expected but was
<"a       Principe actif : extrait de millepertuis (Hyperici herba extractum).">.

  9) Failure:
test_iksnrs12(TestFachinfoDocParser12) [test_fachinfo_doc_parser.rb:1216]:
<nil> expected to be an instance of
<ODDB::Text::Chapter> but was
<NilClass>.

 10) Failure:
test_indications12(TestFachinfoDocParser12) [test_fachinfo_doc_parser.rb:1209]:
<nil> expected to be an instance of
<ODDB::Text::Chapter> but was
<NilClass>.

 11) Failure:
test_name12(TestFachinfoDocParser12) [test_fachinfo_doc_parser.rb:1180]:
<"Yakona-Hypericum\n"> expected but was
<"Yakona-Hypericum\n\t\t\t\t\t\t\t\t\t\tTentan AG\n">.

 12) Failure:
test_registration_owner12(TestFachinfoDocParser12) [test_fachinfo_doc_parser.rb:1226]:
<nil> expected to be an instance of
<ODDB::Text::Chapter> but was
<NilClass>.

 13) Error:
test_date13(TestFachinfoDocParser13):
NoMethodError: undefined method `text' for nil:NilClass
    test_fachinfo_doc_parser.rb:1303:in `test_date13'

 14) Failure:
test_name13(TestFachinfoDocParser13) [test_fachinfo_doc_parser.rb:1251]:
<1> expected but was
<2>.

85 tests, 360 assertions, 13 failures, 1 errors
view · edit · sidebar · attach · print · history
Page last modified on February 14, 2011, at 04:57 PM