view · edit · sidebar · attach · print · history

Index>

20150616-oddb2xml-o-fails

Summary

  • Running oddb2xml -o fails

Commits

On VCR branch

Index

Keep in Mind for work to do
  • Fix dojo error http://www.sitepen.com/blog/2012/10/31/debugging-dojo-common-error-messages/#forgot-dom-ready
  • I removed on May-27 tests for ix_registrationss, fix_sequences, fix_compositions, fix_packages from test/test_plugin/swissmedic.rb,as he could not find any references for them in the src code. Did I erroneously remove stuff when cleaning up the swissmedic import earlier?
  • The whole test for older/newer Packages must be adapted to xlsx. One must compare the rows (e.g. by creating csv files) and do the same stuff in xlsx!
  • creat gem: task: input=file with ean-codes, standard output show ean-codes + atc-code. Source is Swissmedic Packungen.xlsx or XML.
  • Import via data/medreg_companies.yaml
  • Fix problem with radioactivatum 99m-technetio when parsing Wirkstoffe
  • Fix galenic_forms when parsing swissmedic.xlsx
  • Cleanup generic_type. Replace it everywhere by sl_generic_type and adapt code accordingly.
  • Get updated ATC-codes from EPha for oddb.org, too.
  • Display new fields (LABEL, MORE_INFO, CORRESP) for compositions in oddb.org.
  • Use refdatabase for oddb.org, too.

Use vcr with oddb2xml

Merging changes from oddb2xml -o. Discovered after quite a few trials that VCR is not thread-safe. Therefore many spec tests failed. Disabled threading (running takes a little bit longer).

Pushed commits Avoid using threads to fix problems when running rspec, Merge branch 'master' into vcr and Made all spec-tests (except for builder) pass again

Must fix builder_spec.rb for the changed items.

Running oddb2xml -o fails

Banging my head no how to create with the new nokogiri version the same output as before. Our output should look like this:

  <KMP MONTYPE="fi" LANG="DE" DT="">
    <name>
      <p>3TC®</p>
    </name>
    <owner>
      <p>ViiV Healthcare GmbH</p>
    </owner>
    <monid>53663</monid>
    <paragraph><![CDATA[<title><p>3TC®</p></title><div class="paragraph" id="Section7000">
    <div class="absTitle">Zusammensetzung</div>

Tried

pry(#<Oddb2xml::Builder>)> puts /.+zusammensetzung/im.match(info[:paragraph].root)
 (169Er) Erbiumcitrat CIS bio international Kolloidale Suspension zu lokalen InjektionERMM-1 Zusammensetzung

pry(#<Oddb2xml::Builder>)> puts /.+zusammensetzung/im.match(info[:paragraph].root.to_s)
<html><body><div xmlns="http://www.w3.org/1999/xhtml">
<p class="s2"> </p>
<p class="s6" id="section1"><span class="s3"><span>(</span></span><sup class="s4"><span>169</span></sup><span class="s3"><span>Er)</span></span><span class="s5"><span> </span></span><span class="s5"><span>Erbiumcitrat CIS bio international</span></span><span class="s3"><span> </span></span></p>
<p class="s6"><span class="s7"><span>Kolloidale Suspension zu lokalen Injektion</span></span></p>
<p class="s8"><span class="s7"><span>ERMM-1</span></span></p>
<p class="s2"> </p>
<p class="s9" id="section2"><span class="s5"><span>Zusammensetzung
=> nil

pry(#<Oddb2xml::Builder>)> puts /.+zusammensetzung/im.match(info[:paragraph].to_html)
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<?xml version="1.0" encoding="utf-8"?><html><body><div xmlns="http://www.w3.org/1999/xhtml">
<p class="s2"> </p>
<p class="s6" id="section1"><span class="s3"><span>(</span></span><sup class="s4"><span>169</span></sup><span class="s3"><span>Er)</span></span><span class="s5"><span> </span></span><span class="s5"><span>Erbiumcitrat CIS bio international</span></span><span class="s3"><span> </span></span></p>
<p class="s6"><span class="s7"><span>Kolloidale Suspension zu lokalen Injektion</span></span></p>
<p class="s8"><span class="s7"><span>ERMM-1</span></span></p>
<p class="s2"> </p>
<p class="s9" id="section2"><span class="s5"><span>Zusammensetzung
=> nil

Same output if using to_html instead of to_s

Using now nokogiri 1.5.11. I did not find out, what we used in 2013 as the nokogiri version did not show up in the gemspec or Gemfile.lock. Found finally the answer on howto save it without the enclosing declaration via http://stackoverflow.com/questions/8218711/print-an-xml-document-without-the-xml-header-line-at-the-top.

doc = Nokogiri.XML('<hello world="true" />')
puts doc.to_html :save_with => Nokogiri::XML::Node::SaveOptions::NO_DECLARATION
<hello world="true"></hello>
[

Using Nokogiri::HTML.fragment(pac.content.force_encoding('UTF-8')) to extract the paragraph resolves part of this problem. Adding also the style information for each fachinfo to the generated xml. All spec tests pass again. Running test_options.rb.

Pushed commits Re-enable some checks for extractor and Fix running with -o option for fachinfo

view · edit · sidebar · attach · print · history
Page last modified on June 16, 2015, at 04:23 PM