could not connect to www.swissreg.ch: #<Net::HTTPInternalServerError:0x007f8a7d69bb58>
---
I try the gem ydocx to convert Attach:sinovial_de.txt. This example produces the following exception
/opt/src/ydocx/lib/ydocx/templates/fachinfo.rb:109:in `parse_title': undefined method `inner_text' for nil:NilClass (NoMethodError) from /opt/src/ydocx/lib/ydocx/templates/fachinfo.rb:130:in `parse_block' from /opt/src/ydocx/lib/ydocx/parser.rb:219:in `parse_paragraph' from /opt/src/ydocx/lib/ydocx/parser.rb:37:in `block in parse' from /home/niklaus/.rvm/gems/ruby-1.9.3-p484/gems/nokogiri-1.6.1/lib/nokogiri/xml/node_set.rb:237:in `block in each' from /home/niklaus/.rvm/gems/ruby-1.9.3-p484/gems/nokogiri-1.6.1/lib/nokogiri/xml/node_set.rb:236:in `upto' from /home/niklaus/.rvm/gems/ruby-1.9.3-p484/gems/nokogiri-1.6.1/lib/nokogiri/xml/node_set.rb:236:in `each' from /opt/src/ydocx/lib/ydocx/parser.rb:30:in `map' from /opt/src/ydocx/lib/ydocx/parser.rb:30:in `parse' from /opt/src/ydocx/lib/ydocx/document.rb:126:in `block in read' from /opt/src/ydocx/lib/ydocx/parser.rb:24:in `initialize' from /opt/src/ydocx/lib/ydocx/document.rb:124:in `new' from /opt/src/ydocx/lib/ydocx/document.rb:124:in `read' from /opt/src/ydocx/lib/ydocx/document.rb:33:in `initialize' from /opt/src/ydocx/lib/ydocx/document.rb:19:in `new' from /opt/src/ydocx/lib/ydocx/document.rb:19:in `open' from /opt/src/ydocx/lib/ydocx/command.rb:99:in `run' from bin/docx2xml:12:in `<main>'
Will fork the gem. Add a Gemfile and a .travis.ci. Then add tests with this example.
Found out that I when passing parameter --plain
the transformation works (it neither a fachinfo nor a patinfo).
Pushed the following commits:
Agreed on the following requirements with Zeno:
Looking at the examples provided by Zeno, I see that I still need some more parsing to get nice result. E.g. I would like be able to look (like in fachinfo) registration('62630').indication
. Using 'ydocx' as it is I get results like
inhalt.keys => ["Titel", "titel", "zusammens", "ind_anwmoegl", "dos_anw", "kontraind", "warnhinw", "interakt", "unerwwirkungen", "eigensch", "sonstige_h", "packungen", "hersteller"]
My solution worked fine for Sinovial_0.8_DE.docx, but failed miserably with Sinovial_0.8_DE.docx as the inner structure was quite different even when both files look quit similar!
Comparing the XML-output I find lot less differences. Therefore I think the correct way is to first create an xml file and extract the chapter info afterwards. With the following loop I can iterate over the parts of the XML document, which interest me
require 'nokogiri' doc = Nokogiri::XML(open('/opt/src/ydocx/spec/data/Sinovial_DE.xml')) doc.xpath('//chapters/chapter').each{ |x| next if x.xpath('heading').size == 0; puts "\n\n"+x.xpath('heading').text; puts x.xpath('paragraph').text} name = doc.xpath('//chapters/paragraph/bold').text ean13 = [] ; doc.xpath('//chapters/chapter').each{ |x| next unless x.xpath('heading').text.match(/Packungen/); x.xpath('paragraph').each{ |p| m= p.text.match(/(\d{13})($|\s|\W)/); ean13 << m[1] if m } }; ean13
Search nach Anwendung (indications), z.B. Konjunktivitis, does not report all occurrences in the section "Anwendung" of Fachinfo. Probably index is corrupted or not set up correctly. Analysing the problem with bin/admin
indications.find_all{ |ind| /Konjunktivitis/.match(ind.search_text) }.size -> 6 indications.find_all{ |ind| /Konjunktivitis/.match(ind.search_text) }.first.search_text -> Symptomatische Behandlung der saisonalen allergischen Konjunktivitis und Rhinokonjunktivitis indications.find_all{ |ind| /Konjunktivitis/.match(ind.search_text) }.first.registrations.first.iksnr -> 56248 ch.oddb> @all = indications.find_all{ |ind| /Konjunktivitis/.match(ind.search_text) } -> Array @all.collect{|indication| indication.registrations.first.iksnr } -> ["56248", "52803", "53308", "52044", "56310", "47036"] @all2 = registrations.find_all{ |reg| reg[1].fachinfo and /Konjunktivitis/.match(reg[1].fachinfo.search_text(:de)) } -> IndikationenAnwendungsmoeglichkeiten Blepharitis Hordeolum Bakterielle Keratitis und Konjunktivitis @all2.size -> 266 @all2.first[1].iksnr -> 13131 @all2.first[1].fachinfo.search_text(:de) -> IndikationenAnwendungsmoeglichkeiten Blepharitis Hordeolum Bakterielle Keratitis und Konjunktivitis
With these few lines we demonstrated that using indications (which I suspect that it uses the index) we find only 6 occurrences of Konjunktivitis. Whereas looking in the german version of the searchtext of all registrations we find 266 occurrences.
Added some debug output and running jobs/import_swissmedic on oddb-ci2. It goes further but has the following error
Plugin: ODDB::SwissregPlugin Error: NameError Message: undefined local variable or method `patents' for #<ODDB::SwissregPlugin:0x007fc8eb3db740> Backtrace: /var/www/oddb.org/src/plugin/swissreg.rb:61:in `update_registrations' /var/www/oddb.org/src/plugin/swissreg.rb:55:in `block in update_news' /var/www/oddb.org/src/plugin/swissreg.rb:52:in `each_key' /var/www/oddb.org/src/plugin/swissreg.rb:52:in `update_news' /var/www/oddb.org/src/util/updater.rb:505:in `update_immediate' /var/www/oddb.org/src/util/updater.rb:418:in `update_swissreg_news' /var/www/oddb.org/src/util/updater.rb:406:in `update_swissmedic_followers' jobs/import_swissmedic:15:in `block in <module:Util>' /var/www/oddb.org/src/util/job.rb:40:in `call' /var/www/oddb.org/src/util/job.rb:40:in `run' jobs/import_swissmedic:12:in `<module:Util>' jobs/import_swissmedic:11:in `<module:ODDB>' jobs/import_swissmedic:10:in `<main>'
Error comes from debug statement. Correcting and restarting import again. Now I got the following error
Plugin: ODDB::SwissregPlugin Error: RuntimeError Message: could not connect to www.swissreg.ch https://www.swissreg.ch/srclient/faces/jsp/spc/sr3.jsp: #<Net::HTTPInternalServerError:0x007f13e5613668> Backtrace: (druby://localhost:10007) /var/www/oddb.org/src/util/http.rb:98:in `post' (druby://localhost:10007) /var/www/oddb.org/ext/swissreg/src/session.rb:203:in `post' (druby://localhost:10007) /var/www/oddb.org/ext/swissreg/src/session.rb:196:in `get_result_list' (druby://localhost:10007) /var/www/oddb.org/ext/swissreg/src/swissreg.rb:12:in `search' (druby://localhost:10007) /usr/lib64/ruby/1.9.1/drb/drb.rb:1548:in `perform_without_block' (druby://localhost:10007) /usr/lib64/ruby/1.9.1/drb/drb.rb:1508:in `perform' (druby://localhost:10007) /usr/lib64/ruby/1.9.1/drb/drb.rb:1586:in `block (2 levels) in main_loop' (druby://localhost:10007) /usr/lib64/ruby/1.9.1/drb/drb.rb:1582:in `loop' (druby://localhost:10007) /usr/lib64/ruby/1.9.1/drb/drb.rb:1582:in `block in main_loop' /var/www/oddb.org/src/plugin/swissreg.rb:62:in `update_registrations' /var/www/oddb.org/src/plugin/swissreg.rb:55:in `block in update_news' /var/www/oddb.org/src/plugin/swissreg.rb:52:in `each_key' /var/www/oddb.org/src/plugin/swissreg.rb:52:in `update_news' /var/www/oddb.org/src/util/updater.rb:505:in `update_immediate' /var/www/oddb.org/src/util/updater.rb:418:in `update_swissreg_news' /var/www/oddb.org/src/util/updater.rb:406:in `update_swissmedic_followers' jobs/import_swissmedic:15:in `block in <module:Util>' /var/www/oddb.org/src/util/job.rb:40:in `call' /var/www/oddb.org/src/util/job.rb:40:in `run' jobs/import_swissmedic:12:in `<module:Util>' jobs/import_swissmedic:11:in `<module:ODDB>' jobs/import_swissmedic:10:in `<main>'
Before running jobs/import_swissmedic I removed manually all files containg latest in their name under data/xls/. This broke the import, as a debug msg tried to get the size of the not existing files.
Pushed commit Avoid error when latest does not exist