<< Masa.20100910-import_gkv-error | 2010 | Masa.20100908-test_import >>
$ sh setup.de.oddb.org $ sh setup.de.postgresql.org
masa@masa ~/ywesee/de.oddb.org/jobs $ ruby import_gkv *** /home/masa/ywesee/de.oddb.org/lib/oddb/html/view/drugs/package.rb:373: warning: parenthesize argument(s) for future version
No error output
looks
/lib/oddb/util/updater.rb if url = importer.latest_url(Mechanize.new, opts)
does not working
Notes
def latest_url agent, opts={} host = 'https://www.gkv-spitzenverband.de' url = '/Befreiungsliste_Arzneimittel_Versicherte.gkvnet' page = agent.get host + url if link = (page/'span[\@class=pdf]/a').first host + link.attributes["href"] end end
if link = (page/'span[\@class=pdf]/a').first
becomes nil. that is why import_gkv does not work.
Note
Explain
Test
require 'mechanize' def setup_page(url, html, mech=nil) response = {'content-type' => 'text/html'} Mechanize::Page.new(URI.parse(url), response, html, 200, mech) end agent = Mechanize.new url = "http://www.hogehoge.com" html = "<html><body><span class=pdf><a href='hogehoge'>abc</a></body></html>" page = setup_page url, html, agent p page.search("span").inner_text (page/'span[@class=pdf]/a').each do |item| p item.inner_text end link = (page/'span[@class=pdf]/a').first p link.attributes["href"]
Result
masa@masa ~/work $ ruby test.rb "abc" "abc" #<Nokogiri::XML::Attr:0x3fb089168068 name="href" value="hogehoge">
https://www.gkv-spitzenverband.de/Befreiungsliste_Arzneimittel_Versicherte.gkvnet
There is no <span class="pdf"> tag in the page above. but there is <a ... class="pdf"> tag
Test
require 'mechanize' agent = Mechanize.new url = "https://www.gkv-spitzenverband.de/Befreiungsliste_Arzneimittel_Versicherte.gkvnet" page = agent.get url link = (page/'a[@class=pdf]').first p link.attributes["href"]
Result
masa@masa ~/work $ ruby test.rb #<Nokogiri::XML::Attr:0x3fa8944058cc name="href" value="/upload/Zuzahlungsbefreit_sort_Name_100901_14383.pdf">
Summary
Confirm
masa@masa ~/ywesee/de.oddb.org $ ruby test/import/test_gkv.rb Loaded suite test/import/test_gkv Started ........ Finished in 0.0906060000000001 seconds. 8 tests, 43 assertions, 0 failures, 0 errors
Commit Updated Gkv#latest_url and its test case.
Looks running...but I got an error
WARNING: nonstandard use of \' in a string literal ZEILE 2: VALUES (214062, 'pfizer ltd. \'', 66064) ^ TIP: Use '' to write quotes in strings, or use the escape string syntax (E'...'). WARNING: nonstandard use of \' in a string literal ZEILE 2: VALUES (214062, 'pfizer ltd. \'', 133820) ^ TIP: Use '' to write quotes in strings, or use the escape string syntax (E'...'). WARNING: nonstandard use of \' in a string literal ZEILE 2: VALUES (214062, 'pfizer ltd. \'', 163939) ^ TIP: Use '' to write quotes in strings, or use the escape string syntax (E'...'). WARNING: nonstandard use of \' in a string literal ZEILE 2: VALUES (214062, 'pfizer ltd. \'', 168431) ^ TIP: Use '' to write quotes in strings, or use the escape string syntax (E'...'). WARNING: nonstandard use of \' in a string literal ZEILE 2: VALUES (214062, 'pfizer ltd. \'', 150549) ^ TIP: Use '' to write quotes in strings, or use the escape string syntax (E'...'). WARNING: nonstandard use of \' in a string literal ZEILE 2: VALUES (214062, 'pfizer ltd. \'', 153327) ^ TIP: Use '' to write quotes in strings, or use the escape string syntax (E'...'). WARNING: nonstandard use of \' in a string literal ZEILE 2: VALUES (214062, 'pfizer ltd. \'', 150536) ^ TIP: Use '' to write quotes in strings, or use the escape string syntax (E'...'). WARNING: nonstandard use of \' in a string literal ZEILE 2: VALUES (214062, 'pfizer ltd. \'', 150539) ^ TIP: Use '' to write quotes in strings, or use the escape string syntax (E'...'). E, [2010-09-09T12:37:58.026660 #6373] ERROR -- Gkv: both user and secret are required
Search the error message 'both user and secret are required'
masa@masa ~/ywesee/de.oddb.org $ grep -r "both user and secret are required" * README: /usr/lib/ruby/1.8/net/smtp.rb:562:in `check_auth_args': both user and secret are required (ArgumentError)
This is Davatz-san's log in README.
There must be a configuration somewhere that sets the Mail-sending method.
Search error place
masa@masa /usr/lib/ruby/1.8 $ grep -r "both user and secret are required" * net/smtp.rb: raise ArgumentError, 'both user and secret are required'\
Look at /usr/lib/ruby/1.8/net/smtp.rb
Set p
def check_auth_args( user, secret, authtype ) # masa p "get-in check_auth_args" print caller(0).pretty_inspect.join("\n").to_s,"\n" raise ArgumentError, 'both user and secret are required'\ unless user and secret auth_method = "auth_#{authtype || 'cram_md5'}" raise ArgumentError, "wrong auth type #{authtype}"\ unless respond_to?(auth_method, true) end
import_gvk takes about one hour.
Looks Gkv#import takes long.
def import fh, opts={} parser = Rpdf2txt::Parser.new(fh.read, 'utf8') handler = GkvHandler.new method(:process_page) parser.extract_text handler
In particular, parser.extract_text handler takes over 30 minutes.
Result
"get-in Updater.import_gkv" "a" "b" "get-in Gkv#download_latest" "11" "12" "13" "14" "15" "16" "17" "18" "19" "20" "21" "c" "get-in Updater._reported_import" "A" "get-in Gkv#import" "1" "2" WARNING: nonstandard use of \' in a string literal ZEILE 2: VALUES (214062, 'pfizer ltd. \'', 66064) ^ TIP: Use '' to write quotes in strings, or use the escape string syntax (E'...'). WARNING: nonstandard use of \' in a string literal ZEILE 2: VALUES (214062, 'pfizer ltd. \'', 133820) ^ TIP: Use '' to write quotes in strings, or use the escape string syntax (E'...'). WARNING: nonstandard use of \' in a string literal ZEILE 2: VALUES (214062, 'pfizer ltd. \'', 163939) ^ TIP: Use '' to write quotes in strings, or use the escape string syntax (E'...'). WARNING: nonstandard use of \' in a string literal ZEILE 2: VALUES (214062, 'pfizer ltd. \'', 168431) ^ TIP: Use '' to write quotes in strings, or use the escape string syntax (E'...'). WARNING: nonstandard use of \' in a string literal ZEILE 2: VALUES (214062, 'pfizer ltd. \'', 150549) ^ TIP: Use '' to write quotes in strings, or use the escape string syntax (E'...'). WARNING: nonstandard use of \' in a string literal ZEILE 2: VALUES (214062, 'pfizer ltd. \'', 153327) ^ TIP: Use '' to write quotes in strings, or use the escape string syntax (E'...'). WARNING: nonstandard use of \' in a string literal ZEILE 2: VALUES (214062, 'pfizer ltd. \'', 150536) ^ TIP: Use '' to write quotes in strings, or use the escape string syntax (E'...'). WARNING: nonstandard use of \' in a string literal ZEILE 2: VALUES (214062, 'pfizer ltd. \'', 150539) ^ TIP: Use '' to write quotes in strings, or use the escape string syntax (E'...'). "3" "4" "B" "get-in check_auth_args" E, [2010-09-09T16:23:02.892603 #9785] ERROR -- Gkv: undefined method `join' for #<String:0x7f472212eb78> "d" "e"
Consideration
Run again
I will check the result tomorrow.