<< | Index | >>
update_company_textinfos job received timeout error.
I can't re-create Encoding error.
Plugin: ODDB::TextInfoPlugin Error: Net::HTTP::Persistent::Error Message: too many connection resets (due to Connection reset by peer - Errno::ECONNRESET) after 105 requests on 77491120
added timeout error handling.
It seems that this problem are not caused at Request from Switzerland.
in src/plugin/text_info.rb
def init_agent
agent = Mechanize.new
agent.user_agent = "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_4_11; de-de) AppleWebKit/525.18 (KHTML, like Gecko) Version/3.1.2 Safari/525.22"
-> agent.keep_alive = false
agent
end
def get_with_retry(agent, url)
page = nil
tried = 0
begin
tried += 1
page = agent.get(url)
rescue Net::HTTP::Persistent::Error => e
if e.message =~ /Timeout/i and tried < 3
sleep 5
retry
else
raise e.message
end
end
page
end
It was not solution.
This Problem is caused in only POST request.
agent.submit
then tried followings.
def init_agent
agent = Mechanize.new
agent.user_agent = "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_4_11; de-de) AppleWebKit/525.18 (KHTML, like Gecko) Version/3.1.2 Safari/525.22"
-> agent.keep_alive = false
-> agent.idle_timeout = 20
-> agent.read_timeout = 60
agent
end
Updater could handle more Requests, But sitll same Problem is caused.
I tried to debug with static HTML file.
* jobs/update_company_textinfos
* update_company_textinfos (src/util/updater.rb)
* import_company (src/plugin/text_info.rb)
* search_company
* search
* init_searchform
* import_companies
* submit_event
* import_products
* identify_eventtargets
* import_product
* update_product
-> * parse_fachinfo
-> * parse_patinfo
* update_fachinfo
* update_patinfo
* store_orphaned
debug fiparsed via TextInfoPlugin#parse_fachinfo.
ch.oddb> TextInfoPlugin.new(self).parse_fachinfo('/home/yasuhiro/Downloads/olanzapin.htm').name
-> Olanzapin Sandoz® Filmtabletten/Schmelztabletten
ch.oddb> TextInfoPlugin.new(self).parse_fachinfo('/home/yasuhiro/Downloads/Olanzapin-Mepha®_-_oro.html').name
-> undefined method `inner_text' for nil:NilClass
in ext/fiparse/src/textinfo_hpricot.rb
def text(elem)
return '' unless elem
p elem.to_s.encoding
end
#=>
#<Encoding:UTF-8>
#<Encoding:US-ASCII>
#<Encoding:UTF-8>
..
But, This does not caused errar (any exception).
I tierd following version.
In local machine, All fine.
But production server has following error.
ywesee@thinpower /var/www/oddb.org $ RUBYOPT="" bin/admin
ch.oddb> TextInfoPlugin.new(self).parse_fachinfo('/home/ywesee/test.html')
-> invalid byte sequence in US-ASCII
I noticed that production server does not have locale.
This is a problem for hpricot gem.
Then I tried following code via bin/admin in production server.
module Hpricot
def self.uxs(str)
str.to_s.force_encoding('utf-8').
gsub(/\&(\w+);/) { [NamedCharacters[$1] || 63].pack("U*") }. # 63 = ?? (query char)
gsub(/\&\#(\d+);/) { [$1.to_i].pack("U*") }
end
class Text
def to_s
str = content.force_encoding('utf-8')
Hpricot.uxs(str)
end
end
end
/path/to/ruby/lib/ruby/gems/1.9.1/gems/hpricot-0.8.4(or 6)/lib/hpricot/builder.rb
I could get expected object.
ywesee@thinpower /var/www/oddb.org $ RUBYOPT="" bin/admin
ch.oddb> TextInfoPlugin.new(self).parse_fachinfo('/home/ywesee/test.html')
-> #<ODDB::FachinfoDocument2001:0x0000000cf88138>
ch.oddb> TextInfoPlugin.new(self).parse_fachinfo('/home/ywesee/test.html').name
-> Zadorin®
ch.oddb
Updated style of ATC links in Result list.

:z.B.