<< | Index | >>
update_company_textinfos
job received timeout error.
I can't re-create Encoding error.
Plugin: ODDB::TextInfoPlugin Error: Net::HTTP::Persistent::Error Message: too many connection resets (due to Connection reset by peer - Errno::ECONNRESET) after 105 requests on 77491120
added timeout error handling.
It seems that this problem are not caused at Request from Switzerland.
in src/plugin/text_info.rb
def init_agent agent = Mechanize.new agent.user_agent = "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_4_11; de-de) AppleWebKit/525.18 (KHTML, like Gecko) Version/3.1.2 Safari/525.22" -> agent.keep_alive = false agent end
def get_with_retry(agent, url) page = nil tried = 0 begin tried += 1 page = agent.get(url) rescue Net::HTTP::Persistent::Error => e if e.message =~ /Timeout/i and tried < 3 sleep 5 retry else raise e.message end end page end
It was not solution.
This Problem is caused in only POST
request.
agent.submit
then tried followings.
def init_agent agent = Mechanize.new agent.user_agent = "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_4_11; de-de) AppleWebKit/525.18 (KHTML, like Gecko) Version/3.1.2 Safari/525.22" -> agent.keep_alive = false -> agent.idle_timeout = 20 -> agent.read_timeout = 60 agent end
Updater could handle more Requests, But sitll same Problem is caused.
I tried to debug with static HTML file.
* jobs/update_company_textinfos * update_company_textinfos (src/util/updater.rb) * import_company (src/plugin/text_info.rb) * search_company * search * init_searchform * import_companies * submit_event * import_products * identify_eventtargets * import_product * update_product -> * parse_fachinfo -> * parse_patinfo * update_fachinfo * update_patinfo * store_orphaned
debug fiparsed via TextInfoPlugin#parse_fachinfo
.
ch.oddb> TextInfoPlugin.new(self).parse_fachinfo('/home/yasuhiro/Downloads/olanzapin.htm').name -> Olanzapin Sandoz® Filmtabletten/Schmelztabletten ch.oddb> TextInfoPlugin.new(self).parse_fachinfo('/home/yasuhiro/Downloads/Olanzapin-Mepha®_-_oro.html').name -> undefined method `inner_text' for nil:NilClass
in ext/fiparse/src/textinfo_hpricot.rb
def text(elem) return '' unless elem p elem.to_s.encoding end #=> #<Encoding:UTF-8> #<Encoding:US-ASCII> #<Encoding:UTF-8> ..
But, This does not caused errar (any exception).
I tierd following version.
In local machine, All fine.
But production server has following error.
ywesee@thinpower /var/www/oddb.org $ RUBYOPT="" bin/admin ch.oddb> TextInfoPlugin.new(self).parse_fachinfo('/home/ywesee/test.html') -> invalid byte sequence in US-ASCII
I noticed that production server does not have locale.
This is a problem for hpricot gem.
Then I tried following code via bin/admin in production server.
module Hpricot def self.uxs(str) str.to_s.force_encoding('utf-8'). gsub(/\&(\w+);/) { [NamedCharacters[$1] || 63].pack("U*") }. # 63 = ?? (query char) gsub(/\&\#(\d+);/) { [$1.to_i].pack("U*") } end class Text def to_s str = content.force_encoding('utf-8') Hpricot.uxs(str) end end end
/path/to/ruby/lib/ruby/gems/1.9.1/gems/hpricot-0.8.4(or 6)/lib/hpricot/builder.rb
I could get expected object.
ywesee@thinpower /var/www/oddb.org $ RUBYOPT="" bin/admin ch.oddb> TextInfoPlugin.new(self).parse_fachinfo('/home/ywesee/test.html') -> #<ODDB::FachinfoDocument2001:0x0000000cf88138> ch.oddb> TextInfoPlugin.new(self).parse_fachinfo('/home/ywesee/test.html').name -> Zadorin® ch.oddb
Updated style of ATC links in Result list.
:z.B.