view · edit · sidebar · attach · print · history

20100906 debug gkv

<< Masa.20100907-install-ruby | 2010 | Masa.20100903-setting-deoddborg >>


  1. Confirm the error last week again
  2. Mechanize Library
  3. Local test again
  4. Update test code and Commit
  5. test_import

Goal
  • Finish debugging gkv / 80%
Milestones
  1. Review last joblog 7:50
  2. BraSt Todo 8:00
  3. Confirm Error 8:15
  4. Distillment fundamental part 9:10
    1. Consideration
  5. BraSt ToDo 10:30
  6. Study Mechanize library 10:50
  7. Local test again / Confirm Error 11:15
  8. Experiment 14:30
  9. Update / Commit 14:40
  10. Check other failures
    1. test_import(ODDB::Import::TestGkv) [test/import/test_gkv.rb:106]
      1. I have to read ODDB::Import::Gkv#import method
    2. test_import__ml(ODDB::Import::TestGkv) [test/import/test_gkv.rb:178]
      1. I have to read ODDB::Import::Gkv#import__ml method
  11. Reinstall Ruby
Summary
Commits
  1. test_gkv.rb http://scm.ywesee.com/?p=de.oddb.org/.git;a=commit;h=7c6b8266b16dc105d685a1bc4c13c1c7e647da66
ToDo Tomorrow
Keep in Mind

BraSt Todo

  1. Confirm the error again
  2. Backtrace one by one

Confirm the error again

masa@masa ~/ywesee/de.oddb.org $ ruby test/import/test_gkv.rb 
Loaded suite test/import/test_gkv
Started
E
Finished in 0.002834 seconds.

  1) Error:
test_latest_url(ODDB::Import::TestGkv):
NoMethodError: undefined method `html_parser' for nil:NilClass
    /usr/lib64/ruby/gems/1.8/gems/mechanize-1.0.0/lib/mechanize/page.rb:82:in `parser'
    /home/masa/ywesee/de.oddb.org/lib/oddb/import/gkv.rb:88:in `latest_url'
    test/import/test_gkv.rb:85:in `test_latest_url'

1 tests, 0 assertions, 0 failures, 1 errors

Distillment

require 'mechanize'

# test_latest_url
url = 'https://www.gkv-spitzenverband.de/Befreiungsliste_Arzneimittel_Versicherte.gkvnet'
html_dir = "/home/masa/ywesee/de.oddb.org/test/import/data/html/gkv"
path = File.join html_dir, 'Befreiungsliste_Arzneimittel_Versicherte.gkvnet'
html = File.read(path)

# setup_page
response = {'content-type' => 'text/html'}
page = Mechanize::Page.new(URI.parse(url), response, html, 200)

agent = Mechanize.new

# Gkv::latest_url
host = 'https://www.gkv-spitzenverband.de'
url = '/Befreiungsliste_Arzneimittel_Versicherte.gkvnet'
page2 = agent.get host + url


a = ''
if link = (page2/'span[@class=pdf]/a').first
  p true
  a = host + link.attributes["href"]
else
  p false
end

This case does not have an error.

Consideration

  1. In fact, (page/'span
    class=pdf]/a').first becomes nil
    # The html_parser error is caused by flexmock
    # If agent = flexmock(Mechanize.new) changes into agent = Mechanize.new, the error does not happen
    # Instead of that, however, a failure happens because the return value becomes nil (# 1.)
    
    !! What to do next
    
    * Modify the test code
    ** flexmock part
    * Study Mechanize library, in particular, page/'span[@class=pdf]/a' meaning
    
    !! Study Mechanize library [[#Mechanize]]
    
    Simple Sample
    [@
    require 'mechanize'
    
    agent = Mechanize.new
    page = agent.get('http://masa.o.oo7.jp/')
    p page.at('title').inner_text
    

Result

"\nCV\n"
  • page.at('title').inner_text means a string of characters enclosed in title tag.
  • This at method is delegated to Nokogiri library.

Sample2

require 'mechanize'

agent = Mechanize.new
page = agent.get('http://masa.o.oo7.jp/')
page.search("li[@class='list']").each do |item|
  p item.inner_text
end
  • <li> tag and class=list

Test1: both are same

(page/'li').each do |item|
  p item.inner_text
end
page.search("li").each do |item|
  p item.inner_text
end

Test2: both are same

page.search("li[@class=list]/a").each do |item|:wq
  p item.inner_text
end
(page/'li[@class=list]/a').each do |item|
  p item.inner_text
end

Summary + page/'span[ @class=pdf]/a' means ++ parsing the 'page' ++ searching <span> tag ++ with class property pdf ++ searching <a> tag in the <span> tag

Link
  1. http://route477.net/rubyscraping/?Mechanize (japanese)
  2. http://d.hatena.ne.jp/kitamomonga/20081209/kaisetsu_for_ver_0_9_ruby_www_mechanize (japanese)
  3. http://nokogiri.org/

Local test again

test_gkv.local.rb

require 'mechanize'
require 'flexmock'

include FlexMock::TestCase

# test_latest_url
url = 'https://www.gkv-spitzenverband.de/Befreiungsliste_Arzneimittel_Versicherte.gkvnet'
html_dir = "/home/masa/ywesee/de.oddb.org/test/import/data/html/gkv"
path = File.join html_dir, 'Befreiungsliste_Arzneimittel_Versicherte.gkvnet'
html = File.read(path)

# setup_page
response = {'content-type' => 'text/html'}
page = Mechanize::Page.new(URI.parse(url), response, html, 200)

#agent = Mechanize.new
agent = flexmock(Mechanize.new)
agent.should_receive(:get).with(url).and_return(page)


# Gkv::latest_url
host = 'https://www.gkv-spitzenverband.de'
url = '/Befreiungsliste_Arzneimittel_Versicherte.gkvnet'
page = agent.get host + url

a = ''
if link = (page/'span[@class=pdf]/a').first
  p true
  a = host + link.attributes["href"]
else
  p false
end

Result

/usr/lib64/ruby/gems/1.8/gems/mechanize-1.0.0/lib/mechanize/page.rb:82:in `parser': undefined method `html_parser' for nil:NilClass (NoMethodError)
        from test_gkv.rb:28

Hypothesis + Lack of some methods in Mechanize flexmock object.

Experiment

require 'mechanize'
require 'flexmock'

include FlexMock::TestCase

# test_latest_url
url = 'https://www.gkv-spitzenverband.de/Befreiungsliste_Arzneimittel_Versicherte.gkvnet'
html_dir = "/home/masa/ywesee/de.oddb.org/test/import/data/html/gkv"
path = File.join html_dir, 'Befreiungsliste_Arzneimittel_Versicherte.gkvnet'
html = File.read(path)

# setup_page
response = {'content-type' => 'text/html'}

agent = flexmock(Mechanize.new)
page_return = Mechanize::Page.new(URI.parse(url), response, html, 200, agent)
agent.should_receive(:get).with(url).and_return(page_return)

# Gkv::latest_url
host = 'https://www.gkv-spitzenverband.de'
url = '/Befreiungsliste_Arzneimittel_Versicherte.gkvnet'
page = agent.get host + url

a = ''
if link = (page/'span[@class=pdf]/a').first
  p true
  a = host + link.attributes["href"]
else
  p false
end

Result

masa@masa ~/work $ ruby test_gkv.rb 
true
Loaded suite test_gkv
Started

Finished in 9.6e-05 seconds.

0 tests, 0 assertions, 0 failures, 0 errors

Point

  1. page_return = Mechanize::Page.new(URI.parse(url), response, html, 200, agent)
  2. The last argument is assigned with @mech instance variable
  3. See Mechanize Page class new method http://mechanize.rubyforge.org/mechanize/Mechanize/Page.html

Update test code and Commit

Commit http://scm.ywesee.com/?p=de.oddb.org/.git;a=commit;h=7c6b8266b16dc105d685a1bc4c13c1c7e647da66

Before

masa@masa ~/ywesee/de.oddb.org $ ruby test/import/test_gkv.rb 
Loaded suite test/import/test_gkv
Started
.FFE....
Finished in 0.045239 seconds.

  1) Failure:
test_import(ODDB::Import::TestGkv) [test/import/test_gkv.rb:106]:
<2> expected but was
<0>.

  2) Failure:
test_import__ml(ODDB::Import::TestGkv) [test/import/test_gkv.rb:178]:
<1> expected but was
<3>.

  3) Error:
test_latest_url(ODDB::Import::TestGkv):
NoMethodError: undefined method `html_parser' for nil:NilClass
    /usr/lib64/ruby/gems/1.8/gems/mechanize-1.0.0/lib/mechanize/page.rb:83:in `parser'
    /home/masa/ywesee/de.oddb.org/lib/oddb/import/gkv.rb:88:in `latest_url'
    test/import/test_gkv.rb:77:in `test_latest_url'

8 tests, 13 assertions, 2 failures, 1 errors

After

masa@masa ~/ywesee/de.oddb.org $ ruby test/import/test_gkv.rb 
Loaded suite test/import/test_gkv
Started
.FF.....
Finished in 0.046833 seconds.

  1) Failure:
test_import(ODDB::Import::TestGkv) [test/import/test_gkv.rb:106]:
<2> expected but was
<0>.

  2) Failure:
test_import__ml(ODDB::Import::TestGkv) [test/import/test_gkv.rb:178]:
<1> expected but was
<3>.

8 tests, 14 assertions, 2 failures, 0 errors

Reinstall Ruby

When I installed ruby-gtk2 library, then the current Ruby was deleted and 1.8.7 was installed!!

Evern worse, ruby-1.8.6_p369.ebuild is also deleted from online.

view · edit · sidebar · attach · print · history
Page last modified on July 13, 2011, at 12:03 PM