view · edit · sidebar · attach · print · history

20120628-update-atc-links-fachinfo-image

<< | Index | >>


Summary

  • Updated external links by ATC class
    • added Teilbarkeit link, updated french text

Commits

Index


Update ATC links

Added Teilbarkeit link.
Updated style, text.

NOTE

Htmlgrid::Grid class does not have readable instance variable without @width and @length.

* HtmlGrid::Composite#grid
  * HtmlGrid::Grid

Parse fachinfo images

Store fachinfo images

Fachinfo images are in data directory.

z.B.
 <p class="noSpacing">^M
   <div class="image"><img align="middle" border="0" src="&#xA;          data/pictures/DF_23229_5_1.gif&#xA;       "></div>^M
 </p>
ch.oddb> TextInfoPlugin.new(self).parse_fachinfo('/path/to/fachinfo.htm')
-> 

Mechanize::Page has images and images_with methods. (Mechanize::Page::Image)

page.images_with(:src => %rjpg\Z/).each do |img|
  img.fetch.save
end

Refs

check import_fulltext

Mechanize version ?

ch.oddb> TextInfoPlugin.new(self).import_fulltext ['podhaler']
-> undefined method `images_with' for #<Mechanize::Page:0x000000049248f8>
$ gem list -d mechanize

*** LOCAL GEMS ***

mechanize (2.0.1)
    Authors: Eric Hodel, Aaron Patterson, Mike Dalessio
    Rubyforge: http://rubyforge.org/projects/mechanize
    Homepage: http://mechanize.rubyforge.org
    Installed at: /home/yasuhiro/.rbenv/versions/1.9.3-p194/lib/ruby/gems/1.9.1

    The Mechanize library is used for automating interaction with
    websites

$ gem search mechanize --remote
...
mechanize (2.5.1 ruby)
...
[11] 1.9.3-p194(main)> Mechanize::Page.instance_methods.grep /link/
=> [:links_with, :link_with, :link, :links]
[12] 1.9.3-p194(main)> Mechanize::Page.instance_methods.grep /image/
=> [:images, :image_urls]

old Mechanize::Page::Image class also dose not have fetch method.
I tried newest version of Mechanize.

saved images to following directories.

  • /data/html/fachinfo/de/#{Fachinfo Name}_files/#{Image File Name}
  • /data/html/fachinfo/fr/#{Fachinfo Name}_files/#{Image File Name}

Image saving works fine.
But I have to check all dependency to old Mechinize using.

Check other jobs with Mechanize 2.5.1.

$ grep -r mechanize src
src/plugin/swissmedic.rb:require 'mechanize'
src/plugin/swissmedicjournal.rb:require 'mechanize'
src/plugin/text_info.rb:require 'mechanize'
src/plugin/who.rb:require 'mechanize'
src/plugin/drugbank.rb:require 'mechanize'
src/plugin/analysis.rb:require 'mechanize'
src/plugin/dosing.rb:require 'mechanize'
src/plugin/comarketing.rb:require 'mechanize'
src/plugin/flockhart.rb:require 'mechanize'
src/plugin/bsv_xml.rb:require 'mechanize'

$ grep -r mechanize ext
ext/swissindex/src/swissindex.rb:require 'mechanize'

check exist ODDB::Text::ImageLink class

I found exist ODDB::Text::ImageLink class.

ch.oddb> fachinfos.values.select{|f| !f.description('de').chapters.select{|c| !f.description('de').send(c.to_sym).nil? and !f.description('de').send(c.to_sym).paragraphs.select{|p| p.class == ODDB::Text::ImageLink }.empty? }.empty? }.length
-> 2

There are 2 fachinfo that have this class.

  • 54299
  • 37490
ch.oddb> registration('37490').fachinfo.description('de').chapters.select{|c| !registration('37490').fachinfo.description('de').send(c.to_sym).nil? and !registration('37490').fachinfo.description('de').send(c.to_sym).paragraphs.select{|p|p.class == ODDB::Text::ImageLink}.empty? }
-> [:effects]
ch.oddb> registration('37490').fachinfo.description('de').effects.paragraphs.select{|pa, i| pa.class == ODDB::Text::ImageLink }.first
-> (image)
ch.oddb> registration('37490').fachinfo.description('de').effects.paragraphs.each_with_index{|pa, i| if pa.class == ODDB::Text::ImageLink; then p i; end }
-> Array #= 7
ch.oddb> registration('37490').fachinfo.description('de').effects.paragraphs[7].attributes
-> {"src"=>"/resources/images/fachinfo/de/00010.gif"}
ch.oddb>

/var/www/oddb.org/doc/resources/images/fachinfo/de is almost empty.

NOTE
  ## identification of Pseudo-Fachinfos happens at download-time.
  #  but because we still want to extract the iksnrs, we just mark them
  #  and defer inaction until here:
  unless fi_flags[:pseudo] || fis.empty?
#=> this means
  if !fi_flags.has_key?(:pseudo) or !fis.empty?

tried reparse option

ch.oddb> TextInfoPlugin.new(self, {:reparse => true}).import_fulltext(['podhaler'])

I updated fiparsed to create this ODDB::Text::ImageLink class.

Result
ch.oddb> TextInfoPlugin.new(self).parse_fachinfo('/home/yasuhiro/Downloads/tobi_podhaler.htm')
-> #<ODDB::FachinfoDocument2001:0x0000000635fb50>
ch.oddb> TextInfoPlugin.new(self, {:reparse => true}).import_fulltext(['podhaler'])
-> ["podhaler"]
ch.oddb> registration('60565').fachinfo.description('de').chapters.select{|c| !registration('60565').fachinfo.description('de').send(c.to_sym).nil? and !registration('60565').fachinfo.description('de').send(c.to_sym).paragraphs.select{|p|p.class == ODDB::Text::ImageLink}.empty? }
-> [:usage, :effects]

Refs

Create showing fachinfo images.

Updated resource path.

commit
view · edit · sidebar · attach · print · history
Page last modified on June 28, 2012, at 11:51 AM