view · edit · sidebar · attach · print · history

20101028-debug-bsv_follower-autorun

<< Masa.20101029-debug-bsv_follower-autorun | 2010 | Masa.20101027-debug-bsv_follower-autorun >>


  1. Check FileUtils.cmp method in detail
  2. Check logging process in oddb.org
  3. Description update_bsv (update_bsv_followers) process in detail
  4. Think about possible flows and return values of update_bsv method
  5. Update wrap_update method to report an error log by email
  6. Description update_swissmedic (update_swissmedic_followers) process in detail

Goal
  • debug bsv_followers (oddb.org) autorun / 70 %
Milestones
  1. Check FileUtils.cmp in detail 8:30
  2. Check logging system in oddb.org (Log LogFile classes) 10:00
  3. Describe the updating process of files in detail (update_bsv, update_bsv_followers)
  4. Logging test locally
    • Update source code and commit and pull on production server 15:15
  5. Think about possible flows of update_bsv method suspend
  6. Update wrap_update method to report the error by email
  7. Describe the updating process of files in detail (update_swissmedic, update_siwssmedic_followers)
    • Update source code and commit and pull on production server
Summary
Commits
ToDo Tomorrow
Keep in Mind
Attached Files

Check FileUtils.cmp method in detail

The followings are same

  • FileUtils.cmp(file_a, file_b)
  • FileUtils.compare_file(file_a, file_b)
  • FileUtils.identical?(file_a, file_b)

compare_file(a, b)

# File fileutils.rb, line 792
  def compare_file(a, b)
    return false unless File.size(a) == File.size(b)
    File.open(a, 'rb') {|fa|
      File.open(b, 'rb') {|fb|
        return compare_stream(fa, fb)
      }
    }
  end

compare_stream(a, b)

# File fileutils.rb, line 810
  def compare_stream(a, b)
    bsize = fu_stream_blksize(a, b)
    sa = sb = nil
    while sa == sb
      sa = a.read(bsize)
      sb = b.read(bsize)
      unless sa and sb
        if sa.nil? and sb.nil?
          return true
        end
      end
    end
    false
  end

Notes

  • According to the source code above,
    • This method compares the file size first and the file contents next

Reference

Check logging process in oddb.org

Notes

  • The logging processes in oddb.org and de.oddb.org looks different
  • In de.oddb.org, it uses mainly logger standard library
  • In oddb.org, the original Log and LogFile classes are defined
    • The Log class in oddb.org covers mainly a sending mail, but de.oddb.org delegates the sending process of an email to mail.rb
    • The LogFile class in oddb.org covers outputting a log file
    • It seems that cronolog is used for the output files from LogFile library in oddb.org
  • It is better to use Logger standard library since it is ease and simple and for sure.

Check cronolog processes on production server

$ ps aux|grep cronolog
....
root     27580  0.0  0.0   3852   524 ?        S    03:10   0:00 /usr/sbin/cronolog -l /var/www/de.oddb.org/log/mm/access_log /var/www/de.oddb.org/log/mm/%Y/%m/%d/access_log
root     27581  0.0  0.0   3852   524 ?        S    03:10   0:00 /usr/sbin/cronolog -l /var/www/de.oddb.org/log/oddb/access_log /var/www/de.oddb.org/log/oddb/%Y/%m/%d/access_log
root     27583  0.0  0.0   3856   528 ?        S    03:10   0:00 /usr/sbin/cronolog -l /var/www/ch.oddb.org/log/oddb/access_log /var/www/ch.oddb.org/log/oddb/%Y/%m/%d/access_log

References

Focusing on LogFile.rb in oddb.org

src/util/logfile.rb

#!/usr/bin/env ruby
# LogFile -- ODDB -- 21.10.2003 -- hwyss@ywesee.com

require 'util/oddbconfig'
require 'date'
require 'fileutils'

module ODDB
  module LogFile
    LOG_ROOT = File.expand_path('log', PROJECT_ROOT)
    def append(key, line, time=Time.local)
      file = filename(key, time)
      dir = File.dirname(file)
      FileUtils.mkdir_p(dir)
      timestr = time.strftime('%Y-%m-%d %H:%M:%S %Z')
      File.open(file, 'a') { |fh| fh << [timestr, line, "\n"].join }
    end
    def filename(key, time)
      path = [
        key,
        time.year,
        sprintf('%02i', time.month) + '.log',
      ].join('/')
      File.expand_path(path, LOG_ROOT)
    end
    def read(key, time)
      begin
        File.read(filename(key, time))
      rescue(StandardError)
        ''
      end
    end
    module_function :append
    module_function :filename
    module_function :read
  end
end

Test src/util/updater.rb

    def update_bsv
      LogFile.append(:masa, "getin update_bsv")
=begin
      logs_pointer = Persistence::Pointer.new([:log_group, :bsv_sl])
      logs = @app.create(logs_pointer)
      this_month = Date.new(@@today.year, @@today.month)
      if (latest = logs.newest_date) && latest > this_month
        this_month = latest
      end
      klass = BsvXmlPlugin
      plug = klass.new(@app)
      subj = 'SL-Update (XML)'
      wrap_update(klass, subj) { 
        if plug.update
          log_notify_bsv(plug, this_month, subj)
        end
      }
=end
    end

Run servers

  • oddb.org/bin/oddbd
  • oddb.org/ext/meddata/bin/meddatad

Run jobs/import_daily

Result

masa@masa ~/ywesee/oddb.org $ cat log/masa/2010/10.log 
2010-10-28 10:07:25 CESTgetin update_bsv

Check log directory on production server (We have to use the different directory name other than the current names)

/var/www/oddb.org/log $ ls oddb
Lets put the debug directory here.

Notes

  • Let's use :debug keyword as a directory name
  • Use oddb/debug directory

Description update_bsv (update_bsv_followers) process in detail

1. jobs/import_daily

#!/usr/bin/env ruby
# must be scheduled in crontab to run as the same user as oddb

$: << File.expand_path('../src', File.dirname(__FILE__))
$: << File.expand_path('..', File.dirname(__FILE__))

require 'util/job'
require 'util/updater'

module ODDB
  module Util
    Job.run do |system|
      Updater.new(system).run
    end
  end
end

Notes

  • This job is called from crond every day

2. src/util/updater.rb

    def run
      logfile_stats
      if(update_swissmedic)
        update_swissmedic_followers
      end
      update_swissmedicjournal
      update_vaccines


      if(update_bsv)
        update_bsv_followers
      end


      update_narcotics
      run_on_monthday(1) {
        update_interactions 
      }
    end

Notes

  • If update_bsv runs correctly without error, update_bsv_followers also runs
  • The point: update_bsv runs correctly or not
  • In other words: update_bsv returns nil or not
  • The return value of update_bsv method is not clear only by looking at the source code

3. src/util/updater.rb

    def update_bsv
      logs_pointer = Persistence::Pointer.new([:log_group, :bsv_sl])
      logs = @app.create(logs_pointer)
      this_month = Date.new(@@today.year, @@today.month)
      if (latest = logs.newest_date) && latest > this_month
        this_month = latest
      end
      klass = BsvXmlPlugin
      plug = klass.new(@app)
      subj = 'SL-Update (XML)'
      wrap_update(klass, subj) { 


        if plug.update
          log_notify_bsv(plug, this_month, subj)
        end


      }
    end

Notes

  • The important part: plug(BsvXmlPlugin).update method
  • If it runs without its return value nil, the report email comes by the log_notify_bsv method

4. src/plugin/bsv_xml.rb

    def update
      path = download_to ARCHIVE_PATH
      if File.exist?(@latest) && FileUtils.cmp(@latest, path)
        FileUtils.rm path
        return
      end
      _update path
      FileUtils.cp path, @latest
      path
    end

Notes

  • This is the main part to replace (compare) the downloaded file (XMLPublications-2010.xx.xx.zip) with the latest file (XMLPublications-latest.zip)
  • The condition to update the xml file as follows
    • There is NOT XMLPublications-latest.zip
    • If there is XMLPublications-latest.zip, the downloaded file and the latest file are different
    • Otherwise, the update will go on
  • We should check these conditions on production server

Logging test of manual job run in local

src/util/updater.rb

    def run
      logfile_stats
=begin
      if(update_swissmedic)
        update_swissmedic_followers
      end
      update_swissmedicjournal
      update_vaccines
=end
return_value_update_bsv = update_bsv
LogFile.append('oddb/debug', " return_value_update_bsv=" + return_value_update_bsv.inspect.to_s, Time.now)
      #if(update_bsv)
      if(return_value_update_bsv)
        update_bsv_followers
      end
=begin
      update_narcotics
      run_on_monthday(1) {
        update_interactions 
      }
=end
    end

src/util/updater.rb

    def update_bsv
LogFile.append('oddb/debug', " getin update_bsv", Time.now)

      logs_pointer = Persistence::Pointer.new([:log_group, :bsv_sl])
      logs = @app.create(logs_pointer)
      this_month = Date.new(@@today.year, @@today.month)
      if (latest = logs.newest_date) && latest > this_month
        this_month = latest
      end
      klass = BsvXmlPlugin
      plug = klass.new(@app)
      subj = 'SL-Update (XML)'
      wrap_update(klass, subj) {
return_value_plug_update = plug.update
LogFile.append('oddb/debug', " return_value_BsvXmlPlugin.update = " + return_value_plug_update.inspect.to_s, Time.now)
        #if plug.update
        if return_value_plug_update
          log_notify_bsv(plug, this_month, subj)
        end
      }
    end

src/util/updater.rb

    def update_bsv_followers
LogFile.append('oddb/debug', " getin update_bsv_followers", Time.now)
=begin
      update_trade_status
      update_medwin_packages
      update_lppv
      update_price_feeds
      export_oddb_csv
      # export_oddb2_csv # Disabled 4.1.2010
      export_ouwerkerk
      export_generics_xls
      export_competition_xlss
=end
    end

src/util/updater.rb

    def log_notify_bsv(plug, date, subj='SL-Update')
LogFile.append('oddb/debug', " getin log_notify_bsv", Time.now)
      pointer = Persistence::Pointer.new([:log_group, :bsv_sl], [:log, date])
      values = log_info(plug)
      if log = pointer.resolve(@app)
        change_flags = values[:change_flags]
        if previous = log.change_flags
          previous.each do |ptr, flgs|
            if flags = change_flags[ptr]
              flags.concat flgs
              flags.uniq!
            else
              change_flags[ptr] = flgs
            end
          end
        end
      end
      log = @app.update(pointer.creator, values)
      #log.notify(subj)
return_value_log_notify = log.notify(subj)
LogFile.append('oddb/debug', " return_value_log_notify = " + return_value_log_notify.inspect.to_s, Time.now)
      log2 = Log.new(date)
      log2.update_values log_info(plug, :log_info_bsv)
return_value_log2_notify = log2.notify(subj)
LogFile.append('oddb/debug', " return_value_log2_notify = " + return_value_log2_notify.inspect.to_s, Time.now)
      #log2.notify(subj)
return_value_log2_notify
    end

src/plugin/bsv_xml.rb

    def update
LogFile.append('oddb/debug', " getin BsvXmlPlugin.update", Time.now)
      path = download_to ARCHIVE_PATH
LogFile.append('oddb/debug', " path = " + path.inspect.to_s, Time.now)
LogFile.append('oddb/debug', " @latest = " + @latet.inspect.to_s, Time.now)
file_exists = File.exist?(@latest)
comp_files = ""
comp_files = FileUtils.cmp(@latest, path) if file_exists
LogFile.append('oddb/debug', ' File.exist?(@latest) = ' + file_exists.inspect.to_s, Time.now)
LogFile.append('oddb/debug', ' FileUtils.cmp(@latest, path) = ' + comp_files.inspect.to_s, Time.now)
      #if File.exist?(@latest) && FileUtils.cmp(@latest, path)
      if file_exists && comp_files
        FileUtils.rm path
LogFile.append('oddb/debug', " FileUtils.rm #{path} ", Time.now)
        return
      end
      _update path
LogFile.append('oddb/debug', " FileUtils.cp #{path}, #{@latest}", Time.now)
      FileUtils.cp path, @latest
      path
    end

src/util/log.rb

    def notify(subject = nil, reply_to = nil)
LogFile.append('oddb/debug', " getin Log.notify (SL-Update)", Time.now) if subject == 'SL-Update'
      subj = [
        'ch.ODDB.org Report',
        subject,
        (@date_str || @date.strftime('%m/%Y')),
      ].compact.join(' - ')

      text = text_part(@report)

Run jobs/import_daily manually

Result

masa@masa ~/ywesee/oddb.org $ cat log/oddb/debug/2010/10.log 

2010-10-28 14:19:32 CEST getin update_bsv
2010-10-28 14:19:36 CEST getin BsvXmlPlugin.update
2010-10-28 14:19:45 CEST path = "/home/masa/ywesee/oddb.org/data/xml/XMLPublications-2010.10.28.zip"
2010-10-28 14:19:45 CEST @latest = nil
2010-10-28 14:19:45 CEST File.exist?(@latest) = false
2010-10-28 14:19:45 CEST FileUtils.cmp(@latest, path) = ""
2010-10-28 14:30:49 CEST FileUtils.cp /home/masa/ywesee/oddb.org/data/xml/XMLPublications-2010.10.28.zip, /home/masa/ywesee/oddb.org/data/xml/XMLPublications-latest.zip
2010-10-28 14:30:49 CEST return_value_BsvXmlPlugin.update = "/home/masa/ywesee/oddb.org/data/xml/XMLPublications-2010.10.28.zip"
2010-10-28 14:30:49 CEST getin log_notify_bsv
2010-10-28 14:31:10 CEST getin Log.notify (SL-Update)
2010-10-28 14:31:16 CEST return_value_log_notify = ["mhatakeyama@ywesee.com"]
2010-10-28 14:31:16 CEST getin Log.notify (SL-Update)
2010-10-28 14:31:20 CEST return_value_log2_notify = ["mhatakeyama@ywesee.com"]
2010-10-28 14:31:20 CEST return_value_update_bsv=["mhatakeyama@ywesee.com"]
2010-10-28 14:31:20 CEST getin update_bsv_followers

masa@masa ~/ywesee/oddb.org $ ls data/xml/ -al
insgesamt 6640
drwxr-xr-x 2 masa masa      16 28. Okt 13:35 .
drwxr-xr-x 9 masa masa      56 28. Okt 11:30 ..
-rw-r--r-- 1 masa masa 3398226 28. Okt 13:24 XMLPublications-2010.10.28.zip
-rw-r--r-- 1 masa masa 3398226 28. Okt 13:35 XMLPublications-latest.zip

Commit Added the logging of update_bsv process

Think about possible flows and return values of update_bsv method

How to make the return value of update_bsv clear

BraSt

  • Unclear points
    1. Return value of wrap_update method (block)
    2. Return value of if-statement if 'if condition' becomes nil or false
    3. Return value of log_notify_bsv method
  • Basically, every value is recognized as true if it is NEITHER 'nil' NOR 'false'
  • For example, 123, 'hello', :abc, 0, "", '' are recognized as true (TRUE).

Test

$ cat test.rb
p 1 if 123
p 2 if "hello"
p 3 if :abc
p 4 if 0
p 5 if ""
p 6 if ''

$ ruby test.rb 
1
2
3
4
5
6

if-statment returns the value that is executed at the last line, but in the case that the if-condition is nil or false and there is no else part, if-statement returns nil.

References

Update wrap_update method to report an error log by email

Refer

src/util/updater.rb#wrap_update

    def wrap_update(klass, subj, &block)
      begin
        block.call
      rescue Exception => e #RuntimeError, StandardError => e
        notify_error(klass, subj, e)
        raise
      end
    rescue StandardError
      nil
    end

Notes

  • This method does 1. catching an Error (Exception) and 2. sending an error report mail.
  • If an error happens in the block, nil will be returned back

Description update_swissmedic (update_swissmedic_followers) process in detail

1. jobs/import_daily

#!/usr/bin/env ruby
# must be scheduled in crontab to run as the same user as oddb

$: << File.expand_path('../src', File.dirname(__FILE__))
$: << File.expand_path('..', File.dirname(__FILE__))

require 'util/job'
require 'util/updater'

module ODDB
  module Util
    Job.run do |system|
      Updater.new(system).run
    end
  end
end

Notes

  • This job is called from crond every day

2. src/util/updater.rb

    def run
      logfile_stats


      if(update_swissmedic)
        update_swissmedic_followers
      end


      update_swissmedicjournal
      update_vaccines
      if(update_bsv)
        update_bsv_followers
      end
      update_narcotics
      run_on_monthday(1) {
        update_interactions 
      }
    end

Notes

  • Until this, it is the same as update_bsv

3. src/util/updater.rb

    def update_swissmedic(*args)
      logs_pointer = Persistence::Pointer.new([:log_group, :swissmedic])
      logs = @app.create(logs_pointer)
      klass = SwissmedicPlugin
      plug = klass.new(@app)
      wrap_update(klass, "swissmedic") {
        if(plug.update(*args))
          month = @@today << 1
          pointer = logs.pointer + [:log, Date.new(month.year, month.month)]
          log = @app.update(pointer.creator, log_info(plug))
          log.notify('Swissmedic XLS')
        end
      }
    end

Notes

  • This is also the same structure as update_bsv

4. src/plugin/swissmedic.rb

    def update(agent=Mechanize.new, target=get_latest_file(agent))
      if(target)
        initialize_export_registrations agent
        diff target, @latest, [:atc_class, :sequence_date]
        update_registrations @diff.news + @diff.updates, @diff.replacements
        update_export_registrations @export_registrations
        update_export_sequences @export_sequences
        sanity_check_deletions(@diff)
        delete @diff.package_deletions
        deactivate @diff.sequence_deletions
        deactivate @diff.registration_deletions
        FileUtils.cp target, @latest
        @change_flags = @diff.changes.inject({}) { |memo, (iksnr, flags)|
          memo.store Persistence::Pointer.new([:registration, iksnr]), flags
          memo
        }
      end
    end

Notes

  • This is quite different structure from BsvXmlPlugin.update method
  • As far as I read this source code, if the get_lastest_file method cannot get the file, update_swissmedic method returns nil and update_swissmedic_followers does not run.

5. src/plugin/swissmedic.rb#get_latest_file

    def get_latest_file(agent, keyword='Packungen')
      page = agent.get @index_url
      links = page.links.select do |link|
        ptrn = keyword.gsub /[^A-Za-z]/u, '.'
        /#{ptrn}/iu.match link.attributes['title']
      end
      link = links.first or raise "could not identify url to #{keyword}.xls"
      file = agent.get(link.href)
      download = file.body
      latest_name = File.join @archive, "#{keyword}-latest.xls"
      latest = ''
      if(File.exist? latest_name)
        latest = File.read latest_name
      end
      if(download[-1] != ?\n)
        download << "\n"
      end
      target = File.join @archive, @@today.strftime("#{keyword}-%Y.%m.%d.xls")
      if(download != latest)
        File.open(target, 'w') { |fh| fh.puts(download) }
        target
      end
    end

Notes

  • This method downloads the latest file online
  • What will be the return value of this method?
view · edit · sidebar · attach · print · history
Page last modified on July 13, 2011, at 11:57 AM