view · edit · sidebar · attach · print · history

20101025-debug-import_gkv_autorun

<< Masa.20101026-update-import_gkv debug-import_dimdi | 2010 | Masa.20101022-debug-export_chde_xls update-xls2odat >>


  1. Thread Test
  2. Admin tool
  3. Check exportd in de.oddb.org
  4. Debug bag_followers suspend
  5. Debug import_gkv autorun

Goal
  • debug import_gkv autorun / 70 %
Milestones
  1. Review the joblog last Friday
  2. Read source code bag_followers
  3. run import_gkv manually and check the result
  4. auto run import_gkv again
  5. BraSt
Summary
Commits
ToDo Tomorrow
Keep in Mind
Attached Files

Thread Test

Test Ruby thread (1.8.6)

$flag = false

threads = []
3.times do
  puts "create new thread"
  threads << Thread.new do
    loop do
      break if $flag
    end
  end
  puts "Sub thread: sleep 5 sec"
  sleep 5
end

puts "Main thread: sleep 10 sec"
sleep 10
$flag = true

threads.each do |thread|
  thread.join
end
puts "done"

Result

Notes

  • in Ruby 1.8, in my local environment (Funtoo linux), Ruby thread looks using only one CPU.

Reference

Admin tool

Experiment (in my local)

masa@masa ~/ywesee/de.oddb.org $ bin/admin
de.oddb> Thread.list
-> #<Thread:0x7f8bd6b9e650>#<Thread:0x7f8bd6b9f168>#<Thread:0x7f8bd6bf2980>#<Thread:0x7f8bde576fe0>#<Thread:0x7f8be0a68df8>#<Thread:0x7f8bd6b9df70>#<Thread:0x7f8bd6a645c8>#<Thread:0x7f8bd6bdbbe0>
de.oddb> @exporter
-> #<Thread:0x7f8bd6b9f168>
de.oddb> @updater
-> #<Thread:0x7f8bd6b9e650>
de.oddb> Thread.main
-> #<Thread:0x7f8be0a68df8>
de.oddb> Thread.list.length
-> 8
de.oddb> @exporter.alive?
-> true
de.oddb> @updater.alive?
-> true
de.oddb> @exporter.status
-> sleep
de.oddb> @updater.status
-> sleep

Experiment (on production server)

/var/www/de.oddb.org $ bin/admin 
de.oddb> Thread.list.length
-> 21
de.oddb> Thread.list
-> Array
de.oddb> @exporter
-> #<Thread:0x7fed979b73c8>
de.oddb> @updater
-> #<Thread:0x7fed979b7008>
de.oddb> @exporter.alive?
-> false
de.oddb> @updater.alive?
-> true
de.oddb> @exporter.status
-> 
de.oddb> @updater.status
-> sleep

Notes

  • @exporter thread looks not alive

Experiment (in my local)

lib/oddb/export/server.rb

      def Server.safe_export(exporter_class, name, *args, &block)
        dir = ODDB.config.export_dir or raise "Please configure 'export_dir'"
        FileUtils.mkdir_p(dir)
        Tempfile.open(name, dir) { |fh|
          raise "MASA"

Before run export_chde_xls

masa@masa ~/ywesee/de.oddb.org $ bin/admin 
de.oddb> @exporter.alive?
-> true

Run export_chde_xls

After run export_chde_xls

masa@masa ~/ywesee/de.oddb.org $ bin/admin 
de.oddb> @exporter.alive?
-> true

Notes

  • Even if the RuntimeError (StandrdError) happens, @exporter thread is still alive.
  • I have not understood yet why @exporter thread died on production server.

Reboot de.oddb.org on production server

Check

/var/www/de.oddb.org $ bin/admin
de.oddb> Thread.list.length
-> 21
de.oddb> @exporter.alive?
-> true
de.oddb> @updater.alive?
-> true

Notes

  • Now it looks good. We will see the exporter and updater processes next month.

Check exportd in de.oddb.org

There is a daemon service /service/de.oddb-exporter on production server.

Question

  • What is this?

bin/exportd

module ODDB
  log_file = @config.log_file_export
  if(log_file.is_a?(String))
    FileUtils.mkdir_p(File.dirname(log_file))
    log_file = File.open(log_file, 'a')
    at_exit { log_file.close }
  end
  logger = Logger.new(log_file)
  logger.level = Logger.const_get(@config.log_level_export)
  @logger = logger

  Currency = DRbObject.new(nil, @config.currency_rates)

  begin
    server = ODDB::Export::Server
    server.extend(DRbUndumped)
    @server = server

    url = @config.server_url_export
    url.untaint
    DRb.start_service(url, server)
    $SAFE = 1
    logger.info('start') {
      sprintf("starting export-server on %s", url) }
    DRb.thread.join
  rescue Exception => error
    logger.error('fatal') { error }
    raise
  end
end

Notes

  • bin/exportd starts DRb service of ODDB::Export::Server

lib/oddb/util/server.rb

      def run_exporter
        @exporter = run_at(ODDB.config.export_hour) { Exporter.run }
      end

Notes

  • ODDB::Util::Server, which is called from bin/oddbd, calls ODDB::Export.run

lib/oddb/util/exporter.rb

module ODDB
  Exporter = DRb::DRbObject.new(nil, @config.server_url_export)
end

Notes

  • ODDB::Export is an independent DRb process from ODDB::Util::Server
  • It means that ODDB::Export::Server process is independent from ODDB::Util::Server (main) process
  • ODDB::Export::Server process is called from a sub thread of ODDB::Util::Server process.
  • ODDB::Util::Updater process is done in a sub thread but in the same DRb process as main.
    • It is not independet from main DRb process (ODDB::Util::Server).

Interim Summar

I have read the source code.
As far as I understand,

There are two DRb processes in de.oddb.org.
1. main (ODDB::Util::Server)
2. exportd (ODDB::Exporter)

and there are some threads in the main DRb process.
1. main thread
2. exporter thread
3. updater thread

(There are more sub threads)

exportd is called from the exporter thread in the main DRb process.
updater is done in the updater thread of the main DRb process.

So, I guess
exporter (probably) runs in a different CPU from the main de.oddb.org process.
but updater runs on the same CPU as the main process.

I have not understood why the updater (import_gkv) has not ran automatically,
because updater thread on production server was alive.

Let's see the next exporter and updater jobs.

Test locally

lib/oddb/config.rb

    'export_hour'           => 11,
    'update_hour'           => 12,

lib/oddb/util/server.rb

      def run_at(hour, &block)
        Thread.new {
          loop {
            now = Time.now
            run_at = Time.local(now.year, now.month, now.day, hour)
            while(now > run_at)
              run_at += 24*60*60
            end
print "now=", now, "\n"
print "run_at", run_at, "\n"
            sleep(run_at - now)
            block.call
          }
        }
      end
now=Mon Oct 25 10:55:57 +0200 2010
run_atMon Oct 25 11:00:00 +0200 2010
now=Mon Oct 25 10:55:57 +0200 2010
run_atMon Oct 25 12:00:00 +0200 2010

Notes

  • I will check the result after lunch

Result

admin

masa@masa ~/ywesee/de.oddb.org $ bin/admin 
de.oddb> @exporter.status
-> sleep
de.oddb> @updater.status
-> sleep
de.oddb> Thread.list.length
-> 8

files

masa@masa ~/ywesee/de.oddb.org $ ls var/downloads/
masa@masa ~/ywesee/de.oddb.org $ 

masa@masa ~/ywesee/de.oddb.org $ ls log
import_dimdi  import_gkv

masa@masa ~/ywesee/de.oddb.org $ cat log/import_dimdi 

masa@masa ~/ywesee/de.oddb.org $ cat log/import_gkv 
E, [2010-10-25T12:19:47.057275 #7444] ERROR -- Gkv: No such file or directory - log_file=/home/masa/ywesee/de.oddb.org/log/import_gkv

Notes

  • It looks run Exporter.run, but export_chde_xls did not run, since it is not 1st day in a month.
  • import_gkv and import_dimdi looked starting to run but they stopped with an error.

Debug bag_followers

Email ch.oddb: bag_followers

On ch.oddb the

       bag_followers

do not run. The only thing that runs is

       bag_xml_update

Check mails

  1. The last SL-Update (XML) ch.ODDB.org Report - SL-Update (XML) - 10/2010 on 20101002
  2. ch.ODDB.org Report - Trade-Status on 20101002
  3. ch.ODDB.org Report - Medwin-Packages on 20101002
  4. ch.ODDB.org Report - Lppv prices on 20101002
  5. ch.ODDB.org Report - oddb.csv on 20101002
  6. ch.ODDB.org Report - Med-Drugs - Swissmedic on 20101002
  7. ch.ODDB.org Report - Generikaliste on 20101002
  8. ch.ODDB.org Report - Error: Fach- und Patienteninfo News on 20101002

Debug import_gkv autorun again

Test run import_gkv manually

masa@masa ~/ywesee/de.oddb.org $ jobs/import_gkv

Result

Mon Oct 25 13:49:38 2010: de.oddb.org ODDB::Import::Gkv#import
Imported  6557 Zubef-Entries on 25.10.2010:
Visited   6557 existing Zubef-Entries
Visited   6557 existing Companies
Visited   1029 existing Substances
Created      0 new Zubef-Entries
Created      0 new Products
Created      0 new Sequences
Created      0 new Companies
Created      0 new Substances
Assigned    27 Chemical Equivalences
Assigned     3 Companies
Created      0 Incomplete Packages:
Created      3 Product(s) without a name (missing product name):
http://de.oddb.org/de/drugs/product/uid/3480899
http://de.oddb.org/de/drugs/product/uid/3572900
http://de.oddb.org/de/drugs/product/uid/3572913

Notes

  • It looks good manually.

Test autorun import_gkv

lib/oddb/util/updater.rb

      def Updater.run(today = Date.today)
p "getin Updater.run"
#        run_logged_job 'import_dimdi'
        run_logged_job 'import_gkv'
=begin
        case today.day
        when 1
          run_logged_job 'import_pharmnet'
          run_logged_job 'import_whocc'
        when 15
          run_logged_job 'import_pharma24'
        end
=end
      end

lib/oddb/util/server.rb

      def run_at(hour, &block)
        Thread.new {
          loop {
            now = Time.now
            #run_at = Time.local(now.year, now.month, now.day, hour)
            run_at = Time.local(now.year, now.month, now.day, hour, 50)
            while(now > run_at)
              run_at += 24*60*60
            end
print "now ", now, "\n"
print "run_at ", run_at, "\n"
            sleep(run_at - now)
            block.call
          }
        }
      end

lib/oddb/config.rb

    'update_hour'           => 14,

Check before oddbd run

masa@masa ~/ywesee/de.oddb.org $ git status
# On branch master
# Changed but not updated:
#   (use "git add <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
#
#       modified:   lib/oddb/config.rb
#       modified:   lib/oddb/util/server.rb
#       modified:   lib/oddb/util/updater.rb
#
no changes added to commit (use "git add" and/or "git commit -a")

run oddbd

masa@masa ~/ywesee/de.oddb.org $ bin/oddbd
now Mon Oct 25 14:42:00 +0200 2010
run_at Mon Oct 25 14:50:00 +0200 2010

Result (after updater run)

I, [2010-10-25T14:54:01.361746 #10042]  INFO -- start: starting oddb-server on druby://localhost:11000
"getin Updater.run_logged_job"
"getin Updater.run"
/home/masa/ywesee/de.oddb.org/lib/oddb/html/view/drugs/package.rb:373: warning: parenthesize argument(s) for future version
now Mon Oct 25 14:56:00 +0200 2010
run_at Tue Oct 26 14:56:00 +0200 2010
masa@masa ~/ywesee/de.oddb.org $ git status
# On branch master
# Changed but not updated:
#   (use "git add <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
#
#       modified:   lib/oddb/config.rb
#       modified:   lib/oddb/util/server.rb
#       modified:   lib/oddb/util/updater.rb
#
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#       log/
no changes added to commit (use "git add" and/or "git commit -a")
masa@masa ~/ywesee/de.oddb.org $ ls log -al
insgesamt 12
drwxr-xr-x  2 masa masa    8 25. Okt 14:50 .
drwxr-xr-x 12 masa masa 4096 25. Okt 14:56 ..
-rw-r--r--  1 masa masa  135 25. Okt 14:56 import_gkv
masa@masa ~/ywesee/de.oddb.org $ cat log/import_gkv 
E, [2010-10-25T14:56:00.697468 #10079] ERROR -- Gkv: No such file or directory - log_file=/home/masa/ywesee/de.oddb.org/log/import_gkv

Notes

  • Updater.run looked called, but import_gkv looked NOT running
  • Where is the error message of log/import_gkv coming from?

Experiment

lib/oddb/util/updater.rb

      def Updater.run_logged_job job
p "getin Updater.run_logged_job"
        dir = ODDB.config.oddb_dir
        cmd = File.join dir, 'jobs', job
        log = File.join dir, 'log', job
print "dir=",dir,"\n"
print "cmd=",cmd,"\n"
print "log=",log,"\n"
        IO.popen "#{cmd} log_file=#{log}" do |io|
          # wait for importer to exit
        end
      end

Run bin/oddbd

Result

"getin Updater.run"
"getin Updater.run_logged_job"
dir=/home/masa/ywesee/de.oddb.org
cmd=/home/masa/ywesee/de.oddb.org/jobs/import_gkv
log=/home/masa/ywesee/de.oddb.org/log/import_gkv

Experiment

masa@masa ~/ywesee/de.oddb.org $ /home/masa/ywesee/de.oddb.org/jobs/import_gkv log_file=/home/masa/ywesee/de.oddb.org/log/import_gkv
...
/home/masa/ywesee/de.oddb.org/lib/oddb/html/view/drugs/package.rb:373: warning: parenthesize argument(s) for future version
getin Updater.import_gkv

(ends)

masa@masa ~/ywesee/de.oddb.org $ cat log/import_gkv 
E, [2010-10-25T15:38:51.334689 #10853] ERROR -- Gkv: No such file or directory - log_file=/home/masa/ywesee/de.oddb.org/log/import_gkv

Experiment

lib/oddb/import/gkv.rb

  def download_latest(url, opts={}, &block)
p "getin download_latest"
p "1"
    opts = {:date => Date.today}.merge(opts)
p "2"
    file = File.basename(url)
p "3"
    pdf_dir = File.join(ODDB.config.var, 'pdf/gkv')
p "4"
    FileUtils.mkdir_p(pdf_dir)
p "5"
    dest = File.join(pdf_dir, file)
p "6"
    archive = File.join(ODDB.config.var, 'pdf/gkv',
                sprintf("%s-%s", opts[:date].strftime("%Y.%m.%d"), file))
p "7"
    content = open(url).read
p "8"
    if(!File.exist?(dest) || content.size != File.size(dest))
p "9"
      open(archive, 'w') { |local|
        local << content
      }
p "10"
      open(archive, 'r', &block)
p "11"
      open(dest, 'w') { |local|
        local << content
      }
p "12"
    end
  rescue StandardError => error
p "13"
    ODDB.logger.error('Gkv') { error.message }
  end

Run import_gkv

masa@masa ~/ywesee/de.oddb.org $ /home/masa/ywesee/de.oddb.org/jobs/import_gkv log_file=/home/masa/ywesee/de.oddb.org/log/import_gkv

Result

"getin download_latest"
"1"
"2"
"3"
"4"
"5"
"6"
"7"
"13"

Notes

  • This means a RuntimeError happens at the following point

http://scm.ywesee.com/?p=de.oddb.org/.git;a=blob;f=lib/oddb/import/gkv.rb;h=0d34d765e7384a919cb41a29b340203695b45972;hb=HEAD#l72

content = open(url).read

Experiment

  def download_latest(url, opts={}, &block)
    opts = {:date => Date.today}.merge(opts)
    file = File.basename(url)
    pdf_dir = File.join(ODDB.config.var, 'pdf/gkv')
    FileUtils.mkdir_p(pdf_dir)
    dest = File.join(pdf_dir, file)
    archive = File.join(ODDB.config.var, 'pdf/gkv',
                sprintf("%s-%s", opts[:date].strftime("%Y.%m.%d"), file))

print "url=", url, "\n"

    content = open(url).read
    if(!File.exist?(dest) || content.size != File.size(dest))
      open(archive, 'w') { |local|
        local << content
      }
      open(archive, 'r', &block)
      open(dest, 'w') { |local|
        local << content
      }
    end
  rescue StandardError => error

print "error=",error,"\n"

    ODDB.logger.error('Gkv') { error.message }
  end

Result

url=log_file=/home/masa/ywesee/de.oddb.org/log/import_gkv
error=No such file or directory - log_file=/home/masa/ywesee/de.oddb.org/log/import_gkv

Notes

  • 'log_file=/home/masa/ywesee/de.oddb.org/log/import_gkv' should be in opts variable, NOT in url

Experiment

lib/oddb/util/updater.rb

      def Updater.import_gkv(opts = {})
STDOUT.print "getin Updater.import_gkv\n"
        importer = Import::Gkv.new
print "opts="
p opts
        if url = opts[:pdf] || importer.latest_url(Mechanize.new, opts)
print "url=",url,"\n"

Result

getin Updater.import_gkv
opts={:pdf=>"log_file=/home/masa/ywesee/de.oddb.org/log/import_gkv"}
url=log_file=/home/masa/ywesee/de.oddb.org/log/import_gkv

Conclusion

Commit Turn back to the origial because import_gkv does not run automatically

Notes

  • I have to think how to add pdf url link directly to the import_gkv job
view · edit · sidebar · attach · print · history
Page last modified on July 13, 2011, at 11:57 AM