<< Masa.20101026-update-import_gkv debug-import_dimdi | 2010 | Masa.20101022-debug-export_chde_xls update-xls2odat >>
suspend
Test Ruby thread (1.8.6)
$flag = false threads = [] 3.times do puts "create new thread" threads << Thread.new do loop do break if $flag end end puts "Sub thread: sleep 5 sec" sleep 5 end puts "Main thread: sleep 10 sec" sleep 10 $flag = true threads.each do |thread| thread.join end puts "done"
Result
Notes
Reference
Experiment (in my local)
masa@masa ~/ywesee/de.oddb.org $ bin/admin de.oddb> Thread.list -> #<Thread:0x7f8bd6b9e650>#<Thread:0x7f8bd6b9f168>#<Thread:0x7f8bd6bf2980>#<Thread:0x7f8bde576fe0>#<Thread:0x7f8be0a68df8>#<Thread:0x7f8bd6b9df70>#<Thread:0x7f8bd6a645c8>#<Thread:0x7f8bd6bdbbe0> de.oddb> @exporter -> #<Thread:0x7f8bd6b9f168> de.oddb> @updater -> #<Thread:0x7f8bd6b9e650> de.oddb> Thread.main -> #<Thread:0x7f8be0a68df8> de.oddb> Thread.list.length -> 8 de.oddb> @exporter.alive? -> true de.oddb> @updater.alive? -> true de.oddb> @exporter.status -> sleep de.oddb> @updater.status -> sleep
Experiment (on production server)
/var/www/de.oddb.org $ bin/admin de.oddb> Thread.list.length -> 21 de.oddb> Thread.list -> Array de.oddb> @exporter -> #<Thread:0x7fed979b73c8> de.oddb> @updater -> #<Thread:0x7fed979b7008> de.oddb> @exporter.alive? -> false de.oddb> @updater.alive? -> true de.oddb> @exporter.status -> de.oddb> @updater.status -> sleep
Notes
Experiment (in my local)
def Server.safe_export(exporter_class, name, *args, &block) dir = ODDB.config.export_dir or raise "Please configure 'export_dir'" FileUtils.mkdir_p(dir) Tempfile.open(name, dir) { |fh| raise "MASA"
Before run export_chde_xls
masa@masa ~/ywesee/de.oddb.org $ bin/admin de.oddb> @exporter.alive? -> true
Run export_chde_xls
After run export_chde_xls
masa@masa ~/ywesee/de.oddb.org $ bin/admin de.oddb> @exporter.alive? -> true
Notes
Reboot de.oddb.org on production server
Check
/var/www/de.oddb.org $ bin/admin de.oddb> Thread.list.length -> 21 de.oddb> @exporter.alive? -> true de.oddb> @updater.alive? -> true
Notes
Now it looks good. We will see the exporter and updater processes next month.
There is a daemon service /service/de.oddb-exporter on production server.
Question
module ODDB log_file = @config.log_file_export if(log_file.is_a?(String)) FileUtils.mkdir_p(File.dirname(log_file)) log_file = File.open(log_file, 'a') at_exit { log_file.close } end logger = Logger.new(log_file) logger.level = Logger.const_get(@config.log_level_export) @logger = logger Currency = DRbObject.new(nil, @config.currency_rates) begin server = ODDB::Export::Server server.extend(DRbUndumped) @server = server url = @config.server_url_export url.untaint DRb.start_service(url, server) $SAFE = 1 logger.info('start') { sprintf("starting export-server on %s", url) } DRb.thread.join rescue Exception => error logger.error('fatal') { error } raise end end
Notes
def run_exporter @exporter = run_at(ODDB.config.export_hour) { Exporter.run } end
Notes
module ODDB Exporter = DRb::DRbObject.new(nil, @config.server_url_export) end
Notes
Interim Summar
I have read the source code. As far as I understand, There are two DRb processes in de.oddb.org. 1. main (ODDB::Util::Server) 2. exportd (ODDB::Exporter) and there are some threads in the main DRb process. 1. main thread 2. exporter thread 3. updater thread (There are more sub threads) exportd is called from the exporter thread in the main DRb process. updater is done in the updater thread of the main DRb process. So, I guess exporter (probably) runs in a different CPU from the main de.oddb.org process. but updater runs on the same CPU as the main process. I have not understood why the updater (import_gkv) has not ran automatically, because updater thread on production server was alive. Let's see the next exporter and updater jobs.
Test locally
'export_hour' => 11, 'update_hour' => 12,
def run_at(hour, &block) Thread.new { loop { now = Time.now run_at = Time.local(now.year, now.month, now.day, hour) while(now > run_at) run_at += 24*60*60 end print "now=", now, "\n" print "run_at", run_at, "\n" sleep(run_at - now) block.call } } end
now=Mon Oct 25 10:55:57 +0200 2010 run_atMon Oct 25 11:00:00 +0200 2010 now=Mon Oct 25 10:55:57 +0200 2010 run_atMon Oct 25 12:00:00 +0200 2010
Notes
admin
masa@masa ~/ywesee/de.oddb.org $ bin/admin de.oddb> @exporter.status -> sleep de.oddb> @updater.status -> sleep de.oddb> Thread.list.length -> 8
files
masa@masa ~/ywesee/de.oddb.org $ ls var/downloads/ masa@masa ~/ywesee/de.oddb.org $ masa@masa ~/ywesee/de.oddb.org $ ls log import_dimdi import_gkv masa@masa ~/ywesee/de.oddb.org $ cat log/import_dimdi masa@masa ~/ywesee/de.oddb.org $ cat log/import_gkv E, [2010-10-25T12:19:47.057275 #7444] ERROR -- Gkv: No such file or directory - log_file=/home/masa/ywesee/de.oddb.org/log/import_gkv
Notes
Email ch.oddb: bag_followers
On ch.oddb the bag_followers do not run. The only thing that runs is bag_xml_update
Check mails
masa@masa ~/ywesee/de.oddb.org $ jobs/import_gkv
Result
Mon Oct 25 13:49:38 2010: de.oddb.org ODDB::Import::Gkv#import Imported 6557 Zubef-Entries on 25.10.2010: Visited 6557 existing Zubef-Entries Visited 6557 existing Companies Visited 1029 existing Substances Created 0 new Zubef-Entries Created 0 new Products Created 0 new Sequences Created 0 new Companies Created 0 new Substances Assigned 27 Chemical Equivalences Assigned 3 Companies Created 0 Incomplete Packages: Created 3 Product(s) without a name (missing product name): http://de.oddb.org/de/drugs/product/uid/3480899 http://de.oddb.org/de/drugs/product/uid/3572900 http://de.oddb.org/de/drugs/product/uid/3572913
Notes
def Updater.run(today = Date.today) p "getin Updater.run" # run_logged_job 'import_dimdi' run_logged_job 'import_gkv' =begin case today.day when 1 run_logged_job 'import_pharmnet' run_logged_job 'import_whocc' when 15 run_logged_job 'import_pharma24' end =end end
def run_at(hour, &block) Thread.new { loop { now = Time.now #run_at = Time.local(now.year, now.month, now.day, hour) run_at = Time.local(now.year, now.month, now.day, hour, 50) while(now > run_at) run_at += 24*60*60 end print "now ", now, "\n" print "run_at ", run_at, "\n" sleep(run_at - now) block.call } } end
'update_hour' => 14,
Check before oddbd run
masa@masa ~/ywesee/de.oddb.org $ git status # On branch master # Changed but not updated: # (use "git add <file>..." to update what will be committed) # (use "git checkout -- <file>..." to discard changes in working directory) # # modified: lib/oddb/config.rb # modified: lib/oddb/util/server.rb # modified: lib/oddb/util/updater.rb # no changes added to commit (use "git add" and/or "git commit -a")
run oddbd
masa@masa ~/ywesee/de.oddb.org $ bin/oddbd now Mon Oct 25 14:42:00 +0200 2010 run_at Mon Oct 25 14:50:00 +0200 2010
Result (after updater run)
I, [2010-10-25T14:54:01.361746 #10042] INFO -- start: starting oddb-server on druby://localhost:11000 "getin Updater.run_logged_job" "getin Updater.run" /home/masa/ywesee/de.oddb.org/lib/oddb/html/view/drugs/package.rb:373: warning: parenthesize argument(s) for future version now Mon Oct 25 14:56:00 +0200 2010 run_at Tue Oct 26 14:56:00 +0200 2010
masa@masa ~/ywesee/de.oddb.org $ git status # On branch master # Changed but not updated: # (use "git add <file>..." to update what will be committed) # (use "git checkout -- <file>..." to discard changes in working directory) # # modified: lib/oddb/config.rb # modified: lib/oddb/util/server.rb # modified: lib/oddb/util/updater.rb # # Untracked files: # (use "git add <file>..." to include in what will be committed) # # log/ no changes added to commit (use "git add" and/or "git commit -a") masa@masa ~/ywesee/de.oddb.org $ ls log -al insgesamt 12 drwxr-xr-x 2 masa masa 8 25. Okt 14:50 . drwxr-xr-x 12 masa masa 4096 25. Okt 14:56 .. -rw-r--r-- 1 masa masa 135 25. Okt 14:56 import_gkv masa@masa ~/ywesee/de.oddb.org $ cat log/import_gkv E, [2010-10-25T14:56:00.697468 #10079] ERROR -- Gkv: No such file or directory - log_file=/home/masa/ywesee/de.oddb.org/log/import_gkv
Notes
Experiment
def Updater.run_logged_job job p "getin Updater.run_logged_job" dir = ODDB.config.oddb_dir cmd = File.join dir, 'jobs', job log = File.join dir, 'log', job print "dir=",dir,"\n" print "cmd=",cmd,"\n" print "log=",log,"\n" IO.popen "#{cmd} log_file=#{log}" do |io| # wait for importer to exit end end
Run bin/oddbd
Result
"getin Updater.run" "getin Updater.run_logged_job" dir=/home/masa/ywesee/de.oddb.org cmd=/home/masa/ywesee/de.oddb.org/jobs/import_gkv log=/home/masa/ywesee/de.oddb.org/log/import_gkv
Experiment
masa@masa ~/ywesee/de.oddb.org $ /home/masa/ywesee/de.oddb.org/jobs/import_gkv log_file=/home/masa/ywesee/de.oddb.org/log/import_gkv ... /home/masa/ywesee/de.oddb.org/lib/oddb/html/view/drugs/package.rb:373: warning: parenthesize argument(s) for future version getin Updater.import_gkv (ends) masa@masa ~/ywesee/de.oddb.org $ cat log/import_gkv E, [2010-10-25T15:38:51.334689 #10853] ERROR -- Gkv: No such file or directory - log_file=/home/masa/ywesee/de.oddb.org/log/import_gkv
Experiment
def download_latest(url, opts={}, &block) p "getin download_latest" p "1" opts = {:date => Date.today}.merge(opts) p "2" file = File.basename(url) p "3" pdf_dir = File.join(ODDB.config.var, 'pdf/gkv') p "4" FileUtils.mkdir_p(pdf_dir) p "5" dest = File.join(pdf_dir, file) p "6" archive = File.join(ODDB.config.var, 'pdf/gkv', sprintf("%s-%s", opts[:date].strftime("%Y.%m.%d"), file)) p "7" content = open(url).read p "8" if(!File.exist?(dest) || content.size != File.size(dest)) p "9" open(archive, 'w') { |local| local << content } p "10" open(archive, 'r', &block) p "11" open(dest, 'w') { |local| local << content } p "12" end rescue StandardError => error p "13" ODDB.logger.error('Gkv') { error.message } end
Run import_gkv
masa@masa ~/ywesee/de.oddb.org $ /home/masa/ywesee/de.oddb.org/jobs/import_gkv log_file=/home/masa/ywesee/de.oddb.org/log/import_gkv
Result
"getin download_latest" "1" "2" "3" "4" "5" "6" "7" "13"
Notes
content = open(url).read
Experiment
def download_latest(url, opts={}, &block) opts = {:date => Date.today}.merge(opts) file = File.basename(url) pdf_dir = File.join(ODDB.config.var, 'pdf/gkv') FileUtils.mkdir_p(pdf_dir) dest = File.join(pdf_dir, file) archive = File.join(ODDB.config.var, 'pdf/gkv', sprintf("%s-%s", opts[:date].strftime("%Y.%m.%d"), file)) print "url=", url, "\n" content = open(url).read if(!File.exist?(dest) || content.size != File.size(dest)) open(archive, 'w') { |local| local << content } open(archive, 'r', &block) open(dest, 'w') { |local| local << content } end rescue StandardError => error print "error=",error,"\n" ODDB.logger.error('Gkv') { error.message } end
Result
url=log_file=/home/masa/ywesee/de.oddb.org/log/import_gkv error=No such file or directory - log_file=/home/masa/ywesee/de.oddb.org/log/import_gkv
Notes
Experiment
def Updater.import_gkv(opts = {}) STDOUT.print "getin Updater.import_gkv\n" importer = Import::Gkv.new print "opts=" p opts if url = opts[:pdf] || importer.latest_url(Mechanize.new, opts) print "url=",url,"\n"
Result
getin Updater.import_gkv opts={:pdf=>"log_file=/home/masa/ywesee/de.oddb.org/log/import_gkv"} url=log_file=/home/masa/ywesee/de.oddb.org/log/import_gkv
Conclusion
Commit Turn back to the origial because import_gkv does not run automatically
Notes