view · edit · sidebar · attach · print · history

20120604-timer-update-job

<< | Index | >>


Summary

Index


updater/exporter job of de.oddb.org

de.oddb.org has Updater/Exporter job without crontab.

  • Exporter.run
  • Updater.run
NOTE
  • de.oddb.org does not have process without main server (ex. crawler.)
  • ch.oddb.org has many update/export jobs that its take 5 hours over.
  • We might need to check duplicate process to prevent data corruption, if we run some jobs at same time in some process.

in lib/oddb/util/server.rb

...
     def initialize(*args)
       super
       @rss_mutex = Mutex.new
       run_exporter if(ODDB.config.run_exporter)
       run_updater if(ODDB.config.run_updater)
     end
...
     def run_at(hour, &block)
        Thread.new {
          loop {
            now = Time.now 
            run_at = Time.local(now.year, now.month, now.day, hour)
            while(now > run_at)
              run_at += 24*60*60
            end 
            sleep(run_at - now)
            block.call
          }   
        }   
      end

Updater.run and Exporter.run use IO.popen.
IO.popen runs command in sub-process(child process).

in lib/oddb/util/updater.rb

      def Updater.run_logged_job job 
        dir = ODDB.config.oddb_dir
        cmd = File.join dir, 'jobs', job 
        log = File.join dir, 'log', job 
        IO.popen "#{cmd} log_file=#{log}" do |io|
          # wait for importer to exit
        end 
      end

Refs.


Random Updater of ch.oddb.org

update_fachinfo is called via this Updater.run_random.
It behave like updater at bin/admin.

in src/util/oddbapp.rb

    def run_random_updater
      Thread.new {
        Thread.current.abort_on_exception = true 
        update_hour = rand(24)
        update_min = rand(60)
        day = (update_hour > Time.now.hour) ? \
          today : today.next
        loop {
          next_run = Time.local(day.year, day.month, day.day,
            update_hour, update_min)
          puts "next random update will take place at #{next_run}"
          $stdout.flush
          sleep(next_run - Time.now)
          Updater.new(self).run_random
          @system.recount
          GC.start
          day = today.next
          update_hour = rand(24)
          update_min = rand(60)
        }    
      }    
    end

Then as experiment, I created ontime updater method.
This ontime_update calls Updater.new(self).run (daily updater) only.

in src/util/oddbapp.rb

    def run_ontime_updater
      Thread.new {
        Thread.current.abort_on_exception = true 
        hour = ODDB.config.update_hour
        loop {
          now = Time.now
          run_at = Time.local(now.year, now.month, now.day, hour)
          run_at += 24*60*60 if(now > run_at)
          puts "\nNext ontime update will take place at #{run_at}"
          $stdout.flush
          sleep(run_at - now) 
          Updater.new(self).run
          @system.recount
          GC.start
        }    
      }    
    end 

I will test in my local machine.
It takes 25-30 hours (from japan.)


Debug updater by cron

Above updater works fine, But We need updater job in new process.
I have debugged cronjob once again.

I found problem in recent debug logs of import_daily via cron.
It seems that import_dail is stopped in parsing by somthig error.

I estimated that problem is environment.
Because apache user dose not have shell and LANG variable.

debug logs are:

no-change
2012-06-03 09:01:21 CEST getin update_bsv
2012-06-03 09:01:29 CEST getin BsvXmlPlugin.update
2012-06-03 09:01:29 CEST target_url = http://bag.e-mediat.net/SL2007.Web.External/File.axd?file=XMLPublications.zip
2012-06-03 09:01:29 CEST save_dir   = /var/www/oddb.org/data/xml
2012-06-03 09:01:29 CEST getin download_file
2012-06-03 09:01:32 CEST save_file   = /var/www/oddb.org/data/xml/XMLPublications-2012.06.03.zip
2012-06-03 09:01:32 CEST latest_file = /var/www/oddb.org/data/xml/XMLPublications-latest.zip
2012-06-03 09:01:32 CEST File.exists?(/var/www/oddb.org/data/xml/XMLPublications-latest.zip) = true
2012-06-03 09:01:32 CEST FileUtils.compare_file(/tmp/foo20120603-29047-10ni8hd, /var/www/oddb.org/data/xml/XMLPublications-latest.zip) = true
2012-06-03 09:01:32 CEST path = nil
2012-06-03 09:01:32 CEST return_value_BsvXmlPlugin.update = nil
2012-06-03 09:01:32 CEST return_value_update_bsv=nil
error
2012-06-01 09:01:33 CEST getin update_bsv
2012-06-01 09:01:41 CEST getin BsvXmlPlugin.update
2012-06-01 09:01:41 CEST target_url = http://bag.e-mediat.net/SL2007.Web.External/File.axd?file=XMLPublications.zip
2012-06-01 09:01:41 CEST save_dir   = /var/www/oddb.org/data/xml
2012-06-01 09:01:41 CEST getin download_file
2012-06-01 09:01:45 CEST save_file   = /var/www/oddb.org/data/xml/XMLPublications-2012.06.01.zip
2012-06-01 09:01:45 CEST latest_file = /var/www/oddb.org/data/xml/XMLPublications-latest.zip
2012-06-01 09:01:45 CEST File.exists?(/var/www/oddb.org/data/xml/XMLPublications-latest.zip) = true
2012-06-01 09:01:45 CEST FileUtils.compare_file(/tmp/foo20120601-24706-ahx79w, /var/www/oddb.org/data/xml/XMLPublications-latest.zip) = false
2012-06-01 09:01:45 CEST path = "/var/www/oddb.org/data/xml/XMLPublications-2012.06.01.zip"
2012-06-01 09:01:45 CEST entry.name = GL_Diff_SB.xml
2012-06-01 09:01:45 CEST entry.name = Gestrichene_Packungen_Emballages_radies.xls
2012-06-01 09:01:45 CEST entry.name = ItCodes.xml
2012-06-01 09:04:07 CEST entry.name = PR120601.txt
2012-06-01 09:04:07 CEST entry.name = Preparations.xml
2012-06-01 09:40:21 CEST getin Log.notify (SL-Update)
...
2012-06-01 09:40:21 CEST return_value_update_bsv=nil

continue tomorrow.

view · edit · sidebar · attach · print · history
Page last modified on June 05, 2012, at 10:14 AM