view · edit · sidebar · attach · print · history

20120409-update-import-gkv-de-oddb-org-test-suite-ch-oddb-org

<< | Index | >>


Summary

de.oddb.org
  • debuged import_gkv job
    • updated ssl verify for HTTP Request at downloading.
    • checkd with extracted pdf
ch.oddb.org
  • checked duplicated pharmacode
    • run Updater.new(self).export_oddb_csv
  • updated test/suite.rb
    • Fixed syntax errors and encoding probrems
    • removed old unused test-case
    • tested with ruby 1.9.3

Commit

test
other
de.oddb.org

Index


Update import_gkv

SSL certificate error

See
E, [2012-04-09T09:25:20.880881 #5144] ERROR -- Gkv: certificate verify failed
Flow
  • job import_gkv (jobs/import_gkv)
    • Updater.import_gkv (lib/oddb/util/updater.rb)
      • Import::Gkv (/lib/oddb/import/gkv.rb)
        • Import::GkvHandler < Rpdf2text::SimpleHandler
          • download_latest
          • import

in /lib/oddb/import/gkv.rb

...
 def download_latest(url, opts={}, &block)
    opts = {:date => Date.today}.merge(opts)
    file = File.basename(url)
    pdf_dir = File.join(ODDB.config.var, 'pdf/gkv')
    FileUtils.mkdir_p(pdf_dir)
    dest = File.join(pdf_dir, file)
    archive = File.join(ODDB.config.var, 'pdf/gkv',
                sprintf("%s-%s", opts[:date].strftime("%Y.%m.%d"), file))
    content = open(url).read
    if(!File.exist?(dest) || content.size != File.size(dest))
      open(archive, 'w') { |local|
        local << content
      }   
      open(archive, 'r', &block)
      open(dest, 'w') { |local|
        local << content
      }   
    end 
  rescue StandardError => error
    ODDB.logger.error('Gkv') { error.message }
  end 
  def latest_url agent, opts={}
    host = 'https://www.gkv-spitzenverband.de'
    url = '/Befreiungsliste_Arzneimittel_Versicherte.gkvnet'
    page = agent.get host + url 
    file_base_name = "Zuzahlungsbefreit"
...

Debug import_gkv

extract few page from downloaded PDF

function pdfpextr()
{
    # this function uses 3 arguments:
    #     $1 is the first page of the range to extract
    #     $2 is the last page of the range to extract
    #     $3 is the input file
    #     output file will be named "inputfile_pXX-pYY.pdf"
    gs -sDEVICE=pdfwrite -dNOPAUSE -dBATCH -dSAFER \
       -dFirstPage=${1} \
       -dLastPage=${2} \
       -sOutputFile=${3%.pdf}_p${1}-p${2}.pdf \
       ${3}
}

commented out creation of archive file
in lib/oddb/import/gkv.rb

def download_latest(url, opts={}, &block)
...
    if(!File.exist?(dest) || content.size != File.size(dest))
      #open(archive, 'w') { |local|
      #  local << content
      #}  
      open(archive, 'r', &block)
      open(dest, 'w') { |local|
        local << content
      }   
    end 
...
end

It works.

Result
Mon Apr  9 09:55:04 2012: de.oddb.org ODDB::Import::Gkv#import
Imported     0 Zubef-Entries on 09.04.2012:
Visited      0 existing Zubef-Entries
Visited      0 existing Companies
Visited      0 existing Substances
Created      0 new Zubef-Entries
Created      0 new Products
Created      0 new Sequences
Created      0 new Companies
Created      0 new Substances
Assigned     0 Chemical Equivalences
Assigned     0 Companies
Created      0 Incomplete Packages:
Created      1 Product(s) without a name (missing product name):
http://de.oddb.org/de/drugs/product/uid/3480899

Refs


Duplicated pharmacode in csv

ch.ODDB.org Report - CSV-Export includes 4 duplicates - 04/2012

CSV-Export includes 4 duplicates:
28557027
51830041
51830042
60922001

I ran exporter_job with newest data from production.
And counted duplicate codes.

No.     Ean             Phar
1       7680285570271   341988
2       7680285570271   341988
3       7680006790117   754839
4       7680401140173   754839
5       7680302330185   1552262
6       7680571970020   1552262
7       7680536080191   1699717
8       7680580660097   1699717
9       7680003730017   1877047
10      7680003730031   1877047
11      7680158120602   2520459
12      7680158120794   2520459
13      7680006280021   2837535
14      7680006280038   2837535
15      7680576190027   3477837
16      7680576190034   3477837
17      7680524760210   3665651
18      7680524760388   3665651
19      7680524760319   3665668
20      7680524760463   3665668
21      7680577350017   3748199
22      7680577350031   3748199
23      7680609220011   4980952
24      7680609220011   4980952
25      7680518300415   5130612
26      7680518300415   5130612
27      7680518300422   5130629
28      7680518300422   5130629

suspend

Refs


Check duplicate pharmacode in oddb.csv


test/suite.rb for ch.oddb.org

test/suite.rb warnig output $ruby test/suite.rb 2>test.txt

Attach:test-20120409.txt

1. readonlyd does not work.

2. config file loading error.

3. old plugins

  • removed test for MigelPlugin

4. namespace

  • updated namespace

5. migel model

  • commented out migel model (moved to other process)

6. odba stub update

  • update odba stub classes in test

7. encoding

  • converted encoding of some files to utf-8

Reuslts

$ ruby test/test_util/suite.rb

Finished in 6.351517521 seconds.

515 tests, 670 assertions, 10 failures, 33 errors, 0 pendings, 0 omissions, 3 notifications
92.0388% passed

81.08 tests/s, 105.49 assertions/s

$ ruby test/test_command/suite.rb

Finished in 0.003894709 seconds.

3 tests, 2 assertions, 0 failures, 1 errors, 0 pendings, 0 omissions, 0 notifications
66.6667% passed

770.28 tests/s, 513.52 assertions/s

$ ruby test/test_custom/suite.rb

Finished in 0.960992447 seconds.

43 tests, 43 assertions, 9 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications
79.0698% passed

44.75 tests/s, 44.75 assertions/s

$ ruby test/test_model/suite.rb

Finished in 1.159574164 seconds.

661 tests, 1196 assertions, 3 failures, 87 errors, 0 pendings, 0 omissions, 11 notifications
86.3843% passed

570.04 tests/s, 1031.41 assertions/s

$ ruby test/test_plugin/suite.rb

Finished in 1.925475044 seconds.

93 tests, 159 assertions, 1 failures, 16 errors, 0 pendings, 0 omissions, 1 notifications
82.7957% passed

48.30 tests/s, 82.58 assertions/s

$ ruby test/test_remote/suite.rb

Finished in 0.004551177 seconds.

6 tests, 6 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications
100% passed

1318.34 tests/s, 1318.34 assertions/s

$ ruby test/test_state/suite.rb

Finished in 2.148220868 seconds.

785 tests, 828 assertions, 16 failures, 54 errors, 0 pendings, 0 omissions, 1 notifications
91.0828% passed

365.42 tests/s, 385.44 assertions/s

$ ruby test/test_view/suite.rb

Finished in 4.654328211 seconds.

906 tests, 786 assertions, 32 failures, 210 errors, 0 pendings, 0 omissions, 2 notifications
73.2892% passed

194.66 tests/s, 168.88 assertions/s

$ ruby test/suite.rb

Loaded suite
 /var/www/oddb.org/test/test_util/suite,
 /var/www/oddb.org/test/test_model/suite,
 /var/www/oddb.org/test/test_plugin/suite,
 /var/www/oddb.org/test/test_state/suite,
 /var/www/oddb.org/test/test_view/suite,
 /var/www/oddb.org/test/../ext/suite,
 /var/www/oddb.org/test/test_custom/suite,
 /var/www/oddb.org/test/test_command/suite,
 /var/www/oddb.org/test/test_remote/suite

Finished in 33.172388749 seconds.


[]
3387 tests, 4711 assertions, 76 failures, 423 errors

Require DRbServer

  • oddbd
  • readonlyd
  • fiparsed
  • swissindex_pharma
  • swissindex_nonpharma
  • exportd
view · edit · sidebar · attach · print · history
Page last modified on April 09, 2012, at 06:45 PM