view · edit · sidebar · attach · print · history

Index>

20150126-oddb2xml-no-veterinary

Summary

  • oddb2xml should not produce article/products for veterinary use
  • finish gtin2atc

Commits

Index

Keep in Mind for work to do
  • Fix dojo error http://www.sitepen.com/blog/2012/10/31/debugging-dojo-common-error-messages/#forgot-dom-ready
  • I removed on May-27 tests for ix_registrationss, fix_sequences, fix_compositions, fix_packages from test/test_plugin/swissmedic.rb,as he could not find any references for them in the src code. Did I erroneously remove stuff when cleaning up the swissmedic import earlier?
  • The whole test for older/newer Packages must be adapted to xlsx. One must compare the rows (e.g. by creating csv files) and do the same stuff in xlsx!
  • creat gem: task: input=file with ean-codes, standard output show ean-codes + atc-code. Source is Swissmedic Packungen.xlsx or XML.
  • Display 10 recalls not only those from this month
  • Import via data/medreg_companies.yaml

---

finish gtin2atc

We must finish the work. Eg. parsing input for reading the list of GTINs. Parsing the Swissmedic_package.xlsx takes several minutes.

After discussing we Zeno we decided to offer a switch '--compare' which will download and compare the three files (BAG, SwissIndex and RefData) and outputting a list of all differences. This switch will override all others. To fetch the ATC-codes fort one or several GTINs you may specify them via parameters and it will just get the data from swissindex. This takes only about 8 seconds for the first run and 3 seconds for subsequents run (in the next 24 hours).

Okay. Looks good for me now. E.g

bundle exec bin/gtin2atc --compare 2>&1 | tee compare.log
rm -f /opt/src/gtin2atc/XMLPublications.zip
Opened /opt/src/gtin2atc/log.log
2015-01-26 14:47:08 +0100: Resumen:
  Found infos about 21577  entries
  BAG 9212 entries. 8 entries had not GTIN field.. Fetched from http://bag.e-mediat.net/SL2007.Web.External/File.axd?file=XMLPublications.zip
  SwissIndex 15775 entries. Fetched from https://index.ws.e-mediat.net/Swissindex/Pharma/ws_Pharma_V101.asmx?WSDL
  SwissMedic 18870 entries. Fetched from http://www.swissmedic.ch/arzneimittel/00156/00221/00222/00230/index.html?lang=de
  Matching 8964 items.
  Not in BAG 4590
  Not in SwissIndex 5677
  Not in Packungen 2346
  ATC-Codes differ 0
2015-01-26 14:47:08 +0100: 32 done. Took 46 seconds

or looking for the ATC code of two GTINs

bundle exec bin/gtin2atc 7680147690482 7680353660163; cat gtin2atc.csv 
gtin,ATC,pharmacode
7680147690482,N07BC02,41803
7680353660163,B03AE10,20273

Pushed commit Fixed remaining problems

Zeno wishes the following improvements:

  • Compare Swissmedic and swissindex. Especially whether the ATC-Code has the same length
  • gtin2atc_packungen.csv should be renames got gtin2atc_swissmedic.csv
  • all *.csv files should include the product name, too.
  • Must remove trailing linefeed when parsing input
  • Must accept pharmacode as input, too.

See commits

Running gtin2atc --compare now produces

Result of verifing data from BAG (SL):
  BAG-data fetched from http://bag.e-mediat.net/SL2007.Web.External/File.axd?file=XMLPublications.zip.
  BAG had 9212 entries
  8 entries had no GTIN field
  Not in SwissMedic 486
  Not in SwissIndex 248
  Comparing ATC-Codes between BAG and Swissmedic
      8123 items had the same ATC code in BAG, SwissIndex and SwissMedic
       830 are the same in SwissMedic and BAG
       204 are different in SwissMedic and BAG
       265 are shorter in SwissMedic than in BAG
        11 are longer in SwissMedic than in BAG
  Comparing ATC-Codes between BAG and Swissindex
      8123 items had the same ATC code in BAG, SwissIndex and SwissMedic
       830 are the same in SwissIndex and BAG
         0 are different in SwissMedic and BAG
         0 are shorter in SwissIndex than in BAG
        11 are longer in SwissIndex than in BAG
Result of verifing data from swissmedic:
  SwissMedic had 18870 entries. Fetched from http://www.swissmedic.ch/arzneimittel/00156/00221/00222/00230/index.html?lang=de
  SwissIndex 15775 entries. Fetched from https://index.ws.e-mediat.net/Swissindex/Pharma/ws_Pharma_V101.asmx?WSDL
  BAG 9212 entries. 8 entries had no GTIN field. Fetched from http://bag.e-mediat.net/SL2007.Web.External/File.axd?file=XMLPublications.zip
  Matching 8603 items.
  Not in BAG 4590
  Not in SwissIndex 5677
  Comparing ATC-Codes between Swissmedic and Swissindex
      3334 match
       158 are different
      3334 are the same in SwissIndex and SwissMedic
        66 are shorter in SwissIndex
      1032 are longer in SwissIndex
Comparing all GTIN-codes:
  Found infos about 21577 entries
  BAG 9212 entries. 8 entries had no GTIN field. Fetched from http://bag.e-mediat.net/SL2007.Web.External/File.axd?file=XMLPublications.zip
  SwissIndex 15775 entries. Fetched from https://index.ws.e-mediat.net/Swissindex/Pharma/ws_Pharma_V101.asmx?WSDL
  SwissMedic 18870 entries. Fetched from http://www.swissmedic.ch/arzneimittel/00156/00221/00222/00230/index.html?lang=de
    8592 items had the same ATC code in BAG, SwissIndex and SwissMedic
    4590 not in BAG
    5677 not in SwissIndex
    2707 not in SwissMedic
      11 ATC-Codes differed

oddb2xml should not produce article/products for veterinary use

Zeno remarked that in https://srv.elexis.info/jenkins/view/Artikelstamm/job/Artikelstamm%20Full%20Build/1/console we found entries like

7680559950068:Removing VetMed Article with ATC QP53AC11

This should not be necessary.

The relevant parts of the jenkin-ci build are

rvm use ruby-1.9.3-p448
gem update oddb2xml
/usr/local/bin/oddb2xml -e
JAVA="/usr/bin/java"
$JAVA -jar ConvertOddb2XmlToArtikelstamm.jar --oddb2xmlArticleFile oddb_article.xml --oddb2xmlLimitationFile oddb_limitation.xml --oddb2xmlProductFile oddb_product.xml
/usr/bin/xmllint --noout --schema /opt/artikelstamm/Elexis_Artikelstamm_v002.xsd artikelstamm_*.xml

Creating a unit-test for 7680559950068. The relevant part in SwissIndex_Pharma_DE.xml is

      <ITEM DT="2014-10-17T00:00:00">
        <GTIN>7680559950068</GTIN>
        <PHAR>2930393</PHAR>
        <STATUS>A</STATUS>
        <STDATE>2005-02-02T00:00:00</STDATE>
        <LANG>DE</LANG>
        <DSCR>SCALIBOR Protectorband 65 cm gross f Hunde</DSCR>
        <ADDSCR>1 Stk</ADDSCR>
        <COMP>
          <NAME>MSD Animal Health GmbH</NAME>
          <GLN>7601001053854</GLN>
        </COMP>
      </ITEM>

I don't know how we could determine that this is for veterinary use. Or should the absence of in 7680559950068 in Publications.xls trigger this action? When building the article.xml I have the following info at hand (from pry).

Skipping vet ?? 7680559950068 {:refdata=>true, :_type=>:pharma, :ean=>"7680559950068", :pharmacode=>"2930393", :stat_date=>"", :lang=>"DE", :desc=>"SCALIBOR Protectorband 65 cm gross f Hunde", :atc_code=>"", :additional_desc=>"1 Stk", :company_name=>"MSD Animal Health GmbH", :company_ean=>"7601001053854"}

Or should I exclude all articles from the company 'MSD Animal Health GmbH'.

Discussed with Marco Descher the problem. He marked all ATC-Code from the group 'Q';

Side note:

His biggest problem with oddb2xml are the 2316 entries of "Invalid number", eg. 7680628610022: Invalid number string 6 x 2.5 ml in Product/PackGrSwissmedic, where he would like to have correct description of the package size/unit. Created a small ruby-helper Attach:oddb2xml_invalid_number.rb and created a uniq list of used patterns. See Attach:oddb2xml_invalid_number_log.txt. Most of them would be easy to parse, but a few will probably impossible to convert, as they don't fit the pattern of quantity/size. E.g 84 (4 x 21), 84+88

view · edit · sidebar · attach · print · history
Page last modified on January 26, 2015, at 06:11 PM