---
We must finish the work. Eg. parsing input for reading the list of GTINs. Parsing the Swissmedic_package.xlsx takes several minutes.
After discussing we Zeno we decided to offer a switch '--compare' which will download and compare the three files (BAG, SwissIndex and RefData) and outputting a list of all differences. This switch will override all others. To fetch the ATC-codes fort one or several GTINs you may specify them via parameters and it will just get the data from swissindex. This takes only about 8 seconds for the first run and 3 seconds for subsequents run (in the next 24 hours).
Okay. Looks good for me now. E.g
bundle exec bin/gtin2atc --compare 2>&1 | tee compare.log rm -f /opt/src/gtin2atc/XMLPublications.zip Opened /opt/src/gtin2atc/log.log 2015-01-26 14:47:08 +0100: Resumen: Found infos about 21577 entries BAG 9212 entries. 8 entries had not GTIN field.. Fetched from http://bag.e-mediat.net/SL2007.Web.External/File.axd?file=XMLPublications.zip SwissIndex 15775 entries. Fetched from https://index.ws.e-mediat.net/Swissindex/Pharma/ws_Pharma_V101.asmx?WSDL SwissMedic 18870 entries. Fetched from http://www.swissmedic.ch/arzneimittel/00156/00221/00222/00230/index.html?lang=de Matching 8964 items. Not in BAG 4590 Not in SwissIndex 5677 Not in Packungen 2346 ATC-Codes differ 0 2015-01-26 14:47:08 +0100: 32 done. Took 46 seconds
or looking for the ATC code of two GTINs
bundle exec bin/gtin2atc 7680147690482 7680353660163; cat gtin2atc.csv gtin,ATC,pharmacode 7680147690482,N07BC02,41803 7680353660163,B03AE10,20273
Pushed commit Fixed remaining problems
Zeno wishes the following improvements:
See commits
Running gtin2atc --compare now produces
Result of verifing data from BAG (SL):
  BAG-data fetched from http://bag.e-mediat.net/SL2007.Web.External/File.axd?file=XMLPublications.zip.
  BAG had 9212 entries
  8 entries had no GTIN field
  Not in SwissMedic 486
  Not in SwissIndex 248
  Comparing ATC-Codes between BAG and Swissmedic
      8123 items had the same ATC code in BAG, SwissIndex and SwissMedic
       830 are the same in SwissMedic and BAG
       204 are different in SwissMedic and BAG
       265 are shorter in SwissMedic than in BAG
        11 are longer in SwissMedic than in BAG
  Comparing ATC-Codes between BAG and Swissindex
      8123 items had the same ATC code in BAG, SwissIndex and SwissMedic
       830 are the same in SwissIndex and BAG
         0 are different in SwissMedic and BAG
         0 are shorter in SwissIndex than in BAG
        11 are longer in SwissIndex than in BAG
Result of verifing data from swissmedic:
  SwissMedic had 18870 entries. Fetched from http://www.swissmedic.ch/arzneimittel/00156/00221/00222/00230/index.html?lang=de
  SwissIndex 15775 entries. Fetched from https://index.ws.e-mediat.net/Swissindex/Pharma/ws_Pharma_V101.asmx?WSDL
  BAG 9212 entries. 8 entries had no GTIN field. Fetched from http://bag.e-mediat.net/SL2007.Web.External/File.axd?file=XMLPublications.zip
  Matching 8603 items.
  Not in BAG 4590
  Not in SwissIndex 5677
  Comparing ATC-Codes between Swissmedic and Swissindex
      3334 match
       158 are different
      3334 are the same in SwissIndex and SwissMedic
        66 are shorter in SwissIndex
      1032 are longer in SwissIndex
Comparing all GTIN-codes:
  Found infos about 21577 entries
  BAG 9212 entries. 8 entries had no GTIN field. Fetched from http://bag.e-mediat.net/SL2007.Web.External/File.axd?file=XMLPublications.zip
  SwissIndex 15775 entries. Fetched from https://index.ws.e-mediat.net/Swissindex/Pharma/ws_Pharma_V101.asmx?WSDL
  SwissMedic 18870 entries. Fetched from http://www.swissmedic.ch/arzneimittel/00156/00221/00222/00230/index.html?lang=de
    8592 items had the same ATC code in BAG, SwissIndex and SwissMedic
    4590 not in BAG
    5677 not in SwissIndex
    2707 not in SwissMedic
      11 ATC-Codes differed
Zeno remarked that in https://srv.elexis.info/jenkins/view/Artikelstamm/job/Artikelstamm%20Full%20Build/1/console we found entries like
7680559950068:Removing VetMed Article with ATC QP53AC11
This should not be necessary.
The relevant parts of the jenkin-ci build are
rvm use ruby-1.9.3-p448 gem update oddb2xml /usr/local/bin/oddb2xml -e JAVA="/usr/bin/java" $JAVA -jar ConvertOddb2XmlToArtikelstamm.jar --oddb2xmlArticleFile oddb_article.xml --oddb2xmlLimitationFile oddb_limitation.xml --oddb2xmlProductFile oddb_product.xml /usr/bin/xmllint --noout --schema /opt/artikelstamm/Elexis_Artikelstamm_v002.xsd artikelstamm_*.xml
Creating a unit-test for 7680559950068. The relevant part in SwissIndex_Pharma_DE.xml is
      <ITEM DT="2014-10-17T00:00:00">
        <GTIN>7680559950068</GTIN>
        <PHAR>2930393</PHAR>
        <STATUS>A</STATUS>
        <STDATE>2005-02-02T00:00:00</STDATE>
        <LANG>DE</LANG>
        <DSCR>SCALIBOR Protectorband 65 cm gross f Hunde</DSCR>
        <ADDSCR>1 Stk</ADDSCR>
        <COMP>
          <NAME>MSD Animal Health GmbH</NAME>
          <GLN>7601001053854</GLN>
        </COMP>
      </ITEM>
I don't know how we could determine that this is for veterinary use. Or should the absence of in 7680559950068 in Publications.xls trigger this action? When building the article.xml I have the following info at hand (from pry).
Skipping vet ?? 7680559950068 {:refdata=>true, :_type=>:pharma, :ean=>"7680559950068", :pharmacode=>"2930393", :stat_date=>"", :lang=>"DE", :desc=>"SCALIBOR Protectorband 65 cm gross f Hunde", :atc_code=>"", :additional_desc=>"1 Stk", :company_name=>"MSD Animal Health GmbH", :company_ean=>"7601001053854"}
Or should I exclude all articles from the company 'MSD Animal Health GmbH'.
Discussed with Marco Descher the problem. He marked all ATC-Code from the group 'Q';
Side note:
His biggest problem with oddb2xml are the 2316 entries of "Invalid number", eg. 7680628610022: Invalid number string 6 x 2.5 ml in Product/PackGrSwissmedic, where he would like to have correct description of the package size/unit. Created a small ruby-helper Attach:oddb2xml_invalid_number.rb and created a uniq list of used patterns. See Attach:oddb2xml_invalid_number_log.txt. Most of them would be easy to parse, but a few will probably impossible to convert, as they don't fit the pattern of quantity/size. E.g 84 (4 x 21),
84+88