---
Suddenly we got errors like this
Plugin: ODDB::TextInfoPlugin Error: NoMemoryError Message: failed to allocate memory Backtrace: /var/www/oddb.org/src/model/text.rb:395:in `block in wrap'
Observed that I consumed over 13 GB of RAM before import_daily failed on oddb-ci2. Aips*.xml file grew from 595 MB in November 2013 to 733 MB in May 2014. And using nokogiris xpath is very memory consuming. Trying to switch to sax-machine. Made a local patch and restart import_daily.
Called sudo gem install sax-machine --version 0.1.0
on oddb-ci2. Fixing some nil accesses in new code. After 9 minutes memory seems stays between 6700 and 7100 MB. After 50 minutes memory went up to about 9000 MB.
Will export the old xls files as csv and create a corresponding xlsx file. (test/data/xls/Packungen.xls and test/data/xls/Packungen.older.xls)
Idea for realization is that we mark all items coming from refdata by adding a refdata flag (BagXmlExtractor, SwissIndexExtractor, MigelExtractor, SwissmedicInfoExtractor). When emitting (builder.rb) we test it.
Added some unit-tests and made changes. Migel-products required a separat lookup. Running now oddbxml -e to check the result. Adapted oddb2xml.xsd. Bumped version to 1.8.5. The new element REF_DATA will always be created when emitting oddb_article.xml.
Checking the results
2014-06-04 10:41:49: build_article. Done 159363 of 159363 articles DE Pharma products: 15590 NonPharma products: 28992 FR Pharma products: 15590 NonPharma products: 28992 Prices zur Rose: 136962 2014-06-04 10:41:54 +0200: 103 done. Took 2187 seconds Added 52997 via pharmacodes of 136962 items when extracting the transfer.dat from "Zur Rose" found 775 lines with duplicated ean13 niklaus@ng-tr /o/s/oddb2xml> grep -c "REF_DATA>0" oddb_article.xml 115512 niklaus@ng-tr /o/s/oddb2xml> grep -c "REF_DATA>1" oddb_article.xml 44602
I have a discrepancy that 15590 Pharma and 28992 NonPharma products gives only 44582 refdata products, but I emitted 44602 articles. Why? How can I find the 20 items that produce this problem. Probably a brute force debugging needed.
Pushed commits Bumped version to 1.8.5 and Added refdata field.
Running rake test to ensure that everything is okay.