could not connect to www.swissreg.ch: #<Net::HTTPInternalServerError:0x007f8a7d69bb58>
---
Motivation see http://tunein.yap.tv/ruby/2012/06/08/elegant-xml-parsing/.
Changing to use SAX-parser http://nokogiri.org/Nokogiri/XML/SAX/Document.html. We are hoping to solve the problem under windows where we get the following error when running oddb2xml -e
All read methods of File(IO) class can not read whole data on Windows. (got "failed to allocate memory").
I could not reproduce this behaviour on my windows7 virtual machine.
Had patches lying around to use the builder gem. See Attach:use_builder_patch.txt. Tried it and remarked:
<send>DSCRDKendural Depottabl </send>
instead of <DSCRD>Kendural Depottabl </DSCRD>
for oddb_product.xml.
Therefore not looking closer at this alternative.
Running oddb2xml -e --log
only on an ARM linux with 256 MB RAM@1GHzto see, whether we need more memory. Also patching test_options.rb to prepend a time -v
to each command to get an exact value of consumed memory.
Using a sax-parser (and sax-machine) to reduce the memory usage was much more invasive (and time consuming to add) than expected. First part see Attach:use_sax_machine_patch.txt.
Adapting the other 2 occurences of the XML-Parser, too. Now rake spec
works again. Running rake test
fails with
/opt/src/oddb2xml/lib/oddb2xml/extractor.rb:264:in `block (2 levels) in to_hash': undefined method `to_a' for #<Oddb2xml::LimitationElement:0x000000034e1be8> (NoMethodError) from /opt/src/oddb2xml/lib/oddb2xml/extractor.rb:233:in `each' from /opt/src/oddb2xml/lib/oddb2xml/extractor.rb:233:in `block in to_hash' from /opt/src/oddb2xml/lib/oddb2xml/extractor.rb:202:in `each' from /opt/src/oddb2xml/lib/oddb2xml/extractor.rb:202:in `to_hash' from /opt/src/oddb2xml/lib/oddb2xml/cli.rb:231:in `block (2 levels) in download' from /opt/src/oddb2xml/lib/oddb2xml/cli.rb:230:in `synchronize' from /opt/src/oddb2xml/lib/oddb2xml/cli.rb:230:in `block in download'
There seems to be still differences for the emitted limitations, but the maximal use of memory went done. E.g for bundle exec bin/oddb2xml --skip-download --log -f xml
version 1.8.0 used Maximum resident set size (kbytes): 1832776
whereas with my local changes we need now Maximum resident set size (kbytes): 1277148
or about a 30% reduction. But as we hold a lot of information in memory, memory consumption is still very high. If we wanted to change this we would be forced to use a database, e.g. sqlite3, limit its memory usages (see e.g. this link). Also the build seem to complete in less time (also approximately 30%).
Solved with commit Adapt to new location of epha interaction csv-file