Until now we kept a copy of each day with names like interactions_de_utf8-2017.08.15.csv or evidentia_fi_link-2017.08.16.csv. This is not desired.
Done with commit Remove old epha_interactions and evidentia_fi_link files
But Zeno clarified the requirement, that he wants the old files only be removed if the have the same content as todays. This would align the behaviour of src/util/latest.rb to the one of swissmedic. We have various implementations of get_latest, as the requirements are different (e.g. XMLPublications.xml come from an zip file or we want to ignore the daily timestamp). We find it in the following files:
A previous attempt to unify failed. Found an error in src/util/latest.rb which was not covered by an unit test. Fixed with commit Remove only yesterday files if equal latest
Must first enable saving more than one changelog for the FI. Storing in the ODBA failed because I had not added the line ODBA_SERIALIZABLE = ['@time', '@diff']
in the class FachinfoDocument::ChangeLogItem. Reworking src/plugin/textinfo to save correctly the existing changelog entries. Using sudo -u apache bundle-240 exec /usr/local/bin/ruby-240 jobs/update_textinfo_swissmedicinfo --target=fi --reparse 60384 55945
to test some examples after reloading the db dump of August 14.
Saving and restoring new change_log_items has problems. After saving in the plugin I restart ch.oddb and in a bin/admin session I get
ch.oddb> $x = ODBA.cache.fetch(35161723) -> Array ch.oddb> $x.size -> 2 ch.oddb> $x.first.time -> 2016-08-30 ch.oddb> $x.last.time -> undefined method `time' for nil:NilClass Did you mean? timeout
Why did not correctly save the time, even when the ChangeLogItem class has the following definition
class ChangeLogItem include Persistence attr_accessor :time, :diff ODBA_SERIALIZABLE = ['@time', '@diff'] def <=>(anOther) # [diff.to_s, time] <=> [anOther.diff.to_s, anOther.time] diff.to_s <=> anOther.diff.to_s end def pointer_descr time.strftime('%d.%m.%Y') end end
Also I got surprized by the fact that in TextInfoPlugin::store_fachinfo fachinfo.fr.text
returned the german text whereas fachinfo.description(fr).text
correctly returned the french text.
Now I am getting the following error when trying to save the change_log
[1] pry(#<ODDB::FachinfoDocument2001>)> odba_store IOError: closed stream from /var/www/oddb.org/vendor/ruby/2.4.0/gems/odba-1.1.2/lib/odba/marshal.rb:10:in `internal_encoding' [20] pry(ODDB::TextInfoPlugin)> reg.fachinfo.fr.change_log.last.diff.class => Diffy::Diff
Also when looking at the ODBA-object I have in the pry session when trying to store the fachinfo
[29] pry(ODDB::TextInfoPlugin)> reg.fachinfo.fr.change_log.last.time => 2017-08-21 [30] pry(ODDB::TextInfoPlugin)> reg.fachinfo.fr.change_log.last.odba_id => 36113256 [29] pry(ODDB::TextInfoPlugin)> reg.fachinfo.fr.change_log.last.time => 2017-08-21 [30] pry(ODDB::TextInfoPlugin)> reg.fachinfo.fr.change_log.last.odba_id => 36113256
bin/admin reports
ch.oddb> registration('55945').fachinfo.fr -> #<ODDB::FachinfoDocument2001:0x007feac6536688> ch.oddb> registration('55945').fachinfo.fr.change_log.size -> 1
Reverting to the old situation last year in src/model/fachinfo.rb, as I suspect the error was in the src/plugin/fachinfo.rb where we did not preserve the old change_log.
This did not work and I got
[13] pry(ODDB::TextInfoPlugin)> reg.iksnr => "60384" [14] pry(ODDB::TextInfoPlugin)> lang => "fr" [15] pry(ODDB::TextInfoPlugin)> app.registration('60384').fachinfo.fr.change_log.first.odba_id => 35827855 [16] pry(ODDB::TextInfoPlugin)> app.registration('60384').fachinfo.fr.change_log.last.odba_id => 36113198 [17] pry(ODDB::TextInfoPlugin)> x = ODBA.cache.fetch(36113198) ODBA::OdbaError: Unknown odba_id 36113198 from /var/www/oddb.org/vendor/ruby/2.4.0/gems/odba-1.1.2/lib/odba/cache.rb:639:in `restore_object' [18] pry(ODDB::TextInfoPlugin)> app.registration('60384').fachinfo.fr.change_log.last.time => 2017-08-21
I suspect that I must add a few lines to the Diffy::Diff class to persist it correctly, eg. I will try to add
class Diffy::Diff include Persistence ODBA_SERIALIZABLE = ['@default_format', ' @default_options', '@string1', '@string2', '@options'] end
Finally I found the culprit. The gem Diffy holds in its Diff class a variable @tempfiles, which contains closed files. ODBA is unable to dump it. This can be fixed by the following monkey patch
class Diffy::Diff def close_tempfiles @tempfiles.each{|x| x.close unless x.closed?} @tempfiles = [] end end
and adding item.diff.close_tempfiles
before calling item.odba_store
.
But I think the best way is to to fork the Diffy gem, patch it and submit a pull request.
This works nicely, as seen with the attached screenshot:
Pushed the following commits
Reimporting thinpower database and running import_daily before pushing the changes for oddb.org.