view · edit · sidebar · attach · print · history

20120618-search-result-atc-export-oddb-yaml

<< | Index | >>


Summary

  • Updated dosing.de link in search result by sequence.
  • debug exporte_fachinfo_yaml
    • encoding(escape unicode) comes from YAML engine syck
    • YAML engine psych and syck don't have compatibility each other.
    • We have to update ODDB Object that it is handled as binary by psych engine.

Commits

Index


dosing.de link in search sequence resultlist

to create link to dosing.de in search result by sequence(Markenname).

This result list has arleady link.

This result list does not have dosing.de link.

Problem

In Updating by update_bsv, Package model has pointer to atc.code as atc_classs(not atc_code).
We need to update pointer of Package atc_class value that contains new atc.ni_id.
But, I could not update this pointer of Package.

in src/plugin/bsv_xml.rb

...
          if pac.public? && pac.sl_entry
            @known_packages.store pac.pointer, {
              :name_base           => pac.name_base,
              :atc_class           => (atc = pac.atc_class) && atc.code,
              :pharmacode_oddb     => pac.pharmacode,
              :swissmedic_no5_oddb => pac.iksnr,
              :swissmedic_no8_oddb => pac.ikskey,
            }   
...

and search result list creates new ATC and SearchResult objects.

...
              sequences.each { |seq|
                      code = (atc = seq.atc_class) ? atc.code : 'n.n'
                      new_atc = atc_classes.fetch(code) { 
                              atc_class = ODDB::AtcClass.new(code)
                              atc_class.descriptions = atc.descriptions unless(atc.nil?)
                              atc_classes.store(code, atc_class)
                      }
                      new_atc.sequences.push(seq)
              }
...

I updated this creationg of search result list.

commit
result

z.B.


Debug_export_oddb_yaml

Following exporter jobs has escaped unicode characters in yaml file.

  • jobs/export_fachinfo_yaml (Exporter.export_fachinfo_yaml)
  • jobs/export_patinfo_yaml (Exporter.export_patinfo_yaml)

z.B. (fachinfo.yaml)

...
text: "1 Tablette Inderal enth\xC3\xA4lt 10 mg oder 40 mg Propranololi hydrochloridum."
...

exporter job needs exportd(in ext/export)

$ ruby jobs/export_fachinfo_yaml

Yaml.dump

checked Yaml.dump

test
test = ["Grüße", "Öl", "Käse"]
dump = YAML.dump(test)
p dump # => "---\n- Grüße\n- Öl\n- Käse\n"
File.open('test', 'w') do |fh|
  test = ["Grüße", "Öl", "Käse"]
  YAML.dump(test, fh) 
end
# =>
---
- Grüße
- Öl
- Käse

Tempfile.open

I added :encoding argument to Tempfile.open in ODBAExporter.safe_export. But, encoding problem is still caused.

in ext/export/odba_exporter.rb

...
Tempfile.open(name, dir, :encoding => 'utf-8') { |fh|
        block.call(fh)
        fh.close
...
experiment
require 'tempfile'
require 'yaml'

Tempfile.open('test', './') do |fh|
  test = ["Grüße", "Öl", "Käse"]
  YAML.dump(test, fh) 
  fh.close
  FileUtils.mv(fh.path, 'result')
  FileUtils.chmod(0644, 'result')
end
# =>
---
- Grüße
- Öl
- Käse

Refs.

OdbaExporter.compress

Then I estimated that OdbaExporter.compress have some encoding problem.

in ext/exporter/src/odba_exporter.rb

...
      File.open(name, "r") { |fh|
        fh.each { |line|
          p line
          gzwriter << line
          zipwriter.puts(line)
        }
      }
...

but line has already escaped text.

...
 heading: \"Mise \\xC3\\xA0 jour de l\\xE2\\x80\\x99information\"\n"
...

YAML engine

I noticed YAML engine is setuped as syck in oddbd, exportd.

experiment
 def OdbaExporter.export_yaml(odba_ids, dir, name, opts={})
...
begin
            #YAML.dump(ODBA.cache.fetch(odba_id, nil), fh)
            YAML.dump("üüüü", fh) 
            fh.puts
...

# => "--- \"\\xC3\\xBC\\xC3\\xBC\\xC3\\xBC\\xC3\\xBC\"\n"
require 'tempfile'
require 'yaml'
#YAML::ENGINE.yamler = "psych"
YAML::ENGINE.yamler = "syck"

Tempfile.open('test', './') do |fh|
  test = ["Grüße", "Öl", "Käse"]
  YAML.dump(test, fh) 
  fh.close
  FileUtils.mv(fh.path, 'result')
  FileUtils.chmod(0644, 'result')
end
# =>
--- 
- "Gr\xC3\xBC\xC3\x9Fe"
- "\xC3\x96l"
- "K\xC3\xA4se"

But we have to use syck in oddb.
exportd use quick_emit method.(this method are deprecated in YAML psych)
If use new engine psych in oddb, then all yaml format is broken.

Attach:fachinfo-with-psych-engine-20120618.yaml.txt

I tried overwrite to_yaml method of each object.
But I could not fix this problem yet.

Refs

view · edit · sidebar · attach · print · history
Page last modified on June 19, 2012, at 12:32 PM