<< | Index | >>
Reference
Note
zubef2011.pdf
%PDF-1.4 %????^M 1 0 obj << /Title (?ﺯ8Z???^^???u???~K?^@^X?\\~? ?) /Author (?٭?S???^P???a???n`) /Creator (?ɮ^_S???^K???d???ez?\n^R?.W?^F??b!) /Producer (?ɮ^_S???^K???=?ȣ"C?^A^N?tC?7??p!??$???x0MDZ?o?Z?????Y^Y???Sѧ?e^S?/^Q) /CreationDate (˗?i^C???H???!?ȴ!$?HZ?$) >> endobj
Note
zubef2010.pdf
<</Length 3674/Subtype/XML/Type/Metadata>>stream ... endstream^Mendobj^M 2536 0 obj^M<</EncryptMetadata false/Filter/Standard/Length 128/O(?\r???i8?h?}^Q^W??3^Fn?Q>??^B?d?b????) ...
Note
Experiment (lib/rpdf2txt/parser.rb)
def check_producer object_catalogue.values.each do |v| p v.class begin if producer = v.contents[:producer] p producer end rescue end end end
#parser.extract_text(handler) parser.check_producer
Run
masa@masa ~/ywesee/rpdf2txt $ ruby -I lib bin/rpdf2txt zubef2011.pdf > test.dat
Result
... Rpdf2txt::PageLeaf Rpdf2txt::Unknown Rpdf2txt::PdfHash "\377\311\256\037S\371\251\335\v\210\364\363=\277\310\243\"C\333\001\016\365tC\3027\263\335p!\272\203$\373?\315x0M\307\261\340o\333Z\215\372\323\370\341Y\031\372\241\320S\321\247\271e\023\265/\021" Rpdf2txt::Unknown Rpdf2txt::Unknown Rpdf2txt::Stream ...
Note
Experiment
test.rb
require 'origami' include Origami pdf = PDF.read('zubef2011.pdf', {:verbosity => Parser::VERBOSE_QUIET}) docinfo = pdf.get_document_info pro = Origami::Name.new('Producer') p docinfo[pro]
Result
"pdfFactory 3.25 (Windows Server 2003 R2 Standard Edition German)"