view · edit · sidebar · attach · print · history

20101203-update-outline-spreadsheet

<< Masa.20101206-update-outline-spreadsheet | 2010 | Masa.20101202-update-outline-sreadsheet >>


  1. Review
  2. Look at bit data of column info

Goal
  • Check outline function spreadsheet / 80%
Milestones
  1. Review
  2. BraSt
  3. Dig into Writer and Reader classes
    • Look at bit data 10:30
    • BraSt
Summary
Commits
ToDo Tomorrow
Keep in Mind
  1. todos
  2. BAG: Files published obviously only on the first ch.ODDB.org Report - Error: Generikaliste - 12/2010 log passwd
  3. Testcase of 'Object Stream' parser (for rpdf2txt/lib/rpdf2txt/parser.rb#rebuild_object_catalogue method) 20101126-update-rpdf2txt
  4. Testcases of lib/oddb/html/state/global.rb#grant_download, lib/oddb/html/view/download.rb#to_html
  5. Debug testcases in test/export/test_server.rb de.oddb.org
  6. A bug import_gkv Tue Nov 16 02:00:10 2010: de.oddb.org Zubef (PDF)
  7. Compression (refer to lib/oddb/export/server.rb), Test cases (grant_download, Logging, Reporting)
  8. Log Error: on production server, de.oddb.org/log/import_dimdi, import_pharmnet
  9. On Ice
  10. emerge --sync

Review

masa@masa ~/ywesee/spreadsheet $ cat test.rb 
require 'spreadsheet'

book = Spreadsheet::Workbook.new
sheet = book.create_worksheet

sheet.column(1).hidden = true
sheet.column(2).hidden = true

book.write 'out.xls' 

Result

  • 'B' and 'C' columns are hidden, but outline (grouping) buttons are not shown.

BraSt

  • Is there some code (methods) to utilize outline (grouping)?
  • I have to read more Reader and Writer classes
    • If there is, try to use it and confirm
    • If there is not, I have to check and read xls byte structure
  • Check the bit structure of column

Look at bit data of column info

Experiment

  1. test1.xls: normal file
  2. test2.xls: 'C' and 'D' columns are not outlined but hidden
  3. test3.xls: 'C' and 'D' columns are outlined but not hidden
  4. test4.xls: 'C' and 'D' columns are outlined and hidden
  5. test5.xls: 'C' and 'D' columns are hidden
    • test1.xls - test4.xls: made by MS Office
    • test5.xls: made by spreadsheet

lib/spreadsheet/excel/reader.rb#read_colinfo

  def read_colinfo worksheet, work, pos, len
p "getin read_colinfo"

test.rb

require 'spreadsheet'

file = ARGV[0]
book = Spreadsheet.open(file, 'rb')
sheet= book.worksheet(0)
5.times {|i| sheet[0,i] = i}
book.write "out.xls"

Result

masa@masa ~/ywesee/spreadsheet $ ruby -I lib test.rb test1.xls 
masa@masa ~/ywesee/spreadsheet $ ruby -I lib test.rb test2.xls 
"getin read_colinfo"
masa@masa ~/ywesee/spreadsheet $ ruby -I lib test.rb test3.xls 
"getin read_colinfo"
masa@masa ~/ywesee/spreadsheet $ ruby -I lib test.rb test4.xls 
"getin read_colinfo"
"getin read_colinfo"
masa@masa ~/ywesee/spreadsheet $ ruby -I lib test.rb test5.xls 
"getin read_colinfo"
"getin read_colinfo"
"getin read_colinfo"

Notes

  • Interesing
  • It seems that 'outline' and 'hidden' functions are save independetly
  • I have to look at them in more detail

Experiment (Rename files)

  1. normal.xls: normal file
  2. hidden.xls: 'C' and 'D' columns are not outlined but hidden
  3. outline.xls: 'C' and 'D' columns are outlined but not hidden
  4. outline_hidden.xls: 'C' and 'D' columns are outlined and hidden
  5. hidden_by_sp.xls: 'C' and 'D' columns are hidden made by spreadsheet

lib/spreadsheet/excel/reader.rb#read_colinfo

  def read_colinfo worksheet, work, pos, len
    # Offset  Size  Contents
    #      0     2  Index to first column in the range
    #      2     2  Index to last column in the range
    #      4     2  Width of the columns in 1/256 of the width of the zero
    #               character, using default font (first FONT record in the
    #               file)
    #      6     2  Index to XF record (&#10140; 6.115) for default column formatting
    #      8     2  Option flags:
    #               Bits  Mask    Contents
    #                  0  0x0001  1 = Columns are hidden
    #               10-8  0x0700  Outline level of the columns (0 = no outline)
    #                 12  0x1000  1 = Columns are collapsed
    #     10     2  Not used
    first, last, width, xf, opts = work.unpack binfmt(:colinfo)[0..-2]
print "fitst\tlast\twidth\txf\topts\n"
print first,"\t",last,"\t",width,"\t",xf,"\t",opts,"\n\n"

Result

masa@masa ~/ywesee/spreadsheet $ ruby -I lib test.rb hidden.xls 
fitst   last    width   xf      opts
2       3       0       15      3

masa@masa ~/ywesee/spreadsheet $ ruby -I lib test.rb outline.xls 
fitst   last    width   xf      opts
2       3       2925    15      256

masa@masa ~/ywesee/spreadsheet $ ruby -I lib test.rb outline_hidden.xls 
fitst   last    width   xf      opts
2       3       0       15      259

fitst   last    width   xf      opts
4       4       2925    15      4096

masa@masa ~/ywesee/spreadsheet $ ruby -I lib test.rb hidden_by_sp.xls 
fitst   last    width   xf      opts
0       1       2560    0       0

fitst   last    width   xf      opts
2       3       2560    0       1

fitst   last    width   xf      opts
4       4       2560    0       0

Notes

  • 'width' does not look so important
  • in the case of 'outline' and 'hidden' together, it looks like sum of them
  • but there is something set in the last column ('E' column)
  • It looks strange in the case of 'width=2560' of hidden columns (2 - 3 columns) (hidden_by_sp.xls)
  • and 'opts' is not '3' but '1', this is different from the file of MS Office
  • 'xf' is also different

Question

  • What is XF?

Experiment

lib/spreadsheet/excel/reader.rb#read_colinfo

  def read_colinfo worksheet, work, pos, len
    # Offset  Size  Contents
    #      0     2  Index to first column in the range
    #      2     2  Index to last column in the range
    #      4     2  Width of the columns in 1/256 of the width of the zero
    #               character, using default font (first FONT record in the
    #               file)
    #      6     2  Index to XF record (&#10140; 6.115) for default column formatting
    #      8     2  Option flags:
    #               Bits  Mask    Contents
    #                  0  0x0001  1 = Columns are hidden
    #               10-8  0x0700  Outline level of the columns (0 = no outline)
    #                 12  0x1000  1 = Columns are collapsed
    #     10     2  Not used
    first, last, width, xf, opts = work.unpack binfmt(:colinfo)[0..-2]
print "fitst\tlast\twidth\txf\topts\n"
print first,"\t",last,"\t",width,"\t",xf,"\t",opts,"\n\n"
    first.upto last do |col|
print "col=", col, "\n"
print "\twidth=\t\t", width.to_f / 256, "\n"
print "\thidden=\t\t", (opts & 0x0001) > 0, "\n"
print "\touline_level=\t", (opts & 0x0700), "\n"
print "\tcollapsed=\t", (opts & 0x1000) > 0, "\n"
      column = Column.new col, @workbook.format(xf),
                          :width         => width.to_f / 256,
                          :hidden        => (opts & 0x0001) > 0,
                          :collapsed     => (opts & 0x1000) > 0,
                          :outline_level => (opts & 0x0700)
      column.worksheet = worksheet
      worksheet.columns[col] = column
    end
puts
  end

Result

masa@masa ~/ywesee/spreadsheet $ ruby -I lib test.rb normal.xls 
masa@masa ~/ywesee/spreadsheet $ ruby -I lib test.rb hidden.xls 
fitst   last    width   xf      opts
2       3       0       15      3

col=2
        width=          0.0
        hidden=         true
        ouline_level=   0
        collapsed=      false
col=3
        width=          0.0
        hidden=         true
        ouline_level=   0
        collapsed=      false

masa@masa ~/ywesee/spreadsheet $ ruby -I lib test.rb outline.xls 
fitst   last    width   xf      opts
2       3       2925    15      256

col=2
        width=          11.42578125
        hidden=         false
        ouline_level=   256
        collapsed=      false
col=3
        width=          11.42578125
        hidden=         false
        ouline_level=   256
        collapsed=      false

masa@masa ~/ywesee/spreadsheet $ ruby -I lib test.rb outline_hidden.xls 
fitst   last    width   xf      opts
2       3       0       15      259

col=2
        width=          0.0
        hidden=         true
        ouline_level=   256
        collapsed=      false
col=3
        width=          0.0
        hidden=         true
        ouline_level=   256
        collapsed=      false

fitst   last    width   xf      opts
4       4       2925    15      4096

col=4
        width=          11.42578125
        hidden=         false
        ouline_level=   0
        collapsed=      true

masa@masa ~/ywesee/spreadsheet $ ruby -I lib test.rb hidden_by_sp.xls 
fitst   last    width   xf      opts
0       1       2560    0       0

col=0
        width=          10.0
        hidden=         false
        ouline_level=   0
        collapsed=      false
col=1
        width=          10.0
        hidden=         false
        ouline_level=   0
        collapsed=      false

fitst   last    width   xf      opts
2       3       2560    0       1

col=2
        width=          10.0
        hidden=         true
        ouline_level=   0
        collapsed=      false
col=3
        width=          10.0
        hidden=         true
        ouline_level=   0
        collapsed=      false

fitst   last    width   xf      opts
4       4       2560    0       0

col=4
        width=          10.0
        hidden=         false
        ouline_level=   0
        collapsed=      false

Experiment

create_new_file.rb

require 'spreadsheet'
book = Spreadsheet::Workbook.new
sheet = book.create_worksheet
5.times {|i| sheet[0,i] = i}
sheet.column(2).outline_level = 256
sheet.column(3).outline_level = 256
book.write 'out.xls'

Result

masa@masa ~/ywesee/spreadsheet $ ruby -I lib create_new_file.rb 
masa@masa ~/ywesee/spreadsheet $ vim create_new_file.rb 
masa@masa ~/ywesee/spreadsheet $ mv out.xls aaa.xls
masa@masa ~/ywesee/spreadsheet $ ruby -I lib test.rb aaa.xls 
fitst   last    width   xf      opts
0       1       2560    0       0

col=0
        width=          10.0
        hidden=         false
        ouline_level=   0
        collapsed=      false
col=1
        width=          10.0
        hidden=         false
        ouline_level=   0
        collapsed=      false

fitst   last    width   xf      opts
2       3       2560    0       0

col=2
        width=          10.0
        hidden=         false
        ouline_level=   0
        collapsed=      false
col=3
        width=          10.0
        hidden=         false
        ouline_level=   0
        collapsed=      false

fitst   last    width   xf      opts
4       4       2560    0       0

col=4
        width=          10.0
        hidden=         false
        ouline_level=   0
        collapsed=      false

Notes

  • 'outline_level' is set '0' not '256'
  • 'outline_level' value does not look to be converted into byte code

Experiment

lib/spreadsheet/excel/writer/worksheet.rb

  def write_colinfo bunch
    col = bunch.first
print "col.idx=", col.idx, "\n"
print "col.hidden?=", col.hidden?, "\n"
print "col.outline_level=", col.outline_level, "\n"
...

create_new_file.rb

require 'spreadsheet'


book = Spreadsheet::Workbook.new
sheet = book.create_worksheet
5.times {|i| sheet[0,i] = i}
#sheet.column(2).hidden = true
sheet.column(2).outline_level = 256
#sheet.column(3).hidden = true
sheet.column(3).outline_level = 256
book.write 'out.xls'

Result

masa@masa ~/ywesee/spreadsheet $ ruby -I lib create_new_file.rb 
col.idx=0
col.hidden?=false
col.outline_level=0
col.idx=2
col.hidden?=false
col.outline_level=256
col.idx=4
col.hidden?=false
col.outline_level=0

Notes

  • In 'write_colinfo' method, the outline_level is certainly 256

Next

lib/spreadsheet/excel/writer/worksheet.rb

  def write_colinfo bunch
    col = bunch.first
print "col.idx=", col.idx, "\n"
print "col.hidden?=", col.hidden?, "\n"
print "col.outline_level=", col.outline_level, "\n"
print "col.collapsed?=", col.collapsed?, "\n"
    width = col.width.to_f * 256
    xf_idx = @workbook.xf_index @worksheet.workbook, col.default_format
    opts =  0
    opts |= 0x0001 if col.hidden?
    opts |= col.outline_level.to_i << 8
    opts |= 0x1000 if col.collapsed?
print "opts=", opts, "\n"
...

Result

masa@masa ~/ywesee/spreadsheet $ ruby -I lib create_new_file.rb 
col.idx=0
col.hidden?=false
col.outline_level=0
col.collapsed?=false
opts=0

col.idx=2
col.hidden?=false
col.outline_level=256
col.collapsed?=false
opts=65536

col.idx=4
col.hidden?=false
col.outline_level=0
col.collapsed?=false
opts=0

Note

  • 'opts=65536' is strange

Experiment (create a new file forcedly with the same setting of outlined xls file)

lib/spreadsheet/excel/writer/worksheet.rb

  def write_colinfo bunch
    col = bunch.first
print "col.idx=", col.idx, "\n"
print "col.hidden?=", col.hidden?, "\n"
print "col.outline_level=", col.outline_level, "\n"
print "col.collapsed?=", col.collapsed?, "\n"
    width = col.width.to_f * 256
    xf_idx = @workbook.xf_index @worksheet.workbook, col.default_format
    opts =  0
    opts |= 0x0001 if col.hidden?
    opts |= col.outline_level.to_i << 8
    opts |= 0x1000 if col.collapsed?
print "opts=", opts, "\n"
if col.idx == 2
  opts = 256
end
xf_idx = 15

Run create

masa@masa ~/ywesee/spreadsheet $ ruby -I lib create_new_file.rb 

Result

masa@masa ~/ywesee/spreadsheet $ ruby -I lib test.rb aaa.xls 
fitst   last    width   xf      opts
0       1       2560    15      0

col=0
        width=          10.0
        hidden=         false
        ouline_level=   0
        collapsed=      false
col=1
        width=          10.0
        hidden=         false
        ouline_level=   0
        collapsed=      false

fitst   last    width   xf      opts
2       3       2560    15      256

col=2
        width=          10.0
        hidden=         false
        ouline_level=   256
        collapsed=      false
col=3
        width=          10.0
        hidden=         false
        ouline_level=   256
        collapsed=      false

fitst   last    width   xf      opts
4       4       2560    15      0

col=4
        width=          10.0
        hidden=         false
        ouline_level=   0
        collapsed=      false

Notes

  • These check values above are same as the outlined xls file (outline.xls)
  • but the file actually is NOT outlined
  • also outline function (grouping) does not work in OpenOffice

My guess

  • Not only 'outline_level' but something else is necessary to activate the outline (grouping) function

Check file sizes

masa@masa ~/ywesee/spreadsheet $ ls -la|grep xls
-rwxr--r--  1 masa masa 23552  3. Dez 08:40 hidden.xls
-rwxr--r--  1 masa masa 23040  3. Dez 08:40 normal.xls
-rwxr--r--  1 masa masa 23552  3. Dez 08:40 outline_hidden.xls
-rw-r--r--  1 masa masa 23552  3. Dez 08:40 outline.xls

Notes

  • All the file except for normal.xls are same

Consideration

  • This means that hidden and outline functions maybe uses the same part of data in a xls file
  • and no additional data for the setting (even if hidden or not, showing outlined button or not)
  • Probably 'on' 'off' switching data to show the outline buttons are there somewhere

BraSt

  • What should I do next?
  • Search binary viewer for xls file (Biff viewer?)
  • Look at and analyze binary data directly (not use spreadsheet methods)

Install BiffView Tool (on Windows7)

Link http://b2xtranslator.sourceforge.net/download.html

Create Biff data for each file

  1. Attach:normal.xls.html
  2. Attach:hidden.xls.html
  3. Attach:outline.xls.html
  4. Attach:outline_hidden.xls.html
  5. Attach:hidden_by_sp.xls.html

Diff

Notes

  • There is no 'COLINFO' in nomal.xls
  • There is only one 'COLINFO' in hidden.xls and outline.xls
  • There are two 'COLINFO' in outline_hidden.xls
  • There is no 'GUTS' in hidden_by_sp.xls

Focus on 'diff hidden.xls outline.xls' first

  • There is two differences
hidden.xls:  BIFF GUTS (80h) 	8 	00 00 00 00 00 00 00 00 
outline.xls: BIFF GUTS (80h) 	8 	00 00 1D 00 00 00 02 00 
hidden.xls:  BIFF COLINFO (7Dh) 	12 	02 00 03 00 00 00 0F 00  03 00 00 00 
outline.xls: BIFF COLINFO (7Dh) 	12 	02 00 03 00 6D 0B 0F 00  00 01 00 00 

My guess

  • The GUTS data is the outline buttons information

grep search

masa@masa ~/ywesee/spreadsheet $ grep -r GUTS lib
lib/spreadsheet/excel/internals.rb:    :guts         => 0x0080, # &#9675;  GUTS &#10140; 6.49
lib/spreadsheet/excel/writer/worksheet.rb:    # &#9675;  GUTS &#10140; 5.53

lib/spreadsheet/excel/writer/worksheet.rb#write_from_scratch

  def write_from_scratch
    # &#9679;  BOF Type = worksheet (&#10140; 5.8)
    write_bof
    # &#9675;  UNCALCED &#10140; 5.105
    # &#9675;  INDEX &#10140; 4.7 (Row Blocks), &#10140; 5.59
    # &#9675;  Calculation Settings Block &#10140; 4.3
    write_calccount
    write_refmode
    write_iteration
    write_saverecalc
    # &#9675;  PRINTHEADERS &#10140; 5.81
    # &#9675;  PRINTGRIDLINES &#10140; 5.80
    # &#9675;  GRIDSET &#10140; 5.52
    # &#9675;  GUTS &#10140; 5.53
    # &#9675;  DEFAULTROWHEIGHT &#10140; 5.31
    write_defaultrowheight
    # &#9675;  WSBOOL &#10140; 5.113
    write_wsbool
...

Notes

  • The method to create GUTS data is NOT implemented yet.

Next

  • Try to create GUTS data
view · edit · sidebar · attach · print · history
Page last modified on July 13, 2011, at 12:06 PM