view · edit · sidebar · attach · print · history

20121206-update-fiparsed

<< | Index | >>


Summary

  • Update TextInfoPlugin
  • Update fiparsed

Commits

Index


Update fiparsed

table: Pre-format(old style) table tag is changed

There are not any <tbody> , <thead> and <th> tags.

source
   <table cellSpacing="0" cellPadding="0" border="0">
      <tr>
        <td>Alter       Körper-   Einzeldosis    Maximale       </td>
      </tr>
      <tr>
        <td>            gewicht                  Tagesdosis     </td>
      </tr>
      <tr>
        <td>Erwachsene  &gt;40 kg    1–2 Tabletten  8 Tabletten    </td>
      </tr>
      <tr>
        <td>und Jugend-           nach Bedarf    (= 4 g         </td>
      </tr>
      <tr>
        <td>liche ab                             Paracetamol)   </td>
      </tr>
      <tr>
        <td class="rowSepBelow">12 Jahren                                           </td>
      </tr>
      <tr>
      ...
    </table>
origin
parsed

Then Updated table detection. (line table or pre formatted paragraph)

updated

table: rowspan handling

update src/view/chapter.rb to apply row_span

table
                if cell.is_a? Text::MultiCell
                  context.td('colspan' => cell.col_span, 'rowspan' => cell.row_span) {
                    paragraphs(context, cell.contents)
                  }   
                else
                  context.td('colspan' => cell.col_span, 'rowspan' => cell.row_span) {
                    formats(context, cell)
                  }   
                end

chapter: "Packungen" chapter

Packungen has stragte </img> HTML tag in table cell.

       <td class="picture" rowspan="3">
          <img src="http://pictures.documed.ch/WV_GetPictures/560_PIF_M.jpg?px=144" style="width:72px; vertical-align:middle;" alt="ACETALGIN Tabl 500 mg">
          </img>
        </td>

strange text in cell.

[6] 1.9.3-p194(#<ODDB::FiParse::FachinfoHpricot>)> child.parent
=> {elem
 <td class="picture" rowspan="3">
 "\r\n          "         # <= This
 {emptyelem
  <img
   src="http://pictures.documed.ch/WV_GetPictures/560_PIF_M.jpg?px=144"
   style="width:72px; vertical-align:middle;"
   alt="ACETALGIN Tabl 500 mg">}
 "\r\n          "
 {bogusetag </img>}       # <= Invalid Tag
 "\r\n        "
 </td>}

strange text between tag and tag.

=> {elem
 <table
  class="tblArticles"
  style="border-top: solid 0px white; border-bottom: solid 1pt #E5E7E8; margin-top:2px;">
 "\r\n      "                # <= This
 {elem
  <tr name="article">
  "\r\n        "             # <= This
  {elem
   <td
    class="product"
    style="     font-family: Arial, Helvetica, sans-serif;"
    rowspan="">
...

skipped these invalid text.
and applied rowspan and colspan, and image.

package
view · edit · sidebar · attach · print · history
Page last modified on December 06, 2012, at 12:25 PM