<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:clearspace="http://www.jivesoftware.com/xmlns/jive/rss" version="2.0">
  <channel>
    <title>Adobe Community: Message List - PDF File with Run Length Encoding?</title>
    <link>https://forums.adobe.com/community/design_development/pdf_language_and_specifications?view=discussions</link>
    <description>Most recent forum messages</description>
    <language>en</language>
    <pubDate>Tue, 25 Mar 2014 14:53:10 GMT</pubDate>
    <generator>Jive Engage 7.0.0.1  (http://jivesoftware.com/products/)</generator>
    <dc:date>2014-03-25T14:53:10Z</dc:date>
    <dc:language>en</dc:language>
    <item>
      <title>Re: PDF File with Run Length Encoding?</title>
      <link>https://forums.adobe.com/message/6240557?tstart=0#6240557</link>
      <description>&lt;!-- [DocumentBodyStart:f3c6e02f-1c87-4ff6-b7b8-d566060d383c] --&gt;&lt;div class="jive-rendered-content"&gt;&lt;p&gt;I was talking to lrosenth.. &lt;/p&gt;&lt;p&gt;Your post was rather helpful.. &lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:f3c6e02f-1c87-4ff6-b7b8-d566060d383c] --&gt;&lt;img src='/beacon?t=1415903390977' /&gt;</description>
      <pubDate>Tue, 25 Mar 2014 14:53:10 GMT</pubDate>
      <author>forums_noreply@adobe.com</author>
      <guid>https://forums.adobe.com/message/6240557?tstart=0#6240557</guid>
      <dc:date>2014-03-25T14:53:10Z</dc:date>
      <clearspace:dateToText>7 months 3 weeks ago</clearspace:dateToText>
      <clearspace:objectType>0</clearspace:objectType>
    </item>
    <item>
      <title>Re: PDF File with Run Length Encoding?</title>
      <link>https://forums.adobe.com/message/6240306?tstart=0#6240306</link>
      <description>&lt;!-- [DocumentBodyStart:9e29b2f5-77c4-4532-af36-0ca9374c1704] --&gt;&lt;div class="jive-rendered-content"&gt;&lt;p&gt;I'm sorry you think that small deviations from the definition are unimportant. (And by the way I apologise for using "page stream" when I misremembered "Content stream"). You are asking people who have digested the 1000 page PDF specification (in some cases maybe even helped to edit it) about very specific and fine details from it. You must expect them to be very precise so that the answer can be accurate. The only way to deal with a specification like this is to be very pedantic indeed, so no apologies for that at all.&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:9e29b2f5-77c4-4532-af36-0ca9374c1704] --&gt;</description>
      <pubDate>Tue, 25 Mar 2014 13:45:00 GMT</pubDate>
      <author>forums_noreply@adobe.com</author>
      <guid>https://forums.adobe.com/message/6240306?tstart=0#6240306</guid>
      <dc:date>2014-03-25T13:45:00Z</dc:date>
      <clearspace:dateToText>7 months 3 weeks ago</clearspace:dateToText>
      <clearspace:replyCount>1</clearspace:replyCount>
      <clearspace:objectType>0</clearspace:objectType>
    </item>
    <item>
      <title>Re: PDF File with Run Length Encoding?</title>
      <link>https://forums.adobe.com/message/6239783?tstart=0#6239783</link>
      <description>&lt;!-- [DocumentBodyStart:dd240658-3501-4386-bf0b-7dd0c38a13cd] --&gt;&lt;div class="jive-rendered-content"&gt;&lt;p&gt;I think it's pretty clear what I meant.. ISO3200-1 (I've already mentioned it in a previous post)&amp;nbsp; but thanks for picking up that small insignificant fact.&amp;nbsp;&amp;nbsp; You don't have to give me the ins and outs of binary, glyphs and complicating factors which detract from the question.&amp;nbsp;&amp;nbsp; &lt;/p&gt;&lt;p style="min-height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;I've been parsing pdfs for years but they have all been Flat or LZW decode some with ASCII filters.. All I asked is to ascertain whether text would be encoded in any other way as I'm building on an existing automated process I have in place.&amp;nbsp; &lt;/p&gt;&lt;p style="min-height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;By assuming that I'm stupid and complicating things by highlighting small deviations in my chosen vocabulary you are not seriusly adressing the issue at hand.&lt;/p&gt;&lt;p style="min-height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;In a slang way I'm obviously referring to a stream in a text object. The Tj operator treats each element (yes binary value) as a character code.. &lt;/p&gt;&lt;p style="min-height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Two posts above answers my question anyway..&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:dd240658-3501-4386-bf0b-7dd0c38a13cd] --&gt;</description>
      <pubDate>Tue, 25 Mar 2014 09:42:35 GMT</pubDate>
      <author>forums_noreply@adobe.com</author>
      <guid>https://forums.adobe.com/message/6239783?tstart=0#6239783</guid>
      <dc:date>2014-03-25T09:42:35Z</dc:date>
      <clearspace:dateToText>7 months 3 weeks ago</clearspace:dateToText>
      <clearspace:replyCount>2</clearspace:replyCount>
      <clearspace:objectType>0</clearspace:objectType>
    </item>
    <item>
      <title>Re: PDF File with Run Length Encoding?</title>
      <link>https://forums.adobe.com/message/6239726?tstart=0#6239726</link>
      <description>&lt;!-- [DocumentBodyStart:1e6e4af3-4b10-4b7c-b717-f95b79c34aef] --&gt;&lt;div class="jive-rendered-content"&gt;&lt;p&gt;Since there is no such thing in ISO 32000, can you tell us what do you mean by a "text stream"?&amp;nbsp;&amp;nbsp; Do you mean a content stream, where the page content drawing instructions are?&amp;nbsp; Those can include arbitrary data (as mentioned by TSN) and not just drawing instructions.&amp;nbsp;&amp;nbsp; Also, the instructions you find there aren't actual text - they are just references to glyphs in the font.&amp;nbsp; You would need to decode the font program streams as well if you wished to get actual text from a PDF.&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:1e6e4af3-4b10-4b7c-b717-f95b79c34aef] --&gt;</description>
      <pubDate>Tue, 25 Mar 2014 08:51:40 GMT</pubDate>
      <author>forums_noreply@adobe.com</author>
      <guid>https://forums.adobe.com/message/6239726?tstart=0#6239726</guid>
      <dc:date>2014-03-25T08:51:40Z</dc:date>
      <clearspace:dateToText>7 months 3 weeks ago</clearspace:dateToText>
      <clearspace:replyCount>3</clearspace:replyCount>
      <clearspace:objectType>0</clearspace:objectType>
    </item>
    <item>
      <title>Re: PDF File with Run Length Encoding?</title>
      <link>https://forums.adobe.com/message/6239746?tstart=0#6239746</link>
      <description>&lt;!-- [DocumentBodyStart:0557f697-7bb4-4f84-9c27-35f8af045c43] --&gt;&lt;div class="jive-rendered-content"&gt;&lt;p&gt;If you mean a page stream, the likely filters are LZW and Flate, optionally with ASCII85. It depends whether you want to support all theoretical PDFs or just those you're likely to find in the field. I guess you are saying that you only intend to decompress page streams (and presumably form XObjects), likely to extract text.&lt;/p&gt;&lt;p style="min-height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;I see no ambiguity in the statement. And I agree with lrosenth that talking about "only text" really does nothing to simplify your problem. All filters are simply implemented as binary octets in, binary octets out. If it is a page stream, it will usually contain only visible characters, but by no means always as the text strings passed to Tj (etc.) can contain arbitrary binary data, and there may be inline images.&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:0557f697-7bb4-4f84-9c27-35f8af045c43] --&gt;</description>
      <pubDate>Tue, 25 Mar 2014 08:33:48 GMT</pubDate>
      <author>forums_noreply@adobe.com</author>
      <guid>https://forums.adobe.com/message/6239746?tstart=0#6239746</guid>
      <dc:date>2014-03-25T08:33:48Z</dc:date>
      <clearspace:dateToText>7 months 3 weeks ago</clearspace:dateToText>
      <clearspace:replyCount>4</clearspace:replyCount>
      <clearspace:objectType>0</clearspace:objectType>
    </item>
    <item>
      <title>Re: PDF File with Run Length Encoding?</title>
      <link>https://forums.adobe.com/message/6239774?tstart=0#6239774</link>
      <description>&lt;!-- [DocumentBodyStart:93d5e67f-6ace-4cb9-80d4-2bbaa0f084c0] --&gt;&lt;div class="jive-rendered-content"&gt;&lt;p&gt;You are complicating a relatively simple question.&amp;nbsp; Please be reasonable.&lt;/p&gt;&lt;p style="min-height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Yes, all files are ultimately represented in binary if you break them down.&lt;/p&gt;&lt;p style="min-height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;But streams (yes, ultimately binary) are configured to represent something.. Images, text etc... I really don't understand why you made that point. &lt;/p&gt;&lt;p style="min-height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;A stream representing text is casually referred to as a text stream in this case or more correctly a string stream.. or just a string.&amp;nbsp; Because we're talking about PDFs, yes, ultimately you can break this text stream down to 8bit bytes. &lt;/p&gt;&lt;p style="min-height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;So, if the stream is meant to prepresent text as opposed to an image or something else, then what would be a realistic list of encoding types which could possible be applied to it.&amp;nbsp; I even pasted the section from the ISO3200 above to show you why there may be some ambiguity.&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:93d5e67f-6ace-4cb9-80d4-2bbaa0f084c0] --&gt;</description>
      <pubDate>Tue, 25 Mar 2014 08:25:21 GMT</pubDate>
      <author>forums_noreply@adobe.com</author>
      <guid>https://forums.adobe.com/message/6239774?tstart=0#6239774</guid>
      <dc:date>2014-03-25T08:25:21Z</dc:date>
      <clearspace:dateToText>7 months 3 weeks ago</clearspace:dateToText>
      <clearspace:replyCount>5</clearspace:replyCount>
      <clearspace:objectType>0</clearspace:objectType>
    </item>
    <item>
      <title>Re: PDF File with Run Length Encoding?</title>
      <link>https://forums.adobe.com/message/6239720?tstart=0#6239720</link>
      <description>&lt;!-- [DocumentBodyStart:46bbf760-45b6-4180-8b88-d1fefaa376d8] --&gt;&lt;div class="jive-rendered-content"&gt;&lt;p&gt;I don't know what you mean by "only text though".&amp;nbsp; PDF is a binary format not a text format.&amp;nbsp; There are certainly places that you can have a text string or a stream of text-like data, but the entire format is binary.&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:46bbf760-45b6-4180-8b88-d1fefaa376d8] --&gt;</description>
      <pubDate>Tue, 25 Mar 2014 07:44:27 GMT</pubDate>
      <author>forums_noreply@adobe.com</author>
      <guid>https://forums.adobe.com/message/6239720?tstart=0#6239720</guid>
      <dc:date>2014-03-25T07:44:27Z</dc:date>
      <clearspace:dateToText>7 months 3 weeks ago</clearspace:dateToText>
      <clearspace:replyCount>6</clearspace:replyCount>
      <clearspace:objectType>0</clearspace:objectType>
    </item>
    <item>
      <title>Re: PDF File with Run Length Encoding?</title>
      <link>https://forums.adobe.com/message/6239574?tstart=0#6239574</link>
      <description>&lt;!-- [DocumentBodyStart:ea3c2978-8ea1-401b-9ed1-29104f19fc3d] --&gt;&lt;div class="jive-rendered-content"&gt;&lt;p&gt;No worries..&lt;/p&gt;&lt;p style="min-height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;I'm in the process of building a decoder to parse pdf files compressed in multiple different encoding types. Only text though..&lt;/p&gt;&lt;p style="min-height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Just reading the ISO3200-1 2008 in Table 6. pasted below on Run-Lenght decode.&amp;nbsp; The reference to text. &lt;/p&gt;&lt;p style="min-height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;This is to assume that I can't reprint the pdf (using pdf writer) into flate decode for text streams which seems to be the standard now.&amp;nbsp; I've inlcuded LZW, FlateDecode, still deciding on Run-Length Decode, and of course the ASCII85 decode filter which will be combined with all three if necessary or at least Flate and LZW.&amp;nbsp;&amp;nbsp; &lt;/p&gt;&lt;p style="min-height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Any other suggestions on that?&amp;nbsp;&amp;nbsp; For decoding text streams mainly. &lt;/p&gt;&lt;p style="min-height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p style="min-height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;RunLengthDecode no Decompresses data encoded using a byte-oriented run-length&lt;/strong&gt;&lt;/p&gt;&lt;p style="min-height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;em&gt;encoding algorithm, reproducing the original &lt;strong&gt;text&lt;/strong&gt; or binary data&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;(typically monochrome image data, or any data that contains&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;frequent long runs of a single byte value).&lt;/em&gt;&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:ea3c2978-8ea1-401b-9ed1-29104f19fc3d] --&gt;</description>
      <pubDate>Tue, 25 Mar 2014 07:01:51 GMT</pubDate>
      <author>forums_noreply@adobe.com</author>
      <guid>https://forums.adobe.com/message/6239574?tstart=0#6239574</guid>
      <dc:date>2014-03-25T07:01:51Z</dc:date>
      <clearspace:dateToText>7 months 3 weeks ago</clearspace:dateToText>
      <clearspace:replyCount>7</clearspace:replyCount>
      <clearspace:objectType>0</clearspace:objectType>
    </item>
    <item>
      <title>Re: PDF File with Run Length Encoding?</title>
      <link>https://forums.adobe.com/message/6213872?tstart=0#6213872</link>
      <description>&lt;!-- [DocumentBodyStart:31aab08a-131b-4914-a45a-ebf17cd44c4d] --&gt;&lt;div class="jive-rendered-content"&gt;&lt;p&gt;I think it's almost inconceivable anyone would have made one, except for a test, since run length applied to text streams would only ever make it bigger.&amp;nbsp; You might find it applied to images. Indeed Acrobat Distiller offers this as an option for compression of monochrome images. You could also write an Acrobat plug-in that compressed a text stream in this way. Both of course require purchase of Adobe software, but our hosts probably wouldn't see that as a bad thing.&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:31aab08a-131b-4914-a45a-ebf17cd44c4d] --&gt;</description>
      <pubDate>Sun, 16 Mar 2014 12:26:43 GMT</pubDate>
      <author>forums_noreply@adobe.com</author>
      <guid>https://forums.adobe.com/message/6213872?tstart=0#6213872</guid>
      <dc:date>2014-03-16T12:26:43Z</dc:date>
      <clearspace:dateToText>7 months 1 month ago</clearspace:dateToText>
      <clearspace:replyCount>8</clearspace:replyCount>
      <clearspace:objectType>0</clearspace:objectType>
    </item>
    <item>
      <title>Re: PDF File with Run Length Encoding?</title>
      <link>https://forums.adobe.com/message/6213795?tstart=0#6213795</link>
      <description>&lt;!-- [DocumentBodyStart:d23e5840-bca3-49a7-bc1b-e8f8f3ce1c99] --&gt;&lt;div class="jive-rendered-content"&gt;&lt;p&gt;Maybe someone can suggest how to find one of these?&amp;nbsp; Is there some way of creating one with software? &lt;/p&gt;&lt;p style="min-height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Thanks&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:d23e5840-bca3-49a7-bc1b-e8f8f3ce1c99] --&gt;</description>
      <pubDate>Sun, 16 Mar 2014 10:55:36 GMT</pubDate>
      <author>forums_noreply@adobe.com</author>
      <guid>https://forums.adobe.com/message/6213795?tstart=0#6213795</guid>
      <dc:date>2014-03-16T10:55:36Z</dc:date>
      <clearspace:dateToText>7 months 1 month ago</clearspace:dateToText>
      <clearspace:replyCount>9</clearspace:replyCount>
      <clearspace:objectType>0</clearspace:objectType>
    </item>
    <item>
      <title>PDF File with Run Length Encoding?</title>
      <link>https://forums.adobe.com/message/6117577?tstart=0#6117577</link>
      <description>&lt;!-- [DocumentBodyStart:875191ef-9102-4349-a562-eafb4a58c134] --&gt;&lt;div class="jive-rendered-content"&gt;&lt;p&gt;Would anyone have a PDF file containing text streams compressed in Run-Length Encoding?&amp;nbsp; I need one to test a decompressor.&amp;nbsp; I know they are probably rare and not used anymore which is why one is so difficult to find. &lt;/p&gt;&lt;p style="min-height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;If you do have one could you attach it in a reply or refer me to it.&amp;nbsp; &lt;span aria-label="Happy" class="emoticon-inline emoticon_happy" style="height:16px;width:16px;"&gt;&lt;/span&gt;&lt;/p&gt;&lt;p style="min-height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Thanks&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:875191ef-9102-4349-a562-eafb4a58c134] --&gt;</description>
      <pubDate>Thu, 13 Feb 2014 17:29:13 GMT</pubDate>
      <author>forums_noreply@adobe.com</author>
      <guid>https://forums.adobe.com/message/6117577?tstart=0#6117577</guid>
      <dc:date>2014-02-13T17:29:13Z</dc:date>
      <clearspace:dateToText>8 months 1 month ago</clearspace:dateToText>
      <clearspace:replyCount>10</clearspace:replyCount>
      <clearspace:objectType>0</clearspace:objectType>
    </item>
  </channel>
</rss>

