10 Replies Latest reply on Jan 23, 2011 3:16 PM by Tom Violence

    Writing my own dng reader

    Tom Violence

      I began writing my own dng reader and I have some problems (for my school project). First, for testing I'm using a dng file who is converted from a original nef file via the adobe dng converter and I'm using the dng, tiff 6.0 and tiff-ep specification. In the dng specification on page 18 is said that the raw data is compressed via lossless jpeg. My problem is that I just don't know which predictor is used. Lossless jpeg is using 8 different sorts of predictors and I don't know in which tag I cannot get this information. Tiff 6.0 defines a tag named JPEGLosslessPredictors, but first, in my file I'm not reading this tag and two, here is written that this tag is no longer valid. So my question is were is the information about the predictor. It must be somewhere, but I don't know were.


        • 1. Re: Writing my own dng reader
          sandy_mc Level 3

          Umm, good luck with that.   Not that many of us wouldn't very much like to see a fully free and open source DNG SDK.


          You don't need that tag, even if it existed (which it doesn't anymore). The DNG specification is explicit on variants of JPEG:


          If PhotometricInterpretation = 6 (YCbCr) and BitsPerSample = 8/8/8, or if PhotometricInterpretation = 1 (BlackIsZero) and BitsPerSample = 8, then the JPEG variant must be baseline DCT JPEG.
          Otherwise, the JPEG variant must be lossless Huffman JPEG.


          So you just take a look at the fields in question, and use the appropriate JPEG compression variant......


          However, so far as I know, baseline DCT is actually never used, so in practice you're only ever dealing with Huffman (PhotometricInterpretation on the main image is either CFA or LinearRaw on every image I've ever seen).


          BTW, this kind of question is probably best posted on the DNG SDK list.



          • 2. Re: Writing my own dng reader
            Tom Violence Level 1

            No, I'm not writing a dng sdk. , I'm just a student and this is a small semester project. Actually, this is a plugin for our simple program for editing graphics.


            To go back to my question. What do you mean when you say jpeg variant? I'll be honest when I say that I don't know how jpeg works. I just know that there is the jpeg standard (lossy standard) and the lossless jpeg standard. I was googling for lossless jpeg and what I'm found is this: http://en.wikipedia.org/wiki/Lossless_JPEG

            The same thing is written in tiff 6.0. So if I understand, first I'm calculating the prediction error and than I just compress him via Huffman (did you mean this when you sad that I'm dealing only with huffman). So, I still don't know which predictor is used.


            I'm sorry if there are grammatical errors. English is only my third language (but the most important).

            • 3. Re: Writing my own dng reader
              sandy_mc Level 3

              I'm a bit confused by your question as regards predictors. So far as I know, the Huffman compression used in DNG is just the "regular" compression as described in the main JPEG specification, specifically in section H1.2 (predictors and stuff), and section K2. So the predictors you would use are as shown on the Wikipedia page in the "Selection-value Prediction" table, and the actual predictor selection is encoded into the scan header in the "Ss" parameter. (Section B.2.3) of the JPEG standard. No TIFF tags required.


              But a more important question: if all you want is a plug-in, why aren't you just using the DNG SDK? 



              • 4. Re: Writing my own dng reader
                Tom Violence Level 1

                No wonder you're confused, it's my fault. I was thinking that lossless jpeg is a own format, not a mod of jpeg. I did not even know that the raw data in the dng file includes metadata. However, now it's easy to find the selected predictor (for my test picture it's the value 1, only the left pixel). But I have two other questions about this, if you don't mind.

                First, in the frame header, the parameter P (for my test picture) is set to zero, but it should be 16 (as I know the raw picture is a two byte picture), even in the table B.2 the value zero is invalid. The other parameters are ok.

                Second,  I'm a bit confused about this:


                         The difference between the prediction  value  and the input is calculated modulo 2^16. In the decoder the difference is
                         decoded and added, modulo 2^16, to the prediction.


                Am I doing this:


                         compressed sample =  difference between the prediction  value  and the input (mod)  65536? But this is always zero (for a 16 bit pixel or less).


                To answer your question. I'm not using the DNG SDK because the key of the semester project is to understand how digitals image are represenand and to learn how to use documentations (When I was begining with this I was thinking that all I need is the dng documentation).

                • 5. Re: Writing my own dng reader
                  sandy_mc Level 3

                  I can't think of any reason for P to be set to zero. Does the file you're dealing with pass Adobe's dng_validator without any errors? If it does, I'd suspect that somehow the JPEG decoder wasn't working correctly.


                  As regards the modulo, that means remainder:


                  compressedSample = (pixelValue - predictor) & 0xffff;




                  • 6. Re: Writing my own dng reader
                    Tom Violence Level 1

                    The parameter P, that was my fault, now it's ok.

                    I hope I'm not boring with my questions.

                    When reading the frame and scan header a I get two image components and two huffman tables. I just don't know what they represent. My image is a CFA color image and I would assume 3 components (R, G and B). The JPEG specification is a bit confused. Is this ok:


                    Assume that this is the begining of the compressed image data: 1111 1110 00000001 1111 1110

                    The first ten bits are 1111 1110, and that's one value from the HUFFCODE array, the corresponding value in the HUFFVAL array is 15. Now I read the 15 remainding bits and this bits are now the decoded error values (predictor - sample). Now I'm using the HUFFCODE and HUFFVAL arrays from the second component. Is this reasoning ok?


                    Btw, thanks reading and answering my questions.

                    • 7. Re: Writing my own dng reader
                      sandy_mc Level 3

                      Umm, well, it's a long time since I was that deep into JPEG coding.


                      But I'd think it depends on what you mean by "compressed image data" - what you should be seeing is a DHT marker, then the Huffman length, then a steam of N table entries.....



                      • 8. Re: Writing my own dng reader
                        Tom Violence Level 1

                        Actually, reading again the documentation my reasoning about the decoding process was ok. But the main question (and that I can't get in the jpeg documentation) is what are my two image components? Photometrical interpretation is CFA, but why than two and not three components. Calculating the prediction error without considering three different components don't make sense. So when I calculate the prediction error for pixel A (assume a green color filter) his left neighbor would be a blue or red color filter and their difference can be very huge (and this don't make sense for predictive coding). I am very likely wrong, but there must be a catch.

                        • 9. Re: Writing my own dng reader
                          sandy_mc Level 3

                          Sorry, you lost me - don't know what "two image components" you're talking about here. Compression is done on a component by component basis, so far as I know.



                          • 10. Re: Writing my own dng reader
                            Tom Violence Level 1

                            So, I'm back.



                            I was thinking that the plugin was almost finished, but I was wrong. Where is the problem, I just don't know, so I ask for help.

                            I think it's best when I describe what I was doing and someone would find the fault. So I begin.



                            First I read the frame and the scan header. In the frame header the number of lines (or tile length) was 256, and number of samples per line was 128, sample precision is 16. I guess It's 128 samples per component in the line, because the tile width in the tiff tag was 256 and I have two components. Horizontal and vertical sampling factors are 1. In the scan header the selected predictor is 1 (left pixel). The point transform is 0.



                            After that I read the huffman tables BITS and HUFFVAL, and then generate the HUFFSIZE table as described in section C2 in the jpeg specification.



                            Than I was decompressing the huffman coded error data.  I generated a huffman tree and was reading the data bit by bit. When I reached the leaf of the tree I read the HUFFVAL value for that huffman code, and if it was for example 10, then I read the next ten bits. This 10 bits are integers in a complement code and the error samples. So if  the first bit is zero the value is positive, else it's negative.  I'm pretty sure that this part is ok, because when decompressing the coded data I get exactly 256*256 values.



                            So, this decoded  values are not pixel values, they are the error data. So, when I want to get the first pixel in the tile (left upper corner)  I compute it so:



                                                         pixel = predictor - error



                            The predictor value is:



                                                         predictor = 2^(sampleprecision - point transform - 1) = 2^(16-0-1) = 2^1 = 32768



                            The second pixel (coordinate (0,1)) is also computed with the same predictor, because it's also the first pixel, but the first pixel for the second component.

                            Other pixels in the first line are computed with the "left pixel" predictor. The first pixel of each line uses the pixel from the line above as the predictor. The selected predictor is used for all other lines. So when I want to compute the pixel on coordinate (230, 150) I would do this:



                                                        pixel(230, 150) = pixel(230, 148) - error(230*256 + 148)



                            What I first noticed was that the error values for the first two pixels had been about -32000, and all other values between -50 and 50. And that is for every single tile, but for me that didn't make sense.



                            After that was done I was supposing that this pixels are CFA pixels, because the photometrical intepretation tag was 32803. Reading the CFARepeatPaternDim and CFAPattern tags I get that the first line is a red-green line and the second line a green-blue line and so on. Then I demosaiceted every tile by bilinear interpolation.