Umm, good luck with that. Not that many of us wouldn't very much like to see a fully free and open source DNG SDK.
You don't need that tag, even if it existed (which it doesn't anymore). The DNG specification is explicit on variants of JPEG:
If PhotometricInterpretation = 6 (YCbCr) and BitsPerSample = 8/8/8, or if PhotometricInterpretation = 1 (BlackIsZero) and BitsPerSample = 8, then the JPEG variant must be baseline DCT JPEG.
Otherwise, the JPEG variant must be lossless Huffman JPEG.
So you just take a look at the fields in question, and use the appropriate JPEG compression variant......
However, so far as I know, baseline DCT is actually never used, so in practice you're only ever dealing with Huffman (PhotometricInterpretation on the main image is either CFA or LinearRaw on every image I've ever seen).
BTW, this kind of question is probably best posted on the DNG SDK list.
No, I'm not writing a dng sdk. , I'm just a student and this is a small semester project. Actually, this is a plugin for our simple program for editing graphics.
To go back to my question. What do you mean when you say jpeg variant? I'll be honest when I say that I don't know how jpeg works. I just know that there is the jpeg standard (lossy standard) and the lossless jpeg standard. I was googling for lossless jpeg and what I'm found is this: http://en.wikipedia.org/wiki/Lossless_JPEG
The same thing is written in tiff 6.0. So if I understand, first I'm calculating the prediction error and than I just compress him via Huffman (did you mean this when you sad that I'm dealing only with huffman). So, I still don't know which predictor is used.
I'm sorry if there are grammatical errors. English is only my third language (but the most important).
I'm a bit confused by your question as regards predictors. So far as I know, the Huffman compression used in DNG is just the "regular" compression as described in the main JPEG specification, specifically in section H1.2 (predictors and stuff), and section K2. So the predictors you would use are as shown on the Wikipedia page in the "Selection-value Prediction" table, and the actual predictor selection is encoded into the scan header in the "Ss" parameter. (Section B.2.3) of the JPEG standard. No TIFF tags required.
But a more important question: if all you want is a plug-in, why aren't you just using the DNG SDK?
No wonder you're confused, it's my fault. I was thinking that lossless jpeg is a own format, not a mod of jpeg. I did not even know that the raw data in the dng file includes metadata. However, now it's easy to find the selected predictor (for my test picture it's the value 1, only the left pixel). But I have two other questions about this, if you don't mind.
First, in the frame header, the parameter P (for my test picture) is set to zero, but it should be 16 (as I know the raw picture is a two byte picture), even in the table B.2 the value zero is invalid. The other parameters are ok.
Second, I'm a bit confused about this:
The difference between the prediction value and the input is calculated modulo 2^16. In the decoder the difference is
decoded and added, modulo 2^16, to the prediction.
Am I doing this:
compressed sample = difference between the prediction value and the input (mod) 65536? But this is always zero (for a 16 bit pixel or less).
To answer your question. I'm not using the DNG SDK because the key of the semester project is to understand how digitals image are represenand and to learn how to use documentations (When I was begining with this I was thinking that all I need is the dng documentation).
I can't think of any reason for P to be set to zero. Does the file you're dealing with pass Adobe's dng_validator without any errors? If it does, I'd suspect that somehow the JPEG decoder wasn't working correctly.
As regards the modulo, that means remainder:
compressedSample = (pixelValue - predictor) & 0xffff;
The parameter P, that was my fault, now it's ok.
I hope I'm not boring with my questions.
When reading the frame and scan header a I get two image components and two huffman tables. I just don't know what they represent. My image is a CFA color image and I would assume 3 components (R, G and B). The JPEG specification is a bit confused. Is this ok:
Assume that this is the begining of the compressed image data: 1111 1110 00000001 1111 1110
The first ten bits are 1111 1110, and that's one value from the HUFFCODE array, the corresponding value in the HUFFVAL array is 15. Now I read the 15 remainding bits and this bits are now the decoded error values (predictor - sample). Now I'm using the HUFFCODE and HUFFVAL arrays from the second component. Is this reasoning ok?
Btw, thanks reading and answering my questions.
Umm, well, it's a long time since I was that deep into JPEG coding.
But I'd think it depends on what you mean by "compressed image data" - what you should be seeing is a DHT marker, then the Huffman length, then a steam of N table entries.....
Actually, reading again the documentation my reasoning about the decoding process was ok. But the main question (and that I can't get in the jpeg documentation) is what are my two image components? Photometrical interpretation is CFA, but why than two and not three components. Calculating the prediction error without considering three different components don't make sense. So when I calculate the prediction error for pixel A (assume a green color filter) his left neighbor would be a blue or red color filter and their difference can be very huge (and this don't make sense for predictive coding). I am very likely wrong, but there must be a catch.
Sorry, you lost me - don't know what "two image components" you're talking about here. Compression is done on a component by component basis, so far as I know.
So, I'm back.
I was thinking that the plugin was almost finished, but I was wrong. Where is the problem, I just don't know, so I ask for help.
I think it's best when I describe what I was doing and someone would find the fault. So I begin.
First I read the frame and the scan header. In the frame header the number of lines (or tile length) was 256, and number of samples per line was 128, sample precision is 16. I guess It's 128 samples per component in the line, because the tile width in the tiff tag was 256 and I have two components. Horizontal and vertical sampling factors are 1. In the scan header the selected predictor is 1 (left pixel). The point transform is 0.
After that I read the huffman tables BITS and HUFFVAL, and then generate the HUFFSIZE table as described in section C2 in the jpeg specification.
Than I was decompressing the huffman coded error data. I generated a huffman tree and was reading the data bit by bit. When I reached the leaf of the tree I read the HUFFVAL value for that huffman code, and if it was for example 10, then I read the next ten bits. This 10 bits are integers in a complement code and the error samples. So if the first bit is zero the value is positive, else it's negative. I'm pretty sure that this part is ok, because when decompressing the coded data I get exactly 256*256 values.
So, this decoded values are not pixel values, they are the error data. So, when I want to get the first pixel in the tile (left upper corner) I compute it so:
pixel = predictor - error
The predictor value is:
predictor = 2^(sampleprecision - point transform - 1) = 2^(16-0-1) = 2^1 = 32768
The second pixel (coordinate (0,1)) is also computed with the same predictor, because it's also the first pixel, but the first pixel for the second component.
Other pixels in the first line are computed with the "left pixel" predictor. The first pixel of each line uses the pixel from the line above as the predictor. The selected predictor is used for all other lines. So when I want to compute the pixel on coordinate (230, 150) I would do this:
pixel(230, 150) = pixel(230, 148) - error(230*256 + 148)
What I first noticed was that the error values for the first two pixels had been about -32000, and all other values between -50 and 50. And that is for every single tile, but for me that didn't make sense.
After that was done I was supposing that this pixels are CFA pixels, because the photometrical intepretation tag was 32803. Reading the CFARepeatPaternDim and CFAPattern tags I get that the first line is a red-green line and the second line a green-blue line and so on. Then I demosaiceted every tile by bilinear interpolation.