www.xbdev.net xbdev - software development
Friday December 14, 2018
home | about | contact | Donations


“JPEG”

The mysterious world of the jpeg file format explained

By Benjamin Kenwright

 

Well those of you who are about to set out on the journey of discovering the jpeg file format better be prepared, I believe that the principles behind it are easy to grasp than the actual coding of it.

The jpeg file format is all around you… probably one of the most widely used… its platform independent… one of its biggest uses is for the internet.

 

Why use the jpeg file format?  Its biggest selling point is the fact that you can take a 3meg bitmap file (coolimage.bmp) and convert it to a jpg of only a few hundred kilobytes instead (e.g. coolimage.jpg).

Of course it does have its downside, it’s a lossy image format, so the original image looses some of its original details… but usually these are not noticeable to the human eye.

 

 

 

Give me a jpeg, and lets rip it to pieces!

 

How its layed out, and how the information is stored isn’t to hard to grasp…. Simply put the jpeg file format is arrange with a description section of size 2 bytes which has an value which we can look up, e.g. 0xffda for example, then it has a further 2 bytes following that desciption section header which tells us how many bytes make up that section.

 

One exception!!  There are a few uni-markers as they say!  A uni marker is just a 2 byte id, which has some meaning… SO A UNI-MARKER DOES NOT HAVE 2 BYTES FOLLOWING IT THAT GIVES ITS SIZE.

 

 

 

 

Hmmm…markers…hmm….yup it all seems simple…and if you take it one step at a time it will eventually make sense!  The secret to understanding the jpeg file format is to break it up into pieces and understand fully how each section of the marker works, and its use in the overall picture.

 

 

 

Now, to start with… all jpegs should start with a  SOI marker, which is a uni-marker and has the value 0xffda.  But the jpeg file format is arrange in little endian format… What does that mean?  Well the bytes are arranged so that the least significant bits are first … so if we read in the first two bytes we would get “0xda” first followed by “0xff”…. Again it only matters if we read more than one byte at a time.  So if we read more than one byte at a time we’ll have to shift it around.

 

One thing I found when writing a simple jpeg decoder/reader, was that you have to be prepared to look at your data right down to the binary level!  Its not enough to look at your data as bytes….you’ll have to actually decode 1’s and 0’s and compare them etc… and shift them … to get your image back.

 

Let write a little code to see how the jpeg file format works… give you a taster

 

 

 

/****************************************************************/

/*                                                              */

/* Title: How JPEG format works                                 */

/*                                                              */

/****************************************************************/

 

/************************** Start *******************************/

#include <windows.h>

#include <stdio.h>

 

/****************************************************************/

/*                                                              */

/* FeedBack Data                                                */

/*                                                              */

/****************************************************************/

void abc(char* s)

{

     FILE *fp;

     fp = fopen("t.txt", "a+");

     fprintf(fp, "%s\n",s);

     fclose(fp);

}

 

void readjpeg();

 

/****************************************************************/

/*                                                              */

/* Entry Point                                                  */

/*                                                              */

/****************************************************************/

int __stdcall WinMain (HINSTANCE hInst, HINSTANCE hPrev, LPSTR lpCmd, int nShow)

{

  readjpeg();

  return 0;

}

 

/****************************************************************/

/*                                                              */

/* jpeg functions                                               */

/*                                                              */

/****************************************************************/

 

 

/* Okay before we start getting overwelmed by bits and bytes and

bit shifting and all sorts of special tricks.. we should first

read in the header...which is the first part of the file, and

can tell us a lot about the jpeg file. */

 

// First lets define some things

#define             SOI                  /*Start of Image*/  0xffd8

#define             EOI                  /*End of Image  */  0xffd9

 

void readjpeg()

{

             byte chunk[2];

             WORD sizechunk;

 

     FILE *f;

     f = fopen("balloon.jpg", "rb");

             // Lets read in the first word (e.g. a word is 2 bytes).

     fread(chunk, 1, 2, f);

 

     // really big buffer for text

     char buff[200];

             // Output what we have read in.. see what it is?

     sprintf(buff, "First 2 bytes are: 0x%x, 0x%x", chunk[0], chunk[1] );

     abc(buff);

 

             fclose(f);

}

 

// okay if you run the program you'll get as an output:

/*

            First 2 bytes are: 0xff, 0xd8

*/

// This tells us that the file we are dealing with is a jpeg,

// as it starts with 0xffd8 which means SOI (Start Of Image).

 

 

Well its not the most exciting piece of code yet… but I’m a beliver in starting simple….  Any-how don’t want to loose you yet!… haven’t even started on the Huffman coding…lol.

 

So what have we learned above, well we have read in a file called “ballon.jpg”, something I just found laying around on my hard drive.  Then I read in the first 2 bytes from the very start of the file and printed them out to a text file called t.txt…. its not the most creative name, but you can change it if you want.

I like to put it to a text file so that we can examine what we have obtained as we go along.

 

All markers… (e.g. 2 byte identification chunks) always start with 0xff… it may not be so easy to follow as the bytes are read in the opposite way so what where really getting is 0xdaff J  But I think you get the point.  There is one exception, a marker followed by 00 is a ignore marker, these only appear in our encoded data (e.g. in the SOS –Start of Scan part).. but we’ll get to that in time.

 

Note: Sometimes I use the word chunk to represent the actual section ID of the part of the file… for example I say I’m reading in the chunk ID, which for example could be APP0 (e.g. 0xffe0) which will be followed by a chunk size of 2 bytes, then we go through the chunk.  Alternatively you can refer to then as sections, or markers..…just make sure you know what I’m going on about, and how its organized. 

 

Our file would start like this:

 

[0xd8] [0xff] [0xe0] [0xff] [0x00][0x10] …..

 

So what would this tell us???  Any ideas?…its goes like this:

 

[0xd8] [0xff] – first 2 bytes are a uni-marker representing SOI (Start Of Image).

[0xe0] [0xff] -  further 2 bytes we have the marker APP0 (Application Marker)

[0x00] [0x10] – This is the further 2 bytes following our APP0 marker telling us how long this marker is!  Which is 0x1000…rember the bytes are the other way around.  So the next 16 bytes (0x10) including the 2 bytes for the length, represent how long our APP0 marker is.

 

Now a good idea for a newby to the jpeg file format is to write a small program which just goes though the file and see’s what markers are in there!  As we can read in the marker’s type, and we know how long that marker is, so we can just jump to the next marker and read in what it is…get its size and skip to the next one.

 

Note:  All the markers contain the total size of how much information is in them “EXCEPT” the SOS marker which is the marker that contains the compressed image information… and is usually located at the end of the file…well almost, the last marker is always EOI (End Of Image) and has the value 0xffd9.

 

And here we are… a nice little program which lists the markers in the jpg file:

 

 

#include <windows.h>

#include <stdio.h>

 

/****************************************************************/

/*                                                              */

/* FeedBack Data                                                */

/*                                                              */

/****************************************************************/

void abc(char* s)

{

     FILE *fp;

     fp = fopen("t.txt", "a+");

     fprintf(fp, "%s\n",s);

     fclose(fp);

}

 

void readjpeg();

 

/****************************************************************/

/*                                                              */

/* Entry Point                                                  */

/*                                                              */

/****************************************************************/

int __stdcall WinMain (HINSTANCE hInst, HINSTANCE hPrev, LPSTR lpCmd, int nShow)

{

  readjpeg();

  return 0;

}

 

/****************************************************************/

/*                                                              */

/* jpeg functions                                               */

/*                                                              */

/****************************************************************/

 

 

/* Okay before we start getting overwelmed by bits and bytes and

bit shifting and all sorts of special tricks.. we should first

read in the header...which is the first part of the file, and

can tell us a lot about the jpeg file. */

 

// First lets define some things

#define             SOI                  /*Start of Image*/  0xffd8

#define             EOI                  /*End of Image  */  0xffd9

 

#define             APP0   /**/                                          0xffe0 /*to 0xffef APP15*/

 

// Really big buffer for text output

char buff[200];

 

void readjpeg()

{

             byte chunk[2];

             byte sizeofchunk[2];

 

     FILE *f;

     f = fopen("balloon.jpg", "rb");

             // Lets read in the first 8 bytes

     fread(chunk, 1, 2, f);

 

 

             // Output what we have read in.. see what it is?

     sprintf(buff, "First 2 bytes are: 0x%x, 0x%x", chunk[0], chunk[1] );

     abc(buff);

 

             fread(chunk, 1, 2, f);

             fread(sizeofchunk, 1, 2, f);

             short unsigned int size = ((sizeofchunk[0] << 8) | sizeofchunk[1]);

 

             sprintf(buff, "Second 2 bytes are: 0x%x, 0x%x", chunk[0], chunk[1] );

             abc(buff);

             sprintf(buff, "Size of our piece of data:%u", size);

             // Remeber the size includes the 2 bytes for the size.

             abc(buff);

 

             // Now we now how big the next chunk is, we can read it in.

             // I know its an app0 chunk because the chunk was 0xffe0

 

             char temp[100];

             fread(&temp, size - 2, 1, f);

             temp[size - 2 + 1] = '\0'; // Null terminate the string :)

 

             sprintf(buff, "The APP0 value: %s", temp);

             abc(buff);

 

             // A stage further, opening up the various sections.

 

             //  Lets try and read in all the data...see what we get...

             //  Remeber now, its a 2 byte value which tells us what it is,

             //  then a 2 byte value of how bit it is :)

           

             while(true)

             {

                        fread(chunk, 1, 2, f);

                       

                        // If the chunk we read in doesn't begin with 0xff then

                        // then its not a valid chunk and so exit.

                        if( chunk[0] != 0xff )

                        {

                                    sprintf(buff, "Error chunk[0] was: 0x%x, chunk[1]: 0x%x", chunk[0], chunk[1]);

                                    abc(buff);

                                    break;

                        }

 

                        // If we get 0xffd9 then its the EOF (End Of File)

                        if( chunk[1] == 0xd9 )

                        {

                                    abc("End Of File");

                                    break;

                        }

 

                       

                        fread(sizeofchunk, 2, 1, f);

                        short unsigned int size = ((sizeofchunk[0] << 8) | sizeofchunk[1]);

 

                        if( chunk[1] == 0xda )

                        {

                        // Okay this means we have started to scan the encoded data.

                                    byte count;

                                    fread(&count, 1, 1, f);

                                    sprintf(buff, "Start of scan count: %u", count);

                                    abc(buff);

                                    //fseek(f, -1, SEEK_CUR);

 

                                    while(count != 0xff)

                                    {

                                                fread(&count, 1, 1, f);

                                                if(count == 0xff)

                                                {

                                                            fread(&count, 1, 1, f);

                                                            if(count != 0x00)

                                                                        break;

                                                }

 

 

                                    }

                                    sprintf(buff, "\nEnd of scan value: 0xff%x", count);

                                    abc(buff);

 

                                    break;

                                   

                                   

                        }

 

                        sprintf(buff, "Chunk ID: 0x%x%x,   Size:%u",

                                                chunk[0], chunk[1], size);

                        abc(buff);

 

                        fseek(f, size-2, SEEK_CUR);

                       

 

             }

 

 

 

             fclose(f);

}

 

 

If you run the above code you’ll get the following:

 

First 2 bytes are: 0xff, 0xd8

Second 2 bytes are: 0xff, 0xe0

Size of our piece of data:16

The APP0 value: JFIF

Chunk ID: 0xffdb,   Size:67

Chunk ID: 0xffdb,   Size:67

Chunk ID: 0xffc0,   Size:17

Chunk ID: 0xffc4,   Size:31

Chunk ID: 0xffc4,   Size:181

Chunk ID: 0xffc4,   Size:31

Chunk ID: 0xffc4,   Size:181

Start of scan count: 3

 

End of scan value: 0xffd9

 

Now belive it or not, we’ve got a whole lot of juicy information there… it tells us the size of each chunk, the order that they are in… and which chunks are in our image.

 

 

 

 

 
 Visitor: 9534626  { 209.237.238.175 } Copyright (c) 2002-2017 xbdev.net - All rights reserved.
Designated tutorial and software are the property of their respective owners.