Introduction to JPEG Compression

JPEG is an image compression standard which was developed by "Joint Photographic Experts Group". In 1992, it was accepted as an international standard. JPEG is a lossy image compression method. JPEG compression uses the DCT (Discrete Cosine Transform) method for coding transformation. It allows a tradeoff between storage size and the degree of compression can be adjusted.

Following are the steps of JPEG Image Compression-

Step 1: The input image is divided into a small block which is having 8x8 dimensions. This dimension is sum up to 64 units. Each unit of the image is called pixel.

Step 2: JPEG uses [Y,Cb,Cr] model instead of using the [R,G,B] model. So in the 2^nd step, RGB is converted into YCbCr.

Step 3: After the conversion of colors, it is forwarded to DCT. DCT uses a cosine function and does not use complex numbers. It converts information?s which are in a block of pixels from the spatial domain to the frequency domain.

DCT Formula

Step 4: Humans are unable to see important aspects of the image because they are having high frequencies. The matrix after DCT conversion can only preserve values at the lowest frequency that to in certain point. Quantization is used to reduce the number of bits per sample.

There are two types of Quantization:

Uniform Quantization
Non-Uniform Quantization

Step 5: The zigzag scan is used to map the 8x8 matrix to a 1x64 vector. Zigzag scanning is used to group low-frequency coefficients to the top level of the vector and the high coefficient to the bottom. To remove the large number of zero in the quantized matrix, the zigzag matrix is used.

Step 6: Next step is vectoring, the different pulse code modulation (DPCM) is applied to the DC component. DC components are large and vary but they are usually close to the previous value. DPCM encodes the difference between the current block and the previous block.

Step 7: In this step, Run Length Encoding (RLE) is applied to AC components. This is done because AC components have a lot of zeros in it. It encodes in pair of (skip, value) in which skip is non zero value and value is the actual coded value of the non zero components.