Image Compression Standards Chapter 9
Document Details
Uploaded by RealizableRhodium7117
Tags
Summary
This document provides an overview of image compression standards, particularly focusing on JPEG. It details JPEG's core principles, its implementation techniques (e.g., DCT, quantization, entropy coding), and the rationale behind certain design decisions. The document also mentions briefly other image compression standards and their functionalities.
Full Transcript
Chapter 9 Image Compression Standards Outline 9.1 The JPEG Standard 9.2 The JPEG2000 Standard (skip) 9.3 The JPEG-LS Standard (skip) 9.4 Bi-level Image Compression Standards (skip) 2 9.1 The JPEG St...
Chapter 9 Image Compression Standards Outline 9.1 The JPEG Standard 9.2 The JPEG2000 Standard (skip) 9.3 The JPEG-LS Standard (skip) 9.4 Bi-level Image Compression Standards (skip) 2 9.1 The JPEG Standard JPEG is an image compression standard that was developed by the “Joint Photographic Experts Group”. JPEG was formally accepted as an international standard in 1992. JPEG is a lossy image compression method. It employs a transform coding method using the DCT (Discrete Cosine Transform). 3 The JPEG Standard An image is a function of i and j (or x and y) in the spatial domain. The 2D DCT is used as one step in JPEG in order to yield a frequency response which is a function F(u, v) in the spatial frequency domain, indexed by u and v. 4 Observations for JPEG Image Compression The effectiveness of the DCT transform coding method in JPEG relies on 3 major observations: Observation 1: Useful image contents change relatively slowly across the image in a small area, for example, within an 8×8 image block. Much of the information in an image is repeated, hence “spatial redundancy". 5 Observations for JPEG Image Compression Observation 2: Psychophysical experiments suggest that humans are much less likely to notice the loss of very high spatial frequency components than the loss of lower frequency components. The spatial redundancy can be reduced by largely reducing the high spatial frequency contents. Observation 3: Visual acuity is much greater for gray (luminance) than for color (chrominance). Chroma subsampling (4:2:0) is used in JPEG. 6 Fig. 9.1: Block diagram for JPEG encoder. 7 9.1.1 Main Steps in JPEG Image Compression Transform RGB to YIQ or YUV and subsample color. DCT on image blocks. Quantization. Zig-zag ordering and run-length encoding. Entropy coding. 8 DCT on image blocks Each image is divided into 8×8 blocks. The 2D DCT is applied to each block image f(i, j), with output being the DCT coefficients F(u, v) for each block. Using blocks, however, has the effect of isolating each block from its neighboring context. This is why JPEG images look blocky when a high compression ratio is specified by the user. 9 Quantization æF (u, v ) ö ˆ F (u, v ) = round ç ç ÷ ÷ ç ÷ ÷ èQ (u, v ) ø F(u, v) represents a DCT coefficient, Q(u, v) is a quantization matrix entry, and Fˆ (u, v ) represents the quantized DCT coefficients which JPEG will use in the entropy coding. The quantization step is the main source for loss in JPEG compression. 10 Quantization The entries of Q(u, v) tend to have larger values towards the lower right corner. This aims to introduce more loss at the higher spatial frequencies - a practice supported by Observations 1 and 2. Table 9.1 and 9.2 show the default Q(u, v) values obtained from psychophysical studies with the goal of maximizing the compression ratio while minimizing perceptual losses in JPEG images. 11 Table 9.1 The Luminance Quantization Table Table 9.2 The Chrominance Quantization Table 12 Fig. 9.2: JPEG compression for a smooth image block. 13 Fig. 9.2 (cont’d): JPEG compression for a smooth image block. 14 Fig. 9.2: JPEG compression for a textured (complex) image block. 15 Fig. 9.3 (cont’d): JPEG compression for a textured (complex) image block. 16 Run-length Coding (RLC) on AC coefficients RLC aims to turn the Fˆ (u, v )into sets values {#-zeros-to-skip , next non-zero value}. To make it most likely to hit a long run of zeros: a zig-zag scan is used to turn the 8×8 matrix into a 64-vector. 17 Fig. 9.4: Zig-Zag Scan in JPEG. 18 DPCM on DC coefficients The DC coefficients are coded separately from the AC ones. Differential Pulse Code Modulation (DPCM) is the coding method. If the DC coefficients for the first 5 image blocks are 150, 155, 149, 152, 144, then the DPCM would produce 150, 5,-6, 3, -8, assuming di = DCi − DCi-1, and d0=DC0. 19 Entropy Coding The DC and AC coefficients finally undergo an entropy coding step to gain a possible further compression. Use DC as an example: each DPCM coded DC coefficient is represented by (SIZE, AMPLITUDE), where SIZE indicates how many bits are needed for representing the coefficient, and AMPLITUDE contains the actual bits. 20 In the example we are using, codes 150, 5, −6, 3, −8 will be turned into (8, 10010110), (3, 101), (3, 001), (2, 11), (4, 0111). SIZE is Huffman coded since smaller SIZEs occur much more often. AMPLITUDE is not Huffman coded, its value can change widely so Huffman coding has no appreciable benefit. 21 Table 9.3 Baseline entropy coding details - size category. 22 9.1.2 Four Commonly Used JPEG Modes Sequential Mode - the default JPEG mode. Each gray-level image or color image component is encoded in a single left-to-right, top-to-bottom scan. Progressive Mode. Hierarchical Mode. Lossless Mode - discussed in Chapter 7, to be replaced by JPEG-LS (Section 9.3). 23 Progressive Mode Progressive JPEG delivers low quality versions of the image quickly, followed by higher quality passes. 1. Spectral selection: Takes advantage of the spectral (spatial frequency spectrum) characteristics of the DCT coefficients: higher AC components provide detail information. Scan 1: Encode DC and first few AC components, e.g., AC1, AC2. Scan 2: Encode a few more AC components, e.g., AC3, AC4, AC5.... Scan k: Encode the last few ACs, e.g., AC61, AC62, AC63. 24 Progressive Mode (Cont’d) 2. Successive approximation: Instead of gradually encoding spectral bands, all DCT coefficients are encoded simultaneously but with their most significant bits (MSBs) first. Scan 1: Encode the first few MSBs, e.g., Bits 7, 6, 5, 4. Scan 2: Encode a few more less significant bits, e.g., Bit 3.... Scan m: Encode the least significant bit (LSB), Bit 0. 25 9.1.3 A Glance at the JPEG Bitstream Fig. 9.6: JPEG bitstream. 26