Congestion and Queuing (Chapter 3 PDF)
Document Details
Uploaded by WellPositionedMesa8527
Carleton University
Tags
Summary
This document provides an overview of various queuing algorithms used in networking. The topics covered include First-In-First Out (FIFO), priority queuing, round robin, and weighted round robin. It also discusses concepts related to router queuing components and congestion management.
Full Transcript
Congestion and Queuing ▪ Congestion can occur at any point in the network where there are points of speed mismatches or aggregation. ▪ Queuing manages congestion to provide bandwidth and delay guarantees. Queuing Algorithms ◼ First-in, first-out (FIFO) ◼ Priority queuing (PQ) ◼ R...
Congestion and Queuing ▪ Congestion can occur at any point in the network where there are points of speed mismatches or aggregation. ▪ Queuing manages congestion to provide bandwidth and delay guarantees. Queuing Algorithms ◼ First-in, first-out (FIFO) ◼ Priority queuing (PQ) ◼ Round robin ◼ Weighted round robin (WRR) FIFO ◼ First packet in is first packet out ◼ Simplest of all ◼ One queue ◼ All individual queues are FIFO Priority Queuing ◼ Uses multiple queues ◼ Allows prioritization ◼ Always empties first queue before going to the next queue: ◼ Empty queue number 1. ◼ If queue number 1 is empty, then dispatch one packet from queue number 2. ◼ If both queue number 1 and queue number 2 are empty, then dispatch one packet from queue number 3. ◼ Queues number 2 and number Round Robin Queuing ◼ Uses multiple queues ◼ No prioritization ◼ Dispatches one packet from each queue in each round: One packet from queue number 1 One packet from queue number 2 One packet from queue number 3 Then repeat Weighted Round Robin Queuing ◼ Allows prioritization ◼ Assign a weight to each queue ◼ Dispatches packets from each queue proportionately to an assigned weight: ◼ Dispatch up to four from queue number 1. ◼ Dispatch up to two from queue number 2. ◼ Dispatch 1 from queue number 3. ◼ Go back to queue number 1. Router Queuing Components ◼ Each physical interface has a hardware and a software queuing system. Hardware and Software Router Queuing Components ◼ The hardware queuing system always uses FIFO queuing. ◼ The software queuing system can be selected and configured depending on the platform and Cisco IOS version. The Software Queue ◼ Generally, a full hardware queue indicates interface congestion, and software queuing is used to manage it. ◼ When a packet is being forwarded, the router will bypass the software queue if the hardware queue has space in it (no congestion). The Hardware Queue ◼ Routers determine the length of the hardware queue based on the configured bandwidth of the interface. ◼ The length of the hardware queue can be adjusted with the tx- ring-limit command. ◼ Reducing the size of the hardware queue has two benefits: It reduces the maximum amount of time that packets wait in the FIFO queue before being transmitted. It accelerates the use of QoS in Cisco IOS software. ◼ Improper tuning of the hardware queue may produce undesirable results: A long transmit queue may result in poor performance of the software queuing system. A short transmit queue may result in a large number of interrupts, which causes high CPU utilization and low link utilization. Weighted Fair Queuing (WFQ) ◼ A queuing algorithm should share the bandwidth fairly among flows by: Reducing response time for interactive flows by scheduling them to the front of the queue Preventing high-volume flows from monopolizing an interface ◼ In the WFQ implementation, conversations are sorted into flows and transmitted by the order of the last bit crossing its channel. ◼ Unfairness is reinstated by introducing weight to give proportionately more bandwidth to flows with higher IP precedence (lower weight). ◼ The terms “WFQ flows” and “conversations” can be interchanged. Class-Based Weighted Fair Queuing ◼ CBWFQ is a mechanism that is used to guarantee bandwidth to classes. ◼ CBWFQ extends the standard WFQ functionality to provide support for user-defined traffic classes: Classes are based on user-defined match criteria. Packets satisfying the match criteria for a class constitute the traffic for that class. ◼ A queue is reserved for each class, and traffic belonging to a class is directed to that class queue. Low Latency Queuing (LLQ) ◼ A priority queue is added to CBWFQ for real-time traffic. ◼ High-priority classes are guaranteed: Low-latency propagation of packets Bandwidth ◼ High-priority classes are also policed when congestion occurs—they then cannot exceed their guaranteed bandwidth. ◼ Lower-priority classes use CBWFQ. Dropping/Congestion Management Managing Interface Congestion with Tail Drop ◼ Router interfaces experience congestion when the output queue is full: Additional incoming packets are dropped. Dropped packets may cause significant application performance degradation. Tail drop has significant drawbacks. Tail Drop Limitations ◼ In some situations, simple tail drop should be avoided because it contains significant flaws: Dropping can affect TCP synchronization. Dropping can cause TCP starvation. There is no differentiated drop—high-priority traffic is dropped as easily as low-priority traffic. Random Early Detection (RED) ◼ Tail drop can be avoided if congestion is prevented. ◼ RED is a mechanism that randomly drops packets before a queue is full. ◼ RED increases drop rate as the average queue size increases. ◼ RED result: TCP sessions slow to the approximate rate of output-link bandwidth. Average queue size is small (much less than the maximum queue size). TCP sessions are desynchronized by random drops. RED Drop Profiles RED Modes ◼ RED has three modes: No drop: When the average queue size is between 0 and the minimum threshold Random drop: When the average queue size is between the minimum and the maximum threshold Full drop (tail drop): When the average queue size is above the maximum threshold ◼ Random drop should prevent congestion (prevent tail drops). Weighted Random Early Detection (WRED) ◼ WRED can use multiple RED profiles. ◼ Each profile is identified by: Minimum threshold Maximum threshold Mark probability denominator ◼ WRED profile selection is based on: IP precedence (8 profiles) DSCP (64 profiles) ◼ WRED drops less important packets more aggressively than more important packets. ◼ WRED can be applied at the interface, VC, or class level. WRED Building Blocks Class-Based WRED (CBWRED) ◼ Class-based WRED is available when configured in combination with CBWFQ. ◼ Using CBWFQ with WRED allows the implementation of DiffServ Assured Forwarding PHB. ◼ Class-based configuration of WRED is identical to stand- alone WRED. Traffic Policing and Shaping Traffic Conditioners ◼ Policing Limits bandwidth by discarding traffic. Can re-mark excess traffic and attempt to send. Should be used on higher-speed interfaces. Can be applied inbound or outbound. ◼ Shaping Limits excess traffic by buffering. Buffering can lead to a delay. Recommended for slower-speed interfaces. Cannot re-mark traffic. Can only be applied in the outbound direction. Traffic Policing and Shaping Overview ◼ These mechanisms must classify packets before policing or shaping the traffic rate. ◼ Traffic policing typically drops or marks excess traffic to stay within a traffic rate limit. ◼ Traffic shaping queues excess packets to stay within the desired traffic rate. Why Use Policing? Why Use Shaping? ◼ To limit access to resources ◼ To prevent and manage when high-speed access is congestion in ATM, Frame used but not desired (subrate Relay, and Metro Ethernet access) networks, where asymmetric ◼ To limit the traffic rate of bandwidths are used along the certain applications or traffic traffic path classes ◼ To regulate the sending traffic ◼ To mark down (recolor) rate to match the subscribed exceeding traffic at Layer 2 or (committed) rate in ATM, Layer 3 Frame Relay, or Metro Ethernet networks ◼ To implement shaping at the network edge Policing Versus Shaping ◼ Incoming and outgoing directions. ◼ Outgoing direction only. ◼ Out-of-profile packets are ◼ Out-of-profile packets are queued dropped. until a buffer gets full. ◼ Dropping causes TCP retransmits. ◼ Buffering minimizes TCP ◼ Policing supports packet marking retransmits. or re-marking. ◼ Marking or re-marking not supported. ◼ Shaping supports interaction with Frame Relay congestion indication. Token Bucket ◼ Mathematical model used by routers and switches to regulate traffic flow. ◼ Tokens represent permission to send a number of bits into the network. ◼ Tokens are put into the bucket at a certain rate by IOS. ◼ Token bucket holds tokens. ◼ Tokens are removed from the bucket when packets are forwarded. ◼ If there are not enough tokens in the bucket to send the packet, traffic conditioning is invoked (shaping or policing). Single Token Bucket ◼ If sufficient tokens are available (conform action): Tokens equivalent to the packet size are removed from the bucket. The packet is transmitted. Single Token Bucket Exceed Action ◼ If sufficient tokens are not available (exceed action): Drop (or mark) the packet. Single Token Bucket Class-Based Policing Bc is normal burst size. Tc is the time interval. CIR is the committed information rate. CIR = Bc / Tc NET 4007 Multimedia Networking MPEG Audio Compression BIT - NET 4007 Slide 193 MPEG Audio Compression ◼ So far, we have concentrated on voice telephony applications – usually, LPC and CELP are tuned to speech parameters. ◼ In contrast, here, we consider compression methods applicable to general audio, such as music, broadcast digital TV. ◼ Instead of modeling speech, the method used is a waveform coding approach – one that attempts to make the decompressed amplitude- versus-time waveform as much as possible like the input signal. ◼ The main technique used in evaluating audio content for possible compression makes use of a psychoacoustic model of hearing. ◼ The kind of coding carried out, then, is generally referred to as perceptual coding. BIT - NET 4007 Slide 194 Psychoacoustics ◼ The range of human hearing is about 20 Hz to about 20 kHz. ◼ The frequency range of the voice is typically only from about 200 Hz to 4 kHz. ◼ The dynamic range, the ratio of the maximum sound amplitude to the quietest sound that humans can hear, is on the order of about 120 dB. BIT - NET 4007 Slide 195 Frequency Masking ◼ Lossy audio data compression methods, such as MPEG/Audio encoding, remove some sounds which are masked anyway. ◼ The general situation in regard to masking is as follows: ◼ A lower tone can effectively mask (make us unable to hear) a higher tone. ◼ The reverse is not true -- a higher tone does not mask a lower tone well. ◼ The greater the power in the masking tone, the wider is its influence -- the broader the range of frequencies it can mask. ◼ As a consequence, if two tones are widely separated in frequency then little masking occurs. BIT - NET 4007 Slide 196 Critical Bands ◼ Hearing has a limited, frequency-dependent resolution. ◼ In a complex tone, the critical bandwidth corresponds to the smallest frequency difference between two partials such that each can still be heard separately. ◼ The critical bandwidth represents the ear’s resolving power for simultaneous tones or partials. ◼ Experiments indicate that the critical bandwidth: ◼ for masking frequencies < 500 Hz: remains approximately constant in width ( about 100 Hz). ◼ for masking frequencies > 500 Hz: increases approximately linearly with frequency. BIT - NET 4007 Slide 197 Temporal Masking ◼ Recall that after the dance it takes quite a while for our hearing to return to normal. ◼ Phenomenon: any loud tone will cause the hearing receptors in the inner ear to become saturated and require time to recover. ◼ To quantify this type of behavior, we can measure the time sensitivity of hearing by another masking experiment. Suppose we play a masking tone at 1 kHz with a volume level of 60 dB, and a nearby tone at, say, 1.1 kHz with a volume level of 40 dB. Once the masking tone is turned off, we can again hear the 1.1 kHz tone, but only after a small amount of time. BIT - NET 4007 Slide 198 MPEG Audio ◼ MPEG audio compression takes advantage of psychoacoustic models, constructing a large multi- dimensional lookup table to transmit masked frequency components using fewer bits. ◼ MPEG Audio Overview ◼ Applies a filter bank to the input to break it into its frequency components. ◼ In parallel, a psychoacoustic model is applied to the data for bit allocation block. ◼ The number of bits allocated are used to quantize the info from the filter bank -- providing the compression. BIT - NET 4007 Slide 199 MPEG Audio Strategy ◼ MPEG approach to compression relies on ◼ Quantization ◼ Human auditory system is not accurate within the width of a critical band (perceived loudness and audibility of a frequency). ◼ MPEG encoder employs a bank of filters to ◼ Analyze the frequency components of the audio signal by calculating a frequency transform of a window of signal values. ◼ Decompose the signal into subbands by using a bank of filters (Layer 1 & 2: “quadrature-mirror”; Layer 3: adds a DCT; psychoacoustic model: Fourier transform) BIT - NET 4007 Slide 200 MPEG Audio Strategy ◼ Frequency masking: by using a psychoacoustic model to estimate the just noticeable noise level: ◼ Encoder balances the masking behavior and the available number of bits by discarding inaudible frequencies. ◼ Scaling quantization according to the sound level that is left over, above masking levels. ◼ May take into account the actual width of the critical bands: ◼ For practical purposes, audible frequencies are divided into 25 main critical bands. ◼ To keep simplicity, adopts a uniform width for all frequency analysis filters, using 32 overlapping subbands. BIT - NET 4007 Slide 201 MPEG Layers ◼ MPEG audio offers three compatible layers : ◼ Each succeeding layer able to understand the lower layers. ◼ Each succeeding layer offering more complexity in the psychoacoustic model and better compression for a given level of audio quality. ◼ each succeeding layer, with increased compression effectiveness, accompanied by extra delay. ◼ The objective of MPEG layers: a good tradeoff between quality and bit- rate. ◼ Layer 1 quality can be quite good provided a comparatively high bit-rate is available. Digital Audio Tape typically uses Layer 1 at around 192 kbps. ◼ Layer 2 has more complexity; was proposed for use in Digital Audio Broadcasting. ◼ Layer 3 (MP3) is most complex, and was originally aimed at audio transmission over ISDN lines. ◼ Most of the complexity increase is at the encoder, not the decoder – accounting for the popularity of MP3 players. BIT - NET 4007 Slide 202 Basic Algorithm ◼ The algorithm proceeds by dividing the input into 32 frequency subbands, via a filter bank. ◼ A linear operation taking 32 PCM samples, sampled in time; output is 32 frequency coefficients. ◼ In the Layer 1 encoder, the sets of 32 PCM values are first assembled into a set of 12 groups of 32s. ◼ The following figure shows how samples are organized. ◼ A Layer 2 or Layer 3, frame actually accumulates more than 12 samples for each subband: a frame includes 1,152 samples. BIT - NET 4007 Slide 203 MPEG Audio Frame Sizes BIT - NET 4007 Slide 204 Layer 2 of MPEG-1 Audio ◼ Layer 2 of the MPEG-1 audio codec includes small changes to effect bitrate reduction and quality improvement, at the price of an increase in complexity. BIT - NET 4007 Slide 205 Layer 3 of MPEG-1 Audio ◼ Main differences ◼ Employs a similar filter bank to that used in Layer 2, except using a set of filters with non-equal frequencies. ◼ Takes into account stereo redundancy. ◼ Uses Modified Discrete Cosine Transform (MDCT). BIT - NET 4007 Slide 206 NET 4007 Multimedia Networking Color in Image and Video BIT - NET 4007 Slide 1 Light and Spectrum Light is an electromagnetic wave. Its color is characterized by the wavelength content of the light. Laser light consists of a single wavelength: e.g., a ruby laser produces a bright, scarlet-red beam. Most light sources produce contributions over many wavelengths. However, humans cannot detect all light, just contributions that fall in the “visible wavelengths”. Short wavelengths produce a blue sensation, long wavelengths produce a red one. BIT - NET 4007 Slide 2 Human Vision The eye works like a camera, with the lens focusing an image binary image onto the retina (upside-down and left-right reversed). The retina consists of an array of rods and three kinds of cones. The rods come into play when light levels are low and produce a image in shades of gray (“all cats are gray at night!"). For higher light levels, the cones each produce a signal. Because of their different pigments, the three kinds of cones are most sensitive to red (R), green (G), and blue (B) light. The Blue sensitivity is much smaller than the curves for Red or Green - Blue is a late addition, in evolution. Statistically, Blue is the favorite color of humans, regardless of nationality - perhaps for this reason: Blue is a latecomer and thus is a bit surprising! BIT - NET 4007 Slide 3 Color-Matching Functions The amounts of R, G, and B the subject selects to match each single- wavelength light forms the color-matching curves. BIT - NET 4007 Slide 4 CIE Chromaticity Diagram Since the r ( ) color-matching curve has a negative lobe, a set of fictitious primaries were devised that lead to color-matching functions with only positives values. The matrix is chosen such that the middle standard color-matching function y ( ) exactly equals the luminous-efficiency curve. BIT - NET 4007 Slide 5 Transform XYZ to RGB BIT - NET 4007 Slide 6 Subtractive Color: CMY Color Model So far, we have effectively be dealing only with additive color. Namely, when two light beams impinge on a target, their colors add; when two phosphors on a CRT screen are turned on, their colors add. But for ink deposited on paper, the opposite situation holds: yellow ink subtracts blue from while illumination, but reflects red and green; it appears yellow. BIT - NET 4007 Slide 7 Transformation from RGB to CMY Simplest model we can invert to specify what ink density to lay down on paper, to make a certain desired RGB color: C 1 R M = 1 − G Y 1 B Then the inverse transform is: R 1 C G = 1 − M B 1 Y BIT - NET 4007 Slide 8 CMYK System Undercolor removal: Sharper and cheaper printer colors: calculate that part of the CMY mix that would be black, remove it from the color proportions, and add it back as real black. The new specification of inks is thus: K = min {C, M, Y} C C − K M = M − K Y Y − K BIT - NET 4007 Slide 9 Color Models in Video Largely derive from older analog methods of coding color for TV. Luminance is separated from color information. YIQ is used to transmit TV signals in North America and Japan. In Europe, video tape uses the PAL or SECAM codings, which are based on TV that uses a matrix transform called YUV. Digital video mostly uses a matrix transform called YCbCr that is closely related to YUV. BIT - NET 4007 Slide 10 YUV Color Model YUV codes a luminance signal (for gamma-corrected signals) equal to the luminance. Chrominance refers to the difference between a color and a reference white at the same luminance - use color differences U, V: For gray, R’ = G’ = B’, the luminance Y’ equals to that gray, since 0.299+0.587+0.114 = 1.0. And for a gray (“black and white") image the chrominance (U, V ) is zero. BIT - NET 4007 Slide 11 NET 4007 Multimedia Networking Graphics and Image Data Representations BIT - NET 4007 Slide 12 8-bit Gray-level Images Each pixel has a gray-value between 0 and 255. Each pixel is represented by a single byte; e.g., a dark pixel might have a value of 10, and a bright one might be 230. Bitmap: The two-dimensional array of pixel values that represents the graphics/image data. Image resolution refers to the number of pixels in a digital image (higher resolution always yields better quality). Fairly high resolution for such an image might be 1,600 * 1,200, whereas lower resolution might be 640 * 480. Frame buffer: Hardware used to store bitmap. Video card (actually a graphics card) is used for this purpose. The resolution of the video card does not have to match the desired resolution of the image, but if not enough video card memory is available then the data has to be shifted around in RAM for display. BIT - NET 4007 Slide 13 Dithering Dithering is used to calculate patterns of dots such that values from 0 to 255 correspond to patterns that are more and more filled at darker pixel values, for printing on a 1-bit printer. The main strategy is to replace a pixel value by a larger pattern, say 2*2 or 4*4, such that the number of printed dots approximates the varying- sized disks of ink used in analog, in halftone printing (e.g., for newspaper photos). Half-tone printing is an analog process that uses smaller or larger filled circles of black ink to represent shading, for newspaper printing. For example, if we use a 2 * 2 dither matrix 0 2 3 1 BIT - NET 4007 Slide 14 Dithering We can first re-map image values in 0..255 into the new range 0..4 by (integer) dividing by 256/5. Then, e.g., if the pixel value is 0 we print nothing, in a 2 * 2 area of printer output. But if the pixel value is 4 we print all four dots. The rule is: If the intensity is > the dither matrix entry then print an on dot at that entry location: replace each pixel by an n * n matrix of dots. Note that the image size may be much larger, for a dithered image, since replacing each pixel by a 4 * 4 array of dots, makes an image 16 times as large. BIT - NET 4007 Slide 15 24-bit Color Images In a color 24-bit image, each pixel is represented by three bytes, usually representing RGB. This format supports 256 * 256 * 256 possible combined colors, or a total of 16,777,216 possible colors. However such flexibility does result in a storage penalty: A 640*480 24-bit color image would require 921.6 kB of storage without any compression. An important point: many 24-bit color images are actually stored as 32- bit images, with the extra byte of data for each pixel used to store an alpha value representing special effect information (e.g., transparency). The following figure shows the image forestfire.bmp., a 24-bit image in Microsoft Windows BMP format. Also shown are the grayscale images for just the Red, Green, and Blue channels, for this image. BIT - NET 4007 Slide 16 8-bit Color Images Many systems can make use of 8 bits of color information (the so-called “256 colors”) in producing a screen image. Such image files use the concept of a lookup table to store color information. Basically, the image stores not color, but instead just a set of bytes, each of which is actually an index into a table with 3-byte values that specify the color for a pixel with that lookup table index. Note the great savings in space for 8-bit images, over 24-bit ones: a 640 * 480 8-bit color image only requires 300 kB of storage, compared to 921.6 kB for a color image (again, without any compression applied). BIT - NET 4007 Slide 17 Color Look-up Tables (LUTs) The idea used in 8-bit color images is to store only the index, or code value, for each pixel. Then, e.g., if a pixel stores the value 25, the meaning is to go to row 25 in a color look-up table (LUT). BIT - NET 4007 Slide 18 JPEG JPEG: The most important current standard for image compression. The human vision system has some specific limitations and JPEG takes advantage of these to achieve high rates of compression. JPEG allows the user to set a desired level of quality, or compression ratio (input divided by output). BIT - NET 4007 Slide 19 NET 4007 Multimedia Networking Lossless and Lossy Image Compression BIT - NET 4007 Slide 20 Lossless Image Compression Approaches of Differential Coding of Images Given an original image I (x, y), using a simple difference operator we can define a difference image d (x, y) as follows: d (x, y) = I (x, y) – I (x – 1, y) or use the discrete version of the 2-D Laplacian operator to define a difference image d (x, y) as d (x, y) = 4I (x, y) – I (x – 1, y) – I (x, y + 1) – I (x + 1, y) – I (x, y-1) Due to spatial redundancy existed in normal images I, the difference image d will have a narrower histogram and hence a small entropy. BIT - NET 4007 Slide 21 Lossless Image Compression BIT - NET 4007 Slide 22 Lossless JPEG Lossless JPEG: A special case of the JPEG image compression. The predictive method Forming a differential prediction: a predictor combines the values of up to three neighboring pixels as the predicted value for the current pixel, indicated by “X” in the following figure. The predictor can use any one of the seven schemes listed in the following table. Encoding: The encoder compares the prediction with the actual pixel value at position “X” and encodes the difference using one of the lossless compression techniques we have discussed. e.g., the Huffman coding scheme. BIT - NET 4007 Slide 23 Lossless JPEG BIT - NET 4007 Slide 24 Lossy Image Compression by Transform Coding Rationale: If Y is the result of a linear transform T of the input vector X in such a way that the components of Y are much less correlated, then Y can be coded more efficiently than X. If most information is accurately described by the first few components of a transformed vector, then the remaining components can be coarsely quantized, or even set to zero, with little signal distortion. Discrete Cosine Transform (DCT) will be studied first. In addition, we will examine the Karhunen-Loeve Transform (KLT) which optimally decorrelates the components of the input X. BIT - NET 4007 Slide 25 Spatial Frequency and DCT Spatial frequency indicates how many times pixel values change across an image block. The DCT formalizes this notion with a measure of how much the image contents change in correspondence to the number of cycles of a cosine wave per block The role of the DCT is to decompose the original signal into its DC and AC components; the role of the IDCT is to reconstruct (re-compose) the signal. http://www.sfu.ca/~cjenning/toybox/hjpeg/index.html BIT - NET 4007 Slide 26 Definition of DCT Given an input function f (i, j) over two integer variable i and j (a piece of an image), the 2D DCT transforms it into a new function F (u,v), with integer u and v running over the same range as i and j. The general definition of the transform is: where i, u = 0, 1, …, M-1; j, v = 0, 1, …, N-1; and the constants C (u) and C (v) are determine by BIT - NET 4007 Slide 27 Wavelet Transform Introduction Wavelet: a small wave. Convert a signal into a series of wavelets. The objective of the wavelet transform is to decompose the input signal into components that are easier to deal with, have special interpretations, or have some components that can be thresholded away, for compression purposes. BIT - NET 4007 Slide 28 Haar Wavelet Transform The simplest wavelet transform. Suppose we have the following input sequence: {xn, i} = {10, 13, 25, 26, 29, 21, 7, 15} where i = 0, 1, …, 7 indexes pixels, and n stands for the level. Consider the transform that replaces the original sequence with its pairwise average xn, i, and difference dn-1, i define as follows: Form a new sequence having length equal to that of the original sequence by concatenating the two sequences {xn, i } and {dn-1, i}. The resulting sequence is {xn, i , dn-1, i} = {11.5, 25.5, 25, 11, -1.5, -0.5, 4, -4} BIT - NET 4007 Slide 29 Haar Wavelet Transform Since the first half of the above sequence contain averages from the original sequence, we can view it as a coarser approximation to the original signal. The second half of this sequence can be viewed as the details or approximation errors of the first half. 2-D Haar Transform a b x = c d 1 a + b + c + d a −b+c −d y = 2 a +b −c −d a −b−c + d BIT - NET 4007 Slide 30 2-D Haar Wavelet Transform Apply this transform to a complete image. Group the pixels into 2 x 2 blocks and apply the transform. To view the result sensibly, we group all the top left components of the 2 x 2 blocks in y together to form the top left subimage. BIT - NET 4007 Slide 31 2-D Haar Wavelet Transform It is clear that most of the energy is contained in the top left subimage and the least energy is in the lower right subimage. Note that the top right subimage contains the near-vertical edges and lower left subimage contains the near-horizontal edges. The energies of the subimages and their percentages of the total are: Top left Top right Lower left Lower right 201.73 x 10 6 4.56 x 106 1.89 x 106 0.82 x 106 96.5% 2.2% 0.9% 0.4% Other examples: http://heliso.tripod.com/java_hls/wavelet/haar2d.htm BIT - NET 4007 Slide 32 Wavelet Transform Unlike families of linear transforms (e.g., DCT, DFT) where the same, static set of basis function are used for every input signal, wavelets employ a dynamic set of basis functions that represents the input function in the most efficient way. “The forest & the trees”: -- Notice gross features with a large “window”. -- Notice small features with a small “window”. Wavelets seek to represent a signal with good resolution in both time and frequency. Continuous wavelet transform (CWT) and Discrete wavelet transform (DWT). DWT is used in this course. BIT - NET 4007 Slide 33 Why Wavelet ? Stationary Signal: Signals with frequency content unchanged in time. All frequency components exist at all times. Non-Stationary Signal: Frequency changes in time. BIT - NET 4007 Slide 34 F T vs. Wavelet Transform Fourier Transform: Only gives what frequency components exist in the signal. The time and frequency information can not be seen at the same time. Wavelet Transform: Analyze the signal at different frequencies with different resolutions. Good time resolution and poor frequency resolution at high frequencies. Good frequency resolution and poor time resolution at low frequencies. Many signals in our daily life are non-stationary. We need to know whether and also when an incident was happened. BIT - NET 4007 Slide 35