Anda di halaman 1dari 5

JPEG Compression

The acronym JPEG stands for the Joint Photographic Experts Group, a standards committee - within the International Standard Organization (ISO). ISO formed the Photographic Experts Group (PEG) to research methods of transmitting video, still images, and text over ISDN lines. Goal was to produce a set of industry standards for the transmission of graphics and image data over digital communications networks. Disadv of prev: GIF maximum pixel depth of eight bits, maximum of 256 colors. Its LZW compression algorithm does not work very well on typical scanned image data and does not to recognize repeated patterns. Both TIFF and BMP are capable of storing 24-bit data but do not compress image data very well. JPEG provides a compression method that is capable of compressing continuous-tone image data with a pixel depth of 6 to 24 bits with reasonable speed and efficiency. And although JPEG itself does not define a standard image file format, several have been invented or modified to fill the needs of JPEG data storage. JPEG in Perspective JPEG is not a single algorithm. Instead, it may be thought of as a toolkit of image compression methods that may be altered to fit the needs of the user. JPEG may be adjusted to produce very small, compressed images that are of relatively poor quality in appearance but still suitable for many applications. Conversely, JPEG is capable of producing very high-quality compressed images that are still far smaller than the original uncompressed data. JPEG was designed to compress color or gray-scale continuous-tone images of real-world subjects: photographs, video stills, or any complex graphics that resemble natural subjects. Bad for - Animations, ray tracing, line art, black-and-white documents, and typical vector graphics don't compress very well under JPEG and shouldn't be expected to. JPEG is an excellent way to store 24-bit photographic images, such as those used in imaging and multimedia applications The amount of compression achieved depends upon the content of the image data. A typical photographic-quality image may be compressed from 20:1 to 25:1 without experiencing any noticeable degradation in quality. Higher compression ratios will result in image files that differ noticeably from the original image but still have an overall good image quality. And achieving a 20:1 or better compression ratio in many cases not only saves disk space, but also reduces transmission time across data networks and phone lines. "tune" the quality of a JPEG encoder using a parameter called a quality setting or a Q factor. Although different implementations have varying scales of Q factors, a range of 1 to 100 is typical. A factor of 1 produces the smallest, worst quality images; a factor of 100 produces the largest, best quality images. The optimal Q factor depends on the image content and is therefore different for every image. The art of JPEG compression is finding the lowest Q factor that produces an image that is visibly acceptable, and preferably as close to the original as possible. The JPEG library supplied by the Independent JPEG Group uses a quality setting scale of 1 to 100. To find the optimal compression for an image using the JPEG library, follow these steps: Encode the image using a quality setting of 75 (-Q 75). If you observe unacceptable defects in the image, increase the value, and re-encode the image. If the image quality is acceptable, decrease the setting until the image quality is barely acceptable. This will be the optimal quality setting for this image. Repeat this process for every image you have (or just encode them all using a quality setting of 75). JPEG isn't always an ideal compression solution. There are several reasons: JPEG doesn't fit every compression need. Images containing large areas of a single color do not compress very well- JPEG will introduce "artifacts" JPEG can be rather slow when it is implemented only in software. If fast decompression is required, a hardware-based JPEG solution Baseline JPEG The JPEG specification defines a minimal subset of the standard called baseline JPEG, which all JPEG-aware applications are required to support.

This baseline uses an encoding scheme based on the Discrete Cosine Transform (DCT) to achieve compression. DCT-based encoding algorithms are always lossy by nature. DCT algorithms are capable of achieving a high degree of compression with only minimal loss of data. This scheme is effective only for compressing continuous-tone images in which the differences between adjacent pixels are usually small. In practice, JPEG works well only on images with depths of at least four or five bits per color channel. The baseline standard actually specifies eight bits per input sample. Data of lesser bit depth can be handled by scaling it up to eight bits per sample, but the results will be bad for low-bit-depth source data, because of the large jumps between adjacent pixel values. For similar reasons, color mapped source data does not work very well, especially if the image has been dithered.

The JPEG compression scheme is divided into the following stages: 1. 2. Transform the image into an optimal color space. Downsample chrominance components by averaging groups of pixels together. Apply a Discrete Cosine Transform (DCT) to blocks of pixels, thus removing redundant image data. Quantize each block of DCT coefficients using weighting functions optimized for the human eye. Encode the resulting coefficients (image data) using a Huffman variable word-length algorithm to remove redundancies in the coefficients.

3.

4.

5.

Transform the image Convert RGB, HIS or CMYK to YUV/YCbCR color model The JPEG algorithm is capable of encoding images that use any type of color space. JPEG itself encodes each component in a color model separately, and it is completely independent of any color-space model, such as RGB, HSI, or CMY. The best compression ratios result if a luminance/chrominance color space, such as YUV or YCbCr, is used. Most of the visual information to which human eyes are most sensitive is found in the high-frequency, gray-scale, luminance component (Y) of the YCbCr color space. The other two chrominance components (Cb and Cr) contain highfrequency color information to which the human eye is less sensitive. Most of this information can therefore be discarded. In comparison, the RGB, HSI, and CMY color models spread their useful visual image information evenly across each of their three color components, making the selective discarding of information very difficult. All three color components would need to be encoded at the highest quality, resulting in a poorer compression ratio. Gray-scale images do not have a color space as such and therefore do not require transforming. Downsample chrominance components The simplest way of exploiting the eye's lesser sensitivity to chrominance information is simply to use fewer pixels for the chrominance channels. For example, in an image nominally 1000x1000 pixels, we might use a full 1000x1000 luminance pixels but only 500x500 pixels for each chrominance component. In this representation, each chrominance pixel covers the same area as a 2x2 block of luminance pixels Remarkably, this 50 percent reduction in data volume has almost no effect on the perceived quality of most images. Equivalent savings are not possible with conventional color models such as RGB, because in RGB each color channel carries some luminance information and so any loss of resolution is quite visible. The JPEG standard allows several different choices for the sampling ratios, or relative sizes, of the downsampled channels. The luminance channel is always left at full resolution (1:1 sampling). Typically both chrominance channels are downsampled 2:1 horizontally and either 1:1 or 2:1 vertically, meaning that a chrominance pixel covers the same area as either a 2x1 or a 2x2 block of luminance pixels. JPEG refers to these downsampling processes as 2h1v and 2h2v sampling, respectively.

Another notation commonly used is 4:2:2 sampling for 2h1v and 4:2:0 sampling for 2h2v; this notation derives from television customs. 2h1v sampling is fairly common because it corresponds to NTSC standard TV practice, but it offers less compression than 2h2v sampling, with hardly any gain in perceived quality.

Apply a Discrete Cosine Transform Divide image into 8x8 blocks From this point on, each color component is processed independently, so a "pixel" means a single value, even in a color image A DCT is applied to each 8x8 block. DCT converts the spatial image representation into a frequency map: the low-order or "DC" term represents the average value in the block, while successive higher-order ("AC") terms represent the strength of more and more rapid changes across the width or height of the block. The highest AC term represents the strength of a cosine wave alternating from maximum to minimum at adjacent pixels. The DCT calculation is fairly complex; in fact, this is the most costly step in JPEG compression. The point of doing it is that we have now separated out the high- and low-frequency information present in the image. We can discard high-frequency data easily without losing low-frequency information. The DCT step itself is lossless except for round off errors. Quantize each block divides each DCT output value by a "quantization coefficient" and rounds the result to an integer. The larger the quantization coefficient, the more data is lost Each of the 64 positions of the DCT output block has its own quantization coefficient higher-order terms being quantized more heavily than the low-order terms separate quantization tables are employed for luminance and chrominance data, with the chrominance data being quantized more heavily than the luminance data. This allows JPEG to exploit further the eye's differing sensitivity to luminance and chrominance. The compressor starts from a built-in table that is appropriate for a medium-quality setting and varies it in inverse proportion to the requested quality. The complete quantization tables actually used are recorded in the compressed file so that the decompressor will know how to reconstruct the DCT coefficients. Encode the resulting coefficients 1. The resulting coefficients contain a significant amount of redundant data. 2. Apply Huffman compression to losslessly remove the redundancies 3. allows arithmetic encoding to be used instead of Huffman for an even greater compression ratio JPEG Extensions (Part 1) A number of extensions have been defined in Part 1 of the JPEG specification that provide progressive image buildup, improved compression ratios using arithmetic encoding, and a lossless compression scheme. These features are beyond the needs of most JPEG implementations and have therefore been defined as "not required to be supported" extensions to the JPEG standard. 1. Progressive image buildup a. displaying the first few scan lines of the image as it is decoded- you would first see a crude, but recognizable, rendering of the whole image b. Progressive buildup allows an image to be sent in layers rather than scan lines. c. But instead of transmitting each bitplane or color channel in sequence a succession of images built up from approximations of the original image are sent. d. The first scan provides a low-accuracy representation of the entire image--in effect, a very low-quality JPEG compressed image. Subsequent scans gradually refine the image by increasing the effective quality factor. e. A limitation of progressive JPEG is that each scan takes essentially a full JPEG decompression cycle to display. Therefore a very fast JPEG decoder would be needed Arithmetic encoding a. The baseline JPEG standard defines Huffman compression as the final step in the encoding process. A JPEG extension replaces the Huffman engine with a binary arithmetic entropy encoder. b. arithmetic coder reduces the resulting size of the JPEG data by a further 10 percent to 15 percent c. importance in implementations where enormous quantities of JPEG images are archived. d. Arithmetic encoding has several drawbacks: i. slower in both encoding and decoding than Huffman.

2.

3.

ii. The arithmetic coder used by JPEG (called a Q-coder) is owned by IBM and AT&T - must obtain a license from the appropriate vendors if their Q-coders are to be used as the back end of your JPEG implementation. Lossless JPEG compression a. Whatever be the value of Q factor does JPEG NEVER become lossless b. Baseline JPEG is a lossy method of compression regardless of adjustments you may make in the parameters bcos DCT-based encoders are always lossy, because roundoff errors c. suppress information loss in the downsampling and quantization steps, d. The JPEG standard DOES offer a separate lossless mode. This mode has nothing in common with the regular DCTbased algorithms, and it is currently implemented only in a few commercial applications. e. JPEG lossless is a form of Predictive Lossless Coding using a 2D Differential Pulse Code Modulation (DPCM) scheme. The basic premise is that the value of a pixel is combined with the values of up to three neighboring pixels to form a predictor value. The predictor value is then subtracted from the original pixel value. When the entire bitmap has been processed, the resulting predictors are compressed using either the Huffman or the binary arithmetic entropy encoding methods described in the JPEG standard. f. Lossless JPEG works on images with 2 to 16 bits per pixel, but performs best on images with 6 or more bits per pixel. For such images, the typical compression ratio achieved is 2:1. For image data with fewer bits per pixels, other compression schemes do perform better.

JPEG Extensions (Part 3) 1. Variable quantization a. The process of quantization used in JPEG quantizes each of the 64 DCT coefficients using a corresponding value from a quantization table. Quantization values may be redefined prior to the start of a scan but must not be changed once they are within a scan of the compressed data stream. b. Variable quantization allows the scaling of quantization values within the compressed data stream. Quantization values may then be located and changed as needed. c. characteristics of an image changed to control the quality of the output based on a given model. d. can constantly adjust during decoding to provide optimal output. e. The variable quantization extension also allows JPEG to store image data originally encoded using a variable quantization scheme, such as MPEG. For MPEG data to be accurately transcoded into another format, the other format must support variable quantization to maintain a high compression ratio. This extension allows JPEG to support a data stream originally derived from a variably quantized source, such as an MPEG I-frame. 2. Selective refinement a. Selective refinement is used to select a region of an image for further enhancement. b. JPEG supports three types of selective refinement: hierarchical, progressive, and component. i. Hierarchical selective refinement - allows for a region of a frame to be refined by the next differential frame of a hierarchical sequence. ii. Progressive selective refinement - adds refinement. iii. Component selective refinement may be used in any mode of operation. It allows a region of a frame to contain fewer colors than are defined in the frame header. Image tiling a. Tiling is used to divide a single image into two or more smaller subimages. b. Tiling allows easier buffering of the image data in memory, quicker random access of the image data on disk, and the storage of images larger than 64Kx64K samples in size. c. JPEG supports three types of tiling: simple, pyramidal, and composite. d. Simple tiling divides an image into two or more fixed-size tiles contiguous and non-overlapping. e. Pyramidal tiling also divides the image into tiles, but each tile is also tiled using several different levels of resolution - A JTIP image stores successive layers of the same image at different resolutions using imagette. f. Composite tiling allows multiple-resolution versions of images to be stored and displayed as a mosaic. SPIFF (Still Picture Interchange File Format) - to replace the defacto JFIF (JPEG File Interchange Format) format in use today.

3.

MPEG Video Compression


It is at the heart of digital television set-top boxes, DSS, HDTV decoders, DVD players, video conferencing, Internet video, and other applications. require less storage space for archived video information, less bandwidth for the transmission of the video information

MPEG Video Basics The acronym MPEG stands for Moving Picture Expert Group, which worked to generate the specifications under ISO, the International Organization for Standardization and IEC, the International Electrotechnical Commission. "MPEG video" actually consists at the present time of 3 standards, MPEG-11 and MPEG-22, MPEG-4 The MPEG-1 & -2 standards are similar - based on motion compensated block-based transform coding techniques, while MPEG-4 deviates from these more traditional approaches in its usage of software image construct descriptors, for target bit-rates in the very low range, < 64Kb/sec. Because MPEG-1 & -2 are finalized standards and are both presently being utilized in a large number of applications, this paper concentrates on compression techniques relating only to these two standards. Note that there is no reference to MPEG-3. This is because it was originally anticipated that this standard would refer to HDTV applications, but it was found that minor extensions to the MPEG-2 standard would suffice for this higher bitrate, higher resolution application, so work on a separate MPEG-3 standard was abandoned. MPEG-1 was finalized in 1991, and was originally optimized to work at video resolutions of 352x240 pixels at 30 frames/sec (NTSC based) or 352x288 pixels at 25 frames/sec (PAL based), commonly referred to as Source Input Format (SIF) video. It is often mistakenly thought that the MPEG-1 resolution is limited to the above sizes, but it in fact may go as high as 4095x4095 at 60 frames/sec. The bit-rate is optimized for applications of around 1.5 Mb/sec, but again can be used at higher rates if required. MPEG-1 is defined for progressive frames only, and has no direct provision for interlaced video applications, such as in broadcast television applications. MPEG-2 was finalized in 1994, and addressed issues directly related to digital television broadcasting, such as the efficient coding of field-interlaced video and scalability. Also, the target bit-rate was raised to between 4 and 9 Mb/sec, resulting in potentially very high quality video. MPEG-2 consists of profiles and levels. The profile defines the bitstream scalability and the colorspace resolution, while the level defines the image resolution and the maximum bit-rate per profile. Probably the most common descriptor in use currently is Main Profile, Main Level (MP@ML) which refers to 720x480 resolution video at 30 frames/sec, at bit-rates up to 15 Mb/sec for NTSC video. Another example is the HDTV resolution of 1920x1080 pixels at 30 frame/sec, at a bit-rate of up to 80 Mb/sec. This is an example of the Main Profile, High Level (MP@HL) descriptor. A complete table of the various legal combinations can be found in reference2.

Anda mungkin juga menyukai