H. 261
H. 263
MPEG
Newer MPEG Standards
Reference: Chapter 6 of Steinmetz and Nahrstedt
- Uncompressed video data are huge. In HDTV, the bit-rate
could exceed 1 Gbps. --> big problems for storage and network
communications.
- We will discuss both Spatial and Temporal Redundancy
Removal -- Intra-frame and Inter-frame coding.
1. Overview of H. 261
- Frame Sequence
- Frame types are CCIR 601 CIF (352 x 288) and QCIF (176 x 144)
images with 4:2:0 subsampling.
- Two frame types: Intra-frames (I-frames) and Inter-frames
(P-frames):
I-frame provides an accessing point, it uses basically JPEG.
P-frames use "pseudo-differences" from previous
frame ("predicted"), so frames depend on each other.
2. Intra-frame Coding
3. Inter-frame (P-frame) Coding
- An Coding Example (P-frame)
- Previous image is called reference image,
the image to encode is called target image.
- Points to emphasize:
- The difference image (not the target image itself) is encoded.
- Need to use the decoded image as reference image, not
the original.
- We're using "Mean Absolute Error" (MAE) to decide best block.
Can also use "Mean Squared Error" (MSE) = sum(E*E)/N
4. H. 261 Encoder
- "Control" -- controlling the bit-rate. If the transmission
buffer is too full, then bit-rate will be reduced by changing the
quantization factors.
- "memory" -- used to store the reconstructed image (blocks) for the
purpose of motion vector search for the next P-frame.
5. Methods for Motion Vector Searches
5.1 Full Search Method
Sequentially search the whole [-p, p] region --> very slow
5.2 Two-Dimensional Logarithmic Search
5.3 Hierarchical Motion Estimation
- Form several low resolution version of the target and reference pictures
- Find the best match motion vector in the lowerest resolution version.
- Modify the motion vector level by level when going up
6. Some Important Issues
- Avoiding propagation of errors
- Send an I-frame every once in a while
- Make sure you use decoded frame for comparison
- Bit-rate control
7. Details
7.1 How the Macroblock is Coded ?
- Many macroblocks will be exact matches (or close enough).
So send address of each block in image --> Addr
- Sometimes no good match can be found, so send INTRA block --> Type
- Will want to vary the quantization to fine tune compression,
so send quantization value --> Quant
- Motion vector --> vector
- Some blocks in macroblock will match well,
others match poorly. So send bitmask indicating which blocks are
present (Coded Block Pattern, or CBP).
- Send the blocks (4 Y, 1 Cr, 1 Cb) as in JPEG.
7.2. H. 261 Bitstream Structure
- Need to delineate boundaries between pictures, so send Picture Start Code --> PSC
- Need timestamp for picture (used later for audio synchronization), so send Temporal Reference --> TR
- Is this a P-frame or an I-frame? Send Picture Type --> PType
- Picture is divided into regions of 11 x 3 macroblocks called
Groups of Blocks --> GOB
- Might want to skip whole groups, so send Group Number (Grp #)
- Might want to use one quantization value for whole group, so send Group Quantization Value --> GQuant
- Overall, bitstream is designed so we can skip data whenever possible while still unambiguous.
- H. 263 is a new improved standard for low bit-rate video, adopted
in March 1996.
As H. 261, it uses the transform coding for intra-frames
and predictive coding for inter-frames.
- Advanced Options:
- Half-pixel precision in motion compensation
- Unrestricted motion vectors
- Syntax-based arithmetic coding
- Advanced prediction and PB-frames
- In addition to CIF and QCIF, H. 263 could also support SQCIF, 4CIF,
and 16CIF.
The following is a summary of video formats supported by
H. 261 and H. 263:
Video Formats Supported
Video format |
Luminance Image Resolution |
Chrominance Image Resolution |
H.261 support |
H.263 support |
Bit-rate (Mbit/s) (if uncompressed, 30 fps) |
Max bits allowed per picture (BPPmax, Kb) |
B / W |
Color |
SQCIF | 128 x 96 | 64 x 48 | n/a | Required |
3.0 | 4.4 | 64 |
QCIF | 176 x 144 | 88 x 72 | Required | Required |
6.1 | 9.1 | 64 |
CIF | 352 x 288 | 176 x 144 | Optional |
Optional | 24.3 | 36.5 | 256 |
4CIF | 704 x 576 | 352 x 288 | n/a |
Optional | 97.3 | 146.0 | 512 |
16CIF | 1408 x 1152 | 704 x 576 | n/a |
Optional | 389.3 | 583.9 | 1024 |
1. What is MPEG ?
- "Moving Picture Coding Experts Group", established in 1988 to
create standard for delivery of video and audio.
- MPEG-1 Target: VHS quality on a CD-ROM or Video CD (VCD)
(352 x 240 + CD audio @ 1.5 Mbits/sec)
- Standard had three parts:
Video, Audio, and System (control interleaving of streams)
2. MPEG Video
- Problem: some macroblocks need information not in the previous
reference frame.
Example: The darkened macroblock in Current frame does not have a
good match from the Previous frame, but it will find a good match
in the Next frame.
- MPEG solution: add third frame type: bidirectional frame, or B-frame
In B-frames, search for matching macroblocks in both past and
future frames.
- Typical pattern is IBBPBBPBB IBBPBBPBB IBBPBBPBB
Actual pattern is up to encoder, and need not be regular.
3. Differences from H. 261
4. MPEG Video Bitstream
5. Decoding MPEG Video in Software
Unlike MPEG-1 which is basically a standard for storing and playing
video on a single computer at low bit-rates, MPEG-2 is a standard for
digital TV. It meets the
requirements for HDTV and DVD (Digital Video/Versatile Disc).
MPEG-2 Level Table:
--------------------------------------------------------------------------------
Level Max Max Max Max Coded Application
Resolution fps Pixels/sec Data Rate (Mb/s)
--------------------------------------------------------------------------------
Low 352 x 288 30 3 M 4 consumer tape equiv.
Main 720 x 576 30 10 M 15 studio TV
High 1440 1440 x 1152 60 47 M 60 consumer HDTV
High 1920 x 1152 60 63 M 80 film production
--------------------------------------------------------------------------------
- Other Differences from MPEG-1:
- Support both field prediction and frame prediction.
- Besides 4:2:0, also allow 4:2:2 and 4:4:4 chroma subsampling
- Scalable Coding Extensions: (so the same set of signals works for
both HDTV and standard TV)
- SNR (quality) Scalability -- similar to JPEG DCT-based
Progressive mode, adjusting the quantization steps of the DCT
coefficients.
- Spatial Scalability -- similar to hierarchical JPEG,
multiple spatial resolutions.
- Temporal Scalability -- different frame rates.
- Frame sizes could be as large as 16383 x 16383
- Non-linear macroblock quantization factor
- Many minor fixes (see MPEG FAQ for more details)
- MPEG-3: Originally planned for HDTV, got folded into MPEG-2
- Version 1 approved Oct. 1998, Version 2 to be approved Dec. 1999.
- Originally targeted at very low bit-rate communication
(4.8 to 64 Kb/sec), it now aims at the following ranges of bit-rates:
- video -- 5 Kb to 10 Mb per second
- audio -- 2 Kb to 64 Kb per second
- It emphasizes the concept of Visual Objects
--> Video Object Plane (VOP)
- objects can be of arbitrary shape, VOPs can be
non-overlapped or overlapped
- supports content-based scalability
- supports object-based interactivity
- individual audio channels can be associated with objects
- Good for video composition, segmentation, and compression;
networked VRML, audiovisual communication systems (e.g., text-to-speech
interface, facial animation), etc.
- Standards being developed for shape coding, motion coding,
texture coding, etc.
- International Standard due by July 2001.
- MPEG-7 is a content representation standard for multimedia
information search, filtering, management and processing.
- Descriptors for multimedia objects
and Description Schemes for the descriptors and their
relationships.
- A Description Definition Language (DLL) for
specifying Description Schemes.
- For visual contents, the lower level descriptions will be color,
texture, shape, size, etc., the higher level could include a semantic
description such as "this is a scene with cars on the highway".
Further Exploration
The MPEG Home Page,    
A good MPEG FAQ,    
MPEG Resources on the Web.
Top |
Chap 4 |
CMPT 365 Home Page |
CS