The Microsoft VC-1 encoder is used to create video elementary streams. It can be used in conjunction
with the group profile to create an asf file that includes both video and audio or a live stream. It can also
be used to create a Bluray compatible VC-1 file that can be used in a Bluray authoring program.
General
For most encodes, the Display frame size settings should match the height and width of the Encode
frame size. Specifying a different value for display resolution than for encoded resolution can help to
enable some encoding scenarios. For example, video may be scaled and encoded at a low resolution but
displayed at full resolution. This can reduce network bandwidth usage when live streaming content.
Frame Rate
Set the frame rate by choosing an option from the list, or specifying Custom and entering a frame rate.
Enabling this setting ensures that the target frame rate is maintained, even in cases where there are
insufficient bits to be encoded, by inserting an explicit skipped frame flag into the bit stream. The skipped
frame flag is an indication that the current frame is a visual duplicate of the previous frame. When
disabled, the encoder extends the duration of the previous frame to compensate for a dropped frame.
For IPTV operation you must enable this setting, so that frame-based metadata such as closed captioning
can be applied to a frame.
VC-1 Profile
VC-1 supports 3 profiles, and each profile supports specific features, bit rates, and resolutions. Once you
have set a profile, the level will be set automatically by the encoder, based on your other settings.
You should select a profile based on the requirements of you decoder, or playback device.
Field/Frame Mode
This setting is only supported for Advanced Profile.
Use this to indicate whether the source video is progressive, interlaced with top field first, or interlaced
with bottom field first.
Note that if you choose Progressive, this will NOT deinterlace the video. If you need to deinterlace the
video either use the Digital Rapids hardware deinterlacer or the software deinterlacer plugin.
Complexity Level
There are 6 complexity settings, ranging from Fastest (best performance) to Extreme (best quality).
The highest setting you can use in real time will be determined by your system’s CPU speed and number
of CPU cores.
Rate Control
Bitrate
The value entered for bitrate will be proportional to file resulting file size. The range is 1kbps to 135Mbps.
This setting is not used when Rate Control Mode is set to 1-pass VBR.
Peak Bitrate
This setting is only used for 2-pass VBR peak Constrained.
The Peak bitrate setting determines how many bps are allocated to the frames of the video that are
hardest to encode.
Enter a value for the desired VBV buffer size, in milliseconds (ms).
For streaming content from Windows Media Services with Advanced Fast Start and Fast Cache modes,
use a buffer size that corresponds to a duration of 8000 milliseconds.
Lower buffer sizes/durations are useful when attempting to encode for low latency connections.
To calculate the VBV buffer size in bytes, use the following equation:
VBVBuffer (bytes) = Bit rate in kbps x buffer duration in seconds x (1000 bits/kbits / 8 bits/byte)
For a 500 kbps encode with a 8 second buffer duration, the VBV Buffer size would be
= 500 x 8 x 125 = 500,000 bytes.
Look ahead rate control mode optimizes the tradeoff between video quality and bit usage in scenarios
that require the use of short buffer and GOP sizes, such as live broadcast over IP networks. It works by
applying greater compression to B-frames, thus freeing up more bits for higher quality I-frames.
Timecode
If you would like to embed timecode in the VC-1 file, then you can enable this setting. This will enable the
Timecode settings button. Click this button to select whether you want to embed timecode from the
source video, or from a user-specified starting timecode.
GOP Settings
The encoder may insert additional keyframes at a shorter interval, depending on whether or not a scene
change has been detected.
Note that a higher maximum key frame distance value will potentially yield a better compression, while a
lower value will allow you to stop and re-start the video as well as to scrub the video more smoothly.
If you are using B-frames the duration must be greater than or equal to the number of B frames + 1
divided by the number of frames per second.
If you require a fixed GOP duration, that is you do not want the encoder to insert additional keyframes
based on scene detection, then you can enable this setting.
When Fixed GOP Duration is enabled, the Adaptive GOP and Look Ahead settings will be disabled.
Closed GOP
A group of pictures (GOP) can be either closed or open. A closed GOP does not contain frames that
depend on adjacent GOPs. Closed GOPs are mainly used for chapter points on optical discs or for files
encoded for VOD servers; they are not required for playback in a Windows Media Player.
Adaptive GOP
This setting cannot be enabled at the same time as “Look Ahead Rate Control” or “Fixed GOP Duration”.
When you enable this setting the encoder resets the count for the maximum key frame distance at each
key frame.
When this setting is disabled the encoder counts the maximum key frame distance starting from the first
key frame and inserts an additional keyframe at a regular interval, regardless as to whether or not
additional key frames have been inserted by the encoder due to a scene detection.
For example, assume the Maximum key frame distance is 8 frames, with the Look Ahead setting enabled
and an I-frame inserted at the 5th frame due to a scene change. The following would be the GOP
structure depending on whether Adaptive GOP were enabled or disabled:
Frames: 1-2-3-4-5-6-7-8-1-2-3-4-5-6-7-8-1-...
Enabled: I-B-P-B-I-B-P-B-P-B-P-B-I-B-P-B-P-...
Disabled: I-B-P-B-I-B-P-B-I-B-P-B-P-B-P-B-I-...
Look Ahead
This setting cannot be enabled at the same time as “Look Ahead Rate Control” or “Fixed GOP Duration”.
It can only be used with 1-pass encoding modes.
Look Ahead allows the encoder to insert I-frames and B-frames based on based on content analysis,
specifically scene change detection, fade detection, and flash detection.
Adding I-frames and B-frames based on content analysis will optimize the compression. This optimization
offers some of the B frame quality improvements that would otherwise require 2-pass encoding.
Enter a value for the number of B-frames between key frames. The valid range is 0 to 7, the
recommended value is 1.
Closed Captioning
The close captioning that is embedded into the VC-1 data is the type of captioning that is used by
hardware playback devices such as set top boxes. It is not the close captioning that can be played back
using software players (such as the Windows Media Player).
Mode
Indicate if the source for the closed captions will be from line 21 of the video input or from an SCC file.
Advanced Settings
When you click on this button you will see additional settings that may be use to control the encoder.
For 1-pass or 2-pass CBR, this represents the maximum frame quantization value. If there are not
enough bits for the next frame to be at or below the maximum frame quantization value, then that frame
will not be encoded, and those bits will be allocated to the subsequent frame. Thus, a lower fixed
quantization value may result in dropped frames. In general, you should choose the highest value that
provides the minimum acceptable visual quality in order to reduce the number of dropped frames.
This method is only supported when the Number of B frames is greater than 0. Not supported for Simple
profile. By default, the encoder will use dynamic B frame delta QP settings.
The dead zone is created during the quantization step of image compression. It represents the bin in
which all AC coefficients that quantize to zero are stored. AC coefficient values close to zero commonly
represent noise and subtle picture detail. Increasing the dead zone therefore ensures more low-frequency
image detail is lost during quantization, which can in turn reduce the data size of compressed frames.
The adaptive dead zone method dynamically increases the size of the dead zone in macro blocks
containing textured areas. In the context of encoding with rate control, this often translates to lower QP
and higher quality in smooth areas due to more bits being available overall.
Increasing the adaptive dead zone strength controls how textured areas are mapped to larger dead-
zones. The recommended strength is the lowest adaptive dead-zone strength. Using higher values can
result in image detail being eroded too readily from textured areas.
Adaptive dead zone methods can be used with quantization option, quantization strength for P frames
and quantization strength for B frames for a combined approach to perceptual optimization. While usually
effective on film sources with natural noise (such as film grain), these methods are not meant for use as
generic encoding enhancements and should be used with caution.
Higher values mean stronger quantization. The following table lists the possible values:
Value Description
0 Off (Default)
DQuant can improve video quality in smooth areas containing very fine detail or gradients, or very dark
uniform areas because those areas are prone to blocking artifacts at high quantizer levels.
The drawback to using DQuant is that using lower QPs for certain macro blocks can use up too many bits
for the entire frame, resulting in a higher general QP for the rest of the macro blocks in the frame. In other
words, improved quality in targeted areas might result in reduced quality in the rest of the image. Using
more than 2 levels of quantizers can also add an additional overhead to compressed sample sizes due to
the necessity to signal different quantizer levels for each macroblock.
DQuant applied only to I and P frames usually produces the best results.
Note that DQuant in Main profile doesn't actually apply to I-frames, but the I+P and I+P+ B settings
parameters are used in the normal fashion.
Must be non-zero when the differential quantization setting is 2 or 3. Otherwise, this value must be zero.
Recommendation: 0 should be used for most content. Try setting 1 if you notice blocking in smooth
regions after initial encoding and setting 2 if you still see blocking after trying setting 1.
Specify 0 for Simple profile or 1-pass VBR. Must be non-zero when the differential quantization option is
3. Otherwise, this value must be zero.
Recommendation: 0
De-noising is generally performed during the preprocessing phase in Digital Rapids encoding hardware,
however, this filter can be useful when preprocessing is not an option. When using this filter you cannot
preview the noise reduction separately from the encoder, as you can with the Digital Rapids hardware
noise reduction filter.
The in-loop filter reduces blocking artifacts during encoding to improve the quality of P and B frames. It is
also used when decoding, which means it can reduce performance during playback. Although the in-loop
filter can reduce image detail in individual frames, the overall quality of the video improves. The biggest
downside to using in-loop filtering is the additional decoding performance cost, which can be a problem
for low-power playback device, such as cell phones. The In-Loop Filter will typically increase CPU
requirements for a given encode by about 15%.
The median filter improves motion estimation processing by factoring out noise artifacts. This can improve
the quality of very noisy video and reduce the size of the compressed data. Note that this filter is not the
same as median blur filters found in many video editing and post-processing applications.
A noisy frame edge is usually caused by the vertical blanking interval (VBI) data from a frame of
broadcast television being visible. The VBI is the first 21 scan lines of a broadcast frame. When a
television signal is recorded by a capture card, the VBI is usually removed from the frame.
The noisy edge detection and correction filter can only correct an edge that has 3 or fewer lines of noise.
Motion estimation settings control how the codec searches for motion in the frame. These settings can
have a dramatic effect on quality and an even more dramatic effect on encoding time. The following table
lists the possible values:
Value Description
0 Off. (Default)
Including chroma in motion estimation can significantly improve the quality of encoded video when
chroma changes happen where luma changes do not. For example, motion graphics, cel animation, and
screen recordings can be significantly improved with this setting. Motion search with luma and true
chroma will yield the best quality, but at the highest performance cost. The two adaptive modes and the
nearest-integer chroma mode provide reasonable compromises between quality and performance.
Adaptive modes apply chroma search to the 50 percent of the blocks in the frame that are predicted to
have the most benefit. This provides most of the value of chroma search with only half the encoding
performance reduction The default depends on the complexity level. The following table lists the possible
values:
Value Description
0 Luma only. The VC-1 encoder searches for motion in luminance values
only. Provides fastest performance (encoding speed).
2 Luma with true chroma. Provides the best quality with the lowest
performance.
range too high can also lead to false positives, so it's important to set the motion search window to a
range adequate for the video. The following table lists the possible values:
Value Description
Value Description
1 RD cost. This option configures the codec to account for both rate
and distortion when computing cost.
Value Description
0 Static method. This option uses the same motion vector cost
estimate for all macroblocks.
1 Dynamic method. This option varies the motion vector cost between
macroblocks to achieve optimal visual quality.
1 +127.75/-128.0 H, +63.75/-64.0 V
2 +511.75/-512.0 H, +127.75/-128.0 V
3 +1023.75/-1024.0 H, +255.75/-256.0 V
Recommendation: setting 4, except when using Simple profile (when you must use 0)
Letterbox detection is dynamic and should correctly detect changes in video mattes. However, there may
be cases where detection does not immediately find a change, particularly when video frames contain
mostly black regions.
Key pop reduction is most useful for shorter GOP encodes, like those targeted for optical discs. Key pop
reduction can also cause some softness in the video, and therefore may not be appropriate for all
content, particularly content that uses long GOPS where they are not needed.
Recommendation: One thread for pictures less than 128 lines in height, two threads for pictures between
128 and 256 lines, and four threads for larger pictures.
For example, to make the VC-1 encoder use processors zero, one, two, and three, the low-order byte of
the affinity mask would be 00001111 (decimal value of 15 and a hexadecimal value of 0xF).
If the encoding computer has 8 processors, then processors four, five, six and seven would use an affinity
mask of 11110000 (decimal value of 240 and a hexadecimal value of 0xF0).
This setting specifies the percentage of available processing to use for encoding. The range is 1 – 100,
and the default is 80.
This method can dynamically vary the complexity of the encoding to ensure a fixed encoding time,
regardless of the complexity of the input video stream. Setting a lower target CPU usage value implies
lower complexity and, therefore, faster encoding, while making more system resources available for other
components that the application controls. Setting a higher value implies potentially higher complexity and
more utilization of CPU resources by the encoder itself.
Recommendation: Use the highest value that leaves sufficient resources for other processes on the
computer.
For display on a computer VGA monitor or an HDTV monitor, use an aspect ratio of 1:1.
For display on an NTSC monitor, for D1/DV use 10:11 and for D1/DV anamorphic widescreen use 40:33.
For display on a PAL monitor, for CCIR-601 use 16:15, for D1/DV use 59:54, and for D1/DV anamorphic
widescreen use 118:81.
This is not to be confused with the display aspect ratio, which defines the ratio between the width and
height of the video display.
For display on a computer VGA monitor or an HDTV monitor, use an aspect ratio of 1:1.
For display on an NTSC monitor, for D1/DV use 10:11 and for D1/DV anamorphic widescreen use 40:33.
For display on a PAL monitor, for CCIR-601 use 16:15, for D1/DV use 59:54, and for D1/DV anamorphic
widescreen use 118:81.
This is not to be confused with the display aspect ratio, which defines the ratio between the width and
height of the video display.
A value of 1 will set 1:1 aspect ratio for Blu-Ray and a value of 5 will set 4:3.
Value Description
1 (PAR=1:1(square))
2 (PAR=12:11)
3 (PAR=10:11)
4 (PAR=16:11)
5 (PAR=40:33)
6 (PAR=24:11).
7 (PAR=20:11).
8 (PAR=32:11)
9 (PAR=80:33)
10 (PAR=18:11).
11 (PAR=15:11)
12 (PAR=64:33)
13 (PAR=160:99)
15 (PAR=custom)
Value Description
RAW: The encoder generates output for a container format, such as ASF.
ES: The encoder generates output for an elementary stream with an entry point start code inserted for
each GOP. Sequence start codes are inserted as needed.
ES_SH: The encoder generates output for an elementary stream with both entry point and sequence start
codes inserted for each GOP. For BluRay choose this setting.