Experiments on the benefits of Optimal Truncation for the JPEG2000 Codestream

Andrew Willis
April 26, 2002

The JPEG2000 in-house simulator currently pads passes with 4 additional bytes in order to ensure correct decoding. This practice is known to be suboptimal. Hence, a more optimal truncation scheme was implemented for the simulator. This page is devoted to the analysis of the relative benefits of each of the methods.

NOTE:
Optimal truncation as outlines in the verification model and Taubman & Marcellin's book is not possible given our current implementation of the entropy coder. For true absolute optimal truncation, one must have access to previously encoded bytes and be able to remove them from the codestream. This accounts for the specific instance of repeated 0xFF7F entropy coded pairs. Since they are generated by the bit un-stuffer (i.e. BYTEIN() ) algorithm, we can sometimes truncate a pass before the end of the marked length Ln. Aside from this very unlikely instance, there is the additional issue of witholding the emission of 0xFF bytes from the entropy coder. In the case of optimal truncation, these bytes are "delayed" pending the end of the pass or the end of the segment. The presence of a delayed byte is indicated by the E and En variables on a byte-by-byte and pass-by-pass basis respectively.  This may then be exploited in an attempt to save a byte here or there in the codestream. It is unlikely that this will make any significant contribution in improving our rate/distortion statistics.

Given the statements in the above paragraph, I cannot truly call the truncation scheme implemented as optimal. Hence, I will call it sub-optimal in the sense that it is missing exactly those features listed above which would be required to make it "truly" optimal.

Functional Requirements:

Pass Padding
This method has really no functional requirements other than having access to the 4 bytes of arithmetic coded data immediately following the byte at which one would like to truncate. (i.e. if I want to truncate after the 5th pass of entropy coding, then I would require the 4 bytes which immediately follow the last byte of entropy coded data from the 5th pass).
Sub-Optimal Truncation
This method requires the entropy coder to save the state variables after each pass. The saved variables include:

int     CTn[NPASSES];             The value of the CT register.
UINT    An[NPASSES];              The value of the A register.
int     Cn[NPASSES];              The value of the C register.
BYTE    Bn[NPASSES];              The contents of the byte buffer.
int     En[NPASSES];              Boolean variable which indicates if the previous value
                                  sent to BYTEOUT() was 0xFF. This is not necessary for our implementation.
int     Ln[NPASSES];              The length in bytes of the codestream where 0 indicates the start of the
                                  current codeblock.

Additional variables in the entropy coder:
UCHAR   E;                        Boolean variable -- same as En[] above.
int     L;                        Length of segment --- same as Ln[] above.

In addition, this truncation method will require access to the bytes following the pass selected for truncation. This is the same as the analogous requirement for pass-padding truncation.
Computational Complexity
The actual computational complexity of the truncation scheme is very low and requires about 3 comparisons, 3 bitwise operations (i.e. shift, and, etc...), and 1 addition. This being a rough estimate based on my non-optimized code at the moment.  The operations above are executed in a loop which will execute up to a worst-case of 4 times. Each loop iteration corresponds to appending a byte to the truncated pass. Hence, according to my experiments this loop with execute on average about 2.3-2.4 times per code block in the image.

Example:
An image with 684 codeblocks could expect approximately 1573-1642 loop executions.

Preliminary Results
Synthetic contexts and bits were generated for the entropy coder over a 2 pass arithmetic coding scheme. The resulting codestrean was subsequently truncated between passes 1 and 2. Statistics were then collected on the generated truncated pass lengths. Given a uniform probability distribution on the bits and contexts generated for the entropy coder, it was found experimentally that one would save approximately 1.7 bytes per truncated pass.

Results for Real Images
The new truncation scheme was integrated with the in-house simulator and tests were run on real images. For the results provided below, the images were encoded such that every codeblock would be truncated at a constant pass number (i.e. for 10 passes, for each codeblock passes 0-9 were kept by in the code stream). Therefore , the provided results will be biased since this effectively bypasses the rate distortion algorithm. However, the results here should provide useful insight to the benefits which one would expect when using sub-optimal truncation.

Coding Parameters
Number of encoded passes = 10
Wavelet Levels = 3
Tilesize = 256x256
Core = 9/7 Irreversible

Image Size
Padded Size (bytes)
Sub-Optimal Size (bytes)
Bitrate Improvement (bpp)
Codeblocks
1024x768x3
748738
747607
0.011505
684
1024x768x3
779799
778679
0.011393
684
1024x768x3
637312
636171
0.011607
684
1024x768x3
494413
493245
0.011882
684

Coding Parameters
Number of encoded passes = 10
Wavelet Levels = 3
Tilesize = 512x512
Core = 9/7 Irreversible

Image Size
Padded Size (bytes)
Sub-Optimal Size (bytes)
Bitrate Improvement (bpp)
Codeblocks
1024x768x3
656687
655700
 0.0100402
588
1024x768x3
684613
683625
 0.010050
588


The following experiment allows the number of encoded passes to vary to verify that the bitrate improvement is invariant to this parameter.

Coding Parameters
File = adi1.ppm
Wavelet Levels = 3
Tilesize = 512x512
Core = 9/7 Irreversible

Image Size
Number of Encoded Passes
Padded Size (bytes)
Sub-Optimal Size (bytes)
Bitrate Improvement (bpp)
Codeblocks
1024x768x3
4
127750
126718
0.01049
588
1024x768x3
8
544603
543599
0.0102132
588
1024x768x3
12
972115
971245
0.008850
588

Conclusions
Use of sub-optimal truncation should improve our bitrate by approximately 1%. I think use of "truly" optimal truncation will not significantly increase this value.