Generalized Bitplane-by-Bitplane Shift Method for JPEG2000 Region of Interest Image Coding
Zhou Wang & Alan C. Bovik
(a) Background: Region of interest (ROI) image coding allows for encoding the ROIs in an image with better quality than the background (BG). It is very useful for visual communication applications where the available bandwidth is limited. ROI coding is one of the requirements in the upcoming JPEG 2000 image coding standard, where the ROI coding is based on a scaling based method.
At the encoder, the wavelet transform is applied to the image and the resulting coefficients associated with the BG are scaled down (shifted down) so that the ROI-associated bits are placed in higher bitplanes. During the embedded bitplane coding process, the bits in the higher bitplanes are placed before those in the lower bitplanes. The scaling value and the shape information of the ROIs are also added into the encoded bitstream. At the decoder, the bitplanes are reconstructed and the BG associated coefficients are scaled up to their original bitplanes before the inverse wavelet transform. If the encoded bitstream is truncated or the encoding is terminated before the image is fully coded, the ROIs will have a higher quality then the BG. The relative importance of the ROIs and the BG is determined by the scaling value s, which defines the number of bitplanes to be shifted. This method has two major drawbacks. First, it is not convenient to deal with different wavelet subbands in different ways, which is sometimes desired by the users. Second, it needs to encode and transmit the shape information of the ROIs. This significantly increases the codec's complexity and decreases the overall coding efficiency, especially when the ROIs are of arbitrary shapes.
A very smart solution, namely the maximum shift (Maxshift) method, was proposed for JPEG 2000, which does not require any shape coding or any shape information to be explicitly transmitted to the decoder. In the Maxshift method, the scaling value must be chosen to be greater or equal to the largest number of magnitude bitplanes for any background coefficient in any code-block in the current component. After scaling, all significant bits associated with the ROI will be in higher bitplanes than all the significant bits associated with the background. At the decoder, the ROI coefficients and the BG coefficients can be classified simply by looking at the coefficients' magnitudes or the bitplane levels of their most significant bits (MSBs). There is no need to tell the decoder explicitly about the shape information of the ROIs. The BG coefficients are scaled back up before the inverse wavelet transform is applied. With the method, it is also very easy to treat different wavelet subbands differently. The problems with it is: It does not have the flexibility to allow for the selection of an arbitrary scaling value to define the relative importance of the ROI and the BG wavelet coefficients as in the general scaling based method.
(b) Bitplane by Bitplane Shift (BbBShift) Method: We suggest a new method, BbBShift, for JPEG 2000 ROI coding. BbBShift has the advantages of both the general scaling based method and the Maxshift method. It does not need any shape coding or any shape information to be transmitted to the decoder. It can choose an arbitrary integer scaling value as in the general scaling based method. In addition, it allows for the treatment of different wavelet subbands in different ways. The cost is only one more parameter than the Maxshift method. It can be shown that Maxshift is actually a special case of BbBShift.
Z. Wang and A. C. Bovik, "Bitplane-by-bitplane shift (BbBShift) - a suggestion for JPEG 2000 region of interest coding," IEEE Signal Processing Letters, vol. 9, no. 5, pp. 160-162, May 2002.
(c) Generalized Bitplane-by-Bitplane Shift (GBbBShift) Method: We generalize the BbBShift method and propose a GBbBShift method for JPEG2000 ROI coding. The Maxshift and BbBShift methods are special cases of the GBbBShift method, while GBbBShift provides more flexibility. GBbBShift has many advantages over the current general scaling based method and the Maxshift method defined in the standard. The major contribution of GBbBShift is the extension of the functionality and flexibility of the current JPEG2000 ROI coding methods. In comparison with the general scaling based methods defined in JPEG2000 Part II, where only rectangle and ellipse ROI shapes are allowed, GBbBShift supports arbitrary shaped ROI coding. Compared with Maxshift and BbBShift, GBbBShift has more flexibility to adjust the bitplane-shift strategy. It is not necessary for the GBbBShift method to have a shape coding component, which is essential in the general scaling based methods. The general scaling based methods also require a complex ROI mask generation procedure, which is different for different ROI shapes and significantly increases the computation and implementation expenses. By contrast, Maxshift, BbBShift and GBbBShift do not require any shape coding, and their ROI/BG identification process is much less computationally complex. Similar to the general scaling based method and the Maxshift method, the coding efficiency of BbBShift and GBbBShift decreases in comparison with JPEG2000 without any ROI coding. The reason is that bitplane shifting increases the dynamic range (or number of bitplanes) of the wavelet coefficients being encoded. It is reported that for lossless coding of images with ROIs, the Maxshift method increases the bit rate by 1-8\%, compared to lossless coding of an image without ROI (and less compared to the general scaling based method, depending on the scaling value used). Apparently, if the point of lossless coding is reached, the Maxshift, BbBShift and GBbBShift methods will result in basically the same bit rate (which is confirmed by our experiments) because they have the same number of bitplanes and the information to be coded in each biplanes is exactly the same. The only difference is that the bitplanes are placed in different order. Our experiments also show that it can provide better visual quality images than Maxshift at low bit rates.
Z. Wang, S. Banerjee, B. L. Evans and A. C. Bovik, "Generalized Bitplane-by-Bitplane Shift Method for JPEG2000 ROI Coding," accepted by IEEE International Conference on Image Processing, Sept. 2002.
The current JPEG2000 image coding standard defines two kinds of region of interest (ROI) coding methods: the general scaling based method and the maximum shift (Maxshift) method. The former requires shape coding of the ROIs, which leads to increased complexity of codec implementations and limits the choice of ROI shapes (currently, only rectangle and ellipse shapes are defined). The latter allows for arbitrarily shaped ROI coding without explicitly transmitting any shape information to the decoder, but does not have the flexibility to select an arbitrary scaling value to define the relative importance of the ROI and the background wavelet coefficients.
Can the JPEG2000 ROI coding be improved
YES! we propose a generalize bitplane-by-bitplane shift (GBbBShift) method, which delivers much more flexibility than both Maxshift and BbBShift for "degree-of-interest" adjustment of the ROI with trivial reduction of coding efficiency and increase of computational complexity. Experiments show that it can provide significantly better visual quality than Maxshift at low bit rates. See paper and demo images below.
JPEG2000 ROI coding results of 24bpp RGB "Barbara" image using the Maxshift method (s = 12) and the GBbBShift method (BP_mask = 111111000000111111000000).
Left: Maxshift; Right: GbbBShift; Top: 0.5bpp; Middle: 1.0bpp; Bottom: 2.0bpp.