Vertex-Based Shape Coding in Polar Coordinates for MPEG-4
Chil-Cheang Ma and Mei-Juan Chen
Institute of Electrical Engineering,
National Dong-Hwa University, Hualien, Taiwan
MPEG-4 standard will allow the transmission of the arbitrarily shaped video object. It allows the encoding of video objects using motion, texture and shape information. In this paper, a new vertex-based shape coding with polygon approximation in polar coordinates is proposed. To optimize the coding efficiency of polygon approximation of object boundary in polar coordinates, a new vertex selection scheme and the area distortion criterion are presented. The proposed vertex selection scheme can reduce the shape points of the boundary for lossless coding. For lossy coding, the proposed area distortion criterion provides good rate-distortion trading-off. The proposed algorithm makes compression more efficient and degradation due to errors more graceful.
Due to the rapid growth of multimedia communications for internet applications, video technologies have been paid much attention recently. MPEG-4, with emphasis on content-based interactivity, universal access, and compression performance, will soon be finalized in late 1998 [1-5]. For content-based interactivity, MPEG-4 will assist the efficient and flexible coding and representation of both natural as well as synthetic data. For compression efficiency, MPEG-4 supports an important functionality at very low bit rates below 64 kbit/s. As far as universal access, multimedia audio-visual data need to be transmitted and accessed in network environments.(Figure 1).
Object-oriented video compression is an important part for digital video processing. Within the object-oriented framework, analysis and synthesis of an image is done by treating it as a collection of disjoint video object planes (VOPs). The purpose of using shape is to achieve better subjective picture quality, increased coding efficiency as well as an object-based video representation. Two main classes of binary shape coder have emerged in the MPEG-4 standardization effort; bitmap-based and vertex-based (or contour-based) coder [6,7]. In this paper, we present a new vertex-based shape coding technique using polygon approximation in polar coordinates for lossless or lossy coding of object boundary. We use a more natural representation of the shape for polar coordinates instead of Cartesian coordinates. The Cartesian x and y coordinates function are transformed to the domain of polar coordinates yielding an and a function. Using Cartesian coordinates means describing the shape in terms of straight lines and blocks, while for using polar coordinates, this is in terms of circles and curves . Because of this more natural representation, compression can be more efficient and degradation due to errors or low bit rates will be more graceful. To enable distortion control and bit allocation for arbitrarily shaped video objects in communication environment, polygon approximation method is exploited to express outline of object. For the contour of object, it is important to select the vertices to represent boundary of the object. We propose a new vertex selection technique exploiting the transition of the radius and the angle of the successive vertices to detect turning points of the boundary of the object. The area distortion criterion is utilized to sample turning points of boundary for further reduction of the bit-rate. Huffman coding or adaptive arithmetic coding can be adopted as an entropy coding to improve efficiency. The framework of the whole system is shown in Figure 2.
Figure 2: Block diagram of the proposed technique
The paper is organized as follows: The vertex-based shape coding in Cartesian coordinate is briefly reviewed in section II. The transformation of Cartesian coordinate to polar coordinate is described in section III. The polygon approximation with a new vertex selection scheme is discussed in section IV. Section V details the method for vertex sampling with area distortion criterion. And the entropy coding is described in section VI. Section VII addresses the experiment results. Conclusions are made in the section VIII.
II. Review of the Vertex-based Shape Coding in Cartesian Coordinate
The vertex-based shape coding in Cartesian coordinates [8,9] codes the outline of the boundary. The approach starts with finding the axis of the shape and uses the two end points as the initial polygon (as shown in Figure 3). Let us assume that PkPk+1 approximates the original contour between the vertices Pk and Pk+1 .The original contour segment associated with this approximation PkPk+1 segment is called Ck with the contour points Ck,i, with k identifying the segment and I the number of contour points of Ck. With d(PkPk+1,Ck,i) being the Euclidean distance between contour point Ck,i and the line PkPk+1, the approximation error for a segment Ck of the original contour is given by dmax(k)
For each side of the polygon, it is checked whether the approximation lies within a given tolerance dmax(k) < threshold (Figure 3), If not, a new vertex is inserted at the point of the largest approximation error. Then, for each new polygon side, it is decided whether it lies within the allowable shape approximation error is lower than threshold. The described vertex selection method selects all vertices on the object boundary. After this vertex selection, the N vertex positions in Cartesian coordinates are encoded. To save bits, the vertices are renumber such that the largest difference in x-or y-coordinates appears between vertex P0 and PN-1. The above procedure is done in Cartesian coordinates system.
Figure 3: Vertex selection method in Cartesian coordinate
III. Transformation to Polar Coordinates
The technique is based on the description of the object shape in polar place. It is discovered that the polar coordinate is a more natural representation of describing two dimension objects as opposed to the Cartesian coordinate system. Each of the contour pixels yields, instead of an xi and a yi coordinate, an i and a i coordinate which gives for each shape the function i and i ; with i the index of the contour pixel
where cx and cy are the coordinate of the center Rx,y of the shape which is taken as the origin of the coordinate system. The center point Rxy(cx,cy) is got by the below definition
where Rxy is center point vector of distance, N is the total pixel numbers of the contour and i is the index.
IV. A New Vertex Selection Scheme for Polygon Approximation
The efficiency of the shape coding depends to a large extent on the encoder. The art of vertex-based shape coding lies in selecting the appropriate vertices for the polygons. At the same time, the reason that the approach of detection of feature points is popular is that feature points provide a sufficient constraint to compute image displacements reliably. One of the most intuitive types of feature points is the corner. Corner detectors have been widely used as feature point detectors because corners correspond to image locations with a high information content, and they can be matched between images reliably. To be useful for feature point matching, a new method of the corner selection is proposed to choose optimal control points of the boundary of the object. We use the transition rule of the angles and the radii among the successive points to select turning point in order to remove the redundant points. Figure 4 illustrates how to select the control points of the boundary. The algorithm is based on the variation of the angles and the radii of the successive points along tracing line passing through the examed points. We define 3-order pair point (Pi-1, Pi, Pi+1) on tracing line. The technique is controlled by the radii of 3-order pair (i-1 ,i, i+1) and the correspond angles (i-1, i, i+1). For example, considering the clockwise scan of shape points, shown in Figure 4(a), when i-1 increases toi and i decreases to i+1 and the transition of the angles is coherent, the midpoint Pi is selected as a turning point of the boundary. In other hand, as depicted in Figure 4(b) when the i-1 decreases to i and i increases to i+1 and the change tendency of the radii is coherent, the Pi is chosen as a turning point. Consequently, The rule of select turning points is formulated by below
By the detection of changing of the radii and the angles, the midpoint Pi is selected ,where i is the index of pixels on the boundary.
|(i+1 -i) (i -i-1)<0
(i+1-i) (i -i-1)>0
|(i+1 -i) (i -i-1)>0
(i+1-i) (i -i-1)<0
All the selected turning points are formed the vertices of the approximate polygon. Although the number of pixels to represent the contour is reduced, it is lossless representation. The method reduces data storage compared to the original shape points of the object boundary. The proposed scheme performs an optimal boundary approximation, and reduces redundant boundary data.
V. Vertex Sampling with Area Distortion Criterion
For low bit-rate lossy coding, a vertex sampling with area distortion criterion between the boundary and the approximated shape is used for finding the optimal locations of the vertices. We perform numerical integral to estimate the area distortion region. The higher the resolution of the original boundary, the closer the numerical integral becomes to the measure involving the count of mislabeled pixels. Furthermore, employing a continuous area metric makes it possible for the decoder to reconstruct the boundary on a finer quality. Let us define by the boundary distortion for 3-point pair(Pi-1, Pi, Pi+1), as shown in Figure 5.
The area distortion A is defined in (Eq.5)
where r1=(Pi,x-Pi-1,x , Pi,y-Pi-1,y), r2=(Pi+1,x-Pi,x , Pi+1,y-Pi-1,y), and is the angle between vector r1 and vector r2.
To optimize the coding and sample the control points of the boundary, we present the area distortion to apply coding the boundary object for enable finer reconstruct quality in decoder. The procedure is shown in Figure 6.
Figure 6: The procedure of sampling the vertices
If area distortion A <threshold, the midpoint Pi is ignored. The system executes vertex-transferring procedure. The initial point Pi-1 is still, and the point Pi+1 is shifted to the next point Pi+2 and the point Pi position is shifted to the neighbor point Pi+1. At the same time, the original Pi+2 symbol is become Pi+1, and Pi+1 is transferred to Pi symbol. After the vertex-transferring proceeding, it is repeated the procedure of judgment of the sampled condition of area distortion among (Pi, Pi+1, Pi+2) (Figure.5).
If the condition is not satisfied, the pixel Pi of the boundary is taken and coded. Then, the initial point Pi-1 shifted to the encoded point Pi and for point Pi, it is considered the next time sampled initial point and iterative to area distortion proceeding for selecting another sampled point. Using this regulation not only selects less vertices to reduce data storage and achieve low bit-rate coding but also lesser error distortion in reconstruction arbitrary boundary.
VI. Entropy Coding
Two efficient coding algorithms, the Huffman coding scheme and arithmetic coding ,are widely utilized for encoding contour line. Huffman code is well-known to be an easily implemented and efficient method for data compression, and arithmetic code is well- known for their ability to achieve code rates that are close to optimal performance bound. Therefore, we exploit the arithmetic code to achieve better coding efficiency.
VII. Simulation Results
To compare the performance of different shape coders, evaluation criteria have to be defined . Within MPEG-4, the measure dn that is the number of erroneously represented pels of the coded shape divided by the total number of pels belonging to the original shape. It is defined in (Eq.6).
This metric is used by MPEG-4 to evaluate performance. Two test sequences, namely Akiyo and News are used for the performance evaluation. Both of the sequences are in QCIF format (i,e. 176 by 144 pixels). Figure 7 shows the pictures and the corresponding alpha planes. The proposed method is compared to the vertex-based algorithm in Cartesian coordinate. Figure 8 shows the approximation bitmap of Akiyo and News test sequences. For the test sequence, it shows bit-rate of the proposed in polar coordinate is lower than vertex-based in Cartesian coordinate when the distortion dn is almost as same. The reconstruct boundary of the object is finer than the conventional vertex-based shape coding. Figure 9 shows an experimental result about rate-distortion in the Akiyo test sequence. Figure 10 shows the other of experimental result in the News test sequence. From the experimental results, our method provides a large of efficiency in reduce bits storage requirement for lossy coding. And for multimedia application, the proposed method is suitable to be adopted in low bit rate for MPEG-4 standard requirement.
This paper presents a new vertex-based shape-encoding algorithm, which uses a polygon approximation in polar coordinates combined with the area distortion criterion measure. The result informs that the proposed method is superior to vertex-based shape coding for MPEG-4 in Cartesian Coordinate, and the bit rate is lower for comparable distortion rate. When compared with vertex-based coding for MPEG-4, the algorithm has 30%-50% rate reduction for comparable distortion. The outstanding results for the proposed method provides a higher coding efficiency. In addition, the technique has the advantage of less computation complexity for area distortion criterion requirement and reduce bit rate .It enables the application to network environment, and bit allocation for lossy coding.Possible future work includes applying the proposed method to digital libraries for fast browsing and search, and extending the current method from 2D to 3D to enable fast 3D rendering and manipulation in computer graphics.
[ 1]T. Sikora ,"MPEG-4 Very Low Bit Rate Video", 1997 IEEE International Symposium on Circuits and System , pp.1440-1443, June 1997.
[ 2]T. Sikora and L. Chiariglione,"MPEG-4 Video and its Potential for Future Multimedia Services",1997 IEEE International Symposium on Circuits and System, pp. 1468-1471, June 1997.
[ 3]T. Ebrahimi,"MPEG-4 video verification model; A video encoding/decoding algorithm based on content representation", Signal Processing :Image Communication , pp. 367-384 1997.
[ 4]A. K. Katsaggelos ,L. P. Kondi, F. W. Meier, J. Ostermann and G .M. Schuster, "MPEG-4 and Rate-Distortion-Based Shape-Coding Techniques", Proceedings of the IEEE , Vol. 86, no. 6, June 1998.
[ 5]MPEG Video Group: ISO/IEC JTCI/SC29/WG11 N1584,"Technical description of core experiments on MPEG-4 binary shape coding",February 1997.
[ 6]J. Ostermann, E. S. Jane, Jae-Seob Shin, T. Chen. "Coding of Arbitrarily Shaped Video Objects in MPEG-4".Vol. 1, no.1440, pp.496-499, ICIP 97.
[ 7]N. Brady. F. Bossen, N. Murphy, "Context-based arithmetic encoding of 2D shape sequences", Special session on shape coding, Vol.1, no.731 pp.29-32, ICIP 97.
[ 8]K.J.O'Connell, "Object-Adaptive Vertex-Based Shape Coding Method", IEEE Transactions on Circuits and system for video technology, Vol.7, no.1,February, 1997.
[ 9]R. J. Qian and M. I. Sezan, "Content-Scalable Shape Representation and coding", Vol.2,no.641,pp.819-823,ICIP 1997.
F. H. P. Spaan, R. L. Lagendijk and J. Biemond,"Shape Coding Using Polar Coordinates and the Discrete Cosine Transform", Vol.1,no.1107,pp.516-520, ICIP 1997.
Figure 7. "Akiyo" and "News" test sequences and corresponding alpha planes
Bit-rate :1224 , dn :0.0038
(proposal in polar coordinates)
Bit-rate :2240 , dn :0.032
(vertex-base in Cartesian coordinate)
Bit-rate :1944 , dn :0.0013
(proposal in polar coordinates)
Bit-rate :2816 , dn :0.037
(vertex-base in Cartesian coordinate)
Figure 8. Lossy encoding using vertex-based shape coding using
(a) in polar coordinate and (b) in Cartesian coordinate
Figure 9. Comparison Rate-distortion curve of the test Akiyo sequence in Cartesian and Polar Coordinates
Figure 10. Comparison Rate-distortion curve of the test News sequence in Cartesian and Polar Coordinates