**Vertex-Based Shape Coding in
Polar Coordinates for MPEG-4**

*Chil-Cheang Ma and Mei-Juan Chen*

Institute of Electrical Engineering,

National Dong-Hwa University, Hualien, Taiwan

*Abstract*

MPEG-4 standard will allow the transmission of the arbitrarily
shaped video object. It allows the encoding of video objects
using motion, texture and shape information. In this paper, a new
vertex-based shape coding with polygon approximation in polar
coordinates is proposed. To optimize the coding efficiency of
polygon approximation of object boundary in polar coordinates, a
new vertex selection scheme and the area distortion criterion are
presented. The proposed vertex selection scheme can reduce the
shape points of the boundary for *lossless* coding. For *lossy*
coding, the proposed area distortion criterion provides good
rate-distortion trading-off. The proposed algorithm makes
compression more efficient and degradation due to errors more
graceful.

*I. Introduction*

Due to the rapid growth of multimedia communications for internet applications, video technologies have been paid much attention recently. MPEG-4, with emphasis on content-based interactivity, universal access, and compression performance, will soon be finalized in late 1998 [1-5]. For content-based interactivity, MPEG-4 will assist the efficient and flexible coding and representation of both natural as well as synthetic data. For compression efficiency, MPEG-4 supports an important functionality at very low bit rates below 64 kbit/s[1]. As far as universal access, multimedia audio-visual data need to be transmitted and accessed in network environments.(Figure 1).[2]

Object-oriented video compression is an important
part for digital video processing. Within the object-oriented
framework, analysis and synthesis of an image is done by treating
it as a collection of disjoint video object planes (VOPs). The
purpose of using shape is to achieve better subjective picture
quality, increased coding efficiency as well as an object-based
video representation. Two main classes of binary shape coder have
emerged in the MPEG-4 standardization effort; bitmap-based and
vertex-based (or contour-based) coder [6,7]. In this paper, we
present a new vertex-based shape coding technique using polygon
approximation in polar coordinates for *lossless*** **or

Figure 2: Block diagram of the proposed technique

The paper is organized as follows: The vertex-based shape coding in Cartesian coordinate is briefly reviewed in section II. The transformation of Cartesian coordinate to polar coordinate is described in section III. The polygon approximation with a new vertex selection scheme is discussed in section IV. Section V details the method for vertex sampling with area distortion criterion. And the entropy coding is described in section VI. Section VII addresses the experiment results. Conclusions are made in the section VIII.

*II. Review of the Vertex-based Shape
Coding in Cartesian Coordinate*

The vertex-based shape coding in Cartesian coordinates [8,9]
codes the outline of the boundary. The approach starts with
finding the axis of the shape and uses the two end points as the
initial polygon (as shown in Figure 3). Let us assume that *P**k**P**k+1*
approximates the original contour between the vertices *P**k* and *P**k+1*
.The original contour segment associated with this approximation *P**k**P**k+1* segment
is called *C**k* with the contour
points *C**k,i**,* with *k*
identifying the segment and *I* the number of contour points
of *C**k*. With *d*(*P**k**P**k+1*,*Ck,i*)
being the Euclidean distance between contour point *C**k,i* and the line *P**k**P**k+1*, the approximation error for a segment
*C**k* of the original contour is
given by *d*max(k)

For each side of the polygon, it is checked
whether the approximation lies within a given tolerance *d*max(*k*) < *threshold* (Figure 3),
If not, a new vertex is inserted at the point of the largest
approximation error. Then, for each new polygon side, it is
decided whether it lies within the allowable shape approximation
error is lower than threshold. The described vertex selection
method selects all vertices on the object boundary. After this
vertex selection, the *N* vertex positions in Cartesian
coordinates are encoded. To save bits, the vertices are renumber
such that the largest difference in x-or y-coordinates appears
between vertex *P**0* and *P**N-1*. The above procedure is done in
Cartesian coordinates system.

Figure 3: Vertex selection method in Cartesian coordinate

*III. Transformation to Polar Coordinates*

The technique is based on the description of the object shape
in polar place. It is discovered that the polar coordinate is a
more natural representation of describing two dimension objects
as opposed to the Cartesian coordinate system. Each of the
contour pixels yields, instead of an *x**i**
*and a *y**i*** **coordinate,
an i and a i coordinate which gives for each shape the
function i and i ; with i the index of the contour pixel

where *c**x* and *c**y* are the coordinate of the center *R**x,y* of the shape which is taken as the
origin of the coordinate system. The center point *R**xy**(c**x**,c**y**)* is got by the below definition

where *R**xy* is center point
vector of distance,* N* is the total pixel numbers of the
contour and *i* is the index.

*IV. A New Vertex Selection Scheme for
Polygon Approximation *

The** **efficiency of the shape coding depends to a
large extent on the encoder. The art of vertex-based shape coding
lies in selecting the appropriate vertices for the polygons. At
the same time, the reason that the approach of detection of
feature points is popular is that feature points provide a
sufficient constraint to compute image displacements reliably.
One of the most intuitive types of feature points is the corner.
Corner detectors have been widely used as feature point detectors
because corners correspond to image locations with a high
information content, and they can be matched between images
reliably. To be useful for feature point matching, a new method
of the corner selection is proposed to choose optimal control
points of the boundary of the object. We use the transition rule
of the angles and the radii among the successive points to select
turning point in order to remove the redundant points. Figure 4
illustrates how to select the control points of the boundary. The
algorithm is based on the variation of the angles and the radii
of the successive points along tracing line passing through the
examed points. We define 3-order pair point (

By the detection of changing of the radii and the angles, the midpoint Pi is selected ,where i is the index of pixels on the boundary.

(i+1 -i) (i -i-1)<0 (i+1-i) (i -i-1)>0 (a) |
(i+1 -i) (i -i-1)>0 (i+1-i) (i -i-1)<0 (b) |

Figure 4

All the selected turning points are formed the vertices of the approximate polygon. Although the number of pixels to represent the contour is reduced, it is lossless representation. The method reduces data storage compared to the original shape points of the object boundary. The proposed scheme performs an optimal boundary approximation, and reduces redundant boundary data.

*V. Vertex Sampling with Area Distortion
Criterion*

For low bit-rate lossy coding, a vertex sampling with area
distortion criterion between the boundary and the approximated
shape is used for finding the optimal locations of the vertices.
We perform numerical integral to estimate the area distortion
region. The higher the resolution of the original boundary, the
closer the numerical integral becomes to the measure involving
the count of mislabeled pixels. Furthermore, employing a
continuous area metric makes it possible for the decoder to
reconstruct the boundary on a finer quality. Let us define by the
boundary distortion for 3-point pair*(P**i-1**,
P**i**, P**i+1**)*,
as shown in Figure 5.

The area distortion *A* is defined in (Eq.5)

where *r**1*=(*P**i,x**-P**i-1,x*
, *P**i,y**-P**i-1,y*),
*r**2*=(*P**i+1,x**-P**i,x** *, *P**i+1,y**-P**i-1,y*), and is the angle
between vector *r**1* and vector *r**2.*

To optimize the coding and sample the control points of the boundary, we present the area distortion to apply coding the boundary object for enable finer reconstruct quality in decoder. The procedure is shown in Figure 6.

Figure 6: The procedure of sampling the vertices

If area distortion *A* <*threshold*, the midpoint
*P**i* is ignored. The system
executes vertex-transferring procedure. The initial point *P**i-1* is still, and the point *P**i+1* is shifted to the next point *P**i+2 *and the point *P**i**
*position is shifted to the neighbor point *P**i+1*. At the same time, the original *P**i+2 *symbol is become *P**i+1*, and* P**i+1**
*is transferred to *P**i** *symbol.
After the vertex-transferring proceeding, it is repeated the
procedure of judgment of the sampled condition of area distortion
among (*P**i**, P**i+1**, P**i+2*)
(Figure.5).

If the condition is not satisfied, the pixel *P**i** *of the boundary is taken and
coded. Then, the initial point *P**i-1*
shifted to the encoded point *P**i**
*and for point *P**i,* it is
considered the next time sampled initial point and iterative to
area distortion proceeding for selecting another sampled point.
Using this regulation not only selects less vertices to reduce
data storage and achieve low bit-rate coding but also lesser
error distortion in reconstruction arbitrary boundary.

*VI. Entropy Coding*

Two efficient coding algorithms, the Huffman coding scheme and arithmetic coding ,are widely utilized for encoding contour line. Huffman code is well-known to be an easily implemented and efficient method for data compression, and arithmetic code is well- known for their ability to achieve code rates that are close to optimal performance bound. Therefore, we exploit the arithmetic code to achieve better coding efficiency.

*VII. Simulation Results*

To compare the performance of different shape coders,
evaluation criteria have to be defined [4]. Within MPEG-4, the
measure* d**n** *that is the
number of erroneously represented pels of the coded shape divided
by the total number of pels belonging to the original shape. It
is defined in (Eq.6).

This metric is used by MPEG-4 to evaluate performance. Two
test sequences, namely *Akiyo* and *News* are used for
the performance evaluation. Both of the sequences are in QCIF
format (i,e. 176 by 144 pixels). Figure 7 shows the pictures and
the corresponding alpha planes. The proposed method is compared
to the vertex-based algorithm in Cartesian coordinate. Figure 8
shows the approximation bitmap of *Akiyo *and *News*
test sequences. For the test sequence, it shows bit-rate of the
proposed in polar coordinate is lower than vertex-based in
Cartesian coordinate when the distortion *d**n*
is almost as same. The reconstruct boundary of the object is
finer than the conventional vertex-based shape coding. Figure 9
shows an experimental result about rate-distortion in the *Akiyo*
test sequence. Figure 10 shows the other of experimental result
in the *News* test sequence. From the experimental results,
our method provides a large of efficiency in reduce bits storage
requirement for lossy coding. And for multimedia application, the
proposed method is suitable to be adopted in low bit rate for
MPEG-4 standard requirement.

*VIII. Conclusion*

This paper presents a new vertex-based shape-encoding algorithm, which uses a polygon approximation in polar coordinates combined with the area distortion criterion measure. The result informs that the proposed method is superior to vertex-based shape coding for MPEG-4 in Cartesian Coordinate, and the bit rate is lower for comparable distortion rate. When compared with vertex-based coding for MPEG-4, the algorithm has 30%-50% rate reduction for comparable distortion. The outstanding results for the proposed method provides a higher coding efficiency. In addition, the technique has the advantage of less computation complexity for area distortion criterion requirement and reduce bit rate .It enables the application to network environment, and bit allocation for lossy coding.Possible future work includes applying the proposed method to digital libraries for fast browsing and search, and extending the current method from 2D to 3D to enable fast 3D rendering and manipulation in computer graphics.

*Reference*

[ 1]T. Sikora ,"MPEG-4 Very Low Bit Rate Video",*
1997 IEEE International Symposium on Circuits and System *,
pp.1440-1443, June 1997.

[ 2]T. Sikora and L. Chiariglione,"MPEG-4 Video and its
Potential for Future Multimedia Services",*1997 IEEE
International Symposium on Circuits and System*, pp.
1468-1471, June 1997.

[ 3]T. Ebrahimi,"MPEG-4 video verification model; A video
encoding/decoding algorithm based on content
representation", *Signal Processing :Image Communication*
, pp. 367-384 1997.

[ 4]A. K. Katsaggelos ,L. P. Kondi, F. W. Meier, J. Ostermann
and G .M. Schuster, "MPEG-4 and Rate-Distortion-Based
Shape-Coding Techniques", *Proceedings of the IEEE , *Vol.
86, no. 6, June 1998.

[ 5]MPEG Video Group: ISO/IEC JTCI/SC29/WG11 N1584,"Technical description of core experiments on MPEG-4 binary shape coding",February 1997.

[ 6]J. Ostermann, E. S. Jane, Jae-Seob Shin, T. Chen. "Coding of Arbitrarily Shaped Video Objects in MPEG-4".Vol. 1, no.1440, pp.496-499, ICIP 97.

[ 7]N. Brady. F. Bossen, N. Murphy, "Context-based arithmetic encoding of 2D shape sequences", Special session on shape coding, Vol.1, no.731 pp.29-32, ICIP 97.

[ 8]K.J.O'Connell, "Object-Adaptive Vertex-Based Shape
Coding Method", *IEEE Transactions on Circuits and system
for video technology,* Vol.7, no.1,February, 1997.

[ 9]R. J. Qian and M. I. Sezan, "Content-Scalable Shape Representation and coding", Vol.2,no.641,pp.819-823,ICIP 1997.

[10]F. H. P. Spaan, R. L. Lagendijk and J. Biemond,"Shape Coding Using Polar Coordinates and the Discrete Cosine Transform", Vol.1,no.1107,pp.516-520, ICIP 1997.

Figure 7. "*Akiyo*" and "*News*"
test sequences and corresponding alpha planes

Bit-rate :1224 , dn :0.0038 (proposal in polar coordinates) |
Bit-rate :2240 , dn :0.032 (vertex-base in Cartesian coordinate) |

Bit-rate :1944 , dn :0.0013 (proposal in polar coordinates) |
Bit-rate :2816 , dn :0.037 (vertex-base in Cartesian coordinate) |

Figure 8. Lossy encoding using vertex-based shape coding using

(a) in polar coordinate and (b) in Cartesian coordinate

Figure 9. Comparison Rate-distortion curve of the test Akiyo sequence in Cartesian and Polar Coordinates

Figure 10. Comparison Rate-distortion curve of the test News sequence in Cartesian and Polar Coordinates