Lossless Data Compression Using Golumb Codes
Mr. Suhas D.
Kakde1, Mr. Shashank S. Mane2
1Assistant Professor, Dept. of Electronics and Tele Communication
Engineering, PJLCE, Nanadanwan,
Nagpur,
India 440009.
2Assistant Professor, Dept. of
Electronics and Communication Engineering, SBITM, Betul,
India 460001.
*Corresponding Author E-mail: suahs.kakde@gmail.com,
mane.shashank@gmail.com
Abstract:
This
paper describes the detailed study and analysis of Golumb
codes used for the test vector compression in VLSI testing. The Golumb Codes are widely used for lossless Data compression
due to its simpler encoding and Decoding methods.
Furthermore, a comparative study of various compression techniques is also
presented in this paper. This paper also give out an
idea about the design of Golumb Encoder and Decoder
using VHDL for the test vectors thus achieving a good height of compression
ratio. The developed algorithm can be verified using the Modelsim
6.3f , Xilinx 9.2i and ALTERA Quartus
II software.
KEY WORDS: Golumb,
Compression, lossless, EGC, MEGC.
INTRODUCTION:
The rapid widespread growth of digital technologies
such as digital television, internet access, huge amount of data storage and
video calls have increased the demand for high storage data and transmission
capacity in order to fit the growing needs[1]. Data compression involves
encoding information using fewer bits than the original representation.
Compression of data in large-sized files reduces storage space and transmission
time in a network and hence reduces the cost of storage and transmission. Most
of the files has lot of redundancy which can be
removed using Compression.
Compression can be achieved through a lossy or lossless mechanism. The selection of compression
method depends on the application. Lossy compression
of a tolerable limit can be used for video transmission, where the loss of data
in the reconstruction cannot be easily noticed by human visual system.
However, compression of data of crucial information
such as the database of a bank needs to be lossless. There have been extensive
research efforts in this field since the last 50 years.
Golomb coding is one of the lossless data
compression techniques. It is competent of compressing larger sized data into a
smaller sized data while still allowing the original data to be reconstructed
back after decompression [1]. Due to its lower design complexity and
computational load, Golumb codes are more preferred
over other lossless data compression codes.
Exp-Golomb VLC for entropy
coding in H.264, which is mostly used for video coding standard is the most
efficient application of golumb code
. Golumb codes are also used for colour image [4], [6] and audio compression. Other application of golumb codes is for embedded cores in a system-on-a-chip
(SOC) [7]. The key features of Golomb coding for test data include very elevated
compression, analytically predictable compression results, and a stumpy cost
and scalable on-chip decoder.
In this paper, a detail study of Golomb
coding algorithms for test vector compression and decompression is presented.
In order to have simplicity in development and testing, the Golomb
coding parameter m is set to 4 [2].The goal of this work was to increase
the compression ratio as high as possible without any loss in the original
data.
The remainder of the paper is organised
as follows. Section 1 presents the details of Golomb
Coding and the basic compression and decompression method. Section 2 presents
the review of past work carried out on golumb codes.
Section 3 presents the future scope in the field of Golumb
Codes. Lastly, Section 4 concludes the paper.
The details regarding Golomb
Coding basic background information is described. In Golomb
Coding, the group size, m, defines the code structure. Thus, choosing the m parameter
decides variable length code structure which will have direct impact on the
compression efficiency [2].
After finalization of parameter m, a table which maps
the runs of zeros until the code is ended with a one is created. A Run length
of multiples of m are grouped into Ak
and given the same prefix, which is (k – 1) number of one’s followed by a
zero, which can also termed as quotient and can be represented in the form of
unary codes. A tail is given for each member of the group, which is the binary
representation of log2m bits. The other term for tail is Remainder
of the division of run length by m. The codeword is then produced by combining
the prefix and the tail. An example of the analytically encoding the data
stream using Golumb Encoder is given in table 2.
Table I. Golomb Encoding example with parameter m = 4
|
Group |
Run
length |
Group
prefix (Quotient) |
Tail
(Remainder) |
Codeword
(Prefix + tail) |
|
A1 |
0 |
0 |
00 |
000 |
|
1 |
01 |
001 |
||
|
2 |
10 |
010 |
||
|
3 |
11 |
011 |
||
|
A2 |
4 |
10 |
00 |
1000 |
|
5 |
01 |
1001 |
||
|
6 |
10 |
1010 |
||
|
7 |
11 |
1011 |
||
|
A3 |
8 |
110 |
00 |
11000 |
|
9 |
01 |
11001 |
||
|
10 |
10 |
11010 |
||
|
11 |
11 |
11011 |
Table II. Golomb
Encoding example with parameter m = 4
|
Data Set |
001 0001 00001 0000001 000001 000101 |
||||||
|
Subset |
001 |
0001 |
00001 |
0000001 |
000001 |
0001 |
01 |
|
Encoded Output |
010 |
1000 |
1001 |
1010 |
1001 |
011 |
001 |
The golomb Encoder and
Decoder models can be well explained with the help of flowcharts given in
figure 1 and figure 2 [2]. The design of the algorithm is made with the
assumption that the input data string will be terminated with a “1”. This will
not always be the case because the input data string may also be terminated
with a “0”.To overcome this problem, a modified version of Golumb
Compression and decompression methods are used.
In order to avoid this problem, the algorithm must be
capable of detecting the end of data and if the last bit is a ‘0’ then
additional ‘1’ must be added during the encoding process and at the time of
decompressing the encoded data,this
extra appended 1 should be removed by the decoder..
Figure
1. Golomb Encoder Flowchart with
parameter m =4
Figure
2. Golomb Decoder Flowchart
with parameter m = 4
A lossless color image compression and decompression hardware
architecture reduces the memory requirement and bandwidth of the system to a
larger extent. The architecture consists of differential-differential pulse
code modulation (DDPCM) and Golomb-Rice coding. The
original image frame is organized as m by n sub-window arrays, to which
Differential differential Pulse code modulation
(DDPCM) is applied to produce one seed and m×n−1 pieces of differential data [4]. Then using the Golomb algorithm, the differential data are encoded to
produce lossless compressed code. This gives the high compression rate and
throughput at the output for real time lossless Compression and decompression
operations for small area. The
performance comparison between the DDPCM + Golumb
Encoding scheme and some other encoder algorithms like JPEG-LS, FELICS and
arithmetic codes [9-11] is given in table 3.
The Receiver side consists of three steps of
unpacking, Golomb-decoding, and inverse DDPCM. The
unpacking hardware module decides whether the input is compressed or not according to the value of the V
bit stored in the V-buffer. If the V bit stored in V buffer is 1, then the data
is directly forwarded to the processor without any processing. Or else, Golomb-Rice decoding and inverse DDPCM operations are
performed to reconstruct the original 4 × 4 sub-window image data. The circuit
at the output side is divided into two pipeline stages in which the first stage
is Golumb decoding and the differential output of
decoder is provided as input to second stage which is inverse DDPCM. Table 4
gives the performance comparison scheme between Golumb
Decoder + DDPCM and Lossless JPEG decoder.
Table 3. Encoder
Comparison scheme
|
Algorithm |
Arithmetic
Coding |
FELICS |
JPEG-LS |
DDPCM
+ Golumb Rice |
|
Technology |
Xillinx
vertex 4 |
0.13um |
0.18um |
0.15um |
|
Operating
frequency |
123 |
273 |
40 |
170 |
|
Throughput
(MB/s) |
123 |
546 |
9.9 |
2720 |
|
Parallelism |
1 |
2 |
1 |
16 |
|
Compression
ratio |
1.76 |
235 |
NA |
1.52 |
|
Area
(GEs) |
NA |
12.97k |
17.6k |
16k |
|
Power
Efficiency |
NA |
16 |
1.38 |
607.14 |
Table
4. Decoder Performance comparison scheme
|
Algorithm |
Technology |
Operating frequency |
Throughput (MB/s) |
Logic Area (GEs) |
|
DDPCM+
Golumb decoder Rice [4] |
0.15um |
170 |
270 |
16.46k |
|
Lossless
JPEG decoder [11] |
0.18um |
200 |
500 |
39k |
Two methods to represent nonnegative integers based on
the principle of Golomb code is described in this
section. In both methods, the given integer is successively divided with a
divisor, the quotient and the remainders are then used to represent the value
of non-negative integer. First method is best suited for representing short
integers. Another method is best suited for representing both short and long
integers. Extended Golumb codes (
EGC) is a method to code the given nonnegative Integer n[8]. In this
method, a divisor d is selected and the integer n to be coded is divided
successively m times by d
until the quotient q becomes zero. In each division, the
remainder value is retained. The integer is then coded by coding m and the m
remainders as code(m)|code (rm,rm-1,…….r1)
Table 5. Extended Golumb Code Example
|
N |
d=2 |
d=3 |
d=4 |
|
1 |
1 |
10 |
10 |
|
2 |
010 |
11 |
110 |
|
3 |
011 |
0100 |
111 |
|
4 |
00100 |
01010 |
01000 |
|
5 |
00101 |
01011 |
01001 |
The second method is Modified Extended golumb code (MEGC) [5], in which the successive division of
the given integer n by the divisor d was done until the quotient q becomes
zero. In Modified Extended golumb code (MEGC), an
integer n is divided by a divisor d successively until the quotient q becomes
either 0 or 1 unlike the previous Extended Golumb
Codes where successive divisions were made until the quotient becomes zero. The
last remainder rm is coded in log2 (d-2)
bits when m>=2 and q=0 , in log2(d-1)
bits when m=1 and q=0 and in log2(d) bits for all other cases. The
advantage of Modified Extended golumb code (MEGC) is
that, if the
division is stopped when q reaches the
value 1 in that case, large integers will get benefited in its length of code.
In MEGC, short integers take more number of bits and large integers take less
number of bits than EGC. Table 5 and Table 6 gives the
example of Extended Golumb Encoding and Modified
Extended Golumb Encoding with different values of
divisor.
Table
6. Modified Extended Golumb
Code Example
|
N |
d=2 |
d=3 |
d=4 |
|
1 |
0 |
010 |
010 |
|
2 |
110 |
011 |
0110 |
|
3 |
111 |
110 |
0111 |
|
4 |
10100 |
1110 |
1100 |
|
5 |
10101 |
1111 |
1101 |
PROJECTED
SCOPE AND METHODOLOGY:
The development of Golomb
coding algorithms for data compression and decompression and their
implementation in Field Programmable Gate Array (FPGA) is studied successfully.
In Simple Golumb codes, the data is divided into
subsets which maps the runs of zeros until the code is ended with a one[2]. In future work of Golumb coding,a modified Golumb Coder is designed to serve as a good application for
the test vector Compression. In that
case, the input bit stream is divided
into two subsets in which first subset will contain the test vectors which maps
the runs of zeros until the code is ended with a one and the second subset will
contain the vectors which maps the runs of one’s until the code is ended with a
zero. For first subset, the data is divided by the divisor m and in the second subset, the data is divided by the divisor n. For lower
error probability, the group size m and n for both the subsets is kept equal.
The optimum value of m and n is set to 4. For verification, simulation and hardware
realization Golomb Coding compression and
decompression algorithms can be implemented on the Field Programmable Gate
Array (FPGA). The Modified Golumb Coder will also
serve as a good application for compressing an image.
CONCLUSION:
In this paper, algorithms of Golumb encoder and decoder for test vector compression has
been studied. A proportional study is also proposed between Golumb
Encoder and other data compression techniques. The successful working of these
algorithms can be proved by comparing the results generated using simulation of
the Golomb Encoder with the expected analytical
results which should be identical. The use of encoded data as the input data
for Golomb Decoder managed to generate back the
original data can prove the success of Golomb Decoder
System.
REFERENCES:
[1] S. W. Golomb, “Run
Length Encodings,” IEEE Transactions on Information Theory,vol. 12,
pp. 399-401, 1966.
[2] G. H. H’ng, M. F. M.
Salleh and Z. A. Halim,” Golomb Coding
Implementation in FPGA”, School
of Electricaland Electronic Engineering, Universiti Sains Malaysia,Seri
Ampangan,14300 Nibon Tebal,
Pulau Pinag, Malaysia. VOL. 10, NO. 2, 2008, 36-40.
[3] Walter D. Leon-Salas, Sina
Balkir, Khalid Sayood, and
Michael W. Hoffman, “An Analog-to-Digital Converter with Golomb–Rice
Output Codes” IEEE transactions on circuits and systems—ii: express
briefs, vol. 53, no. 4. APRIL 2006.
[4] Hong-Sik Kim, Joohong Lee, Hyunjin Kim, Sungh Kang, and Woo Chan Park, A Lossless Color
Image Compression Architecture using a
Parallel Golomb- Rice Hardware CODEC, IEEE
transactions on circuits and systems for video Technology, vol. 21, no. 11,
November 2011.
[5] K. Somasundaram and
S. Domnic, ”Extended Golomb Code for Integer Representation”, IEEE transactions
on multimedia, VOL. 9, NO. 2, February 2007.
[6] Chin-Chen Chang, “An Enhancement of JPEG Still
Image Compression with Adaptive Linear Regression and Golomb-Rice
coding, 2009 Ninth International Conference on Hybrid Intelligent
Systems.
[7] Anshuman Chandra,
Student Member, IEEE, and Krishnendu Chakrabarty, Senior Member, IEE “ System-on-a-Chip Test-Data
Compression and Decompression Architectures Based on Golomb
Codes”, IEEE transactions on computer
aided design of integrated circuits and systems, VOL. 20, NO. 3, MARCH 2001.
[8] Henrique S. Malvar,
“Lossless and Near-Lossless Audio Compression Using Integer- Reversible
Modulated Lapped Transforms, 2007 Data Compression Conference.
[9] X. Chen, N. Canagarajah,
J. L. Nunez-Yanez, and R.Vitulli,
“Hardware architecture for lossless image
compression based on context-based modeling and arithmetic coding,” in Proc.
IEEE Int. SoC Conf., Sep. 2007, pp. 251–254.
[10] T.-H. Tsai, Y.-H.
Lee, and Y.-Y. Lee, “Design and analysis of high throughput lossless
image compression engine using VLSI-oriented FELICS algorithm,” IEEE
Trans. Very Large Scale Integr. Syst., vol.18, no. 1, pp. 39–52, Jan. 2010.
[11] L. Xiaowen,
X. Chen, X. Xie, G. Li, L. Zhang, C. Zhang, and Z. Wang,“A low
power, fully pipelined JPEG-LS encoder for lossless image compression,” in Proc.
IEEE Int. Conf. Multimedia EXPO, Jul. 2007,pp. 1906–1909.
Received
on 20.06.2015 Modified on 25.06.2015
Accepted
on 30.07.2015 ©A&V Publications
All right reserved
Research J. Science and Tech. 7(4):Oct. – Dec. 2015;
Page 197-200
DOI: 10.5958/2349-2988.2015.00027.3