Chapter 7 Lossless Compression Algorithms

3y ago

100 Views

8 Downloads

1.49 MB

44 Pages

Last View : 10d ago

Last Download : 3m ago

Upload by : Kian Swinton

Report this link

Download PDF

Transcription

Multimedia Data CompressionPart IChapter 7Lossless CompressionAlgorithms1

Chapter 7Lossless CompressionAlgorithms1. Introduction2. Basics of Information Theory3. Lossless Compression Algorithms Fix-Length Coding Run‐length coding Differential coding Dictionary‐based coding Variable - Length Coding Shannon-Fano Algorithm Huffman Coding Algorithm2

Introduction A digital file (Video, Image and Audio) can easily become very large we need Huge volume of multimedia data It can be useful or necessary to compress them for ease of storageor delivery we need more efficient data storage, processing and transmission However, while compression can save space or assist delivery, it canslow or delay the opening of the file, since it must be decompressedwhen displayed. Some forms of compression will also compromise the quality of thefile.3

Introduction Compression: the process of coding that will effectivelyreduce the total number of bits needed to representcertain information. Fig. 7.1: A General Data Compression Scheme.4

Introduction Compression ratio denotes the relation between the size ofthe original data before compression and the size of thecompressed data. Compression ratio therefore rates theeffectivity of a compression system in terms of data reductioncapability.Bcompressionratio 0B1B0 – number of bits before compressionB1 – number of bits after compression In general, we would desire any codec (encoder/decoderscheme) to have a compression ratio much larger than 1. Thehigher the compression ratio, the better the losslesscompression scheme.5

Introduction If the compression and decompression processes induceno information loss, then the compression scheme islossless; otherwise, it is lossy. Lossy compressionmeans that the qualityof your file will bereduced. The right sideof this image has beensaved using lossycompression anddoesn't look as good asthe left.6

IntroductionLossless Compression Algorithms There will be no data loss in this type of compression as it isdefined by the name. Both original data and the compresseddata are the same in this compression. The algorithms for thecompression and decompression are exact inverse of eachother in the Lossless Compression. The main mechanism inthis compression is removing the redundant data in thecompression and adding them in the decompression. Advantages: The original format of the data remains even it iscompressed. Disadvantages: Reduction of the size of the data is a small.Sometimes the size can increase instead of decrease.7

IntroductionLossy Compression Algorithms It is the compression technique which will lose data in theoriginal source while trying to keep the visible quality at thealmost same amount. The compression ratio will be very high.Most probably the ratio will be a value near 10. It reduces nonsensitive information to the human eyes and the compressedmedia will not be the media that was available beforecompression. Advantages: Can reduce the file size more than in the LosslessCompression Disadvantages: The original file cannot be taken after thedecompression8

Motivating ExampleTransmit the data {250, 251, 251, 252, 253, 253, 254, 255} bythe network Rewrite the data sequence using binary vector {11111010, 11111011 ,11111011 ,11111100 ,11111101,11111101,11111110, 11111111} Totally require 8*8‐bit 64 bits for transmissionThe available bandwidth is limited!! Only 16 bits available for the transmission of the data Compression is necessary!!9

Lossy Compression Encode: drop the least significant bits Encode data: 8 * 2‐bit 16 bits10

Lossless Compression Encode: encode the difference Encode data: 8‐bit 7*1‐bit 15 bits11

Lossless CompressionAlgorithms12

Basics of Information TheoryEntropy The entropy η of an information source with alphabet S {s1,ns2, . . . , sn} is:1 H ( S ) pi log 2pi(7.2)i 1n pi log 2 pii 1(7.3)pi – probability that symbol si will occur in S.log 1 – indicates the amount of information ( self information2 pias defined by Shannon) contained in si, which corresponds tothe number of bits needed to encode si.13

Basics of Information TheoryDistribution of Gray-Level Intensities Example For example, in an image with uniform distribution of graylevel intensity, i.e. pi 1/256, then the number of bits neededto code each gray level is 8 bits. The entropy of this image is 8.Fig. 7.2 Histograms for Two Gray-level Images. Fig. 7.2(a) shows the histogram of an image with uniform distribution ofgray-level intensities, i.e., i pi 1/256. Hence, the entropy of this image is:(7.4) Fig. 7.2(b) shows the histogram of an image with two possible values. Itsentropy is 0.92.14

Basics of Information TheoryEntropy and Code Length The most significant interpretation of entropy in the context of losslesscompression is that entropy represents the average amount ofinformation contained (in bits) per symbol in the source S.(bits/symbol), i.e. to conduct lossless compression. Entropy therefore gives a lower bound on the number of bits requiredto represent the source information, i.e. the minimum amount of bitscan be used to encode a message without loss.15

Basics of Information TheoryEntropy ExampleA. What is the entropy of the image below, where numbers(0, 20, 50, 99) denote the gray-level intensities?16

Entropy Example17

Lossless Compression AlgorithmsLossless CompressionAlgorithmsFix-length codingRun‐length codingDifferential codingDictionary‐basedcodingVariable-Length CodingShannon-FanoCodingHuffmanCoding18

Fix-length coding The length of the codeword is fixed1. Run‐length coding2. Differential coding3. Dictionary‐based coding19

Fix-length coding1. Run length coding Run-length coding is a data compression algorithm thathelps us encode large runs of repeating items by onlysending one item from the run and a counter showinghow many times this item is repeated. Unfortunately this technique is useless when trying tocompress natural language texts, because they don’thave long runs of repeating elements. In the other handit is useful when it comes to image compression, becauseimages happen to have long runs pixels with identicalcolor.20

Fix-length coding1. Run length coding For example:The string AAABBCDDDD would be encoded as 3A2B1C4D.aaaabbaba ?4a2b1a1b1a A real life example where run-length encoding is quite effective isthe fax machine. Most faxes are white sheets with the occasionalblack text. So, a run-length encoding scheme can take each line andtransmit a code for while then the number of pixels, then the codefor black and the number of pixels and so on. This method of compression must be used carefully. If there is not alot of repetition in the data then it is possible the run lengthencoding scheme would actually increase the size of a file.21

Fix-length coding1. Run length coding (Example) Original data: AAAAABBBAAAAAAAABBBBUsing ASCII code B 20*8bit 20 * 1Byte 20 BytesRun length coding result: 5A3B8A4B Compressed data 8 * 1‐Byte 8 Bytes 20 Bytes Compression ratio: 20/8 2.5 1 good 22

Fix-length coding1. Run length coding (Example)Extreme cases: Best case: AAAAAAAA 8A Compression ratio: 8/2 4 Worst case: ABABABAB 1A1B1A1B1A1B1A1B Compression ratio: 8/16 0.5 Negative compression: the resulting compressed file is largerthan the original -----------------------------------------Example: Original data: ABABBABCABABBA, use RLC andcompute Compression ratio?23Run‐length coding: 1A1B1A2B1A1B1C1A1B1A2B1ACompression ratio: 14/24 negative compression

Fix-length coding2. Differential coding In this method first a reference symbol is placed. Thenfor each symbol in the data, we place the differencebetween that symbol and the reference symbol used. For example, using symbol A as reference symbol, thestring:AAABBC DDDD would be encoded as A0001123333(since A is the same as reference symbol, B has a difference of 1 fromthe reference symbol and so on.)24

Fix-length coding3. Dictionary based coding One of the best known dictionary based encodingalgorithms is Lempel-Ziv (LZ) compression algorithm. In this method, a dictionary (table) of variable lengthstrings (common phrases) is built. This dictionary contains almost every string that isexpected to occur in data. When any of these strings occur in the data, then theyare replaced with the corresponding index to thedictionary.25

Fix-length coding3. Dictionary based coding In this method, instead of working with individualcharacters in text data, we treat each word as a stringand output the index in the dictionary for that word. For example, let us say that the word "compression" hasthe index 4978 in one particular dictionary. To compress a body of text, each time the string"compression" appears, it would be replaced by 4978.26

Variable-length coding The length of the codeword is variable1. Shannon-Fano Algorithm2. Huffman Coding Algorithm27

Variable-length coding1. Shannon-Fano AlgorithmShannon-Fano Algorithm — a top-down approach1. Sort the symbols according to the frequency count of theiroccurrences.2. Recursively divide the symbols into two parts, each withapproximately the same number of counts, until all partscontain only one symbol.An Example: coding of “HELLO”SymbolHELOCount1121Frequency count of the symbols in ”HELLO”.28

Fig. 7.3: Coding Tree for HELLO by Shannon-Fano.29

Table 7.1: Result of Performing Shannon-Fano on HELLOSymbol Count Log2 p1iCode# of bitsusedL21.3202*1 2H12.32102E12.321103O12.321113TOTAL # of bits:pilog12 piOn average it uses 10/5 2 bitsto code each symbol closeto the lower bound of 1.92 the result is satisfactory 10– probability that symbol si will occur in S.– indicates the amount of information ( self-information asdefined by Shannon) contained in si, which corresponds to the number of bitsneeded to encode si.Entropy ŋ ?30PL . Log 1/PL PH . Log 1/PH PE . Log 1/PE PO . Log 1/PO 0.4 X 1.32 0.2 X 2.32 0.2 X 2.32 0.2 X 2.32 1.92

Fig. 7.4 Another coding tree for HELLO by Shannon-Fano.SymbolCountL2HLog2 1Code# of bits used1.3200412.32012E12.32102O12.32112TOTAL # of bits:10pi31

Example Suppose we have a message consisting :[A(4), B(2), C(2), D(1), E(1)] Draw a Shannon for this distribution.

steps first you need to sort the symbols based on the frequency. Second, try to spilt the collection into 2 groups, where thecollection with large frequency in the right. Third, repeat (1,2) in each sub tree. “splitting is after sorting”

steps Only we can create 2 combinations:A (4)B (2)C (2)D (1)E (1)A (4)B (2)C (2)D (1)E (1)OR

10A(4)6(B,C,D,E)C,D,E (4)B(2)C(2) Average code length 22 / 10 2.2 bits/symbol Compression Ratio 8 / 2.2 .1101Code01011011101111D,E (2)E (1)Subtotal (#of bits)4*1 bit 42*2 bits 42*3 bits 61*4 bits 41*4 bits 422 bits

Variable-length coding2. Huffman Coding AlgorithmHuffman Coding Algorithm— a bottom-up approach1. Initialization: Put all symbols on a list sorted according totheir frequency counts.2. Repeat until the list has only one symbol left:a. From the list pick two symbols with the lowestfrequency counts. Form a Huffman subtree that hasthese two symbols as child nodes and create a parentnode.b. Assign the sum of the children’s frequency counts to theparent and insert it into the list such that the order ismaintained.c. Delete the children from the list.d. Assign a codeword for each leaf based on the path fromthe root.36

Fig. 7.5: Coding Tree for “HELLO” using the Huffman Algorithm.In the Figure, new symbols P1, P2, P3 are created to refer to the parentnodes in the Huffman coding tree. The contents in the list are illustratedbelow:After initialization: L H E OAfter iteration (a):L P1 HAfter iteration (b):L P2After iteration (c):P337

Properties of Huffman Coding Unique Prefix Property: No Huffman code is a prefix ofany other Huffman code - precludes any ambiguity indecoding. Optimality: shortest tree minimum redundancy code- proved optimal for a given data model (i.e., a given,accurate, probability distribution): The two least frequent symbols will have the same length fortheir Huffman codes, differing only at the last bit. Symbols that occur more frequently will have shorter Huffmancodes than symbols that occur less frequently.Shannon-Fano VS Huffman Encoding - Tutorial38

Huffman Example (1)A. What is the entropy () of the image below, where numbers(0, 20, 50, 99) denote the gray-level intensities?B. Show step by step how to construct the Huffman tree toencode the above four intensity values in this image. Showthe resulting code for each intensity value.C. What is the average number of bits needed for each pixel,using your Huffman code? How does it compare to ?39

Huffman Example (2)41

Huffman Example (2) A(64), D(16), B(13), C(12), E(9), F(5)A(64), D(16), P1(14), B(13), C(12)A(64), P2(25), D(16), P1(14)A(64), P3(30), P2(25)A(64), P4(55)P5(119)42

Huffman Example (2) Average code length where P is the probability of the symbol and L is the code length of the symbol avg 0.538 * 1 0.1092 * 3 0.1008 * 3 0.1345 * 3 0.0756 * 4 0.042 * 4 2.042 Entropy Average code length (bits/symbol)Entropy 2.042 bits/symbol Compression Ratio Original Symbol Length before Compression / AverageCode Length after compression Compression Ratio 8 / 2.042 3.918 4 8 because we are using ASCII which stores each symbol in 1 byte , which is 8bits43

DecompressionDecompression is straight forward using the tree given : “100001010101011” 100001010101011 A 1 000 01010101011 AC 1 000 0101 0101011 A C E 1 000 0101 0101 011 A C EE 1 000 0101 0101 011 A C EED 100001010101011 ACEED44

Related Documents:

Lossless Data Compression

Lossless Data Compression Christian Steinruecken Abstract This thesis makes several contributions to the ﬁeld of data compression. Lossless data com-pression algorithms shorten the description of input objects, such as sequences of text, in a way that allows perfect recovery of the original object. Such algorithms exploit the fact

31 Views

3y ago

15 Data Compression - 南華大學

15-1 LOSSLESS COMPRESSION In lossless data compression, the integrity of the data is preserved. The original data and the data after compression and decompression are exactly the same because, in these methods, the compression and decompression algorithms are exact inverses of each other: no part of the data is lost in the process.

41 Views

3y ago

Heir of Fire (Throne of Glass Book 3) - WordPress.com

Part One: Heir of Ash Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7 Chapter 8 Chapter 9 Chapter 10 Chapter 11 Chapter 12 Chapter 13 Chapter 14 Chapter 15 Chapter 16 Chapter 17 Chapter 18 Chapter 19 Chapter 20 Chapter 21 Chapter 22 Chapter 23 Chapter 24 Chapter 25 Chapter 26 Chapter 27 Chapter 28 Chapter 29 Chapter 30 .

351 Views

3y ago

Image Compression Algorithms using VHDL Techniques - IJERT

Image Compression Model Image compression reduces the amount of data from the original image representation. There are two approaches to compress an image. These are: (a) Lossless compression (b) Lossy compression Fig.2.2 shows a general image compression model. Image data representation has redundancy (also called pixel

12 Views

5m ago

A Comparison of Various Video Compression Methods for …

With any lossless compression system, the amount of compression achieved varies with picture content, and with the match between the real data and the data expected. Any lossless compression method—and most lossy types—decrease file size on the average, but do not pr

28 Views

2y ago

MOCKINGBIRD harper lee - ICRRD

TO KILL A MOCKINGBIRD. Contents Dedication Epigraph Part One Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7 Chapter 8 Chapter 9 Chapter 10 Chapter 11 Part Two Chapter 12 Chapter 13 Chapter 14 Chapter 15 Chapter 16 Chapter 17 Chapter 18. Chapter 19 Chapter 20 Chapter 21 Chapter 22 Chapter 23 Chapter 24 Chapter 25 Chapter 26

104 Views

1y ago

Low-complexity sequential lossless coding for piecewise ...

1498 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 45, NO. 5, JULY 1999 Low-Complexity Sequential Lossless Coding for Piecewise-Stationary Memoryless Sources Gil I. Shamir, Student Member, IEEE, and Neri Merhav, Fellow, IEEE Abstract— Three strongly sequential, lossless compression schemes, one with linearly growing per-letter computational

41 Views

3y ago

SKRIPSI PERLAKUAN AKUNTANSI TERHADAP PENDAPATAN PADA PT ...

Sarjana Akuntansi Syariah (S.Akun) Pada Program Studi Akuntansi Syariah Menyetujui Pembimbing I Pembimbing II Drs. Sugianto, MA Kamilah, SE, AK, M.Si NIP. 196706072000031003 NIP. 197910232008012014 Mengetahui Ketua Jurusan Akuntansi Syariah Hendra Harmain, SE., M. Pd NIP. 197305101998031003 . LEMBARAN PERSETUJUAN PENGUJI SEMINAR Proposal skripsi berjudul “PERLAKUAN AKUNTANSI TERHADAP .

96 Views

3y ago

Recent Views

Cyber Security Guide for NZ Law Firms - WordPress

2 Incident Response Solutions Cyber Security Guide for NZ Law Firms Welcome to the Cyber Security Guide for NZ Law Firms The storage of sensitive client information and management of large funds make law firms an attractive target for cybercriminals. It is therefore critical for law firms to understand and mitigate the cyber risks they face.

1y ago

135 Views

New Prudential Regime for Investment Firms - Allen Overy

(iii) Investment firms - often referred to as 'Class 2 firms' - these are non-systemic investment firms that do not carry out dealing on own account or underwriting activities. This category of firms are subject to the full scope of the prudential regime is set out in the IFR and IFD. (iv) Small and non-interconnected investment firms -

1y ago

98 Views

The new EU prudential regime for investment firms

In any event, many bank and non-bank financial groups operating through investment firms in the UK have created new EU27 investment firms (or are scaling up existing EU27 investment firms) to serve EU27 clients as part of their Brexit planning. These firms will be subject to the new EU prudential regime. New Classification of Investment Firms

4m ago

48 Views

Actionable Intelligence: Successful Bi for Law Firms

Source: Gartner, Business Intelligence Imperative, 2001 ACTIONABLE INTELLIGENCE: SUCCESSFUL BI FOR LAW FIRMS - 3. A decade later, the fact gap remains a core issue. Law firms have more data than ever about . 1990 Mid-2000s 2015 A CONDENSED HISTORY OF BUSINESS INTELLIGENCE ACTIONABLE INTELLIGENCE: SUCCESSFUL BI FOR LAW FIRMS - 5.

1y ago

129 Views

12 PUBLIC LAW AND PRIVATE LAW - Home: The National .

INTRODUCTION TO LAW MODULE - 3 Public Law and Private Law Classification of Law 164 Notes z define Criminal Law; z list the differences between Public and Private Law; and z discuss the role of Judges in shaping Law 12.1 MEANING AND NATURE OF PUBLIC LAW Public Law is that part of law, which governs relationship between the State

3y ago

745 Views

Dr. Ram Manohar Lohiya National Law University, Lucknow

2. Health and Medicine Law 3. Int. Commercial Arbitration 4. Law and Agriculture IXth SEMESTER 1. Consumer Protection Law 2. Law, Science and Technology 3. Women and Law 4. Land Law (UP) Xth SEMESTER 1. Real Estate Law 2. Law and Economics 3. Sports Law 4. Law and Education **Seminar Courses Xth SEMESTER (i) Law and Morality (ii) Legislative .

3y ago

496 Views

Investment banks hedge funds private equity

investment banks, hedge funds, and private equity firms can use the book to broaden their understanding of their industry and competitors. Finally, professionals at law firms, accounting firms, and other firms that advise investment banks, hedge funds, and private equity firms should

2y ago

372 Views

2021 Report on the State of the Legal Market

1 Thomson Reuters Peer Monitor data are based on reported results from 162 U.S.-based law firms, including 45 Am Law 100 firms, 56 Am Law Second 100 Firms, and 61 additional Midsize firms. 2 Malcolm Gladwell, The Tipping Point

2y ago

136 Views

Cyber Security for Law Firms

Cyber Security and Legal Practice (Australia) Cyber security threats are increasing. 2019 Cyber Security Report - American Bar Association (ABA)(United States) Over a quarter of firms report that they have experienced some sort of security breach Less than a third of law firms have an incident response plan. 2019 PwC Law Firms' Survey

1y ago

130 Views

MARTINDALE-HUBBELL TOP RANKED LAW FIRMS METHODOLOGY TOP - Fee, Smith

view the entire list online at: fortune.com & law.com martindale-hubbell top ranked law firms methodology ranked firms law top page proof—for approval only presents leal leaders coming in 2015 featured in women leaders law in the 2015 for more information call: 855-808-4520 or e-mail legalleaders@alm.com page proof—for approval .

1y ago

92 Views

Companies Law - Cayman Islands dollar

Law 1 of 1971-15th December, 1970 Law 7 of 2000- 20th July, 2000 Law 7 of 1973-28th June, 1973 Law 5 of 2001-20th April, 2001 Law 24 of 1974-22nd November, 1974 Law 10 of 2001-25th May, 2001 Law 25 of 1975-9th December, 1975 Law 29 of 2001-26th September, 2001 Law 19 of 1977-10th November, 1977 Law 46 of 2001-14th January, 2002

3y ago

454 Views

It’s the Law!

ciples stated in Boyle’s Law, Charles’ Law, Gay-Lussac’s Law, Henry’s Law, and Dalton’s Law. Students will be able to explain the application of Boyle’s Law, Charles’ Law, Gay-Lussac’s Law, Henry’s Law, and Dalton’s Law to observations or events related to SCUBA diving. MateriaLs None audio/visuaL MateriaLs None teachinG tiMe

2y ago

378 Views

WHAT LAW IS ? An Introduction to Law

common law system civil law system!! sources of law in civil law !! a1. primary: statutes (written law) enacted by legislative power are the principal source of law. ! a2. two subsidiary sources of law: ! a2.1 administrative regulations a.2.2 customs!! ! sources of law in common law !!! b1. two primary sources of

2y ago

385 Views

Growth Processes of High- Growth Firms in the UK - Nesta

Interest in high-growth ﬁrms (HGFs) has exploded in recent years, once the job-creating prowess of a minority of fast-growing ﬁrms became recognized - roughly 4% of ﬁrms can be expected to generate 50% of jobs (Storey, 1994, p. 117). Research into high-growth ﬁrms has itself undergone high-growth. However, the level of analysis has of-

1y ago

120 Views

Socio-economic profile Coastal and marine ecosystem and economy

According to the Philippine Plastics Industry Association, Inc. (PPIA), there are 1,088 firms throughout the Philippines. The majority of the plastics companies are situated in the National Capital Region (NCR) with 642 firms. This is followed by CALABARZON area with 176 firms. While Central Luzon registered 87 firms. Central Visayas have 87 firms.

1y ago

120 Views

Chapter 7 Lossless Compression Algorithms

It looks like you're using an ad-blocker