You are on page 1of 12

Seminar Report

NEAR FIELD COMMUNICATION

Seminar Coordinator:Deepti jain

Name :- Nikhil Nanda Enrollment No:- 14920802710

Department of Computer Science and Engg./Information Technology Bhagwan Parshuram Institute of Technology

DATA COMPRESSION

Table of Contents:-

1. INTRODUCTION 2. OBJECTIVE, MOTIVATION & APPLICATION 3. DATA COMPRESSION METHODS 3.1 LOSSLESS COMPRESSION 3.1.1 Run-length encoding 3.1.2 Huffman encoding 3.1.3 Lempel-Ziv 3.2 LOSSY COMPRESSION 3.2.1 MPEG 3.2.2 JPEG 3.2.3 MP3 4 5 6 PERFORMANCE ANALYSIS CONCLUSION REFERENCES

1.INTRODUCATION:In computer science and information theory, compression, source coding, or bit-rate involves encoding

information in lesser bits than the original. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy

compression reduces bits by identifying unnecessary information and removing it. The process of reducing the size of a data file is popularly referred to as data compression, although its formal name is source coding (coding done at the source of the data before it is stored or transmitted). Compression is useful because it helps reduce resources usage, such as storage space or transmission capacity. Because compressed data must be decompressed to use, this extra processing imposes computational or other costs through decompression; this situation is far from being a free lunch. Data compression is subject to a space-time complexity trade-off. For instance, a compression scheme for video may require expensive hardware for the video to be decompressed fast enough to be viewed as it is being decompressed, and the option to decompress the video in full before watching it may be inconvenient or require additional storage. The design of data compression schemes involves trade-offs among various factors, including the degree of compression, the amount of distortion introduced (e.g., when using lossy data compression, and the computational resources required to compress and uncompress the data.

DATA COMPRESSION

2. OBJECTIVE:The main and prime objective of data compression is to helps reduce resources usage, such as storage space or transmission capacity. To reduce the data storage requirements and/or the data communication costs, there is a need to reduce the redundancy in the data representation-i.e., to compress the data.

MOTIVATION:This technology is very much motivated without a doubt, documents for text editors and for legal, medical, and library applications also require very high capacity storage devices, and the number of systems handling such material is increasing rapidly. At the same time, the proliferation of computer communication networks and teleprocessing applications involves massive transfer of data over long-distance communication links. To reduce the data storage requirements and/or the data communication costs, there is a need to reduce the redundancy in the data representation-i.e., to compress the data. Data compression also reduces the load on input/output channels in a computer installation.

APPLICATION:This technology is applicable for legal, medical, and library applications.

3.Data Compression Methods


Data compression is about storing and sending a smaller number of bits. Therere two major categories for methods to compress data: lossless and lossy methods.

3.1

Lossless compression

Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Original data and the data after compression and decompression are exactly the same. Redundant data is removed in compression and added during decompression. Lossless methods are used when we cant afford to lose any data: legal and medical documents, computer programs.

3.1.1 Run-length encoding


It is the simplest method of compression. How: replace consecutive repeating occurrences of a symbol by 1 occurrence of the symbol itself, then followed by the number of occurrences.

The method can be more efficient if the data uses only 2 symbols (0s and 1s) in bit patterns and 1 symbol is more frequent than another.

DATA COMPRESSION

3.1.2 Huffman encoding


The basic idea in Huffman coding is to assign short codewords to those input blocks with high probabilities and long codewords to those with low probabilities. A Huffmancode is designed by merging together the two least probable characters, and repeating this process until there is only one character remaining. A code tree is thus generated and the Huffman code is obtained from the labeling of the code tree.

3.1.3 Lempel-Ziv
It is dictionary-based encoding.

Basic idea: Create a dictionary (a table) of strings used during communication. If both sender and receiver have a copy of the dictionary, then previously-encountered strings can be substituted by their index in the dictionary. Algorithm: Extract the smallest substring that cannot be found in the remaining uncompressed string. Store that substring in the dictionary as a new entry and assign it an index value Substring is replaced with the index found in the dictionary ,Insert the index and the last character of the substring into the compressed string.

3.2 Lossy compression


No information is lost in lossless compression. In these schemes, some loss of information is acceptable. Dropping nonessential detail from the data source can save storage space. Lossy data compression schemes are informed by

DATA COMPRESSION
research on how people perceive the data in question. For example, the human eye is more sensitive to subtle variations inluminance than it is to variations in color.

3.2.1 MPEG
Used to compress video. Basic idea: Each video is a rapid sequence of a set of frames. Each frame is a spatial combination of pixels, or a picture. Compressing video = spatially compressing each frame + temporally compressing a set of frames. Spatial Compression.Each frame is spatially compressed by JPEG. Temporal Compression Redundant frames are removed. For example, in a static scene in which someone is talking, most frames are the same except for the segment around the speakers lips, which changes from one frame to the next.

3.2.2 JPEG
Used to compress pictures and graphics. In JPEG, a grayscale picture is divided into 8x8 pixel blocks to decrease the number of calculations. Basic idea: Change the picture into a linear (vector) sets of numbers that reveals the redundancies. The redundancies is then removed by one of lossless.

DCT: Discrete Concise Transform

DCT transforms the 64 values in 8x8 pixel block in a way that the relative relationships between pixels are kept but the redundancies are revealed.

3.2.2 MP3
They are used to remove non-audible (or less audible) components of the signal. Used for speech or music. Speech: compress a 64 kHz digitized signal,Music: compress a 1.411 MHz signal. Two categories of techniques: Predictive encoding and Perceptual encoding. Predictive Encoding-Only the differences between samples are encoded, not the whole sample values. Several standards: GSM (13 kbps), G.729 (8 kbps), and G.723.3 (6.4 or 5.3 kbps).Perceptual Encoding: MP3-CD-quality audio needs at least 1.411 Mbps and cannot be sent over the Internet without compression.MP3 (MPEG audio layer 3) uses perceptual encoding technique to compress audio.

DATA COMPRESSION

4.Performance analysis:Data compression is an effective means for saving storage space and network bandwidth. A large number of compression schemes have been devised based on character encoding or on detection of repetitive strings [2, 18], Many compression schemes achieve data reduction rates to 2.32.5 bits per character for English text [2], i.e., compression factors of about 3 4. Since compression schemes are so successful for network bandwidth, the advantageous effects of data compression on I/O performance in database systems are rather obvious, i.e., its effects on disk space, bandwidth, and throughput. However, we believe that the benets of compression in database systems can be observed and exploited beyond I/O performance. Database performance strongly depends on the amount of available memory, be it as I/O buffers or as work space for query processing algorithms. Therefore, it seems logical to try to use all available memory as effectively as possible in other words, to keep and manipulate data in memory in compressed form. This requires, of course, that the query processing algorithms can operate on compressed data.

5.Conclusion:In this paper, we have outlined a set of fairly simple techniques to achieve database performance improvements by data compression. The key ideas are to compress attributes individually, to employ the same compression scheme for all attributes of a domain, and to perform data manipulations before decompressing the data. Not only do these techniques reduce space requirements on disk and I/O performance when measured in records per time for permanent and temporary data, they also reduce requirements of memory, thus reducing the number of buffer faults resulting in I/O. When data compression is used in conjunction with algorithms that use large work spaces, even modest compression can result in signicant performance gains. Furthermore, our techniques for processing compressed data are very easy to implement. In a simple performance comparison, we have seen that for data sets larger than memory performance gains larger than the compression factor can be obtained because a larger fraction of the data can be retained in the workspace allocated to a query processing operator.

DATA COMPRESSION

6.References:1. Average profile for the generalized digital search tree and the generalized Lempel-Ziv algorithm, (with G. Louchard and J. Tang), SIAM. J. Computing, 2004 2. Pattern Matching Image Compression with Prediction Loop: Preliminary Experimental Results (with D. Arnaud) Purdue University, CSD-TR-96-069, 1996 3. A Suboptimal Lossy Data Compression Based on Approximate Pattern Matching (with T. Lucak), IEEE Trans. Information Theory, 43, 1439-1451, 1997 4. J. J. Rissanen and G. G. Langdon, Jr., "Run length Coding," IBM.J. Research and Development, Vol. 23, No.2, 1979 5.T. Y. Young and P. S. Liu, "Overhead Storage Considerations for Data File Compression," IEEE Trans. Software Eng., Vol. 6, No. 4, 1980, pp. 340-347. 6. http://www.computer.org

You might also like