Lecture 7

 

Chapter 5

 

·        Data Compression – Makes files smaller

·        Advantages/Disadvantages

·        Uses less external storage

·        Read/Write time faster to device with smaller size

·        Fewer seeks/rotational latency with fewer tracks

·        Requires more CPU time to compress/decompress

·        Compact notation – Redundancy reduction

·        Example

·        State names require about 14 bytes (max)

·        State abbreviations require 2 bytes

·        State sequence numbers require 6 bits

·        Costs

·        Binary codes can’t be read (easily) by humans

·        Encoding/decoding costs time

·        Must include encode/decode logic in all programs requiring state names

·        Is the cost worth it?    It depends!

·        Suppressing Repeating Sequences

·        Run-length encoding

·        Include run-length indicator, length value and target byte

·        Can result in longer string than original (if not careful)

·        Variable-length Codes

·        Use shortest encodings for most frequently occurring characters

·        E.g. Morse Code – “e” and “t” get . and - respectively

·        Huffman Codes

·        Instantaneous Code – You know you are at the end of a coded character without having to examine the next character

·        No code is a prefix for another code

·        The characters with the highest probabilities of occurrence are assigned the shortest codes

·        Can be codes using a binary tree (see if you can figure out the algorithm!)

·        Irreversible Compression Techniques

·                    Some information is lost and thus the original cannot be accurately reconstructed

·                    Voice, pictures, etc.

 

 

·        The Master File Update Problem

·        Given:

·        a sorted old master file

·        a sorted transaction file with

·        adds, changes and deletes

·        Output a sorted new master with the transactions applied to the old master file

·        Write the pseudo-code for this algorithm