This project is split into 2 parts.
Part one: Read from a text file then uild a frequency table of the letters a-z, of the empty space, the full stop, and the newline signing this order. That is, you should read in the file and build an array long frequency = new long so that frequency gives the number of occurrences of ‘a’ in the text, frequency that of the ‘b’ and so on, frequency that of the ‘z’, frequency that of ‘ ’, frequency that of ‘.’, and frequency that of ‘\n’. Other symbols should be ignored.
part 2 :huffman encoding is a lossless compression method. It is based on a frequency analysis with respect to the alphabet used. E.g., in English the letters ‘e’ and ‘t’ occur often, whereas ‘z’, ‘q’, and ‘x’ rarely. The essential idea is to represent all letters of the alphabet uniquely by sequences of bits (0 and 1 only), so that the frequently occuring letters are represented by short bit sequences and rarely occuring ones by longer bit sequences. This is done by using the frequencies in the language (or the text) and building up the so-called Huffman tree. For details, see e.g., http:
//[url removed, login to view]
(a) Use the frequencies computed in Exercise 3 to build a Huffman tree.
(b) Write two static methods encode and decode. encode takes a String and a Huffman tree and returns a String of 0s and 1s that encodes the string according to the Huffman tree. decode is the inverse method that takes the String of 0s and 1s and returns the corresponding original string.
The following example of a Huffman tree is taken from Wikipedia, http://en.wikipedia. org/wiki/DOT_%28graph_description_language%29#mediaviewer/File:Huffman_%28To_ be_or_not_to_be%29.svg.