溫馨提示×

溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊×
其他方式登錄
點(diǎn)擊 登錄注冊 即表示同意《億速云用戶服務(wù)條款》

java實(shí)現(xiàn)哈夫曼壓縮與解壓縮的方法

發(fā)布時(shí)間:2020-10-10 03:12:55 來源:腳本之家 閱讀:328 作者:楊濤的博客 欄目:編程語言

一哈夫曼樹以及文件壓縮原理:

1.哈夫曼樹 :

給定N個(gè)權(quán)值作為N個(gè)葉子結(jié)點(diǎn),構(gòu)造一棵二叉樹,若該樹的帶權(quán)路徑長度達(dá)到最小,稱這樣的二叉樹為最優(yōu)二叉樹,也稱為哈夫曼樹。哈夫曼樹是帶權(quán)路徑長度最短的樹,權(quán)值較大的結(jié)點(diǎn)離根較近(頻率越高的結(jié)點(diǎn)離根越進(jìn))。

以 下數(shù)組為例,構(gòu)建哈夫曼樹

int a[] = {0,1,2,3,4,5,6,7,8}

java實(shí)現(xiàn)哈夫曼壓縮與解壓縮的方法

我們可以發(fā)現(xiàn)以下規(guī)律

1:9個(gè)數(shù)構(gòu)成的哈夫曼樹一共有17個(gè)結(jié)點(diǎn),也就是可以n個(gè)數(shù)可以生產(chǎn)2*n-1個(gè)結(jié)點(diǎn)

2:數(shù)字越大的數(shù)離根節(jié)點(diǎn)越近,越小的數(shù)離根節(jié)點(diǎn)越近。

2.如何利用haffman編碼實(shí)現(xiàn)文件壓縮:

比如abc.txt文件中有以下字符aaaabbbccde,

1.進(jìn)行字符統(tǒng)計(jì)

aaaabbbccde
 
a : 4次
b : 3次
c : 2次
d : 1次
e : 1次

2.用統(tǒng)計(jì)結(jié)果構(gòu)建哈夫曼樹

java實(shí)現(xiàn)哈夫曼壓縮與解壓縮的方法

3.用哈夫曼樹生成哈夫曼編碼(從根結(jié)點(diǎn)開始,路徑左邊記為0,右邊記為1):

a的編碼:1
b的編碼:01
c的編碼:000
d的編碼:0011
e的編碼:0010

4.哈夫曼編碼代替字符,進(jìn)行壓縮。

源文件內(nèi)容為:aaaabbbccde
將源文件用對應(yīng)的哈夫曼編碼(haffman code)替換,則有:11110101 01000000 00110010 (總共3個(gè)字節(jié))

由此可見,源文件一共有11個(gè)字符,占11字節(jié)的內(nèi)存,但是經(jīng)過用haffman code替換之后,只占3個(gè)字節(jié),這樣就能達(dá)到壓縮的目的

二主要技術(shù)點(diǎn):

1.哈夫曼樹算法(哈夫曼壓縮的基本算法)

2.哈希算法(字符統(tǒng)計(jì)時(shí)候會(huì)用到,也可以直接用HashMap統(tǒng)計(jì))

3.位運(yùn)算(涉及到將指定位,置0或置1)

4.java文件操作,以及緩沖操作。

5.存儲(chǔ)模式(大端存儲(chǔ),小端存儲(chǔ),能看懂文件16進(jìn)制的形式)

7.設(shè)置壓縮密碼,解壓輸入密碼解壓(小編自己加的內(nèi)容)

三實(shí)現(xiàn)過程:

以上述aaaabbbccde為例

1.字符統(tǒng)計(jì):

public class FreqHuf {
	public static int BUFFER_SIZE = 1 << 18;
	int freq[] = new int[256];
	File file;
	int count;
	List<HuffmanFreq> list;
	
	FreqHuf(String pathname) throws Exception {
		list = new ArrayList<>();
		this.file = new File(pathname);
		if(!file.exists()){
			throw new Exception("文件不存在");
		}
		System.out.println("進(jìn)行字符統(tǒng)計(jì)中");
		CensusChar();
		System.out.println("字符統(tǒng)計(jì)完畢");
	}
	
	public void CensusChar() throws IOException{
		int intchar;
		FileInputStream fis = new FileInputStream(file);
		System.out.println("統(tǒng)計(jì)中");
 
//這種統(tǒng)計(jì)處理方案,速度極慢,不建議使用,以下采用緩存讀數(shù)據(jù)。
//		while((intchar = fis.read()) != -1){
//			freq[intchar]++;
//		}
 
		//這里采用緩存機(jī)制,一次讀1 << 18個(gè)字節(jié),大大提高效率。
		byte[] bytes = new byte[BUFFER_SIZE];
		while((intchar = fis.read(bytes))!= -1){
			for(int i = 0; i < intchar;i++){
				int temp = bytes[i]& 0xff;
				freq[temp]++;
			}
		}
		
		
		
		fis.close();
		
		for(int i = 0; i < 256; i++){
			if(freq[i] != 0){
				this.count++;
			}
		}
		
		int index = 0;
		for(int i = 0; i < 256; i++){
			if(freq[i] != 0){
				HuffmanFreq huffman = new HuffmanFreq();
				huffman.character = (char)i;
				huffman.freq = freq[i];
				list.add(index, huffman);
			}
		}
	}
}
//統(tǒng)計(jì)每個(gè)字符和其頻率的類
public class HuffmanFreq {
	char character;
	int freq;
	
	HuffmanFreq() {
	}
	
	HuffmanFreq(int character,int freq) {
		this.character = (char)character;
		this.freq = freq;
	}
 
	char getCharacter() {
		return character;
	}
 
	void setCharacter(int character) {
		this.character = (char)character;
	}
 
	int getFreq() {
		return freq;
	}
 
	void setFreq(int freq) {
		this.freq = freq;
	}
	
	byte[] infoToByte(){
		byte[] bt = new byte[6];
		
		byte[] b1 = ByteAnd8Types.charToByte(character);
		for(int i= 0; i < b1.length;i++){
			bt[i] = b1[i];
		}
		
		byte[] b2 = ByteAnd8Types.intToBytes2(freq);
		int index = 2;
		for(int i= 0; i < b2.length;i++){
			bt[index++] = b2[i];
		}
		
		return bt;
	}
 
	@Override
	public String toString() {
		return "Huffman [character=" + character + ", freq=" + freq + "]";
	}
}

2.用統(tǒng)計(jì)結(jié)果構(gòu)建哈夫曼樹:

//treeSize為總節(jié)點(diǎn)數(shù)
private void creatTree(int treeSize){
		int temp;
		treeList = new ArrayList<HuffTreeNode>();
		for(int i = 0; i < treeSize; i++){
			HuffTreeNode node = new HuffTreeNode();
			treeList.add(i, node);
		}
		
		for(int i = 0; i < charCount; i++){
			HuffTreeNode node = treeList.get(i);
			node.freq.freq = charList.get(i).getFreq();
			node.freq.character = charList.get(i).getCharacter();
			node.left = -1;
			node.right = -1;
			node.use = 0;
		}
		
		for(int i = charCount; i < treeSize; i++){
			int index = i;
			HuffTreeNode node = treeList.get(i);
			node.use = 0;
			node.freq.character = '#';
			node.right = searchmin(index);
			node.left = searchmin(index);
			node.freq.freq = treeList.get(node.right).freq.freq + treeList.get(node.left).freq.freq;
			temp = searchmin(++index);
			if(temp == -1){
				break;
			}
			treeList.get(temp).use = 0;
		}
	}
	
	private int searchmin(int count){
		int minindex = -1;
		
		for(int i = 0; i < count; i++){
			if(treeList.get(i).use == 0){
				minindex = i;
				break;
			}
		}
		if(minindex == -1){
			return -1;
		}
		for(int i = 0; i < count; i++){
			if((treeList.get(i).freq.freq <= treeList.get(minindex).freq.freq) && treeList.get(i).use == 0){
				minindex = i;
			}
		}
		treeList.get(minindex).use = 1;
		return minindex;
	}

3.用哈夫曼樹生成哈夫曼編碼(從根結(jié)點(diǎn)開始,路徑左邊記為0,右邊記為1):

  private void bulidhuftreecode(int root, String str){
		if(treeList.get(root).getLeft() != -1 && treeList.get(root).getRight() != -1){
			bulidhuftreecode(treeList.get(root).getLeft(), str+"0");
			bulidhuftreecode(treeList.get(root).getRight(), str + "1");
		}
		else{
			treeList.get(root).code = str;
		}	
	}

java實(shí)現(xiàn)哈夫曼壓縮與解壓縮的方法

4.哈夫曼編碼代替字符,進(jìn)行壓縮,壓縮前首先要將文件頭(文件標(biāo)志,字符數(shù)量,最后一個(gè)字節(jié)有效位,密碼)字符和其頻率的那張表格寫入文件,以便于解壓縮

public void creatCodeFile(String path) throws Exception{
		byte value = 0;
		int index = 0;
		int arr[] = new int[256];
		int intchar;
		
		for(int i = 0; i < charCount; i++){
			arr[treeList.get(i).freq.character] = i;
			
		}
		File file = new File(path);
    if(!file.exists()){
       if(!file.createNewFile()){
      	 throw new Exception("創(chuàng)建文件失敗");
       }
    }
		int count = charList.size();
		HuffmanHead head = new HuffmanHead(count, howlongchar(count), password);
        //將文件頭信息寫入文件
		this.write = new RandomAccessFile(file, "rw");
		write.write(head.InfoToByte());
        //將字符及其頻率的表寫入文件
		for(HuffmanFreq freq : charList){
			byte[] bt = freq.infoToByte();
			write.write(bt);
		}
		//將字符用哈夫曼編碼進(jìn)行壓縮,這里讀寫都是采用緩存機(jī)制
		byte[] readBuffer = new byte[BUFFER_SIZE];
		while((intchar = read.read(readBuffer))!= -1){
			ProgressBar.SetCurrent(read.getFilePointer());
			for(int i = 0; i < intchar;i++){
				int temp = readBuffer[i]& 0xff; 
				String code = treeList.get(arr[temp]).code;
				char[] chars = code.toCharArray();
				
				for(int j = 0; j < chars.length; j++){
					if(chars[j] == '0'){
						value = CLR_BYTE(value, index);
					}
					if(chars[j] == '1'){
						value = SET_BYTE(value, index);
					}
					if(++index >= 8){
						index = 0;
						writeInBuffer(value);
					}
				}
			}
		}
		//此方法速度較慢
//		while((intchar = is.read()) != -1){
//			String code = treeList.get(arr[intchar]).code;
//			char[] chars = code.toCharArray();
//			
//			for(int i = 0; i < chars.length; i++){
//				if(chars[i] == '0'){
//					value = CLR_BYTE(value, index);
//				}
//				if(chars[i] == '1'){
//					value = SET_BYTE(value, index);
//				}
//				if(++index >= 8){
//					index = 0;
//					oos.write(value);
//				}
//			}
//		}
		if(index != 0){
			writeInBuffer(value);
		}
	  byte[] Data = Arrays.copyOfRange(writeBuffer, 0, writeBufferSize);
	  write.write(Data);
	  write.close();
		read.close();
	}
    //指定位,置1
    byte SET_BYTE(byte value, int index){
		return (value) |= (1 << ((index) ^ 7));
	}	
    //指定位,置0
	byte CLR_BYTE(byte value, int index){ 
		return (value) &= (~(1 << ((index) ^ 7)));
	}
    //判斷指定位是否為0,0為false,1為true
	boolean GET_BYTE(byte value, int index){ 
		return ((value) & (1 << ((index) ^ 7))) != 0;
	}

如果一個(gè)字節(jié)一個(gè)字節(jié)往文件里寫,速度會(huì)極慢,為了提高效率,寫也采用緩存,先寫到緩存區(qū),緩存區(qū)滿了后寫入文件,

    private void writeInBuffer(byte value) throws Exception {
		if(writeBufferSize < BUFFER_SIZE){
			writeBuffer[writeBufferSize] = value;
			if(++writeBufferSize >= BUFFER_SIZE){
				write.write(writeBuffer);
				writeBufferSize = 0;
			}
		} else{
			throw new Exception("寫入文件出錯(cuò)");
		}
	}

到這里壓縮就完成了,以下為解壓縮方法

1.從寫入文件中的字符統(tǒng)計(jì)的表讀出放入list里

public void init() throws Exception{
		char isHUf = read.readChar();
        //驗(yàn)證文件頭信息
		if(isHUf != '哈'){
			throw new Exception("該文件不是HUFFMAN壓縮文件");
		}
		this.charCount = read.readChar();
		this.treeSize = 2*charCount -1;
		this.lastIndex = read.readChar();
		int password = read.readInt();
		if(password != this.password.hashCode()){
			System.out.println("密碼錯(cuò)誤");
		} else{
			System.out.println("密碼正確,正在解壓");
		}
		
        //從文件中將字符統(tǒng)計(jì)的表讀出
		byte[] buffer = new byte[charCount * 6];
		read.seek(10);
		read.read(buffer, 0, charCount * 6);
		ProgressBar.SetCurrent(read.getFilePointer());
		for(int i = 0; i < buffer.length; i+=6){
			byte[] buff = Arrays.copyOfRange(buffer, i, i+2);
			ByteBuffer bb = ByteBuffer.allocate (buff.length);
		  bb.put (buff);
		  bb.flip ();
		  CharBuffer cb = cs.decode (bb);
		  byte[] buff1 = Arrays.copyOfRange(buffer, i+2, i+6);
		  int size = ByteAnd8Types.bytesToInt2(buff1, 0);
		  HuffmanFreq freq = new HuffmanFreq(cb.array()[0], size);
		  charList.add(freq);
		}
	}

2.用統(tǒng)計(jì)結(jié)果構(gòu)建哈夫曼樹(和以上代碼一樣)

3.用哈夫曼樹生成哈夫曼編碼(從根結(jié)點(diǎn)開始,路徑左邊記為0,右邊記為1)(和以上代碼一樣)

4.遍歷文件每個(gè)字節(jié),根據(jù)哈夫曼編碼找到對應(yīng)的字符,將字符寫入新文件

 public void creatsourcefile(String pathname) throws Exception{
		int root = treeList.size() - 1;
		int fininsh = 1;
		long len;
		File file = new File(pathname);
		if(!file.exists()){
			 if(!file.createNewFile()){
				 throw new Exception("創(chuàng)建文件失敗");
	     }
		}
		write = new RandomAccessFile(file, "rw");
		
		int intchar;
		byte[] bytes = new byte[1<<18];
		int index = 0;
		while((intchar = read.read(bytes))!= -1){
			len = read.getFilePointer();
			ProgressBar.SetCurrent(len);
			for(int i = 0; i < intchar;i++){
				for(;index < 8 && fininsh != 0;){
					if(GET_BYTE(bytes[i], index)){
						root = treeList.get(root).right;
					} else{
						root = treeList.get(root).left;
					}
					if(treeList.get(root).right== -1 && treeList.get(root).left == -1){
						byte temp = (byte)treeList.get(root).freq.character;
						writeInBuffer(temp);
						root = treeList.size() - 1;
					}
					index++;
					if(len == this.goalfilelenth && i == intchar-1){
						if(index >= this.lastIndex){
							fininsh = 0;
						}
					}
				}
				index = 0;
			}
		}
		byte[] Data = Arrays.copyOfRange(writeBuffer, 0, writeBufferSize);
		write.write(Data);
		write.close();
		write.close();
		read.close();
	}

四運(yùn)行展示:

java實(shí)現(xiàn)哈夫曼壓縮與解壓縮的方法

java實(shí)現(xiàn)哈夫曼壓縮與解壓縮的方法

以上就是本文的全部內(nèi)容,希望對大家的學(xué)習(xí)有所幫助,也希望大家多多支持億速云。

向AI問一下細(xì)節(jié)

免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場,如果涉及侵權(quán)請聯(lián)系站長郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。

AI