bzip2

Language: C

Compression / Archiving

bzip2 was developed by Julian Seward in 1996–1998 as an open-source compression library and command-line tool. It became popular for compressing large datasets in Unix-like systems due to its high compression efficiency and simplicity.

bzip2 is a high-quality data compression library and tool that uses the Burrows-Wheeler algorithm and Huffman coding. It provides better compression ratios than traditional gzip for many types of files.

Installation

linux: sudo apt install libbz2-dev bzip2
mac: brew install bzip2
windows: Use binaries from https://sourceforge.net/projects/bzip2/

Usage

bzip2 provides both a command-line tool for file compression and a C library (libbz2) for programmatically compressing/decompressing data streams. It supports streams, file-level compression, and integration with other tools like tar.

Compressing a file using command line

bzip2 myfile.txt

Compresses `myfile.txt` to `myfile.txt.bz2` using the bzip2 tool.

Decompressing a file

bunzip2 myfile.txt.bz2

Decompresses the `.bz2` file back to its original format.

Using bzip2 library in C

#include <bzlib.h>
#include <stdio.h>

int main() {
    FILE *source = fopen("input.txt", "rb");
    FILE *dest = fopen("output.bz2", "wb");
    BZFILE *bz = BZ2_bzWriteOpen(NULL, dest, 9, 0, 30);
    char buffer[1024];
    int n;
    while ((n = fread(buffer, 1, sizeof(buffer), source)) > 0) {
        BZ2_bzWrite(NULL, bz, buffer, n);
    }
    BZ2_bzWriteClose(NULL, bz, 0, NULL, NULL);
    fclose(source);
    fclose(dest);
    return 0;
}

Compresses a file programmatically using the libbz2 C library.

Decompressing using bzip2 library

#include <bzlib.h>
#include <stdio.h>

int main() {
    FILE *source = fopen("output.bz2", "rb");
    FILE *dest = fopen("restored.txt", "wb");
    BZFILE *bz = BZ2_bzReadOpen(NULL, source, 0, 0, NULL, 0);
    char buffer[1024];
    int n;
    while ((n = BZ2_bzRead(NULL, bz, buffer, sizeof(buffer))) > 0) {
        fwrite(buffer, 1, n, dest);
    }
    BZ2_bzReadClose(NULL, bz);
    fclose(source);
    fclose(dest);
    return 0;
}

Decompresses a `.bz2` file programmatically using libbz2.

Error Handling

BZ_CONFIG_ERROR: Occurs if library configuration is invalid. Ensure libbz2 is compiled and linked correctly.
BZ_IO_ERROR: Indicates a file I/O problem. Verify file paths, permissions, and disk space.
BZ_MEM_ERROR: Memory allocation failed. Consider using smaller buffers or freeing memory before compression.

Best Practices

Choose an appropriate compression level (1–9) based on speed vs size trade-off.

Use streaming APIs for large files to reduce memory usage.

Close all file handles and BZFILE objects properly to avoid corruption.

Use file extensions `.bz2` for compressed files for clarity.

Combine with tar (tar.bz2) for archiving multiple files efficiently.