File compression and decompression in C using ZLib
To create a C program that uses Zlib to compress or decompress a file based on command-line arguments, you need to follow these steps:
Install Zlib for Development on Ubuntu:
Open a terminal and run the following command to install Zlib development libraries:
sudo apt-get install zlib1g-dev
Write the C Program:
The program will use Zlib functions to compress or decompress files.
It takes three arguments: the input filename, the operation ('compress' or 'decompress'), and the output filename.
Compile the Program with GCC:
- Use the
-lz
flag to link against Zlib.
- Use the
Let's break down each step in detail.
Step 1: Install Zlib Development Libraries
Run this command in the terminal:
sudo apt-get install zlib1g-dev
This command installs the Zlib development libraries and headers necessary for compiling programs that use Zlib.
Step 2: Writing the C Program
Here is an example program that accomplishes the task:
#include <stdio.h>
#include <string.h>
#include <assert.h>
#include <zlib.h>
#define CHUNK 16384
void compressFile(FILE *source, FILE *dest) {
int ret, flush;
unsigned have;
z_stream strm;
unsigned char in[CHUNK];
unsigned char out[CHUNK];
// Initialize the zlib stream for compression
strm.zalloc = Z_NULL;
strm.zfree = Z_NULL;
strm.opaque = Z_NULL;
ret = deflateInit(&strm, Z_DEFAULT_COMPRESSION);
if (ret != Z_OK) return;
// Compress until end of file
do {
strm.avail_in = fread(in, 1, CHUNK, source);
if (ferror(source)) {
(void)deflateEnd(&strm);
return;
}
flush = feof(source) ? Z_FINISH : Z_NO_FLUSH;
strm.next_in = in;
// Run deflate() on input until output buffer not full
do {
strm.avail_out = CHUNK;
strm.next_out = out;
ret = deflate(&strm, flush);
assert(ret != Z_STREAM_ERROR);
have = CHUNK - strm.avail_out;
if (fwrite(out, 1, have, dest) != have || ferror(dest)) {
(void)deflateEnd(&strm);
return;
}
} while (strm.avail_out == 0);
assert(strm.avail_in == 0);
} while (flush != Z_FINISH);
assert(ret == Z_STREAM_END);
// Clean up
(void)deflateEnd(&strm);
}
void decompressFile(FILE *source, FILE *dest) {
int ret;
unsigned have;
z_stream strm;
unsigned char in[CHUNK];
unsigned char out[CHUNK];
// Initialize the zlib stream for decompression
strm.zalloc = Z_NULL;
strm.zfree = Z_NULL;
strm.opaque = Z_NULL;
ret = inflateInit(&strm);
if (ret != Z_OK) return;
// Decompress until deflate stream ends or end of file
do {
strm.avail_in = fread(in, 1, CHUNK, source);
if (ferror(source)) {
(void)inflateEnd(&strm);
return;
}
if (strm.avail_in == 0) break;
strm.next_in = in;
// Run inflate() on input until output buffer not full
do {
strm.avail_out = CHUNK;
strm.next_out = out;
ret = inflate(&strm, Z_NO_FLUSH);
assert(ret != Z_STREAM_ERROR);
switch (ret) {
case Z_NEED_DICT:
case Z_DATA_ERROR:
case Z_MEM_ERROR:
(void)inflateEnd(&strm);
return;
}
have = CHUNK - strm.avail_out;
if (fwrite(out, 1, have, dest) != have || ferror(dest)) {
(void)inflateEnd(&strm);
return;
}
} while (strm.avail_out == 0);
// Done when inflate() says it's done
} while (ret != Z_STREAM_END);
// Clean up
(void)inflateEnd(&strm);
}
int main(int argc, char **argv) {
if (argc != 4) {
printf("Usage: %s <input file> <compress|decompress> <output file>\n", argv[0]);
return 1;
}
FILE *inFile = fopen(argv[1], "rb");
FILE *outFile = fopen(argv[3
], "wb");
if (inFile == NULL || outFile == NULL) {
fprintf(stderr, "Could not open files\n");
return 1;
}
if (strcmp(argv[2], "compress") == 0) {
compressFile(inFile, outFile);
} else if (strcmp(argv[2], "decompress") == 0) {
decompressFile(inFile, outFile);
} else {
fprintf(stderr, "Invalid operation\n");
return 1;
}
fclose(inFile);
fclose(outFile);
return 0;
}
This C program is designed to either compress or decompress a file using the Zlib library. It takes three command-line arguments: the input file name, the operation (either "compress" or "decompress"), and the output file name. Let's break down the code in detail:
Includes and Macro Definition
#include
statements: Include standard headers and the Zlib header.<stdio.h>
: Standard input/output functions.<string.h>
: String handling functions.<assert.h>
: Provides theassert
macro for debugging.<zlib.h>
: Zlib library for compression/decompression functions.
#define CHUNK 16384
: Defines a macro for the size of the buffer used in compression/decompression. Here, it's set to 16,384 bytes.
Function: compressFile(FILE *source, FILE *dest)
This function compresses the data read from source
and writes the compressed data to dest
.
Local Variables:
z_stream strm
: Struct used by Zlib to maintain compression state.unsigned char in[CHUNK], out[CHUNK]
: Buffers for input and output data.int ret, flush
: Control variables for the compression loop and return status.unsigned have
: The number of bytes obtained after compression.
Initialization:
- Initializes the
z_stream
and checks ifdeflateInit
was successful.
- Initializes the
Compression Loop:
Reads data from
source
and checks for file errors.Sets
flush
based on whether the end of the file is reached.Compresses the data in
in
buffer and writes it toout
buffer.Continues until all data is compressed (
flush
isZ_FINISH
).
Function: decompressFile(FILE *source, FILE *dest)
This function decompresses data from source
and writes the decompressed data to dest
.
Local Variables: Similar to
compressFile
, but for decompression.Initialization:
- Initializes the
z_stream
for decompression and checks ifinflateInit
was successful.
- Initializes the
Decompression Loop:
Reads and checks for errors similarly.
Decompresses data and writes to the output file.
Handles different return statuses from
inflate
.
Function: main(int argc, char **argv)
This is the entry point of the program.
Argument Check:
- Checks if the program received exactly 4 arguments (including the program name).
File Operations:
Opens the input and output files in binary mode.
Checks for file opening errors.
Operation Selection:
Compares the second argument to decide whether to compress or decompress.
Calls
compressFile
ordecompressFile
accordingly.
Cleanup:
- Closes the input and output files.
Flow of the Program
Start: The program starts in
main
, parsing the command-line arguments.File Handling: Opens the source and destination files.
Operation Execution: Based on the user's choice, it either compresses or decompresses the file.
Completion: Closes the files and ends the program.
Error Handling
The program checks for file opening errors and reports if either the source or destination files cannot be opened.
During compression and decompression, it also checks for errors related to file reading/writing and Zlib operations.
Zlib Specifics
deflateInit
andinflateInit
: Initialize compression and decompression streams.deflate
andinflate
: Functions for compressing and decompressing data.The use of
assert
ensures that the program halts if there's an unexpected Zlib error, which is useful for debugging.
Important Considerations
The program assumes binary mode for files, making it suitable for any file type (not just text).
Error handling is basic and might need enhancement for robust applications.
The
CHUNK
size is a trade-off between memory usage and efficiency.
Step 3: Compile the Program with GCC
Use this command in the terminal:
gcc -o myprogram myprogram.c -lz
Replace myprogram
with your desired executable name and myprogram.c
with the name of your source file. The -lz
flag links your program with the Zlib library.