File Compression for Modern Computing
Compression Dictionaries
A dictionary is a file that stores the compression settings for small files. A dictionary is assembled from a group of typical small files that contain similar information, preferably over 100 files. For greatest efficiency, their combined size should be about one hundred times the size of the dictionary produced from them. If the files used are fewer or smaller in size than recommended, zstd
will display a warning but still allow the dictionary to be created (Figure 2).
To create a dictionary, use the command:
zstd --train FILES
The dictionary will be saved with the default name dictionary, and a default size of 112,640KB. To give the dictionary its own name, add the name to the train
option; for example, a dictionary called quick would be named using the option --train-quick
. You can also force the dictionary to use the most compressed files by specifying the number of files to use after the name; for example,
--train-quick=k=NUMBER OF FILES
A specific size can be added with the option --maxdict=SIZE
, and a specific ID with --dicID=NUMBER
, which makes communication with the dictionary faster than relying on the name. To use a dictionary, add the option -D FILE
to the command. Nothing in the output will indicate that the dictionary is in use.
In general, the smaller the file, the greater the improvement in compression. According to the man page, a dictionary can only increase the compression of a 64KB file by 10 percent, compared with a 500 percent improvement for a file of less than 1KB.
Benchmarking
To use zstd
to its full potential requires experimentation. To use the advanced compression options, you probably will need to research the compression algorithm. However, with the methods listed here, zstd
is sure to be efficient.
But how efficient? More particularly, how does zstd
compare with other compression tools? zstd
provides its own answer with a small selection of benchmarking options. To start, you can use the option -bLEVEL
to set the compression level to test. Alternately, you can use -bLEVEL
to indicate the start of a range of compression levels and use -eLEVEL
to indicate the end of the range (Figure 3). You can also change the default of three seconds for the length of the testing with -iSECONDS
. Of course, you can also make notes as you gain experience with zstd
.
zstd
has been released recently enough that, in many ways, it is still an expert's tool. However, although the documentation can be spotty for the advanced features, there is still enough to make zstd
an alternative tool for any level of user, especially those who want a compression tool designed for modern computing.
Infos
- zstd: https://en.wikipedia.org/wiki/Zstandard
- LZ77 algorithm: https://en.wikipedia.org/wiki/LZ77_and_LZ78
« Previous 1 2
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
KaOS Linux 2024.05 Adds Bcachfs Support and More
With updates all around, KaOS Linux now includes support for the bcachefs file system.
-
TUXEDO Computers Unveils New Iteration of the Stellaris Laptop Line
The Stellaris Slim 15 is the 6th generation and includes either an AMD or Intel CPU
-
KDE Releases Plasma 6.0.5
The latest release of the Plasma desktop has arrived with several improvements and the usual bug fixes.
-
Gnome OS Adopting systemd-sysupdate
Gnome OS is about to undergo a major under-the-hood change that promises enhanced security.
-
Endless OS 6 Now Available
After more than a year since the last update, the latest release of Endless OS is now available for general usage.
-
Fedora Asahi 40 Remix Available for Macs with Apple Silicon
If you've been anticipating KDE's Plasma 6 for your Apple Silicon-powered Mac, then you're in luck.
-
Red Hat Adds New Deployment Option for Enterprise Linux Platforms
Red Hat has re-imagined enterprise Linux for an AI future with Image Mode.
-
OSJH and LPI Release 2024 Open Source Pros Job Survey Results
See what open source professionals look for in a new role.
-
Proton 9.0-1 Released to Improve Gaming with Steam
The latest release of Proton 9 adds several improvements and fixes an issue that has been problematic for Linux users.
-
So Long Neofetch and Thanks for the Info
Today is a day that every Linux user who enjoys bragging about their system(s) will mourn, as Neofetch has come to an end.