What is lz4 compression
Last updated: April 1, 2026
Key Facts
- LZ4 achieves compression speeds of 250-500 MB/s and decompression speeds of 1-2 GB/s on modern processors
- The algorithm uses dictionary-based compression similar to LZ77, maintaining a sliding window of recent data
- LZ4 produces compression ratios of typically 20-40%, lower than deflate but much faster
- LZ4 is widely deployed in databases like Apache Hadoop, Cassandra, and Redis for block compression
- The algorithm is open-source licensed under BSD 2-Clause with implementations in C, Java, Python, Go, and Rust
Overview
LZ4 is a fast, lossless data compression algorithm created by Yann Collet that prioritizes compression and decompression speed over achieving maximum compression ratios. It operates on the principle of replacing redundant data sequences with shorter references, enabling rapid processing of large data volumes in real-time applications.
Technical Mechanism
LZ4 uses dictionary-based compression by maintaining a sliding window of recently processed data and searching for matching patterns within that window. When matches are found, the algorithm replaces longer sequences with shorter tokens containing a reference distance and length. This design emphasizes finding matches quickly rather than the longest possible matches, enabling its exceptional speed.
Speed and Compression Characteristics
LZ4 achieves compression speeds around 250-500 MB/s and decompression speeds of 1-2 GB/s, making it among the fastest compression algorithms available. Compression ratios typically reach 20-40% of original file size, comparing unfavorably to deflate at 30-50% but dramatically faster. This speed-versus-ratio tradeoff makes LZ4 ideal for performance-critical applications.
Primary Use Cases
LZ4 is extensively used in big data systems like Apache Hadoop for compressing data blocks, in NoSQL databases like Cassandra and Redis for reducing memory overhead, and in streaming protocols where decompression speed matters more than maximum compression. Many applications use LZ4 at multiple layers: for network transmission, in-memory caching, and disk storage.
Ecosystem and Adoption
LZ4 is available under the BSD 2-Clause license and has been implemented in numerous languages including C, Java, Python, Go, Rust, JavaScript, and C#. Major projects including Apache Kafka use LZ4 for default compression, demonstrating wide industry adoption in systems prioritizing throughput and latency over storage efficiency.
Related Questions
What is the difference between LZ4 and gzip compression?
LZ4 prioritizes speed with compression at 250-500 MB/s but lower ratios, while gzip uses deflate compression for better ratios at slower speeds. Choose LZ4 for real-time systems and gzip for archival and downloads where bandwidth matters more than latency.
How does LZ4 compare to Snappy compression?
LZ4 and Snappy are both fast compression algorithms, but LZ4 typically decompresses faster (1-2 GB/s vs 500-900 MB/s) and generally achieves better compression ratios, making it increasingly preferred in modern systems.
What are dictionary-based compression algorithms?
Dictionary-based algorithms like LZ4 and LZ77 identify repeated byte sequences and replace them with references to a sliding window dictionary. This reduces redundancy; performance depends on dictionary size, search speed, and match distance/length encoding efficiency.
More What Is in Daily Life
Also in Daily Life
More "What Is" Questions
Trending on WhatAnswers
Browse by Topic
Browse by Question Type
Sources
- LZ4 GitHub RepositoryBSD-2-Clause
- Wikipedia - LZ4CC-BY-SA-4.0