Some Summary

  1. Map of Groups of Compression Methods: Statistical Transforming stream block stream block for words, i.e. DMC, *pre-conditioned all LZ ST, "Markov source" model all PPM PPM incl.BWT for bytes, i.e. either adaptive static SEM, VQ, DCT, DWT, "Bernoulli source" or Huffman Huffman MTF, DC FT, SC, "Analog signal" model Fractal for bytes or bits adaptive static RLE, LPC, PBS, Arithmetic Arithmetic incl.Delta ENUC
    *block-based-PPM is practically unexplored field: no(?) other algorithms except pre-conditioned PPM of C.Bloom All definitions are in "A Practical Introduction to Data Compression" article. A BLOCK is a finite piece of digital information, STREAM is a portion with unknown borders: data comes byte-after-byte, not block-after-block. A N-bit BYTE is a sequence of N bits, a WORD is a finite sequence of bytes. Abbreviations are explained at the bottom of this page. Every group (branch, family) contains many methods. I think any one-step compression method belongs to one of these families. Some descriptions are in the "Compression of Multimedia Information" article. All stream methods can be applied to blocks, but inverse is not true. Block methods can't be applied to streams as they can't start working before the length of the buffer with data is assigned. Not all methods for N-bit bytes can be applied to bits (1-bit bytes). It's not good to apply methods for bytes - to words or bits, and methods for words - to bytes or bits (BWT output, for example). Although this Map is very useful, I haven't seen any analogues before. I'm very interested in your comments! Send them to artest@inbox.ru, please!
  2. This definition can probably be used: Non-physical information is information that can be obtained at any point of space-time as result of research on properties of space-time. Today (4th of June 2001) web search returns about 50 results on "non-physical information", but no definition of it was found. Why is it important ? See answer 12 of "A Practical Introduction": Only non-physical information is always accessible, it doesn't depend on material objects. The size of it is infinite. And when we study how it can be applied (wavelets, fractals, for example...) - we actually study how we depend on it. If we only assume that all references used in (compressing) description will be accessible, but don't know exactly, we make "potentially lossy" compression. Just a short example of this unwanted effect: take file ftp://ftp.simtel.net/pub/simtelnet/msdos/astronmy/skyplot.zip (153K) and try to decompress it with latest version of INFO-ZIP ( www.info-zip.org ) You get a nice message: skipping: OBJECTS.DAT `shrink' method not supported UNZIP provided by Simtel.Net to unpack all .ZIPs from ftp.simtel.net ftp://ftp.simtel.net/pub/simtelnet/msdos/UNZIP.EXE (49K) gives same result: it is UnZip 5.40 of 21 November 1998, by Info-ZIP. Can you immediately say what version of what program can unpack this file, and where can it be downloaded from ? Do you still think Definition and "potentially lossy compression" is a theoretical or even philosophical item ? I'm very interested in your comments! Send them to artest@inbox.ru, please!

Abbreviations: DMC Dynamic Markov Coding PPM Prediction by Partial Match LZ Lempel-Ziv methods, incl. LZ77 (zip, rar etc.), LZ78, LZW (gif, v.42bis) ST Sort Transform, including BWT Burrows-Wheeler Transform SEM Separate Exponents and Mantissas MTF Move To Front DC Distance Coding FT Fourier Transform, including DCT Discrete Cosine Transform (jpeg, mp3) DWT Discrete Wavelet Transform (jpeg-2000) SC Subband Coding VQ Vector Quantization RLE Run Length Encoding LPC Linear Prediction Coding, including delta coding, ADPCM, CELP and MELP PBS Parallel Blocks Sorting ENUC Enumerative Coding Back to main ARTest page