                                The Art
                              of Lossless
                           Data Compression
                                vol. 19t

Here are the results of tests performed in September 2000 to compare
lossless compression of english texts by all known good enough programs
developed for such purpose, including RK, DC, YBS, Bzip2, IMP, RAR and 7-zip.

See Archive Comparison Test by J.Gilchrist for more details:  http://act.by.net

If anybody wants to start or continue such tests,
or can suggest some other sets of texts, or other compression programs,
 (not sources or algorithm descriptions, executable programs only)
or knows we have missed something important,
 (some new fantastic technology, an algorithm or even a program capable
 of lossless compression of up to 1000:1 etc.)
please let us know immediately: artest@hotmail.ru   Thank you!


[[1]] COMPRESSION QUALITY
=========================
             (see also
             [[2]] Speed
             [[3]] Details
             [[4]] Comments)

Fifth line shows results for the sum of four Canterbury Corpus Large Set files,
tenth line - for the sum of all 556 files in five sets.


Original ACE32    BEE     BIX     BOA     BA    BZip2     DC      ERI     IMP
length -m5-d4096 -m3-d3 -m1 -mdg  -m15 -k50 -m  -k -9 -b16300-mt5 (none) -2-s4

581.79% 138.67  108.95  129.00  106.46  109.61  121.55  104.85  112.32  119.84
411.40% 112.54  105.04  105.48  100.56  103.86  110.95  101.39  106.17  109.09
582.55% 139.98  106.19  130.78  106.37  106.98  120.52  102.53  109.57  118.23
657.05% 139.67  112.21  137.08  112.45  110.49  130.05  110.92  112.48  128.20
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
523.75% 128.40  106.29  120.77  104.15  106.01  117.43  102.85  108.43  115.51

485.12% 134.76  105.29  129.30  104.67  106.57  116.69  101.84  110.39  115.42
395.58% 130.60  104.45  124.51  102.76  105.56  113.01  100.95  109.19  112.70
432.57% 134.01  104.07  128.51  103.36  106.45  115.88  101.71  110.58  115.55
723.25% 147.93  112.09  143.07  110.68  118.26  135.44  109.89  118.12  143.21
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
448.75% 133.44  104.25  127.84  103.27  106.50  116.14  101.61  110.15  116.28


ArHanGel PPMonstr  SBC    RAR     RK     SZip     777   7-zip     YBS     ZZip
-2-mm-mt -o8-m58  -b19 -m5-mm-mde -mx3 -o10-b41 -m5-mu32  -mx   -m16mu -b20-mx

115.91  103.48  111.74  138.73  *100%   111.26  114.79  159.77  105.39  109.54
 100%   102.55  101.83  112.46  102.13  103.83  100.50  111.08  102.00  103.38
115.28  101.98  109.04  141.03  *100%   111.22  112.14  161.22  102.81  106.94
139.25  104.59  112.95  141.29   100%   115.21  127.33  184.90  109.73  110.23
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
111.81  102.04  106.61  128.87   100%   108.11  109.42  144.02  103.13  105.77

113.92  100.61  110.78  134.99  *100%   110.86  112.33  152.34  104.56  107.07
107.58   100%   109.23  134.61  100.57  109.27  107.97  142.02  103.44  106.12
110.45  ^100%   110.75  135.33  100.69  109.62  109.09  147.50  104.24  107.05
137.70  105.95  117.14  153.76   100%   117.12  116.00  178.32  115.11  118.63
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
111.38   100%   110.16  135.48  100.01  109.47  108.91  147.89  104.24  107.04

* RK -mx2  (not  -mx3 )
^ PPMonstr -o9 -m56 (not -o8 -m56)


[[2]] Speed
===========
Canterbury Corpus Large Set http://corpus.canterbury.ac.nz/ftp/large.zip
was used for this test, and an AMD-K6-400 machine with 64M RAM and Windows98.

 Programs,options        Overall      Average      Compress Extract  Compressed
                          score,       Users'        time,   time,     size,
                                       score,       seconds seconds    bytes
                      seconds  %    seconds  %
777       a -m5 -mu32  1354   156%   1171   140%      203     222     3343996
777       a -mg -s     1880   217%   1262   150%      688     139     3793939
7zip      a            1307   151%   1232   147%       83       4     4393623
7zip      a -mx        1358   156%   1240   148%      131       4     4401160
acb       B            2540   293%   1818   217%      803     808     3346915
acb       b            2997   346%   2059   246%     1042    1047     3267480
acb       u            3802   439%   2496   298%     1452    1456     3221349
ace32  a               1265   146%   1132   135%      148       7     3998222
ace32  a -d4096        1265   146%   1123   134%      158       7     3962314
ace32  a -d4096 -s-    1265   146%   1123   134%      159       7     3962374
ace32  a -d4096 -m1    1221   141%   1150   137%       80       7     4086782
ace32  a -d4096 -m5    1552   179%   1142   136%      456       7     3923686
arhangel  a -2  -mm    1203   139%   1117   133%       96      94     3647060
arhangel  a -mt        1173   135%   1069   127%      115     109     3417110
arhangel  a -mtf       1177   136%   1071   128%      118     110     3418181
ba       -k            1057   122%    988   118%       78      26     3432541
ba       -k -1         1170   135%   1122   134%       54      26     3927264
ba       -k -50        1046   120%    954   114%      103      17     3337823
bee      a -m1         1297   149%   1143   136%      171     178     3414048
bee      a -m2         1371   158%   1177   140%      215     222     3361009
bee      a -m3         1615   186%   1303   155%      347     353     3295506
bee      a -m1 -d3     1247   144%   1114   133%      148     168     3353767
bee      a -m2 -d3     1312   151%   1143   136%      188     210     3289365
bee      a -m3 -d3     1534   177%   1268   151%      296     336     3248025
bee      a -m3 -s      1846   213%   1430   171%      463     466     3303624
bee      a -d3 -s      1363   157%   1176   140%      209     216     3378513
bix    a               1243   143%   1063   127%      201       3     3743319
bix    a -mdg          1245   143%   1051   125%      215       4     3690815
bix    a -m9           1246   144%   1064   127%      202       4     3743319
bix    a -mdg -m9      1249   144%   1052   125%      219       5     3690815
bix    a -mdg -s       1274   147%   1054   126%      244       5     3690984
boa    -m1             1623   187%   1387   165%      263     281     3886856
boa    -a              1560   180%   1266   151%      327     340     3217347
boa    -m15            1588   183%   1277   152%      346     358     3182732
bzip2    -k -1         1201   138%   1159   138%       47      13     4109767
bzip2    -k -5         1089   125%   1046   125%       48      14     3697142
bzip2    -k -9         1070   123%   1023   122%       53      15     3611558
dc    e                 948   109%    917   109%       35      19     3218290
dc    e  -ft            954   110%    921   110%       37      20     3232273
dc    e -b16300        1024   118%    872   104%      170      69     2826931
dc    e -b16300 -mt5    995   115%    869   103%      141      70     2826931
dc    e -b12000         867   100%    836   100%       35      18     2931168
dc    e -b12000 -mt5    865   100%    836   100%       33      18     2931168
eri    a -m1           1110   128%    982   117%      143      29     3378440
eri    a -m2           1108   128%    975   116%      148      30     3346586
eri    a -m3           1114   128%    970   116%      160      32     3318853
eri    a               1127   130%    971   116%      175      33     3313568
eri    a -m5           1162   134%    975   116%      208      33     3313559
imp98     a -2         1043   120%   1002   119%       46      11     3547964
imp98     a -2  -s4    1040   120%    998   119%       48      11     3535351
imp_d     a -2  -s4    1041   120%   1001   119%       45      11     3548156
pkzip  -es             1659   191%   1655   197%        5       3     5945608
pkzip  -a              1326   153%   1307   156%       22       2     4691477
pkzip  -exx            1498   173%   1303   155%      217       2     4605928
ppmd     e -o5          953   110%    934   111%       21      22     3276542
ppmd     e -o7          967   111%    941   112%       29      32     3260462
ppmd     e -o9         1027   118%    990   118%       42      45     3387445
ppmd     e -o5 -m56     948   109%    931   111%       20      22     3266132
ppmd     e -o6 -m56     927   107%    906   108%       24      26     3159004
ppmd     e -o7 -m56     914   105%    890   106%       27      29     3090636
ppmd     e -o8 -m56     917   106%    885   105%       36      36     3045769
ppmd     e -o9 -m56     956   110%    919   109%       42      42     3142087
ppmonstr e -o5         1025   118%    975   116%       57      59     3276610
ppmonstr e -o7         1038   120%    975   116%       70      75     3214871
ppmonstr e -o9         1106   127%   1022   122%       93      98     3293262
ppmonstr e -o5 -m56    1018   117%    971   116%       53      58     3267452
ppmonstr e -o7 -m56     983   113%    924   110%       65      69     3055431
ppmonstr e -o9 -m56    1048   121%    955   114%      104      96     3051781
rar       a            1226   141%   1134   135%      103       4     4029077
rar       a -m1        1247   144%   1205   144%       48       4     4304853
rar       a -s  -m5    1560   180%   1144   136%      463       4     3937052
rk    -mf1             1134   131%   1096   131%       43      29     3826096
rk    -mf2             1228   141%   1109   132%      133      81     3652520
rk    -mf3             1347   155%   1121   134%      252      83     3645264
rk    -mx1             1615   186%   1249   149%      407     352     3083632
rk    -mx2             1735   200%   1320   157%      461     418     3080372
rk    -mx2 -ft+ -fe+   1737   200%   1321   158%      463     419     3080372
rk    -mx3             1768   204%   1336   159%      480     437     3064076
rk    -mx3 -ft+ -fe+   1765   204%   1334   159%      479     435     3064076
sbc      c             1058   122%    993   118%       73      24     3459990
sbc      c -b9         1052   121%    967   115%       95      26     3352214
sbc      c -b19        1103   127%    958   114%      162      43     3233894
sbc      c -b19 -e     1033   119%    941   112%      103      26     3257878
szip    -v0 -b41       1019   117%    984   117%       39      34     3405120
szip    -o8 -b41       1021   118%    974   116%       53      36     3356744
szip    -o0 -b41       1055   121%    959   114%      107      24     3326271
ufa     a   -m5 -mu32  1378   159%   1185   141%      216     234     3343996
ufa     a   -m5 -mu10  1312   151%   1154   138%      177     195     3387619
ufa     a   -mg -s     1630   188%   1161   138%      522      28     3889878
uharc   a              1381   159%   1183   141%      220      27     4081072
uharc   a -m1          1354   156%   1244   148%      122      29     4333271
uharc   a -m3          1514   175%   1125   134%      432      26     3801399
ybs_d    -y             986   113%    932   111%       61      19     3265494
ybs_d   -m2mu           986   113%    932   111%       61      19     3265494
ybs_d   -m16mu          988   114%    925   110%       71      19     3236677
ybs_d   -m16mu -r       992   114%    930   111%       70      18     3257713
zzip     a             1033   119%    975   116%       65      25     3396007
zzip     a -mm         1615   186%   1555   186%       68      29     5468735
zzip     a -mm -b20    1436   166%   1364   163%       81      28     4780656
zzip     a -mm -mx     1030   119%    971   116%       66      26     3376260

Overall score is calculated by adding compression time, extraction time, and
time it would take to transfer the compressed file over a 28,800bps network:
(compressed_size)/3600 , because 28800 bits_per_second is 3600 bytes_per_second

Average Users' score is calculated by adding (compress_time/10)+ extract_time +
time it would take to transfer the compressed file over a 28,800bps network.
Compression time is divided by 10 here, because more than 90% of people would
never compress anything during their life (with compression programs), but they
use compressed data almost _every_ time they use computers and/or Internet.
That's why compression time is not so actual for them.


[[3]] Details
=============
are no longer put to this main text
(738 lines reporting 22796 results on 556 files in 5 sets),
but can be found in FULL version with TEXTS.DAT and *.BAT
at http://geocities.com/SiliconValley/Bay/1995/artest19.zip
or http://artest1.tripod.com/artest19.zip


[[4]] Comments
==============
Links to download programs:
~~~~~~~~~~~~~~~~~~~~~~~~~~~
7-Zip  2.11   :W http://www.7-zip.com/dl/7zip211.exe                              493K
BIX 1.00b7    :W http://www.7-zip.com/dl/ufa/bix100b7.zip                          89K
777 0.04b1    :W http://www.7-zip.com/dl/ufa/777004b1.zip                          72K
UFA 0.04b1    :W http://www.7-zip.com/dl/ufa/ufa004b1.zip                          64K
ArHanGeL 1.40 :a http://geocities.com/SiliconValley/Lab/6606/arh140.zip            50K
ERI32  4.8fre :e http://geocities.com/eri32/eri48fre.zip                           91K
Imp     1.1   :e http://www.winimp.com/imp110d.zip                                266K
Imp-win 1.12  :W http://www.winimp.com/imp112.exe                                 122K
PkZip   2.50  :a ftp://ftp.simtel.net/pub/simtelnet/msdos/arcers/pk250dos.exe     202K
RK     1.03b1 :e http://malcolmt.tripod.com/downloads/rk103a1d.exe                478K
RK     1.03b1 :W http://malcolmt.tripod.com/downloads/rk103a1w.exe                380K
RAR32  2.71   :e ftp://ftp.netlab.sk/public/rarsoft/rar/rarx271.exe               257K
WinRAR 2.71   :W ftp://ftp.netlab.sk/public/rarsoft/rar/wrar271.exe               588K
PPMD var.F,
PPmonstr v.F  :W ftp://ftp.simtel.net/pub/simtelnet/win95/compress/ppmdf.zip       97K
ACB 2.00c     :e ftp://ftp.simtel.net/pub/simtelnet/msdos/compress/acb_200c.zip    42K
BOA 0.58b     :e ftp://ftp.cdrom.com/.3/sac/pack/boa058.zip                        74K
DC 0.98b      :W ftp://ftp.cdrom.com/.3/sac/pack/dc124.zip                         55K
BA 1.00 beta  :e ftp://ftp.cdrom.com/.3/sac/pack/ba100b.zip                        60K
Bzip2 1.0.1   :W ftp://sourceware.cygnus.com/pub/bzip2/v100/bzip2-100-x86-win32.exe 68K
SZip 1.12a    :W http://www.compressconsult.com/szip/szip_112a_win32.zip           71K
UHArc 0.2b    :e ftp://ftp.cdrom.com/.3/sac/pack/uharc02.zip                      101K
ZZip 0.35g    :W http://www.via.ecp.fr/~damien/zzip/zzip-win32.zip                 23K
ACE32 2.0b3   :W ftp://ftp.forlangs.net/pub/windows/winace/ace20b3.exe            573K
YBS 0.03e     :e http://members.nbci.com/vycct/ybs003ed.zip                        55K
YBS 0.03e     :W http://members.nbci.com/vycct/ybs003ew.zip                        43K
SBC 0.305b    :e http://geocities.com/sbcarchiver/sbc0305b.zip                    158K
BEE 0.4.8     :  mailto:Andrew.Filinsky@p11.f4.n452.z2.fidonet.org

:a - any DOS  - DOS programs, will run under pure DOS or in a DOS box
:e - extender - DOS programs using DOS extenders like DOS/4GW or CWSDPMI
:W - windoze  - Windows95/98/NT/etc programs

If direct link doesn't work-most probably newer version of the program appeared
at the same site: visit web page, or read the whole directory from ftp server
(i.e. try the same URL, but without filename).


Homepages:
~~~~~~~~~~
Arhangel     : http://geocities.com/SiliconValley/Lab/6606
Eri32        : http://geocities.com/eri32
      mirror : http://artest1.tripod.com
RK           : http://malcolmt.tripod.com
Imp,WinImp   : http://www.technelysium.com.au
      mirror : http://www.winimp.com
ACE32        : http://www.winace.com
PkZip        : http://www.pkware.com
RAR,WinRAR   : http://www.rarsoft.com
BZip2        : http://sources.redhat.com/bzip2
SZip         : http://www.compressconsult.com/szip
ZZip         : http://www.via.ecp.fr/~damien/zzip
YBS          : http://members.nbci.com/vycct
SBC          : http://geocities.com/sbcarchiver
Ufa,777,
    BIX,7-Zip: http://www.7-zip.com
PPMD, PPMonstr, ACB, BA, Bee, BOA, DC, UHArc - no homepage.


What's new:
~~~~~~~~~~~
7 new programs were tested:
PPMD var.Gpre Sep29, PPMonstr var.Gpre Oct4, YBS 0.03e -DOS and Win32 versions,
ZZip 0.35f, SBC 0.304b, ERI32 4.8fre.
Newer versions of ZZip, SBC, ACE, UFA are ready, and will be tested next time.

Latest beta versions of BEE, DC, PPMonstr, UFA are available
from authors by e-mail request:
BEE: Andrew.Filinsky@p11.f4.n452.z2.fidonet.org
DC: EdgarBinder@t-online.de
PPMonstr: shkarin@arstel.ru , dmitry.shkarin@mtu-net.ru
UFA: support@7-zip.com

 ACB, UHArc and PKzip are not tested on all 556 text files any more,
 their results can be found in previous versions:
 ACB   - ARTest17
 UHArc - ARTest17
 PKzip - ARTest17,18

 Results of PPMD (an open source version of PPMonstr)
 are in full version only, TEXTS.DAT file,
 UFA 0.04b1 performs on text files exactly as 777 0.04b1.

Results of old programs (not updated for more than 3 years, and no homepage),
programs with low overall score will not be put to latest versions of ARTest.
And also results of programs that are known to have bugs
(in compression/decompression functions) for more than half a year.


WARNINGS:
~~~~~~~~~
BA 1.00beta can't decompress any file compressed with -mf , and says nothing
like "CRC fails"

DC 0.99.158b failed to decompress 1DFRE10.dc , ANDES10.dc , and BTI0110.dc ,
saying "Corrupted block" (while t(est) command writes "Test successful").

RK 1.03b1 was unable to correctly decompress 555 files (all except E.TXT)
compressed with "-mx3 -ft-" , reporting
ERROR 303: CRC check failed.

ERI32 4.8fre can't compress files larger than (free DPMI memory)/6, i.e.
about 10Mb on a PC with 64Mb RAM. The largest 44Mb file was split to 5 chunks
9000000 bytes long (last chunk was 8894190 bytes).

Bugs in tested versions of SBC and ZZip were found,
but they are removed from latest versions ZZip 0.35g and SBC 0.305b .

Problems in all other compressors were not found.


The LATEST RELEASE, and all previous versions of these tests can be found
at http://geocities.com/SiliconValley/Bay/1995/ and http://artest1.tripod.com/



The FINAL PART
==============
>     [[5]] PLEASE read THIS before replying to this article
was removed from this text, but can be easily found at
http://geocities.com/SiliconValley/Bay/1995/artest10.html
http://artest1.tripod.com/artest10.html

Send your suggestions, comments to artest@hotmail.ru
With best kind regards,
RAO Inc.
