Compress v. 4.12 (31/5/1995, Kai Uwe Rommel) |
Readme/What's new |
Compress compresses files using a heavily modified version of the
LZW algorithm as described in IEEE Computer, June 1984.
See the comments in compress.c and the Usenet article at
the end of this file for more details.
The "usermem" script attempts to determine the maximum process size. Some
editing of the script may be necessary (see the comments). If you can't get
it to work at all, just create file "USERMEM" containing the maximum process
size in decimal.
The following preprocessor symbols control the compilation of "compress.c":
o USERMEM Maximum process memory on the system
o SACREDMEM Amount to reserve for other proceses
o SIGNED_COMPARE_SLOW Unsigned compare instructions are faster
o NO_UCHAR Don't use "unsigned char" types
o BITS Overrules default set by USERMEM-SACREDMEM
o vax Generate inline assembler
o interdata Defines SIGNED_COMPARE_SLOW
o M_XENIX Makes arrays < 65536 bytes each
o pdp11 BITS=12, NO_UCHAR
o z8000 BITS=12
o pcxt BITS=12
o SHORTNAMES Disallow long filenames ( > 14 characters)
o BSD4 Call setlinebuf(stderr), lstat vs stat, etc.
o VOIDSIG signal returns a void pointer
o DIRENT use <dirent.h> instead of <sys/dir.h>
See the comments at the beginning of the Makefile.
The difference "usermem-sacredmem" determines the maximum BITS that can be
specified with the "-b" flag.
memory: at least BITS
------ -- ----- ----
433,484 16
229,600 15
127,536 14
73,464 13
0 12
The default is BITS=16.
The maximum bits can be overrulled by specifying "-DBITS=bits" at
compilation time.
WARNING: files compressed on a large machine with more bits than allowed by
a version of compress on a smaller machine cannot be decompressed! Use the
"-b12" flag to generate a file on a large machine that can be uncompressed
on a 16-bit machine.
WARNING: compatibility with compress 3.0 has not been tested in
the 4.1 release of compress.
The output of compress 4.0 is fully compatible with that of compress 3.0.
In other words, the output of compress 4.0 may be fed into uncompress 3.0 or
the output of compress 3.0 may be fed into uncompress 4.0.
The output of compress 4.0 is not compatible with that of
compress 2.0. However, compress 4.0 still accepts the output of
compress 2.0. To generate output that is compatible with compress
2.0, use the undocumented "-C" flag.
Check the Makefile, then "make".
Send comments, complaints and especially patches relating to
compress4.1 to csu@alembic.acs.com.
Random comments:
compress' handling of hard links has been criticized (it refuses to
compress a multiply linked file.) In general, this is the correct
thing to do. Hard links cannot cross file system boundaries, and if
the objective of compressing files is to free disk space in a file
system, compressing one link to a file won't help. Compress has no
way of knowing where the other links are. If you REALLY want to
compress a hard link, use the -f flag. Be aware that when it is
uncompressed, the hardlink will not be recreated.
compress4.0's handling of symbolic links was (IMHO) incorrect.
Uncompressing a collection of files should yield exactly what
you had before you compressed them. This didn't happen with
symlinks. Version 4.1 simply ignores attempts to compress
symbolic links, along with anything else that isn't a regular
file. If you're accustomed to using compress followed by tar
to get everything that a directory references, both directly and
indirectly, this may come as something of a disappointment.
The following article from James A. Woods, one of the earlier
authors of compress, explains its relationship to the Unisys
patent on the LZW compression method:
From uunet!zephyr.ens.tek.com!uw-beaver!mit-eddie!wuarchive!usc!ucsd!ucbvax!agate!riacs!jaw Wed Aug 1 15:06:59 EDT 1990
Article: 1282 of gnu.misc.discuss
Path: alembic!uunet!zephyr.ens.tek.com!uw-beaver!mit-eddie!wuarchive!usc!ucsd!ucbvax!agate!riacs!jaw
From: jaw@riacs.edu (James A. Woods)
Newsgroups: gnu.misc.discuss
Subject: Sperry patent #4,558,302 does *not* affect 'compress'
Keywords: data compression, algorithm, patent
Message-ID: <1990Jul31.220935.1424@riacs.edu>
Date: 31 Jul 90 22:09:35 GMT
Organization: RIACS, NASA Ames Research Center
Lines: 69
# "The chief defect of Henry King
Was chewing little bits of string."
-- Hilaire Belloc, Cautionary Tales [1907]
As a co-author of 'compress' who has had contact with an attorney for
Unisys (nee Sperry), I would like to relay a very basic admission from Unisys
that noncommercial use of 'compress' is perfectly legal. 'Compress' is also
commercially distributed by AT&T as part of Unix System 5 release 4,
with no further restrictions placed upon the use of the binary, as far
as I am aware.
From conversations with Professor Abraham Lempel and others, it
appears that neither AT&T, Sun Microsystems, Hewlett Packard, nor IBM
are paying any sort of license fees to Unisys in conjunction with patent
#4,558,302. It may be true that some organizations are paying fees for
data compression technology licensed from one or more of the many holders
of compression patents, but this is all independent from 'compress'.
In particular, I received a letter at NASA dated October 1, 1987 from
John B. Sowell of the Unisys law department, informing me for the first
time that some form of LZW was patented. I naturally expressed
skepticism that an algorithm could be patented (a murky legal area
which remains so), stated that 'compress' is not identical to LZW,
and in fact was designed, developed, and distributed before the ink
on the patent was dry. Several telephone conversations later, Mr. Sowell
intimated that they would *not* seek any fees from users of 'compress'
but instead were signing licensees for hardware implementations of LZW.
So, regardless of what you believe about a shady legal area, if anyone
from Unisys contacts you to extract tribute for the use of 'compress', please
tell them that, first, it is not theirs to begin with, and, second, there is
someone who will testify in court about the conversation above.
It is not even clear if anyone can "own" 'compress', since original developer
Spencer Thomas, myself, and others placed the code in the public domain
long before the adoption of the Berne copyright convention.
In light of the events above, it seems that the Free Software
Foundation is being unduly paranoid about the use of 'compress'.
Now I can well believe that FSF is more likely to be a legal target
than a behemoth like AT&T, but if they are simply redistributing
untouched free software developed years ago in the public sector,
I see no problem.
Aside: I am investigating, possibly for a case history to be
recycled to USENET, the particulars of data compression patents.
I am aware of the following patents: IBM's Miller-Wegman LZ variant,
those of Telcor and ACT [losing candidates for the British Telecom modem
standard], James A. Storer's work on limited lookahead as explicated in his
text "Data Compression (methods and theory)", Computer Science Press, 1988,
and the various patents pending associated with the Fiala and Greene
CACM article of April, 1989 on textual substitution methods.
If you have any lore, send it this way.
Sincerely,
James A. Woods
NASA Ames Research Center (RIACS)
jaw@riacs.edu (or ames!jaw)
P.S. The algorithm patent issue certainly is a "topic A" at the moment.
One useful reference is the review article by Anthony and Colwell --
"Litigating the Validity and Infringement of Software Patents" in
Washington and Lee Law Review, volume 41, fall 1984. I know Robert Colwell
personally. As a practicing patent attorney, he tells me that, at a minimum,
use of an invention "for research purposes" is legitimate. |
Add new comment