|
@@ -3055,6 +3055,13 @@ names. @xref{listing member and file names}.
|
|
|
Invokes a @acronym{GNU} extension when adding files to an archive that handles
|
|
|
sparse files efficiently. @xref{sparse}.
|
|
|
|
|
|
+@opsummary{sparse-version}
|
|
|
+@item --sparse-version=@var{version}
|
|
|
+
|
|
|
+Specified the @dfn{format version} to use when archiving sparse
|
|
|
+files. Implies @option{--sparse}. @xref{sparse}. For the description
|
|
|
+of the supported sparse formats, @xref{Sparse Formats}.
|
|
|
+
|
|
|
@opsummary{starting-file}
|
|
|
@item --starting-file=@var{name}
|
|
|
@itemx -K @var{name}
|
|
@@ -7726,1180 +7733,1163 @@ to create archives in @samp{gnu} format, however, future version will
|
|
|
switch to @samp{posix}.
|
|
|
|
|
|
@menu
|
|
|
-* Portability:: Making @command{tar} Archives More Portable
|
|
|
* Compression:: Using Less Space through Compression
|
|
|
* Attributes:: Handling File Attributes
|
|
|
+* Portability:: Making @command{tar} Archives More Portable
|
|
|
* cpio:: Comparison of @command{tar} and @command{cpio}
|
|
|
@end menu
|
|
|
|
|
|
-@node Portability
|
|
|
-@section Making @command{tar} Archives More Portable
|
|
|
-
|
|
|
-Creating a @command{tar} archive on a particular system that is meant to be
|
|
|
-useful later on many other machines and with other versions of @command{tar}
|
|
|
-is more challenging than you might think. @command{tar} archive formats
|
|
|
-have been evolving since the first versions of Unix. Many such formats
|
|
|
-are around, and are not always compatible with each other. This section
|
|
|
-discusses a few problems, and gives some advice about making @command{tar}
|
|
|
-archives more portable.
|
|
|
-
|
|
|
-One golden rule is simplicity. For example, limit your @command{tar}
|
|
|
-archives to contain only regular files and directories, avoiding
|
|
|
-other kind of special files. Do not attempt to save sparse files or
|
|
|
-contiguous files as such. Let's discuss a few more problems, in turn.
|
|
|
-
|
|
|
-@FIXME{Discuss GNU extensions (incremental backups, multi-volume
|
|
|
-archives and archive labels) in GNU and PAX formats.}
|
|
|
+@node Compression
|
|
|
+@section Using Less Space through Compression
|
|
|
|
|
|
@menu
|
|
|
-* Portable Names:: Portable Names
|
|
|
-* dereference:: Symbolic Links
|
|
|
-* old:: Old V7 Archives
|
|
|
-* ustar:: Ustar Archives
|
|
|
-* gnu:: GNU and old GNU format archives.
|
|
|
-* posix:: @acronym{POSIX} archives
|
|
|
-* Checksumming:: Checksumming Problems
|
|
|
-* Large or Negative Values:: Large files, negative time stamps, etc.
|
|
|
-* Other Tars:: How to Extract GNU-Specific Data Using
|
|
|
- Other @command{tar} Implementations
|
|
|
+* gzip:: Creating and Reading Compressed Archives
|
|
|
+* sparse:: Archiving Sparse Files
|
|
|
@end menu
|
|
|
|
|
|
-@node Portable Names
|
|
|
-@subsection Portable Names
|
|
|
+@node gzip
|
|
|
+@subsection Creating and Reading Compressed Archives
|
|
|
+@cindex Compressed archives
|
|
|
+@cindex Storing archives in compressed format
|
|
|
|
|
|
-Use portable file and member names. A name is portable if it contains
|
|
|
-only ASCII letters and digits, @samp{/}, @samp{.}, @samp{_}, and
|
|
|
-@samp{-}; it cannot be empty, start with @samp{-} or @samp{//}, or
|
|
|
-contain @samp{/-}. Avoid deep directory nesting. For portability to
|
|
|
-old Unix hosts, limit your file name components to 14 characters or
|
|
|
-less.
|
|
|
+@GNUTAR{} is able to create and read compressed archives. It supports
|
|
|
+@command{gzip} and @command{bzip2} compression programs. For backward
|
|
|
+compatibilty, it also supports @command{compress} command, although
|
|
|
+we strongly recommend against using it, since there is a patent
|
|
|
+covering the algorithm it uses and you could be sued for patent
|
|
|
+infringement merely by running @command{compress}! Besides, it is less
|
|
|
+effective than @command{gzip} and @command{bzip2}.
|
|
|
|
|
|
-If you intend to have your @command{tar} archives to be read under
|
|
|
-MSDOS, you should not rely on case distinction for file names, and you
|
|
|
-might use the @acronym{GNU} @command{doschk} program for helping you
|
|
|
-further diagnosing illegal MSDOS names, which are even more limited
|
|
|
-than System V's.
|
|
|
+Creating a compressed archive is simple: you just specify a
|
|
|
+@dfn{compression option} along with the usual archive creation
|
|
|
+commands. The compression option is @option{-z} (@option{--gzip}) to
|
|
|
+create a @command{gzip} compressed archive, @option{-j}
|
|
|
+(@option{--bzip2}) to create a @command{bzip2} compressed archive, and
|
|
|
+@option{-Z} (@option{--compress}) to use @command{compress} program.
|
|
|
+For example:
|
|
|
|
|
|
-@node dereference
|
|
|
-@subsection Symbolic Links
|
|
|
-@cindex File names, using symbolic links
|
|
|
-@cindex Symbolic link as file name
|
|
|
+@smallexample
|
|
|
+$ @kbd{tar cfz archive.tar.gz .}
|
|
|
+@end smallexample
|
|
|
|
|
|
-@opindex dereference
|
|
|
-Normally, when @command{tar} archives a symbolic link, it writes a
|
|
|
-block to the archive naming the target of the link. In that way, the
|
|
|
-@command{tar} archive is a faithful record of the file system contents.
|
|
|
-@option{--dereference} (@option{-h}) is used with @option{--create} (@option{-c}), and causes
|
|
|
-@command{tar} to archive the files symbolic links point to, instead of
|
|
|
-the links themselves. When this option is used, when @command{tar}
|
|
|
-encounters a symbolic link, it will archive the linked-to file,
|
|
|
-instead of simply recording the presence of a symbolic link.
|
|
|
+Reading compressed archive is even simpler: you don't need to specify
|
|
|
+any additional options as @GNUTAR{} recognizes its format
|
|
|
+automatically. Thus, the following commands will list and extract the
|
|
|
+archive created in previous example:
|
|
|
|
|
|
-The name under which the file is stored in the file system is not
|
|
|
-recorded in the archive. To record both the symbolic link name and
|
|
|
-the file name in the system, archive the file under both names. If
|
|
|
-all links were recorded automatically by @command{tar}, an extracted file
|
|
|
-might be linked to a file name that no longer exists in the file
|
|
|
-system.
|
|
|
+@smallexample
|
|
|
+# List the compressed archive
|
|
|
+$ @kbd{tar tf archive.tar.gz}
|
|
|
+# Extract the compressed archive
|
|
|
+$ @kbd{tar xf archive.tar.gz}
|
|
|
+@end smallexample
|
|
|
|
|
|
-If a linked-to file is encountered again by @command{tar} while creating
|
|
|
-the same archive, an entire second copy of it will be stored. (This
|
|
|
-@emph{might} be considered a bug.)
|
|
|
+The only case when you have to specify a decompression option while
|
|
|
+reading the archive is when reading from a pipe or from a tape drive
|
|
|
+that does not support random access. However, in this case @GNUTAR{}
|
|
|
+will indicate which option you should use. For example:
|
|
|
|
|
|
-So, for portable archives, do not archive symbolic links as such,
|
|
|
-and use @option{--dereference} (@option{-h}): many systems do not support
|
|
|
-symbolic links, and moreover, your distribution might be unusable if
|
|
|
-it contains unresolved symbolic links.
|
|
|
+@smallexample
|
|
|
+$ @kbd{cat archive.tar.gz | tar tf -}
|
|
|
+tar: Archive is compressed. Use -z option
|
|
|
+tar: Error is not recoverable: exiting now
|
|
|
+@end smallexample
|
|
|
|
|
|
-@node old
|
|
|
-@subsection Old V7 Archives
|
|
|
-@cindex Format, old style
|
|
|
-@cindex Old style format
|
|
|
-@cindex Old style archives
|
|
|
-@cindex v7 archive format
|
|
|
+If you see such diagnostics, just add the suggested option to the
|
|
|
+invocation of @GNUTAR{}:
|
|
|
|
|
|
-Certain old versions of @command{tar} cannot handle additional
|
|
|
-information recorded by newer @command{tar} programs. To create an
|
|
|
-archive in V7 format (not ANSI), which can be read by these old
|
|
|
-versions, specify the @option{--format=v7} option in
|
|
|
-conjunction with the @option{--create} (@option{-c}) (@command{tar} also
|
|
|
-accepts @option{--portability} or @samp{op-old-archive} for this
|
|
|
-option). When you specify it,
|
|
|
-@command{tar} leaves out information about directories, pipes, fifos,
|
|
|
-contiguous files, and device files, and specifies file ownership by
|
|
|
-group and user IDs instead of group and user names.
|
|
|
+@smallexample
|
|
|
+$ @kbd{cat archive.tar.gz | tar tfz -}
|
|
|
+@end smallexample
|
|
|
|
|
|
-When updating an archive, do not use @option{--format=v7}
|
|
|
-unless the archive was created using this option.
|
|
|
+Notice also, that there are several restrictions on operations on
|
|
|
+compressed archives. First of all, compressed archives cannot be
|
|
|
+modified, i.e., you cannot update (@option{--update} (@option{-u})) them or delete
|
|
|
+(@option{--delete}) members from them. Likewise, you cannot append
|
|
|
+another @command{tar} archive to a compressed archive using
|
|
|
+@option{--append} (@option{-r})). Secondly, multi-volume archives cannot be
|
|
|
+compressed.
|
|
|
|
|
|
-In most cases, a @emph{new} format archive can be read by an @emph{old}
|
|
|
-@command{tar} program without serious trouble, so this option should
|
|
|
-seldom be needed. On the other hand, most modern @command{tar}s are
|
|
|
-able to read old format archives, so it might be safer for you to
|
|
|
-always use @option{--format=v7} for your distributions.
|
|
|
+The following table summarizes compression options used by @GNUTAR{}.
|
|
|
|
|
|
-@node ustar
|
|
|
-@subsection Ustar Archive Format
|
|
|
+@table @option
|
|
|
+@opindex gzip
|
|
|
+@opindex ungzip
|
|
|
+@item -z
|
|
|
+@itemx --gzip
|
|
|
+@itemx --ungzip
|
|
|
+Filter the archive through @command{gzip}.
|
|
|
|
|
|
-@cindex ustar archive format
|
|
|
-Archive format defined by @acronym{POSIX}.1-1988 specification is called
|
|
|
-@code{ustar}. Although it is more flexible than the V7 format, it
|
|
|
-still has many restrictions (@xref{Formats,ustar}, for the detailed
|
|
|
-description of @code{ustar} format). Along with V7 format,
|
|
|
-@code{ustar} format is a good choice for archives intended to be read
|
|
|
-with other implementations of @command{tar}.
|
|
|
+You can use @option{--gzip} and @option{--gunzip} on physical devices
|
|
|
+(tape drives, etc.) and remote files as well as on normal files; data
|
|
|
+to or from such devices or remote files is reblocked by another copy
|
|
|
+of the @command{tar} program to enforce the specified (or default) record
|
|
|
+size. The default compression parameters are used; if you need to
|
|
|
+override them, set @env{GZIP} environment variable, e.g.:
|
|
|
|
|
|
-To create archive in @code{ustar} format, use @option{--format=ustar}
|
|
|
-option in conjunction with the @option{--create} (@option{-c}).
|
|
|
+@smallexample
|
|
|
+$ @kbd{GZIP=--best tar cfz archive.tar.gz subdir}
|
|
|
+@end smallexample
|
|
|
|
|
|
-@node gnu
|
|
|
-@subsection @acronym{GNU} and old @GNUTAR{} format
|
|
|
+@noindent
|
|
|
+Another way would be to avoid the @option{--gzip} (@option{--gunzip}, @option{--ungzip}, @option{-z}) option and run
|
|
|
+@command{gzip} explicitly:
|
|
|
|
|
|
-@cindex GNU archive format
|
|
|
-@cindex Old GNU archive format
|
|
|
-@GNUTAR{} was based on an early draft of the
|
|
|
-@acronym{POSIX} 1003.1 @code{ustar} standard. @acronym{GNU} extensions to
|
|
|
-@command{tar}, such as the support for file names longer than 100
|
|
|
-characters, use portions of the @command{tar} header record which were
|
|
|
-specified in that @acronym{POSIX} draft as unused. Subsequent changes in
|
|
|
-@acronym{POSIX} have allocated the same parts of the header record for
|
|
|
-other purposes. As a result, @GNUTAR{} format is
|
|
|
-incompatible with the current @acronym{POSIX} specification, and with
|
|
|
-@command{tar} programs that follow it.
|
|
|
+@smallexample
|
|
|
+$ @kbd{tar cf - subdir | gzip --best -c - > archive.tar.gz}
|
|
|
+@end smallexample
|
|
|
|
|
|
-In the majority of cases, @command{tar} will be configured to create
|
|
|
-this format by default. This will change in the future releases, since
|
|
|
-we plan to make @samp{posix} format the default.
|
|
|
+@cindex corrupted archives
|
|
|
+About corrupted compressed archives: @command{gzip}'ed files have no
|
|
|
+redundancy, for maximum compression. The adaptive nature of the
|
|
|
+compression scheme means that the compression tables are implicitly
|
|
|
+spread all over the archive. If you lose a few blocks, the dynamic
|
|
|
+construction of the compression tables becomes unsynchronized, and there
|
|
|
+is little chance that you could recover later in the archive.
|
|
|
|
|
|
-To force creation a @GNUTAR{} archive, use option
|
|
|
-@option{--format=gnu}.
|
|
|
+There are pending suggestions for having a per-volume or per-file
|
|
|
+compression in @GNUTAR{}. This would allow for viewing the
|
|
|
+contents without decompression, and for resynchronizing decompression at
|
|
|
+every volume or file, in case of corrupted archives. Doing so, we might
|
|
|
+lose some compressibility. But this would have make recovering easier.
|
|
|
+So, there are pros and cons. We'll see!
|
|
|
|
|
|
-@node posix
|
|
|
-@subsection @GNUTAR{} and @acronym{POSIX} @command{tar}
|
|
|
+@opindex bzip2
|
|
|
+@item -j
|
|
|
+@itemx --bzip2
|
|
|
+Filter the archive through @code{bzip2}. Otherwise like @option{--gzip}.
|
|
|
|
|
|
-@cindex POSIX archive format
|
|
|
-@cindex PAX archive format
|
|
|
-Starting from version 1.14 @GNUTAR{} features full support for
|
|
|
-@acronym{POSIX.1-2001} archives.
|
|
|
+@opindex compress
|
|
|
+@opindex uncompress
|
|
|
+@item -Z
|
|
|
+@itemx --compress
|
|
|
+@itemx --uncompress
|
|
|
+Filter the archive through @command{compress}. Otherwise like @option{--gzip}.
|
|
|
|
|
|
-A @acronym{POSIX} conformant archive will be created if @command{tar}
|
|
|
-was given @option{--format=posix} (@option{--format=pax}) option. No
|
|
|
-special option is required to read and extract from a @acronym{POSIX}
|
|
|
-archive.
|
|
|
+The @acronym{GNU} Project recommends you not use
|
|
|
+@command{compress}, because there is a patent covering the algorithm it
|
|
|
+uses. You could be sued for patent infringement merely by running
|
|
|
+@command{compress}.
|
|
|
|
|
|
-@menu
|
|
|
-* PAX keywords:: Controlling Extended Header Keywords.
|
|
|
-@end menu
|
|
|
+@opindex use-compress-program
|
|
|
+@item --use-compress-program=@var{prog}
|
|
|
+Use external compression program @var{prog}. Use this option if you
|
|
|
+have a compression program that @GNUTAR{} does not support. There
|
|
|
+are two requirements to which @var{prog} should comply:
|
|
|
|
|
|
-@node PAX keywords
|
|
|
-@subsubsection Controlling Extended Header Keywords
|
|
|
+First, when called without options, it should read data from standard
|
|
|
+input, compress it and output it on standard output.
|
|
|
|
|
|
-@table @option
|
|
|
-@opindex pax-option
|
|
|
-@item --pax-option=@var{keyword-list}
|
|
|
-Handle keywords in @acronym{PAX} extended headers. This option is
|
|
|
-equivalent to @option{-o} option of the @command{pax} utility.
|
|
|
+Secondly, if called with @option{-d} argument, it should do exactly
|
|
|
+the opposite, i.e., read the compressed data from the standard input
|
|
|
+and produce uncompressed data on the standard output.
|
|
|
@end table
|
|
|
|
|
|
-@var{Keyword-list} is a comma-separated
|
|
|
-list of keyword options, each keyword option taking one of
|
|
|
-the following forms:
|
|
|
+@cindex gpg, using with tar
|
|
|
+@cindex gnupg, using with tar
|
|
|
+@cindex Using encrypted archives
|
|
|
+The @option{--use-compress-program} option, in particular, lets you
|
|
|
+implement your own filters, not necessarily dealing with
|
|
|
+compression/decomression. For example, suppose you wish to implement
|
|
|
+PGP encryption on top of compression, using @command{gpg} (@pxref{Top,
|
|
|
+gpg, gpg ---- encryption and signing tool, gpg, GNU Privacy Guard
|
|
|
+Manual}). The following script does that:
|
|
|
|
|
|
-@table @code
|
|
|
-@item delete=@var{pattern}
|
|
|
-When used with one of archive-creation commands,
|
|
|
-this option instructs @command{tar} to omit from extended header records
|
|
|
-that it produces any keywords matching the string @var{pattern}.
|
|
|
+@smallexample
|
|
|
+@group
|
|
|
+#! /bin/sh
|
|
|
+case $1 in
|
|
|
+-d) gpg --decrypt - | gzip -d -c;;
|
|
|
+'') gzip -c | gpg -s ;;
|
|
|
+*) echo "Unknown option $1">&2; exit 1;;
|
|
|
+esac
|
|
|
+@end group
|
|
|
+@end smallexample
|
|
|
|
|
|
-When used in extract or list mode, this option instructs tar
|
|
|
-to ignore any keywords matching the given @var{pattern} in the extended
|
|
|
-header records. In both cases, matching is performed using the pattern
|
|
|
-matching notation described in @acronym{POSIX 1003.2}, 3.13
|
|
|
-(@pxref{wildcards}). For example:
|
|
|
+Suppose you name it @file{gpgz} and save it somewhere in your
|
|
|
+@env{PATH}. Then the following command will create a commpressed
|
|
|
+archive signed with your private key:
|
|
|
|
|
|
@smallexample
|
|
|
---pax-option delete=security.*
|
|
|
+$ @kbd{tar -cf foo.tar.gpgz --use-compress=gpgz .}
|
|
|
@end smallexample
|
|
|
|
|
|
-would suppress security-related information.
|
|
|
+@noindent
|
|
|
+Likewise, the following command will list its contents:
|
|
|
|
|
|
-@item exthdr.name=@var{string}
|
|
|
+@smallexample
|
|
|
+$ @kbd{tar -tf foo.tar.gpgz --use-compress=gpgz .}
|
|
|
+@end smallexample
|
|
|
|
|
|
-This keyword allows user control over the name that is written into the
|
|
|
-ustar header blocks for the extended headers. The name is obtained
|
|
|
-from @var{string} after making the following substitutions:
|
|
|
+@ignore
|
|
|
+The above is based on the following discussion:
|
|
|
|
|
|
-@multitable @columnfractions .25 .55
|
|
|
-@headitem Meta-character @tab Replaced By
|
|
|
-@item %d @tab The directory name of the file, equivalent to the
|
|
|
-result of the @command{dirname} utility on the translated pathname.
|
|
|
-@item %f @tab The filename of the file, equivalent to the result
|
|
|
-of the @command{basename} utility on the translated pathname.
|
|
|
-@item %p @tab The process ID of the @command{tar} process.
|
|
|
-@item %% @tab A @samp{%} character.
|
|
|
-@end multitable
|
|
|
+ I have one question, or maybe it's a suggestion if there isn't a way
|
|
|
+ to do it now. I would like to use @option{--gzip}, but I'd also like
|
|
|
+ the output to be fed through a program like @acronym{GNU}
|
|
|
+ @command{ecc} (actually, right now that's @samp{exactly} what I'd like
|
|
|
+ to use :-)), basically adding ECC protection on top of compression.
|
|
|
+ It seems as if this should be quite easy to do, but I can't work out
|
|
|
+ exactly how to go about it. Of course, I can pipe the standard output
|
|
|
+ of @command{tar} through @command{ecc}, but then I lose (though I
|
|
|
+ haven't started using it yet, I confess) the ability to have
|
|
|
+ @command{tar} use @command{rmt} for it's I/O (I think).
|
|
|
|
|
|
-Any other @samp{%} characters in @var{string} produce undefined
|
|
|
-results.
|
|
|
+ I think the most straightforward thing would be to let me specify a
|
|
|
+ general set of filters outboard of compression (preferably ordered,
|
|
|
+ so the order can be automatically reversed on input operations, and
|
|
|
+ with the options they require specifiable), but beggars shouldn't be
|
|
|
+ choosers and anything you decide on would be fine with me.
|
|
|
|
|
|
-If no option @samp{exthdr.name=string} is specified, @command{tar}
|
|
|
-will use the following default value:
|
|
|
+ By the way, I like @command{ecc} but if (as the comments say) it can't
|
|
|
+ deal with loss of block sync, I'm tempted to throw some time at adding
|
|
|
+ that capability. Supposing I were to actually do such a thing and
|
|
|
+ get it (apparently) working, do you accept contributed changes to
|
|
|
+ utilities like that? (Leigh Clayton @file{loc@@soliton.com}, May 1995).
|
|
|
+
|
|
|
+ Isn't that exactly the role of the
|
|
|
+ @option{--use-compress-prog=@var{program}} option?
|
|
|
+ I never tried it myself, but I suspect you may want to write a
|
|
|
+ @var{prog} script or program able to filter stdin to stdout to
|
|
|
+ way you want. It should recognize the @option{-d} option, for when
|
|
|
+ extraction is needed rather than creation.
|
|
|
|
|
|
-@smallexample
|
|
|
-%d/PaxHeaders.%p/%f
|
|
|
-@end smallexample
|
|
|
+ It has been reported that if one writes compressed data (through the
|
|
|
+ @option{--gzip} or @option{--compress} options) to a DLT and tries to use
|
|
|
+ the DLT compression mode, the data will actually get bigger and one will
|
|
|
+ end up with less space on the tape.
|
|
|
+@end ignore
|
|
|
|
|
|
-@item globexthdr.name=@var{string}
|
|
|
-This keyword allows user control over the name that is written into
|
|
|
-the ustar header blocks for global extended header records. The name
|
|
|
-is obtained from the contents of @var{string}, after making
|
|
|
-the following substitutions:
|
|
|
+@node sparse
|
|
|
+@subsection Archiving Sparse Files
|
|
|
+@cindex Sparse Files
|
|
|
|
|
|
-@multitable @columnfractions .25 .55
|
|
|
-@headitem Meta-character @tab Replaced By
|
|
|
-@item %n @tab An integer that represents the
|
|
|
-sequence number of the global extended header record in the archive,
|
|
|
-starting at 1.
|
|
|
-@item %p @tab The process ID of the @command{tar} process.
|
|
|
-@item %% @tab A @samp{%} character.
|
|
|
-@end multitable
|
|
|
+Files in the file system occasionally have @dfn{holes}. A @dfn{hole}
|
|
|
+in a file is a section of the file's contents which was never written.
|
|
|
+The contents of a hole reads as all zeros. On many operating systems,
|
|
|
+actual disk storage is not allocated for holes, but they are counted
|
|
|
+in the length of the file. If you archive such a file, @command{tar}
|
|
|
+could create an archive longer than the original. To have @command{tar}
|
|
|
+attempt to recognize the holes in a file, use @option{--sparse}
|
|
|
+(@option{-S}). When you use this option, then, for any file using
|
|
|
+less disk space than would be expected from its length, @command{tar}
|
|
|
+searches the file for consecutive stretches of zeros. It then records
|
|
|
+in the archive for the file where the consecutive stretches of zeros
|
|
|
+are, and only archives the ``real contents'' of the file. On
|
|
|
+extraction (using @option{--sparse} is not needed on extraction) any
|
|
|
+such files have holes created wherever the continuous stretches of zeros
|
|
|
+were found. Thus, if you use @option{--sparse}, @command{tar} archives
|
|
|
+won't take more space than the original.
|
|
|
|
|
|
-Any other @samp{%} characters in @var{string} produce undefined results.
|
|
|
+@table @option
|
|
|
+@opindex sparse
|
|
|
+@item -S
|
|
|
+@itemx --sparse
|
|
|
+This option istructs @command{tar} to test each file for sparseness
|
|
|
+before attempting to archive it. If the file is found to be sparse it
|
|
|
+is treated specially, thus allowing to decrease the amount of space
|
|
|
+used by its image in the archive.
|
|
|
|
|
|
-If no option @samp{globexthdr.name=string} is specified, @command{tar}
|
|
|
-will use the following default value:
|
|
|
+This option is meaningful only when creating or updating archives. It
|
|
|
+has no effect on extraction.
|
|
|
+@end table
|
|
|
|
|
|
-@smallexample
|
|
|
-$TMPDIR/GlobalHead.%p.%n
|
|
|
-@end smallexample
|
|
|
+Consider using @option{--sparse} when performing file system backups,
|
|
|
+to avoid archiving the expanded forms of files stored sparsely in the
|
|
|
+system.
|
|
|
|
|
|
-@noindent
|
|
|
-where @samp{$TMPDIR} represents the value of the @var{TMPDIR}
|
|
|
-environment variable. If @var{TMPDIR} is not set, @command{tar}
|
|
|
-uses @samp{/tmp}.
|
|
|
+Even if your system has no sparse files currently, some may be
|
|
|
+created in the future. If you use @option{--sparse} while making file
|
|
|
+system backups as a matter of course, you can be assured the archive
|
|
|
+will never take more space on the media than the files take on disk
|
|
|
+(otherwise, archiving a disk filled with sparse files might take
|
|
|
+hundreds of tapes). @xref{Incremental Dumps}.
|
|
|
|
|
|
-@item @var{keyword}=@var{value}
|
|
|
-When used with one of archive-creation commands, these keyword/value pairs
|
|
|
-will be included at the beginning of the archive in a global extended
|
|
|
-header record. When used with one of archive-reading commands,
|
|
|
-@command{tar} will behave as if it has encountered these keyword/value
|
|
|
-pairs at the beginning of the archive in a global extended header
|
|
|
-record.
|
|
|
+However, be aware that @option{--sparse} option presents a serious
|
|
|
+drawback. Namely, in order to determine if the file is sparse
|
|
|
+@command{tar} has to read it before trying to archive it, so in total
|
|
|
+the file is read @strong{twice}. So, always bear in mind that the
|
|
|
+time needed to process all files with this option is roughly twice
|
|
|
+the time needed to archive them without it.
|
|
|
+@FIXME{A technical note:
|
|
|
|
|
|
-@item @var{keyword}:=@var{value}
|
|
|
-When used with one of archive-creation commands, these keyword/value pairs
|
|
|
-will be included as records at the beginning of an extended header for
|
|
|
-each file. This is effectively equivalent to @var{keyword}=@var{value}
|
|
|
-form except that it creates no global extended header records.
|
|
|
+Programs like @command{dump} do not have to read the entire file; by
|
|
|
+examining the file system directly, they can determine in advance
|
|
|
+exactly where the holes are and thus avoid reading through them. The
|
|
|
+only data it need read are the actual allocated data blocks.
|
|
|
+@GNUTAR{} uses a more portable and straightforward
|
|
|
+archiving approach, it would be fairly difficult that it does
|
|
|
+otherwise. Elizabeth Zwicky writes to @file{comp.unix.internals}, on
|
|
|
+1990-12-10:
|
|
|
|
|
|
-When used with one of archive-reading commands, @command{tar} will
|
|
|
-behave as if these keyword/value pairs were included as records at the
|
|
|
-end of each extended header; thus, they will override any global or
|
|
|
-file-specific extended header record keywords of the same names.
|
|
|
-For example, in the command:
|
|
|
+@quotation
|
|
|
+What I did say is that you cannot tell the difference between a hole and an
|
|
|
+equivalent number of nulls without reading raw blocks. @code{st_blocks} at
|
|
|
+best tells you how many holes there are; it doesn't tell you @emph{where}.
|
|
|
+Just as programs may, conceivably, care what @code{st_blocks} is (care
|
|
|
+to name one that does?), they may also care where the holes are (I have
|
|
|
+no examples of this one either, but it's equally imaginable).
|
|
|
|
|
|
-@smallexample
|
|
|
-tar --format=posix --create \
|
|
|
- --file archive --pax-option gname:=user .
|
|
|
-@end smallexample
|
|
|
+I conclude from this that good archivers are not portable. One can
|
|
|
+arguably conclude that if you want a portable program, you can in good
|
|
|
+conscience restore files with as many holes as possible, since you can't
|
|
|
+get it right.
|
|
|
+@end quotation
|
|
|
+}
|
|
|
|
|
|
-the group name will be forced to a new value for all files
|
|
|
-stored in the archive.
|
|
|
+@cindex sparse formats, defined
|
|
|
+When using @samp{POSIX} archive format, @GNUTAR{} is able to store
|
|
|
+sparse files using in three distinct ways, called @dfn{sparse
|
|
|
+formats}. A sparse format is identified by its @dfn{number},
|
|
|
+consisting, as usual of two decimal numbers, delimited by a dot. By
|
|
|
+default, format @samp{1.0} is used. If, for some reason, you wish to
|
|
|
+use an earlier format, you can select it using
|
|
|
+@option{--sparse-version} option.
|
|
|
+
|
|
|
+@table @option
|
|
|
+@opindex sparse-version
|
|
|
+@item --sparse-version=@var{version}
|
|
|
+
|
|
|
+Select the format to store sparse files in. Valid @var{version} values
|
|
|
+are: @samp{0.0}, @samp{0.1} and @samp{1.0}. @xref{Sparse Formats},
|
|
|
+for a detailed description of each format.
|
|
|
@end table
|
|
|
|
|
|
-@node Checksumming
|
|
|
-@subsection Checksumming Problems
|
|
|
+Using @option{--sparse-format} option implies @option{--sparse}.
|
|
|
|
|
|
-SunOS and HP-UX @command{tar} fail to accept archives created using
|
|
|
-@GNUTAR{} and containing non-ASCII file names, that
|
|
|
-is, file names having characters with the eight bit set, because they
|
|
|
-use signed checksums, while @GNUTAR{} uses unsigned
|
|
|
-checksums while creating archives, as per @acronym{POSIX} standards. On
|
|
|
-reading, @GNUTAR{} computes both checksums and
|
|
|
-accept any. It is somewhat worrying that a lot of people may go
|
|
|
-around doing backup of their files using faulty (or at least
|
|
|
-non-standard) software, not learning about it until it's time to
|
|
|
-restore their missing files with an incompatible file extractor, or
|
|
|
-vice versa.
|
|
|
+@node Attributes
|
|
|
+@section Handling File Attributes
|
|
|
+@UNREVISED
|
|
|
|
|
|
-@GNUTAR{} compute checksums both ways, and accept
|
|
|
-any on read, so @acronym{GNU} tar can read Sun tapes even with their
|
|
|
-wrong checksums. @GNUTAR{} produces the standard
|
|
|
-checksum, however, raising incompatibilities with Sun. That is to
|
|
|
-say, @GNUTAR{} has not been modified to
|
|
|
-@emph{produce} incorrect archives to be read by buggy @command{tar}'s.
|
|
|
-I've been told that more recent Sun @command{tar} now read standard
|
|
|
-archives, so maybe Sun did a similar patch, after all?
|
|
|
+When @command{tar} reads files, it updates their access times. To
|
|
|
+avoid this, use the @option{--atime-preserve[=METHOD]} option, which can either
|
|
|
+reset the access time retroactively or avoid changing it in the first
|
|
|
+place.
|
|
|
|
|
|
-The story seems to be that when Sun first imported @command{tar}
|
|
|
-sources on their system, they recompiled it without realizing that
|
|
|
-the checksums were computed differently, because of a change in
|
|
|
-the default signing of @code{char}'s in their compiler. So they
|
|
|
-started computing checksums wrongly. When they later realized their
|
|
|
-mistake, they merely decided to stay compatible with it, and with
|
|
|
-themselves afterwards. Presumably, but I do not really know, HP-UX
|
|
|
-has chosen that their @command{tar} archives to be compatible with Sun's.
|
|
|
-The current standards do not favor Sun @command{tar} format. In any
|
|
|
-case, it now falls on the shoulders of SunOS and HP-UX users to get
|
|
|
-a @command{tar} able to read the good archives they receive.
|
|
|
+Handling of file attributes
|
|
|
|
|
|
-@node Large or Negative Values
|
|
|
-@subsection Large or Negative Values
|
|
|
-@cindex large values
|
|
|
-@cindex future time stamps
|
|
|
-@cindex negative time stamps
|
|
|
-@UNREVISED{}
|
|
|
+@table @option
|
|
|
+@opindex atime-preserve
|
|
|
+@item --atime-preserve
|
|
|
+@itemx --atime-preserve=replace
|
|
|
+@itemx --atime-preserve=system
|
|
|
+Preserve the access times of files that are read. This works only for
|
|
|
+files that you own, unless you have superuser privileges.
|
|
|
|
|
|
-The above sections suggest to use @samp{oldest possible} archive
|
|
|
-format if in doubt. However, sometimes it is not possible. If you
|
|
|
-attempt to archive a file whose metadata cannot be represented using
|
|
|
-required format, @GNUTAR{} will print error message and ignore such a
|
|
|
-file. You will than have to switch to a format that is able to
|
|
|
-handle such values. The format summary table (@pxref{Formats}) will
|
|
|
-help you to do so.
|
|
|
+@option{--atime-preserve=replace} works on most systems, but it also
|
|
|
+restores the data modification time and updates the status change
|
|
|
+time. Hence it doesn't interact with incremental dumps nicely
|
|
|
+(@pxref{Incremental Dumps}), and it can set access or data modification times
|
|
|
+incorrectly if other programs access the file while @command{tar} is
|
|
|
+running.
|
|
|
|
|
|
-In particular, when trying to archive files larger than 8GB or with
|
|
|
-timestamps not in the range 1970-01-01 00:00:00 through 2242-03-16
|
|
|
-12:56:31 @sc{utc}, you will have to chose between @acronym{GNU} and
|
|
|
-@acronym{POSIX} archive formats. When considering which format to
|
|
|
-choose, bear in mind that the @acronym{GNU} format uses
|
|
|
-two's-complement base-256 notation to store values that do not fit
|
|
|
-into standard @acronym{ustar} range. Such archives can generally be
|
|
|
-read only by a @GNUTAR{} implementation. Moreover, they sometimes
|
|
|
-cannot be correctly restored on another hosts even by @GNUTAR{}. For
|
|
|
-example, using two's complement representation for negative time
|
|
|
-stamps that assumes a signed 32-bit @code{time_t} generates archives
|
|
|
-that are not portable to hosts with differing @code{time_t}
|
|
|
-representations.
|
|
|
+@option{--atime-preserve=system} avoids changing the access time in
|
|
|
+the first place, if the operating system supports this.
|
|
|
+Unfortunately, this may or may not work on any given operating system
|
|
|
+or file system. If @command{tar} knows for sure it won't work, it
|
|
|
+complains right away.
|
|
|
|
|
|
-On the other hand, @acronym{POSIX} archives, generally speaking, can
|
|
|
-be extracted by any tar implementation that understands older
|
|
|
-@acronym{ustar} format. The only exception are files larger than 8GB.
|
|
|
+Currently @option{--atime-preserve} with no operand defaults to
|
|
|
+@option{--atime-preserve=replace}, but this is intended to change to
|
|
|
+@option{--atime-preserve=system} when the latter is better-supported.
|
|
|
|
|
|
-@FIXME{Describe how @acronym{POSIX} archives are extracted by non
|
|
|
-POSIX-aware tars.}
|
|
|
+@opindex touch
|
|
|
+@item -m
|
|
|
+@itemx --touch
|
|
|
+Do not extract data modification time.
|
|
|
|
|
|
-@node Other Tars
|
|
|
-@subsection How to Extract GNU-Specific Data Using Other @command{tar} Implementations
|
|
|
+When this option is used, @command{tar} leaves the data modification times
|
|
|
+of the files it extracts as the times when the files were extracted,
|
|
|
+instead of setting it to the times recorded in the archive.
|
|
|
|
|
|
-In previous sections you became acquainted with various quircks
|
|
|
-necessary to make your archives portable. Sometimes you may need to
|
|
|
-extract archives containing GNU-specific members using some
|
|
|
-third-party @command{tar} implementation or an older version of
|
|
|
-@GNUTAR{}. Of course your best bet is to have @GNUTAR{} installed,
|
|
|
-but if it is for some reason impossible, this section will explain
|
|
|
-how to cope without it.
|
|
|
+This option is meaningless with @option{--list} (@option{-t}).
|
|
|
|
|
|
-When we speak about @dfn{GNU-specific} members we mean two classes of
|
|
|
-them: members split between the volumes of a multi-volume archive and
|
|
|
-sparse members. You will be able to always recover such members if
|
|
|
-the archive is in PAX format. In addition split members can be
|
|
|
-recovered from archives in old GNU format. The following subsections
|
|
|
-describe the required procedures in detail.
|
|
|
+@opindex same-owner
|
|
|
+@item --same-owner
|
|
|
+Create extracted files with the same ownership they have in the
|
|
|
+archive.
|
|
|
|
|
|
-@menu
|
|
|
-* Split Recovery:: Members Split Between Volumes
|
|
|
-* Sparse Recovery:: Sparse Members
|
|
|
-@end menu
|
|
|
+This is the default behavior for the superuser,
|
|
|
+so this option is meaningful only for non-root users, when @command{tar}
|
|
|
+is executed on those systems able to give files away. This is
|
|
|
+considered as a security flaw by many people, at least because it
|
|
|
+makes quite difficult to correctly account users for the disk space
|
|
|
+they occupy. Also, the @code{suid} or @code{sgid} attributes of
|
|
|
+files are easily and silently lost when files are given away.
|
|
|
|
|
|
-@node Split Recovery
|
|
|
-@subsubsection Extracting Members Split Between Volumes
|
|
|
+When writing an archive, @command{tar} writes the user id and user name
|
|
|
+separately. If it can't find a user name (because the user id is not
|
|
|
+in @file{/etc/passwd}), then it does not write one. When restoring,
|
|
|
+it tries to look the name (if one was written) up in
|
|
|
+@file{/etc/passwd}. If it fails, then it uses the user id stored in
|
|
|
+the archive instead.
|
|
|
|
|
|
-If a member is split between several volumes of an old GNU format archive
|
|
|
-most third party @command{tar} implementation will fail to extract
|
|
|
-it. To extract it, use @command{tarcat} program (@pxref{Tarcat}).
|
|
|
-This program is available from
|
|
|
-@uref{http://www.gnu.org/@/software/@/tar/@/utils/@/tarcat, @GNUTAR{}
|
|
|
-home page}. It concatenates several archive volumes into a single
|
|
|
-valid archive. For example, if you have three volumes named from
|
|
|
-@file{vol-1.tar} to @file{vol-2.tar}, you can do the following to
|
|
|
-extract them using a third-party @command{tar}:
|
|
|
+@opindex no-same-owner
|
|
|
+@item --no-same-owner
|
|
|
+@itemx -o
|
|
|
+Do not attempt to restore ownership when extracting. This is the
|
|
|
+default behavior for ordinary users, so this option has an effect
|
|
|
+only for the superuser.
|
|
|
|
|
|
-@smallexample
|
|
|
-$ @kbd{tarcat vol-1.tar vol-2.tar vol-3.tar | tar xf -}
|
|
|
-@end smallexample
|
|
|
+@opindex numeric-owner
|
|
|
+@item --numeric-owner
|
|
|
+The @option{--numeric-owner} option allows (ANSI) archives to be written
|
|
|
+without user/group name information or such information to be ignored
|
|
|
+when extracting. It effectively disables the generation and/or use
|
|
|
+of user/group name information. This option forces extraction using
|
|
|
+the numeric ids from the archive, ignoring the names.
|
|
|
|
|
|
-You could use this approach for many (although not all) PAX
|
|
|
-format archives as well. However, extracting split members from a PAX
|
|
|
-archive is a much easier task, because PAX volumes are constructed in
|
|
|
-such a way that each part of a split member is extracted as a
|
|
|
-different file by @command{tar} implementations that are not aware of
|
|
|
-GNU extensions. More specifically, the very first part retains its
|
|
|
-original name, and all subsequent parts are named using the pattern:
|
|
|
+This is useful in certain circumstances, when restoring a backup from
|
|
|
+an emergency floppy with different passwd/group files for example.
|
|
|
+It is otherwise impossible to extract files with the right ownerships
|
|
|
+if the password file in use during the extraction does not match the
|
|
|
+one belonging to the file system(s) being extracted. This occurs,
|
|
|
+for example, if you are restoring your files after a major crash and
|
|
|
+had booted from an emergency floppy with no password file or put your
|
|
|
+disk into another machine to do the restore.
|
|
|
|
|
|
-@smallexample
|
|
|
-%d/GNUFileParts.%p/%f.%n
|
|
|
-@end smallexample
|
|
|
+The numeric ids are @emph{always} saved into @command{tar} archives.
|
|
|
+The identifying names are added at create time when provided by the
|
|
|
+system, unless @option{--old-archive} (@option{-o}) is used. Numeric ids could be
|
|
|
+used when moving archives between a collection of machines using
|
|
|
+a centralized management for attribution of numeric ids to users
|
|
|
+and groups. This is often made through using the NIS capabilities.
|
|
|
|
|
|
-@noindent
|
|
|
-where symbols preceeded by @samp{%} are @dfn{macro characters} that
|
|
|
-have the following meaning:
|
|
|
+When making a @command{tar} file for distribution to other sites, it
|
|
|
+is sometimes cleaner to use a single owner for all files in the
|
|
|
+distribution, and nicer to specify the write permission bits of the
|
|
|
+files as stored in the archive independently of their actual value on
|
|
|
+the file system. The way to prepare a clean distribution is usually
|
|
|
+to have some Makefile rule creating a directory, copying all needed
|
|
|
+files in that directory, then setting ownership and permissions as
|
|
|
+wanted (there are a lot of possible schemes), and only then making a
|
|
|
+@command{tar} archive out of this directory, before cleaning
|
|
|
+everything out. Of course, we could add a lot of options to
|
|
|
+@GNUTAR{} for fine tuning permissions and ownership.
|
|
|
+This is not the good way, I think. @GNUTAR{} is
|
|
|
+already crowded with options and moreover, the approach just explained
|
|
|
+gives you a great deal of control already.
|
|
|
|
|
|
-@multitable @columnfractions .25 .55
|
|
|
-@headitem Meta-character @tab Replaced By
|
|
|
-@item %d @tab The directory name of the file, equivalent to the
|
|
|
-result of the @command{dirname} utility on its full name.
|
|
|
-@item %f @tab The file name of the file, equivalent to the result
|
|
|
-of the @command{basename} utility on its full name.
|
|
|
-@item %p @tab The process ID of the @command{tar} process that
|
|
|
-created the archive.
|
|
|
-@item %n @tab Ordinal number of this particular part.
|
|
|
-@end multitable
|
|
|
+@xopindex{same-permissions, short description}
|
|
|
+@xopindex{preserve-permissions, short description}
|
|
|
+@item -p
|
|
|
+@itemx --same-permissions
|
|
|
+@itemx --preserve-permissions
|
|
|
+Extract all protection information.
|
|
|
|
|
|
-For example, if, a file @file{var/longfile} was split during archive
|
|
|
-creation between three volumes, and the creator @command{tar} process
|
|
|
-had process ID @samp{27962}, then the member names will be:
|
|
|
+This option causes @command{tar} to set the modes (access permissions) of
|
|
|
+extracted files exactly as recorded in the archive. If this option
|
|
|
+is not used, the current @code{umask} setting limits the permissions
|
|
|
+on extracted files. This option is by default enabled when
|
|
|
+@command{tar} is executed by a superuser.
|
|
|
|
|
|
-@smallexample
|
|
|
-var/longfile
|
|
|
-var/GNUFileParts.27962/longfile.1
|
|
|
-var/GNUFileParts.27962/longfile.2
|
|
|
-@end smallexample
|
|
|
|
|
|
-When you extract your archive using a third-party @command{tar}, these
|
|
|
-files will be created on your disk, and the only thing you will need
|
|
|
-to do to restore your file in its original form is concatenate them in
|
|
|
-the proper order, for example:
|
|
|
+This option is meaningless with @option{--list} (@option{-t}).
|
|
|
|
|
|
-@smallexample
|
|
|
-@group
|
|
|
-$ @kbd{cd var}
|
|
|
-$ @kbd{cat GNUFileParts.27962/longfile.1 \
|
|
|
- GNUFileParts.27962/longfile.2 >> longfile}
|
|
|
-$ rm -f GNUFileParts.27962
|
|
|
-@end group
|
|
|
-@end smallexample
|
|
|
+@opindex preserve
|
|
|
+@item --preserve
|
|
|
+Same as both @option{--same-permissions} and @option{--same-order}.
|
|
|
|
|
|
-Notice, that if the @command{tar} implementation you use supports PAX
|
|
|
-format archives, it will probably emit warnings about unknown keywords
|
|
|
-during extraction. They will lool like this:
|
|
|
+The @option{--preserve} option has no equivalent short option name.
|
|
|
+It is equivalent to @option{--same-permissions} plus @option{--same-order}.
|
|
|
|
|
|
-@smallexample
|
|
|
-@group
|
|
|
-Tar file too small
|
|
|
-Unknown extended header keyword 'GNU.volume.filename' ignored.
|
|
|
-Unknown extended header keyword 'GNU.volume.size' ignored.
|
|
|
-Unknown extended header keyword 'GNU.volume.offset' ignored.
|
|
|
-@end group
|
|
|
-@end smallexample
|
|
|
+@FIXME{I do not see the purpose of such an option. (Neither I. FP.)
|
|
|
+Neither do I. --Sergey}
|
|
|
|
|
|
-@noindent
|
|
|
-You can safely ignore these warnings.
|
|
|
+@end table
|
|
|
|
|
|
-If your @command{tar} implementation is not PAX-aware, you will get
|
|
|
-more warnigns and more files generated on your disk, e.g.:
|
|
|
+@node Portability
|
|
|
+@section Making @command{tar} Archives More Portable
|
|
|
|
|
|
-@smallexample
|
|
|
-@group
|
|
|
-$ @kbd{tar xf vol-1.tar}
|
|
|
-var/PaxHeaders.27962/longfile: Unknown file type 'x', extracted as
|
|
|
-normal file
|
|
|
-Unexpected EOF in archive
|
|
|
-$ @kbd{tar xf vol-2.tar}
|
|
|
-tmp/GlobalHead.27962.1: Unknown file type 'g', extracted as normal file
|
|
|
-GNUFileParts.27962/PaxHeaders.27962/sparsefile.1: Unknown file type
|
|
|
-'x', extracted as normal file
|
|
|
-@end group
|
|
|
-@end smallexample
|
|
|
+Creating a @command{tar} archive on a particular system that is meant to be
|
|
|
+useful later on many other machines and with other versions of @command{tar}
|
|
|
+is more challenging than you might think. @command{tar} archive formats
|
|
|
+have been evolving since the first versions of Unix. Many such formats
|
|
|
+are around, and are not always compatible with each other. This section
|
|
|
+discusses a few problems, and gives some advice about making @command{tar}
|
|
|
+archives more portable.
|
|
|
|
|
|
-Ignore these warnings. The @file{PaxHeaders.*} directories created
|
|
|
-will contain files with @dfn{extended header keywords} describing the
|
|
|
-extracted files. You can delete them, unless they describe sparse
|
|
|
-members. Read further to learn more about them.
|
|
|
+One golden rule is simplicity. For example, limit your @command{tar}
|
|
|
+archives to contain only regular files and directories, avoiding
|
|
|
+other kind of special files. Do not attempt to save sparse files or
|
|
|
+contiguous files as such. Let's discuss a few more problems, in turn.
|
|
|
|
|
|
-@node Sparse Recovery
|
|
|
-@subsubsection Extracting Sparse Members
|
|
|
+@FIXME{Discuss GNU extensions (incremental backups, multi-volume
|
|
|
+archives and archive labels) in GNU and PAX formats.}
|
|
|
|
|
|
-Any @command{tar} implementation will be able to extract sparse members from a
|
|
|
-PAX archive. However, the extracted files will be @dfn{condensed},
|
|
|
-i.e. any zero blocks will be removed from them. When we restore such
|
|
|
-a condensed file to its original form, by adding zero bloks (or
|
|
|
-@dfn{holes}) back to their original locations, we call this process
|
|
|
-@dfn{expanding} a compressed sparse file.
|
|
|
+@menu
|
|
|
+* Portable Names:: Portable Names
|
|
|
+* dereference:: Symbolic Links
|
|
|
+* old:: Old V7 Archives
|
|
|
+* ustar:: Ustar Archives
|
|
|
+* gnu:: GNU and old GNU format archives.
|
|
|
+* posix:: @acronym{POSIX} archives
|
|
|
+* Checksumming:: Checksumming Problems
|
|
|
+* Large or Negative Values:: Large files, negative time stamps, etc.
|
|
|
+* Other Tars:: How to Extract GNU-Specific Data Using
|
|
|
+ Other @command{tar} Implementations
|
|
|
+@end menu
|
|
|
|
|
|
-To expand a file, you will need a simple auxiliary program called
|
|
|
-@command{xsparse}. It is available in source form from
|
|
|
-@uref{http://www.gnu.org/@/software/@/tar/@/utils/@/xsparse, @GNUTAR{}
|
|
|
-home page}.
|
|
|
+@node Portable Names
|
|
|
+@subsection Portable Names
|
|
|
|
|
|
-Let's begin with archive members in @dfn{sparse format
|
|
|
-version 1.0}@footnote{@xref{PAX 1}.}, which are the easiest to expand.
|
|
|
-The condensed file will contain both file map and file data, so no
|
|
|
-additional data will be needed to restore it. If the original file
|
|
|
-name was @file{@var{dir}/@var{name}}, then the condensed file will be
|
|
|
-named @file{@var{dir}/@/GNUSparseFile.@var{n}/@/@var{name}}, where
|
|
|
-@var{n} is a decimal number@footnote{technically speaking, @var{n} is a
|
|
|
-@dfn{process ID} of the @command{tar} process which created the
|
|
|
-archive (@pxref{PAX keywords}).}.
|
|
|
+Use portable file and member names. A name is portable if it contains
|
|
|
+only ASCII letters and digits, @samp{/}, @samp{.}, @samp{_}, and
|
|
|
+@samp{-}; it cannot be empty, start with @samp{-} or @samp{//}, or
|
|
|
+contain @samp{/-}. Avoid deep directory nesting. For portability to
|
|
|
+old Unix hosts, limit your file name components to 14 characters or
|
|
|
+less.
|
|
|
|
|
|
-To expand a version 1.0 file, run @command{xsparse} as follows:
|
|
|
+If you intend to have your @command{tar} archives to be read under
|
|
|
+MSDOS, you should not rely on case distinction for file names, and you
|
|
|
+might use the @acronym{GNU} @command{doschk} program for helping you
|
|
|
+further diagnosing illegal MSDOS names, which are even more limited
|
|
|
+than System V's.
|
|
|
|
|
|
-@smallexample
|
|
|
-$ @kbd{xsparse @file{cond-file}}
|
|
|
-@end smallexample
|
|
|
+@node dereference
|
|
|
+@subsection Symbolic Links
|
|
|
+@cindex File names, using symbolic links
|
|
|
+@cindex Symbolic link as file name
|
|
|
|
|
|
-@noindent
|
|
|
-where @file{cond-file} is the name of the condensed file. The utility
|
|
|
-will deduce the name for the resulting expanded file using the
|
|
|
-following algorithm:
|
|
|
+@opindex dereference
|
|
|
+Normally, when @command{tar} archives a symbolic link, it writes a
|
|
|
+block to the archive naming the target of the link. In that way, the
|
|
|
+@command{tar} archive is a faithful record of the file system contents.
|
|
|
+@option{--dereference} (@option{-h}) is used with @option{--create} (@option{-c}), and causes
|
|
|
+@command{tar} to archive the files symbolic links point to, instead of
|
|
|
+the links themselves. When this option is used, when @command{tar}
|
|
|
+encounters a symbolic link, it will archive the linked-to file,
|
|
|
+instead of simply recording the presence of a symbolic link.
|
|
|
|
|
|
-@enumerate 1
|
|
|
-@item If @file{cond-file} does not contain any directories,
|
|
|
-@file{../cond-file} will be used;
|
|
|
+The name under which the file is stored in the file system is not
|
|
|
+recorded in the archive. To record both the symbolic link name and
|
|
|
+the file name in the system, archive the file under both names. If
|
|
|
+all links were recorded automatically by @command{tar}, an extracted file
|
|
|
+might be linked to a file name that no longer exists in the file
|
|
|
+system.
|
|
|
|
|
|
-@item If @file{cond-file} has the form
|
|
|
-@file{@var{dir}/@var{t}/@var{name}}, where both @var{t} and @var{name}
|
|
|
-are simple names, with no @samp{/} characters in them, the output file
|
|
|
-name will be @file{@var{dir}/@var{name}}.
|
|
|
+If a linked-to file is encountered again by @command{tar} while creating
|
|
|
+the same archive, an entire second copy of it will be stored. (This
|
|
|
+@emph{might} be considered a bug.)
|
|
|
|
|
|
-@item Otherwise, if @file{cond-file} has the form
|
|
|
-@file{@var{dir}/@var{name}}, the output file name will be
|
|
|
-@file{@var{name}}.
|
|
|
-@end enumerate
|
|
|
+So, for portable archives, do not archive symbolic links as such,
|
|
|
+and use @option{--dereference} (@option{-h}): many systems do not support
|
|
|
+symbolic links, and moreover, your distribution might be unusable if
|
|
|
+it contains unresolved symbolic links.
|
|
|
|
|
|
-In the unlikely case when this algorithm does not suite your needs,
|
|
|
-you can explicitely specify output file name as a second argument to
|
|
|
-the command:
|
|
|
+@node old
|
|
|
+@subsection Old V7 Archives
|
|
|
+@cindex Format, old style
|
|
|
+@cindex Old style format
|
|
|
+@cindex Old style archives
|
|
|
+@cindex v7 archive format
|
|
|
|
|
|
-@smallexample
|
|
|
-$ @kbd{xsparse @file{cond-file}}
|
|
|
-@end smallexample
|
|
|
+Certain old versions of @command{tar} cannot handle additional
|
|
|
+information recorded by newer @command{tar} programs. To create an
|
|
|
+archive in V7 format (not ANSI), which can be read by these old
|
|
|
+versions, specify the @option{--format=v7} option in
|
|
|
+conjunction with the @option{--create} (@option{-c}) (@command{tar} also
|
|
|
+accepts @option{--portability} or @option{--old-archive} for this
|
|
|
+option). When you specify it,
|
|
|
+@command{tar} leaves out information about directories, pipes, fifos,
|
|
|
+contiguous files, and device files, and specifies file ownership by
|
|
|
+group and user IDs instead of group and user names.
|
|
|
|
|
|
-It is often a good idea to run @command{xsparse} in @dfn{dry run} mode
|
|
|
-first. In this mode, the command does not actually expand the file,
|
|
|
-but verbosely lists all actions it would be taking to do so. The dry
|
|
|
-run mode is enabled by @option{-n} command line argument:
|
|
|
+When updating an archive, do not use @option{--format=v7}
|
|
|
+unless the archive was created using this option.
|
|
|
|
|
|
-@smallexample
|
|
|
-@group
|
|
|
-$ @kbd{xsparse -n /home/gray/GNUSparseFile.6058/sparsefile}
|
|
|
-Reading v.1.0 sparse map
|
|
|
-Expanding file `/home/gray/GNUSparseFile.6058/sparsefile' to
|
|
|
-`/home/gray/sparsefile'
|
|
|
-Finished dry run
|
|
|
-@end group
|
|
|
-@end smallexample
|
|
|
+In most cases, a @emph{new} format archive can be read by an @emph{old}
|
|
|
+@command{tar} program without serious trouble, so this option should
|
|
|
+seldom be needed. On the other hand, most modern @command{tar}s are
|
|
|
+able to read old format archives, so it might be safer for you to
|
|
|
+always use @option{--format=v7} for your distributions. Notice,
|
|
|
+however, that @samp{ustar} format is a better alternative, as it is
|
|
|
+free from many of @samp{v7}'s drawbacks.
|
|
|
|
|
|
-To actually expand the file, you would run:
|
|
|
+@node ustar
|
|
|
+@subsection Ustar Archive Format
|
|
|
|
|
|
-@smallexample
|
|
|
-$ @kbd{xsparse /home/gray/GNUSparseFile.6058/sparsefile}
|
|
|
-@end smallexample
|
|
|
+@cindex ustar archive format
|
|
|
+Archive format defined by @acronym{POSIX}.1-1988 specification is called
|
|
|
+@code{ustar}. Although it is more flexible than the V7 format, it
|
|
|
+still has many restrictions (@xref{Formats,ustar}, for the detailed
|
|
|
+description of @code{ustar} format). Along with V7 format,
|
|
|
+@code{ustar} format is a good choice for archives intended to be read
|
|
|
+with other implementations of @command{tar}.
|
|
|
|
|
|
-@noindent
|
|
|
-The program behaves the same way all UNIX utilities do: it will keep
|
|
|
-quiet unless it has simething important to tell you (e.g. an error
|
|
|
-condition or something). If you wish it to produce verbose output,
|
|
|
-similar to that from the dry run mode, give it @option{-v} option:
|
|
|
+To create archive in @code{ustar} format, use @option{--format=ustar}
|
|
|
+option in conjunction with the @option{--create} (@option{-c}).
|
|
|
|
|
|
-@smallexample
|
|
|
-@group
|
|
|
-$ @kbd{xsparse -v /home/gray/GNUSparseFile.6058/sparsefile}
|
|
|
-Reading v.1.0 sparse map
|
|
|
-Expanding file `/home/gray/GNUSparseFile.6058/sparsefile' to
|
|
|
-`/home/gray/sparsefile'
|
|
|
-Done
|
|
|
-@end group
|
|
|
-@end smallexample
|
|
|
+@node gnu
|
|
|
+@subsection @acronym{GNU} and old @GNUTAR{} format
|
|
|
|
|
|
-Additionally, if your @command{tar} implementation has extracted the
|
|
|
-@dfn{extended headers} for this file, you can instruct @command{xstar}
|
|
|
-to use them in order to verify the integrity of the expanded file.
|
|
|
-The option @option{-x} sets the name of the extended header file to
|
|
|
-use. Continuing our example:
|
|
|
+@cindex GNU archive format
|
|
|
+@cindex Old GNU archive format
|
|
|
+@GNUTAR{} was based on an early draft of the
|
|
|
+@acronym{POSIX} 1003.1 @code{ustar} standard. @acronym{GNU} extensions to
|
|
|
+@command{tar}, such as the support for file names longer than 100
|
|
|
+characters, use portions of the @command{tar} header record which were
|
|
|
+specified in that @acronym{POSIX} draft as unused. Subsequent changes in
|
|
|
+@acronym{POSIX} have allocated the same parts of the header record for
|
|
|
+other purposes. As a result, @GNUTAR{} format is
|
|
|
+incompatible with the current @acronym{POSIX} specification, and with
|
|
|
+@command{tar} programs that follow it.
|
|
|
|
|
|
-@smallexample
|
|
|
-@group
|
|
|
-$ @kbd{xsparse -v -x /home/gray/PaxHeaders.6058/sparsefile \
|
|
|
- /home/gray/GNUSparseFile.6058/sparsefile}
|
|
|
-Reading extended header file
|
|
|
-Found variable GNU.sparse.major = 1
|
|
|
-Found variable GNU.sparse.minor = 0
|
|
|
-Found variable GNU.sparse.name = sparsefile
|
|
|
-Found variable GNU.sparse.realsize = 217481216
|
|
|
-Reading v.1.0 sparse map
|
|
|
-Expanding file `/home/gray/GNUSparseFile.6058/sparsefile' to
|
|
|
-`/home/gray/sparsefile'
|
|
|
-Done
|
|
|
-@end group
|
|
|
-@end smallexample
|
|
|
+In the majority of cases, @command{tar} will be configured to create
|
|
|
+this format by default. This will change in the future releases, since
|
|
|
+we plan to make @samp{POSIX} format the default.
|
|
|
|
|
|
-An @dfn{extended header} is a special @command{tar} archive header
|
|
|
-that precedes an archive member and contains a set of
|
|
|
-@dfn{variables}, describing the member properties that cannot be
|
|
|
-stored in the standard @code{ustar} header. While optional for
|
|
|
-expanding sparse version 1.0 members, use of extended headers is
|
|
|
-mandatory when expanding sparse members in older sparse formats: v.0.0
|
|
|
-and v.0.1 (The sparse formats are described in detail in @pxref{Sparse
|
|
|
-Formats}). So, for this format, the question is: how to obtain
|
|
|
-extended headers from the archive?
|
|
|
+To force creation a @GNUTAR{} archive, use option
|
|
|
+@option{--format=gnu}.
|
|
|
|
|
|
-If you use a @command{tar} implementation that does not support PAX
|
|
|
-format, extended headers for each member will be extracted as a
|
|
|
-separate file. If we represent the member name as
|
|
|
-@file{@var{dir}/@var{name}}, then the extended header file will be
|
|
|
-named @file{@var{dir}/@/PaxHeaders.@var{n}/@/@var{name}}, where
|
|
|
-@var{n} is an integer number.
|
|
|
+@node posix
|
|
|
+@subsection @GNUTAR{} and @acronym{POSIX} @command{tar}
|
|
|
|
|
|
-Things become more difficult if your @command{tar} implementation
|
|
|
-does support PAX headers, because in this case you will have to
|
|
|
-manually extract the headers. We recommend the following algorithm:
|
|
|
+@cindex POSIX archive format
|
|
|
+@cindex PAX archive format
|
|
|
+Starting from version 1.14 @GNUTAR{} features full support for
|
|
|
+@acronym{POSIX.1-2001} archives.
|
|
|
|
|
|
-@enumerate 1
|
|
|
-@item
|
|
|
-Consult the documentation for your @command{tar} implementation for an
|
|
|
-option that will print @dfn{block numbers} along with the archive
|
|
|
-listing (analogous to @GNUTAR{}'s @option{-R} option). For example,
|
|
|
-@command{star} has @option{-block-number}.
|
|
|
+A @acronym{POSIX} conformant archive will be created if @command{tar}
|
|
|
+was given @option{--format=posix} (@option{--format=pax}) option. No
|
|
|
+special option is required to read and extract from a @acronym{POSIX}
|
|
|
+archive.
|
|
|
|
|
|
-@item
|
|
|
-Obtain the verbose listing using the @samp{block number} option, and
|
|
|
-find the position of the sparse member in question and the member
|
|
|
-immediately following it. For example, running @command{star} on our
|
|
|
-archive we obtain:
|
|
|
+@menu
|
|
|
+* PAX keywords:: Controlling Extended Header Keywords.
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node PAX keywords
|
|
|
+@subsubsection Controlling Extended Header Keywords
|
|
|
+
|
|
|
+@table @option
|
|
|
+@opindex pax-option
|
|
|
+@item --pax-option=@var{keyword-list}
|
|
|
+Handle keywords in @acronym{PAX} extended headers. This option is
|
|
|
+equivalent to @option{-o} option of the @command{pax} utility.
|
|
|
+@end table
|
|
|
+
|
|
|
+@var{Keyword-list} is a comma-separated
|
|
|
+list of keyword options, each keyword option taking one of
|
|
|
+the following forms:
|
|
|
+
|
|
|
+@table @code
|
|
|
+@item delete=@var{pattern}
|
|
|
+When used with one of archive-creation commands,
|
|
|
+this option instructs @command{tar} to omit from extended header records
|
|
|
+that it produces any keywords matching the string @var{pattern}.
|
|
|
+
|
|
|
+When used in extract or list mode, this option instructs tar
|
|
|
+to ignore any keywords matching the given @var{pattern} in the extended
|
|
|
+header records. In both cases, matching is performed using the pattern
|
|
|
+matching notation described in @acronym{POSIX 1003.2}, 3.13
|
|
|
+(@pxref{wildcards}). For example:
|
|
|
|
|
|
@smallexample
|
|
|
-@group
|
|
|
-$ @kbd{star -t -v -block-number -f arc.tar}
|
|
|
-@dots{}
|
|
|
-star: Unknown extended header keyword 'GNU.sparse.size' ignored.
|
|
|
-star: Unknown extended header keyword 'GNU.sparse.numblocks' ignored.
|
|
|
-star: Unknown extended header keyword 'GNU.sparse.name' ignored.
|
|
|
-star: Unknown extended header keyword 'GNU.sparse.map' ignored.
|
|
|
-block 56: 425984 -rw-r--r-- gray/users Jun 25 14:46 2006 GNUSparseFile.28124/sparsefile
|
|
|
-block 897: 65391 -rw-r--r-- gray/users Jun 24 20:06 2006 README
|
|
|
-@dots{}
|
|
|
-@end group
|
|
|
+--pax-option delete=security.*
|
|
|
@end smallexample
|
|
|
|
|
|
-@noindent
|
|
|
-(as usual, ignore the warnings about unknown keywords.)
|
|
|
+would suppress security-related information.
|
|
|
+
|
|
|
+@item exthdr.name=@var{string}
|
|
|
+
|
|
|
+This keyword allows user control over the name that is written into the
|
|
|
+ustar header blocks for the extended headers. The name is obtained
|
|
|
+from @var{string} after making the following substitutions:
|
|
|
+
|
|
|
+@multitable @columnfractions .25 .55
|
|
|
+@headitem Meta-character @tab Replaced By
|
|
|
+@item %d @tab The directory name of the file, equivalent to the
|
|
|
+result of the @command{dirname} utility on the translated pathname.
|
|
|
+@item %f @tab The filename of the file, equivalent to the result
|
|
|
+of the @command{basename} utility on the translated pathname.
|
|
|
+@item %p @tab The process ID of the @command{tar} process.
|
|
|
+@item %% @tab A @samp{%} character.
|
|
|
+@end multitable
|
|
|
|
|
|
-@item
|
|
|
-Let the size of the sparse member be @var{size}, its block number be
|
|
|
-@var{Bs} and the block number of the next member be @var{Bn}.
|
|
|
-Compute:
|
|
|
+Any other @samp{%} characters in @var{string} produce undefined
|
|
|
+results.
|
|
|
+
|
|
|
+If no option @samp{exthdr.name=string} is specified, @command{tar}
|
|
|
+will use the following default value:
|
|
|
|
|
|
@smallexample
|
|
|
-@var{N} = @var{Bs} - @var{Bn} - @var{size}/512 - 2
|
|
|
+%d/PaxHeaders.%p/%f
|
|
|
@end smallexample
|
|
|
|
|
|
-@noindent
|
|
|
-This number gives the size of the extended header part in tar @dfn{blocks}.
|
|
|
-In our example, this formula gives: @code{897 - 56 - 425984 / 512 - 2
|
|
|
-= 7}.
|
|
|
+@item globexthdr.name=@var{string}
|
|
|
+This keyword allows user control over the name that is written into
|
|
|
+the ustar header blocks for global extended header records. The name
|
|
|
+is obtained from the contents of @var{string}, after making
|
|
|
+the following substitutions:
|
|
|
|
|
|
-@item
|
|
|
-Use @command{dd} to extract the headers:
|
|
|
+@multitable @columnfractions .25 .55
|
|
|
+@headitem Meta-character @tab Replaced By
|
|
|
+@item %n @tab An integer that represents the
|
|
|
+sequence number of the global extended header record in the archive,
|
|
|
+starting at 1.
|
|
|
+@item %p @tab The process ID of the @command{tar} process.
|
|
|
+@item %% @tab A @samp{%} character.
|
|
|
+@end multitable
|
|
|
+
|
|
|
+Any other @samp{%} characters in @var{string} produce undefined results.
|
|
|
+
|
|
|
+If no option @samp{globexthdr.name=string} is specified, @command{tar}
|
|
|
+will use the following default value:
|
|
|
|
|
|
@smallexample
|
|
|
-@kbd{dd if=@var{archive} of=@var{hname} bs=512 skip=@var{Bs} count=@var{N}}
|
|
|
+$TMPDIR/GlobalHead.%p.%n
|
|
|
@end smallexample
|
|
|
|
|
|
@noindent
|
|
|
-where @var{archive} is the archive name, @var{hname} is a name of the
|
|
|
-file to store the extended header in, @var{Bs} and @var{N} are
|
|
|
-computed in previous steps.
|
|
|
+where @samp{$TMPDIR} represents the value of the @var{TMPDIR}
|
|
|
+environment variable. If @var{TMPDIR} is not set, @command{tar}
|
|
|
+uses @samp{/tmp}.
|
|
|
|
|
|
-In our example, this command will be
|
|
|
+@item @var{keyword}=@var{value}
|
|
|
+When used with one of archive-creation commands, these keyword/value pairs
|
|
|
+will be included at the beginning of the archive in a global extended
|
|
|
+header record. When used with one of archive-reading commands,
|
|
|
+@command{tar} will behave as if it has encountered these keyword/value
|
|
|
+pairs at the beginning of the archive in a global extended header
|
|
|
+record.
|
|
|
|
|
|
-@smallexample
|
|
|
-$ @kbd{dd if=arc.tar of=xhdr bs=512 skip=56 count=7}
|
|
|
-@end smallexample
|
|
|
-@end enumerate
|
|
|
+@item @var{keyword}:=@var{value}
|
|
|
+When used with one of archive-creation commands, these keyword/value pairs
|
|
|
+will be included as records at the beginning of an extended header for
|
|
|
+each file. This is effectively equivalent to @var{keyword}=@var{value}
|
|
|
+form except that it creates no global extended header records.
|
|
|
|
|
|
-Finally, you can expand the condensed file, using the obtained header:
|
|
|
+When used with one of archive-reading commands, @command{tar} will
|
|
|
+behave as if these keyword/value pairs were included as records at the
|
|
|
+end of each extended header; thus, they will override any global or
|
|
|
+file-specific extended header record keywords of the same names.
|
|
|
+For example, in the command:
|
|
|
|
|
|
@smallexample
|
|
|
-@group
|
|
|
-$ @kbd{xsparse -v -x xhdr GNUSparseFile.6058/sparsefile}
|
|
|
-Reading extended header file
|
|
|
-Found variable GNU.sparse.size = 217481216
|
|
|
-Found variable GNU.sparse.numblocks = 208
|
|
|
-Found variable GNU.sparse.name = sparsefile
|
|
|
-Found variable GNU.sparse.map = 0,2048,1050624,2048,@dots{}
|
|
|
-Expanding file `GNUSparseFile.28124/sparsefile' to `sparsefile'
|
|
|
-Done
|
|
|
-@end group
|
|
|
+tar --format=posix --create \
|
|
|
+ --file archive --pax-option gname:=user .
|
|
|
@end smallexample
|
|
|
|
|
|
-@node Compression
|
|
|
-@section Using Less Space through Compression
|
|
|
+the group name will be forced to a new value for all files
|
|
|
+stored in the archive.
|
|
|
+@end table
|
|
|
|
|
|
-@menu
|
|
|
-* gzip:: Creating and Reading Compressed Archives
|
|
|
-* sparse:: Archiving Sparse Files
|
|
|
-@end menu
|
|
|
+@node Checksumming
|
|
|
+@subsection Checksumming Problems
|
|
|
|
|
|
-@node gzip
|
|
|
-@subsection Creating and Reading Compressed Archives
|
|
|
-@cindex Compressed archives
|
|
|
-@cindex Storing archives in compressed format
|
|
|
+SunOS and HP-UX @command{tar} fail to accept archives created using
|
|
|
+@GNUTAR{} and containing non-ASCII file names, that
|
|
|
+is, file names having characters with the eight bit set, because they
|
|
|
+use signed checksums, while @GNUTAR{} uses unsigned
|
|
|
+checksums while creating archives, as per @acronym{POSIX} standards. On
|
|
|
+reading, @GNUTAR{} computes both checksums and
|
|
|
+accept any. It is somewhat worrying that a lot of people may go
|
|
|
+around doing backup of their files using faulty (or at least
|
|
|
+non-standard) software, not learning about it until it's time to
|
|
|
+restore their missing files with an incompatible file extractor, or
|
|
|
+vice versa.
|
|
|
|
|
|
-@GNUTAR{} is able to create and read compressed archives. It supports
|
|
|
-@command{gzip} and @command{bzip2} compression programs. For backward
|
|
|
-compatibilty, it also supports @command{compress} command, although
|
|
|
-we strongly recommend against using it, since there is a patent
|
|
|
-covering the algorithm it uses and you could be sued for patent
|
|
|
-infringement merely by running @command{compress}! Besides, it is less
|
|
|
-effective than @command{gzip} and @command{bzip2}.
|
|
|
+@GNUTAR{} compute checksums both ways, and accept
|
|
|
+any on read, so @acronym{GNU} tar can read Sun tapes even with their
|
|
|
+wrong checksums. @GNUTAR{} produces the standard
|
|
|
+checksum, however, raising incompatibilities with Sun. That is to
|
|
|
+say, @GNUTAR{} has not been modified to
|
|
|
+@emph{produce} incorrect archives to be read by buggy @command{tar}'s.
|
|
|
+I've been told that more recent Sun @command{tar} now read standard
|
|
|
+archives, so maybe Sun did a similar patch, after all?
|
|
|
|
|
|
-Creating a compressed archive is simple: you just specify a
|
|
|
-@dfn{compression option} along with the usual archive creation
|
|
|
-commands. The compression option is @option{-z} (@option{--gzip}) to
|
|
|
-create a @command{gzip} compressed archive, @option{-j}
|
|
|
-(@option{--bzip2}) to create a @command{bzip2} compressed archive, and
|
|
|
-@option{-Z} (@option{--compress}) to use @command{compress} program.
|
|
|
-For example:
|
|
|
+The story seems to be that when Sun first imported @command{tar}
|
|
|
+sources on their system, they recompiled it without realizing that
|
|
|
+the checksums were computed differently, because of a change in
|
|
|
+the default signing of @code{char}'s in their compiler. So they
|
|
|
+started computing checksums wrongly. When they later realized their
|
|
|
+mistake, they merely decided to stay compatible with it, and with
|
|
|
+themselves afterwards. Presumably, but I do not really know, HP-UX
|
|
|
+has chosen that their @command{tar} archives to be compatible with Sun's.
|
|
|
+The current standards do not favor Sun @command{tar} format. In any
|
|
|
+case, it now falls on the shoulders of SunOS and HP-UX users to get
|
|
|
+a @command{tar} able to read the good archives they receive.
|
|
|
|
|
|
-@smallexample
|
|
|
-$ @kbd{tar cfz archive.tar.gz .}
|
|
|
-@end smallexample
|
|
|
+@node Large or Negative Values
|
|
|
+@subsection Large or Negative Values
|
|
|
+@cindex large values
|
|
|
+@cindex future time stamps
|
|
|
+@cindex negative time stamps
|
|
|
+@UNREVISED{}
|
|
|
|
|
|
-Reading compressed archive is even simpler: you don't need to specify
|
|
|
-any additional options as @GNUTAR{} recognizes its format
|
|
|
-automatically. Thus, the following commands will list and extract the
|
|
|
-archive created in previous example:
|
|
|
+The above sections suggest to use @samp{oldest possible} archive
|
|
|
+format if in doubt. However, sometimes it is not possible. If you
|
|
|
+attempt to archive a file whose metadata cannot be represented using
|
|
|
+required format, @GNUTAR{} will print error message and ignore such a
|
|
|
+file. You will than have to switch to a format that is able to
|
|
|
+handle such values. The format summary table (@pxref{Formats}) will
|
|
|
+help you to do so.
|
|
|
|
|
|
-@smallexample
|
|
|
-# List the compressed archive
|
|
|
-$ @kbd{tar tf archive.tar.gz}
|
|
|
-# Extract the compressed archive
|
|
|
-$ @kbd{tar xf archive.tar.gz}
|
|
|
-@end smallexample
|
|
|
+In particular, when trying to archive files larger than 8GB or with
|
|
|
+timestamps not in the range 1970-01-01 00:00:00 through 2242-03-16
|
|
|
+12:56:31 @sc{utc}, you will have to chose between @acronym{GNU} and
|
|
|
+@acronym{POSIX} archive formats. When considering which format to
|
|
|
+choose, bear in mind that the @acronym{GNU} format uses
|
|
|
+two's-complement base-256 notation to store values that do not fit
|
|
|
+into standard @acronym{ustar} range. Such archives can generally be
|
|
|
+read only by a @GNUTAR{} implementation. Moreover, they sometimes
|
|
|
+cannot be correctly restored on another hosts even by @GNUTAR{}. For
|
|
|
+example, using two's complement representation for negative time
|
|
|
+stamps that assumes a signed 32-bit @code{time_t} generates archives
|
|
|
+that are not portable to hosts with differing @code{time_t}
|
|
|
+representations.
|
|
|
|
|
|
-The only case when you have to specify a decompression option while
|
|
|
-reading the archive is when reading from a pipe or from a tape drive
|
|
|
-that does not support random access. However, in this case @GNUTAR{}
|
|
|
-will indicate which option you should use. For example:
|
|
|
+On the other hand, @acronym{POSIX} archives, generally speaking, can
|
|
|
+be extracted by any tar implementation that understands older
|
|
|
+@acronym{ustar} format. The only exception are files larger than 8GB.
|
|
|
|
|
|
-@smallexample
|
|
|
-$ @kbd{cat archive.tar.gz | tar tf -}
|
|
|
-tar: Archive is compressed. Use -z option
|
|
|
-tar: Error is not recoverable: exiting now
|
|
|
-@end smallexample
|
|
|
+@FIXME{Describe how @acronym{POSIX} archives are extracted by non
|
|
|
+POSIX-aware tars.}
|
|
|
|
|
|
-If you see such diagnostics, just add the suggested option to the
|
|
|
-invocation of @GNUTAR{}:
|
|
|
+@node Other Tars
|
|
|
+@subsection How to Extract GNU-Specific Data Using Other @command{tar} Implementations
|
|
|
|
|
|
-@smallexample
|
|
|
-$ @kbd{cat archive.tar.gz | tar tfz -}
|
|
|
-@end smallexample
|
|
|
+In previous sections you became acquainted with various quircks
|
|
|
+necessary to make your archives portable. Sometimes you may need to
|
|
|
+extract archives containing GNU-specific members using some
|
|
|
+third-party @command{tar} implementation or an older version of
|
|
|
+@GNUTAR{}. Of course your best bet is to have @GNUTAR{} installed,
|
|
|
+but if it is for some reason impossible, this section will explain
|
|
|
+how to cope without it.
|
|
|
|
|
|
-Notice also, that there are several restrictions on operations on
|
|
|
-compressed archives. First of all, compressed archives cannot be
|
|
|
-modified, i.e., you cannot update (@option{--update} (@option{-u})) them or delete
|
|
|
-(@option{--delete}) members from them. Likewise, you cannot append
|
|
|
-another @command{tar} archive to a compressed archive using
|
|
|
-@option{--append} (@option{-r})). Secondly, multi-volume archives cannot be
|
|
|
-compressed.
|
|
|
+When we speak about @dfn{GNU-specific} members we mean two classes of
|
|
|
+them: members split between the volumes of a multi-volume archive and
|
|
|
+sparse members. You will be able to always recover such members if
|
|
|
+the archive is in PAX format. In addition split members can be
|
|
|
+recovered from archives in old GNU format. The following subsections
|
|
|
+describe the required procedures in detail.
|
|
|
|
|
|
-The following table summarizes compression options used by @GNUTAR{}.
|
|
|
+@menu
|
|
|
+* Split Recovery:: Members Split Between Volumes
|
|
|
+* Sparse Recovery:: Sparse Members
|
|
|
+@end menu
|
|
|
|
|
|
-@table @option
|
|
|
-@opindex gzip
|
|
|
-@opindex ungzip
|
|
|
-@item -z
|
|
|
-@itemx --gzip
|
|
|
-@itemx --ungzip
|
|
|
-Filter the archive through @command{gzip}.
|
|
|
+@node Split Recovery
|
|
|
+@subsubsection Extracting Members Split Between Volumes
|
|
|
|
|
|
-You can use @option{--gzip} and @option{--gunzip} on physical devices
|
|
|
-(tape drives, etc.) and remote files as well as on normal files; data
|
|
|
-to or from such devices or remote files is reblocked by another copy
|
|
|
-of the @command{tar} program to enforce the specified (or default) record
|
|
|
-size. The default compression parameters are used; if you need to
|
|
|
-override them, set @env{GZIP} environment variable, e.g.:
|
|
|
+If a member is split between several volumes of an old GNU format archive
|
|
|
+most third party @command{tar} implementation will fail to extract
|
|
|
+it. To extract it, use @command{tarcat} program (@pxref{Tarcat}).
|
|
|
+This program is available from
|
|
|
+@uref{http://www.gnu.org/@/software/@/tar/@/utils/@/tarcat, @GNUTAR{}
|
|
|
+home page}. It concatenates several archive volumes into a single
|
|
|
+valid archive. For example, if you have three volumes named from
|
|
|
+@file{vol-1.tar} to @file{vol-2.tar}, you can do the following to
|
|
|
+extract them using a third-party @command{tar}:
|
|
|
|
|
|
@smallexample
|
|
|
-$ @kbd{GZIP=--best tar cfz archive.tar.gz subdir}
|
|
|
+$ @kbd{tarcat vol-1.tar vol-2.tar vol-3.tar | tar xf -}
|
|
|
@end smallexample
|
|
|
|
|
|
-@noindent
|
|
|
-Another way would be to avoid the @option{--gzip} (@option{--gunzip}, @option{--ungzip}, @option{-z}) option and run
|
|
|
-@command{gzip} explicitly:
|
|
|
+You could use this approach for many (although not all) PAX
|
|
|
+format archives as well. However, extracting split members from a PAX
|
|
|
+archive is a much easier task, because PAX volumes are constructed in
|
|
|
+such a way that each part of a split member is extracted as a
|
|
|
+different file by @command{tar} implementations that are not aware of
|
|
|
+GNU extensions. More specifically, the very first part retains its
|
|
|
+original name, and all subsequent parts are named using the pattern:
|
|
|
|
|
|
@smallexample
|
|
|
-$ @kbd{tar cf - subdir | gzip --best -c - > archive.tar.gz}
|
|
|
+%d/GNUFileParts.%p/%f.%n
|
|
|
@end smallexample
|
|
|
|
|
|
-@cindex corrupted archives
|
|
|
-About corrupted compressed archives: @command{gzip}'ed files have no
|
|
|
-redundancy, for maximum compression. The adaptive nature of the
|
|
|
-compression scheme means that the compression tables are implicitly
|
|
|
-spread all over the archive. If you lose a few blocks, the dynamic
|
|
|
-construction of the compression tables becomes unsynchronized, and there
|
|
|
-is little chance that you could recover later in the archive.
|
|
|
-
|
|
|
-There are pending suggestions for having a per-volume or per-file
|
|
|
-compression in @GNUTAR{}. This would allow for viewing the
|
|
|
-contents without decompression, and for resynchronizing decompression at
|
|
|
-every volume or file, in case of corrupted archives. Doing so, we might
|
|
|
-lose some compressibility. But this would have make recovering easier.
|
|
|
-So, there are pros and cons. We'll see!
|
|
|
-
|
|
|
-@opindex bzip2
|
|
|
-@item -j
|
|
|
-@itemx --bzip2
|
|
|
-Filter the archive through @code{bzip2}. Otherwise like @option{--gzip}.
|
|
|
-
|
|
|
-@opindex compress
|
|
|
-@opindex uncompress
|
|
|
-@item -Z
|
|
|
-@itemx --compress
|
|
|
-@itemx --uncompress
|
|
|
-Filter the archive through @command{compress}. Otherwise like @option{--gzip}.
|
|
|
-
|
|
|
-The @acronym{GNU} Project recommends you not use
|
|
|
-@command{compress}, because there is a patent covering the algorithm it
|
|
|
-uses. You could be sued for patent infringement merely by running
|
|
|
-@command{compress}.
|
|
|
+@noindent
|
|
|
+where symbols preceeded by @samp{%} are @dfn{macro characters} that
|
|
|
+have the following meaning:
|
|
|
|
|
|
-@opindex use-compress-program
|
|
|
-@item --use-compress-program=@var{prog}
|
|
|
-Use external compression program @var{prog}. Use this option if you
|
|
|
-have a compression program that @GNUTAR{} does not support. There
|
|
|
-are two requirements to which @var{prog} should comply:
|
|
|
+@multitable @columnfractions .25 .55
|
|
|
+@headitem Meta-character @tab Replaced By
|
|
|
+@item %d @tab The directory name of the file, equivalent to the
|
|
|
+result of the @command{dirname} utility on its full name.
|
|
|
+@item %f @tab The file name of the file, equivalent to the result
|
|
|
+of the @command{basename} utility on its full name.
|
|
|
+@item %p @tab The process ID of the @command{tar} process that
|
|
|
+created the archive.
|
|
|
+@item %n @tab Ordinal number of this particular part.
|
|
|
+@end multitable
|
|
|
|
|
|
-First, when called without options, it should read data from standard
|
|
|
-input, compress it and output it on standard output.
|
|
|
+For example, if, a file @file{var/longfile} was split during archive
|
|
|
+creation between three volumes, and the creator @command{tar} process
|
|
|
+had process ID @samp{27962}, then the member names will be:
|
|
|
|
|
|
-Secondly, if called with @option{-d} argument, it should do exactly
|
|
|
-the opposite, i.e., read the compressed data from the standard input
|
|
|
-and produce uncompressed data on the standard output.
|
|
|
-@end table
|
|
|
+@smallexample
|
|
|
+var/longfile
|
|
|
+var/GNUFileParts.27962/longfile.1
|
|
|
+var/GNUFileParts.27962/longfile.2
|
|
|
+@end smallexample
|
|
|
|
|
|
-@cindex gpg, using with tar
|
|
|
-@cindex gnupg, using with tar
|
|
|
-@cindex Using encrypted archives
|
|
|
-The @option{--use-compress-program} option, in particular, lets you
|
|
|
-implement your own filters, not necessarily dealing with
|
|
|
-compression/decomression. For example, suppose you wish to implement
|
|
|
-PGP encryption on top of compression, using @command{gpg} (@pxref{Top,
|
|
|
-gpg, gpg ---- encryption and signing tool, gpg, GNU Privacy Guard
|
|
|
-Manual}). The following script does that:
|
|
|
+When you extract your archive using a third-party @command{tar}, these
|
|
|
+files will be created on your disk, and the only thing you will need
|
|
|
+to do to restore your file in its original form is concatenate them in
|
|
|
+the proper order, for example:
|
|
|
|
|
|
@smallexample
|
|
|
@group
|
|
|
-#! /bin/sh
|
|
|
-case $1 in
|
|
|
--d) gpg --decrypt - | gzip -d -c;;
|
|
|
-'') gzip -c | gpg -s ;;
|
|
|
-*) echo "Unknown option $1">&2; exit 1;;
|
|
|
-esac
|
|
|
+$ @kbd{cd var}
|
|
|
+$ @kbd{cat GNUFileParts.27962/longfile.1 \
|
|
|
+ GNUFileParts.27962/longfile.2 >> longfile}
|
|
|
+$ rm -f GNUFileParts.27962
|
|
|
@end group
|
|
|
@end smallexample
|
|
|
|
|
|
-Suppose you name it @file{gpgz} and save it somewhere in your
|
|
|
-@env{PATH}. Then the following command will create a commpressed
|
|
|
-archive signed with your private key:
|
|
|
+Notice, that if the @command{tar} implementation you use supports PAX
|
|
|
+format archives, it will probably emit warnings about unknown keywords
|
|
|
+during extraction. They will lool like this:
|
|
|
|
|
|
@smallexample
|
|
|
-$ @kbd{tar -cf foo.tar.gpgz --use-compress=gpgz .}
|
|
|
+@group
|
|
|
+Tar file too small
|
|
|
+Unknown extended header keyword 'GNU.volume.filename' ignored.
|
|
|
+Unknown extended header keyword 'GNU.volume.size' ignored.
|
|
|
+Unknown extended header keyword 'GNU.volume.offset' ignored.
|
|
|
+@end group
|
|
|
@end smallexample
|
|
|
|
|
|
@noindent
|
|
|
-Likewise, the following command will list its contents:
|
|
|
+You can safely ignore these warnings.
|
|
|
+
|
|
|
+If your @command{tar} implementation is not PAX-aware, you will get
|
|
|
+more warnigns and more files generated on your disk, e.g.:
|
|
|
|
|
|
@smallexample
|
|
|
-$ @kbd{tar -tf foo.tar.gpgz --use-compress=gpgz .}
|
|
|
+@group
|
|
|
+$ @kbd{tar xf vol-1.tar}
|
|
|
+var/PaxHeaders.27962/longfile: Unknown file type 'x', extracted as
|
|
|
+normal file
|
|
|
+Unexpected EOF in archive
|
|
|
+$ @kbd{tar xf vol-2.tar}
|
|
|
+tmp/GlobalHead.27962.1: Unknown file type 'g', extracted as normal file
|
|
|
+GNUFileParts.27962/PaxHeaders.27962/sparsefile.1: Unknown file type
|
|
|
+'x', extracted as normal file
|
|
|
+@end group
|
|
|
@end smallexample
|
|
|
|
|
|
-@ignore
|
|
|
-The above is based on the following discussion:
|
|
|
-
|
|
|
- I have one question, or maybe it's a suggestion if there isn't a way
|
|
|
- to do it now. I would like to use @option{--gzip}, but I'd also like
|
|
|
- the output to be fed through a program like @acronym{GNU}
|
|
|
- @command{ecc} (actually, right now that's @samp{exactly} what I'd like
|
|
|
- to use :-)), basically adding ECC protection on top of compression.
|
|
|
- It seems as if this should be quite easy to do, but I can't work out
|
|
|
- exactly how to go about it. Of course, I can pipe the standard output
|
|
|
- of @command{tar} through @command{ecc}, but then I lose (though I
|
|
|
- haven't started using it yet, I confess) the ability to have
|
|
|
- @command{tar} use @command{rmt} for it's I/O (I think).
|
|
|
-
|
|
|
- I think the most straightforward thing would be to let me specify a
|
|
|
- general set of filters outboard of compression (preferably ordered,
|
|
|
- so the order can be automatically reversed on input operations, and
|
|
|
- with the options they require specifiable), but beggars shouldn't be
|
|
|
- choosers and anything you decide on would be fine with me.
|
|
|
-
|
|
|
- By the way, I like @command{ecc} but if (as the comments say) it can't
|
|
|
- deal with loss of block sync, I'm tempted to throw some time at adding
|
|
|
- that capability. Supposing I were to actually do such a thing and
|
|
|
- get it (apparently) working, do you accept contributed changes to
|
|
|
- utilities like that? (Leigh Clayton @file{loc@@soliton.com}, May 1995).
|
|
|
-
|
|
|
- Isn't that exactly the role of the
|
|
|
- @option{--use-compress-prog=@var{program}} option?
|
|
|
- I never tried it myself, but I suspect you may want to write a
|
|
|
- @var{prog} script or program able to filter stdin to stdout to
|
|
|
- way you want. It should recognize the @option{-d} option, for when
|
|
|
- extraction is needed rather than creation.
|
|
|
-
|
|
|
- It has been reported that if one writes compressed data (through the
|
|
|
- @option{--gzip} or @option{--compress} options) to a DLT and tries to use
|
|
|
- the DLT compression mode, the data will actually get bigger and one will
|
|
|
- end up with less space on the tape.
|
|
|
-@end ignore
|
|
|
-
|
|
|
-@node sparse
|
|
|
-@subsection Archiving Sparse Files
|
|
|
-@cindex Sparse Files
|
|
|
-@UNREVISED
|
|
|
-
|
|
|
-@table @option
|
|
|
-@opindex sparse
|
|
|
-@item -S
|
|
|
-@itemx --sparse
|
|
|
-Handle sparse files efficiently.
|
|
|
-@end table
|
|
|
-
|
|
|
-This option causes all files to be put in the archive to be tested for
|
|
|
-sparseness, and handled specially if they are. The @option{--sparse}
|
|
|
-(@option{-S}) option is useful when many @code{dbm} files, for example, are being
|
|
|
-backed up. Using this option dramatically decreases the amount of
|
|
|
-space needed to store such a file.
|
|
|
-
|
|
|
-In later versions, this option may be removed, and the testing and
|
|
|
-treatment of sparse files may be done automatically with any special
|
|
|
-@acronym{GNU} options. For now, it is an option needing to be specified on
|
|
|
-the command line with the creation or updating of an archive.
|
|
|
+Ignore these warnings. The @file{PaxHeaders.*} directories created
|
|
|
+will contain files with @dfn{extended header keywords} describing the
|
|
|
+extracted files. You can delete them, unless they describe sparse
|
|
|
+members. Read further to learn more about them.
|
|
|
|
|
|
-Files in the file system occasionally have @dfn{holes}. A @dfn{hole} in a file
|
|
|
-is a section of the file's contents which was never written. The
|
|
|
-contents of a hole read as all zeros. On many operating systems,
|
|
|
-actual disk storage is not allocated for holes, but they are counted
|
|
|
-in the length of the file. If you archive such a file, @command{tar}
|
|
|
-could create an archive longer than the original. To have @command{tar}
|
|
|
-attempt to recognize the holes in a file, use @option{--sparse} (@option{-S}). When
|
|
|
-you use this option, then, for any file using less disk space than
|
|
|
-would be expected from its length, @command{tar} searches the file for
|
|
|
-consecutive stretches of zeros. It then records in the archive for
|
|
|
-the file where the consecutive stretches of zeros are, and only
|
|
|
-archives the ``real contents'' of the file. On extraction (using
|
|
|
-@option{--sparse} is not needed on extraction) any such
|
|
|
-files have holes created wherever the continuous stretches of zeros
|
|
|
-were found. Thus, if you use @option{--sparse}, @command{tar} archives
|
|
|
-won't take more space than the original.
|
|
|
+@node Sparse Recovery
|
|
|
+@subsubsection Extracting Sparse Members
|
|
|
|
|
|
-A file is sparse if it contains blocks of zeros whose existence is
|
|
|
-recorded, but that have no space allocated on disk. When you specify
|
|
|
-the @option{--sparse} option in conjunction with the @option{--create}
|
|
|
-(@option{-c}) operation, @command{tar} tests all files for sparseness
|
|
|
-while archiving. If @command{tar} finds a file to be sparse, it uses a
|
|
|
-sparse representation of the file in the archive. @xref{create}, for
|
|
|
-more information about creating archives.
|
|
|
+Any @command{tar} implementation will be able to extract sparse members from a
|
|
|
+PAX archive. However, the extracted files will be @dfn{condensed},
|
|
|
+i.e. any zero blocks will be removed from them. When we restore such
|
|
|
+a condensed file to its original form, by adding zero bloks (or
|
|
|
+@dfn{holes}) back to their original locations, we call this process
|
|
|
+@dfn{expanding} a compressed sparse file.
|
|
|
|
|
|
-@option{--sparse} is useful when archiving files, such as dbm files,
|
|
|
-likely to contain many nulls. This option dramatically
|
|
|
-decreases the amount of space needed to store such an archive.
|
|
|
+To expand a file, you will need a simple auxiliary program called
|
|
|
+@command{xsparse}. It is available in source form from
|
|
|
+@uref{http://www.gnu.org/@/software/@/tar/@/utils/@/xsparse, @GNUTAR{}
|
|
|
+home page}.
|
|
|
|
|
|
-@quotation
|
|
|
-@strong{Please Note:} Always use @option{--sparse} when performing file
|
|
|
-system backups, to avoid archiving the expanded forms of files stored
|
|
|
-sparsely in the system.
|
|
|
+Let's begin with archive members in @dfn{sparse format
|
|
|
+version 1.0}@footnote{@xref{PAX 1}.}, which are the easiest to expand.
|
|
|
+The condensed file will contain both file map and file data, so no
|
|
|
+additional data will be needed to restore it. If the original file
|
|
|
+name was @file{@var{dir}/@var{name}}, then the condensed file will be
|
|
|
+named @file{@var{dir}/@/GNUSparseFile.@var{n}/@/@var{name}}, where
|
|
|
+@var{n} is a decimal number@footnote{technically speaking, @var{n} is a
|
|
|
+@dfn{process ID} of the @command{tar} process which created the
|
|
|
+archive (@pxref{PAX keywords}).}.
|
|
|
|
|
|
-Even if your system has no sparse files currently, some may be
|
|
|
-created in the future. If you use @option{--sparse} while making file
|
|
|
-system backups as a matter of course, you can be assured the archive
|
|
|
-will never take more space on the media than the files take on disk
|
|
|
-(otherwise, archiving a disk filled with sparse files might take
|
|
|
-hundreds of tapes). @xref{Incremental Dumps}.
|
|
|
-@end quotation
|
|
|
+To expand a version 1.0 file, run @command{xsparse} as follows:
|
|
|
|
|
|
-@command{tar} ignores the @option{--sparse} option when reading an archive.
|
|
|
+@smallexample
|
|
|
+$ @kbd{xsparse @file{cond-file}}
|
|
|
+@end smallexample
|
|
|
|
|
|
-@table @option
|
|
|
-@item --sparse
|
|
|
-@itemx -S
|
|
|
-Files stored sparsely in the file system are represented sparsely in
|
|
|
-the archive. Use in conjunction with write operations.
|
|
|
-@end table
|
|
|
+@noindent
|
|
|
+where @file{cond-file} is the name of the condensed file. The utility
|
|
|
+will deduce the name for the resulting expanded file using the
|
|
|
+following algorithm:
|
|
|
|
|
|
-However, users should be well aware that at archive creation time,
|
|
|
-@GNUTAR{} still has to read whole disk file to
|
|
|
-locate the @dfn{holes}, and so, even if sparse files use little space
|
|
|
-on disk and in the archive, they may sometimes require inordinate
|
|
|
-amount of time for reading and examining all-zero blocks of a file.
|
|
|
-Although it works, it's painfully slow for a large (sparse) file, even
|
|
|
-though the resulting tar archive may be small. (One user reports that
|
|
|
-dumping a @file{core} file of over 400 megabytes, but with only about
|
|
|
-3 megabytes of actual data, took about 9 minutes on a Sun Sparcstation
|
|
|
-ELC, with full CPU utilization.)
|
|
|
-
|
|
|
-This reading is required in all cases and is not related to the fact
|
|
|
-the @option{--sparse} option is used or not, so by merely @emph{not}
|
|
|
-using the option, you are not saving time@footnote{Well! We should say
|
|
|
-the whole truth, here. When @option{--sparse} is selected while creating
|
|
|
-an archive, the current @command{tar} algorithm requires sparse files to be
|
|
|
-read twice, not once. We hope to develop a new archive format for saving
|
|
|
-sparse files in which one pass will be sufficient.}.
|
|
|
+@enumerate 1
|
|
|
+@item If @file{cond-file} does not contain any directories,
|
|
|
+@file{../cond-file} will be used;
|
|
|
|
|
|
-Programs like @command{dump} do not have to read the entire file; by
|
|
|
-examining the file system directly, they can determine in advance
|
|
|
-exactly where the holes are and thus avoid reading through them. The
|
|
|
-only data it need read are the actual allocated data blocks.
|
|
|
-@GNUTAR{} uses a more portable and straightforward
|
|
|
-archiving approach, it would be fairly difficult that it does
|
|
|
-otherwise. Elizabeth Zwicky writes to @file{comp.unix.internals}, on
|
|
|
-1990-12-10:
|
|
|
+@item If @file{cond-file} has the form
|
|
|
+@file{@var{dir}/@var{t}/@var{name}}, where both @var{t} and @var{name}
|
|
|
+are simple names, with no @samp{/} characters in them, the output file
|
|
|
+name will be @file{@var{dir}/@var{name}}.
|
|
|
|
|
|
-@quotation
|
|
|
-What I did say is that you cannot tell the difference between a hole and an
|
|
|
-equivalent number of nulls without reading raw blocks. @code{st_blocks} at
|
|
|
-best tells you how many holes there are; it doesn't tell you @emph{where}.
|
|
|
-Just as programs may, conceivably, care what @code{st_blocks} is (care
|
|
|
-to name one that does?), they may also care where the holes are (I have
|
|
|
-no examples of this one either, but it's equally imaginable).
|
|
|
+@item Otherwise, if @file{cond-file} has the form
|
|
|
+@file{@var{dir}/@var{name}}, the output file name will be
|
|
|
+@file{@var{name}}.
|
|
|
+@end enumerate
|
|
|
|
|
|
-I conclude from this that good archivers are not portable. One can
|
|
|
-arguably conclude that if you want a portable program, you can in good
|
|
|
-conscience restore files with as many holes as possible, since you can't
|
|
|
-get it right.
|
|
|
-@end quotation
|
|
|
+In the unlikely case when this algorithm does not suite your needs,
|
|
|
+you can explicitely specify output file name as a second argument to
|
|
|
+the command:
|
|
|
|
|
|
-@node Attributes
|
|
|
-@section Handling File Attributes
|
|
|
-@UNREVISED
|
|
|
+@smallexample
|
|
|
+$ @kbd{xsparse @file{cond-file}}
|
|
|
+@end smallexample
|
|
|
|
|
|
-When @command{tar} reads files, it updates their access times. To
|
|
|
-avoid this, use the @option{--atime-preserve[=METHOD]} option, which can either
|
|
|
-reset the access time retroactively or avoid changing it in the first
|
|
|
-place.
|
|
|
+It is often a good idea to run @command{xsparse} in @dfn{dry run} mode
|
|
|
+first. In this mode, the command does not actually expand the file,
|
|
|
+but verbosely lists all actions it would be taking to do so. The dry
|
|
|
+run mode is enabled by @option{-n} command line argument:
|
|
|
|
|
|
-Handling of file attributes
|
|
|
+@smallexample
|
|
|
+@group
|
|
|
+$ @kbd{xsparse -n /home/gray/GNUSparseFile.6058/sparsefile}
|
|
|
+Reading v.1.0 sparse map
|
|
|
+Expanding file `/home/gray/GNUSparseFile.6058/sparsefile' to
|
|
|
+`/home/gray/sparsefile'
|
|
|
+Finished dry run
|
|
|
+@end group
|
|
|
+@end smallexample
|
|
|
|
|
|
-@table @option
|
|
|
-@opindex atime-preserve
|
|
|
-@item --atime-preserve
|
|
|
-@itemx --atime-preserve=replace
|
|
|
-@itemx --atime-preserve=system
|
|
|
-Preserve the access times of files that are read. This works only for
|
|
|
-files that you own, unless you have superuser privileges.
|
|
|
+To actually expand the file, you would run:
|
|
|
|
|
|
-@option{--atime-preserve=replace} works on most systems, but it also
|
|
|
-restores the data modification time and updates the status change
|
|
|
-time. Hence it doesn't interact with incremental dumps nicely
|
|
|
-(@pxref{Backups}), and it can set access or data modification times
|
|
|
-incorrectly if other programs access the file while @command{tar} is
|
|
|
-running.
|
|
|
+@smallexample
|
|
|
+$ @kbd{xsparse /home/gray/GNUSparseFile.6058/sparsefile}
|
|
|
+@end smallexample
|
|
|
|
|
|
-@option{--atime-preserve=system} avoids changing the access time in
|
|
|
-the first place, if the operating system supports this.
|
|
|
-Unfortunately, this may or may not work on any given operating system
|
|
|
-or file system. If @command{tar} knows for sure it won't work, it
|
|
|
-complains right away.
|
|
|
+@noindent
|
|
|
+The program behaves the same way all UNIX utilities do: it will keep
|
|
|
+quiet unless it has simething important to tell you (e.g. an error
|
|
|
+condition or something). If you wish it to produce verbose output,
|
|
|
+similar to that from the dry run mode, give it @option{-v} option:
|
|
|
|
|
|
-Currently @option{--atime-preserve} with no operand defaults to
|
|
|
-@option{--atime-preserve=replace}, but this is intended to change to
|
|
|
-@option{--atime-preserve=system} when the latter is better-supported.
|
|
|
+@smallexample
|
|
|
+@group
|
|
|
+$ @kbd{xsparse -v /home/gray/GNUSparseFile.6058/sparsefile}
|
|
|
+Reading v.1.0 sparse map
|
|
|
+Expanding file `/home/gray/GNUSparseFile.6058/sparsefile' to
|
|
|
+`/home/gray/sparsefile'
|
|
|
+Done
|
|
|
+@end group
|
|
|
+@end smallexample
|
|
|
|
|
|
-@opindex touch
|
|
|
-@item -m
|
|
|
-@itemx --touch
|
|
|
-Do not extract data modification time.
|
|
|
+Additionally, if your @command{tar} implementation has extracted the
|
|
|
+@dfn{extended headers} for this file, you can instruct @command{xstar}
|
|
|
+to use them in order to verify the integrity of the expanded file.
|
|
|
+The option @option{-x} sets the name of the extended header file to
|
|
|
+use. Continuing our example:
|
|
|
|
|
|
-When this option is used, @command{tar} leaves the data modification times
|
|
|
-of the files it extracts as the times when the files were extracted,
|
|
|
-instead of setting it to the times recorded in the archive.
|
|
|
+@smallexample
|
|
|
+@group
|
|
|
+$ @kbd{xsparse -v -x /home/gray/PaxHeaders.6058/sparsefile \
|
|
|
+ /home/gray/GNUSparseFile.6058/sparsefile}
|
|
|
+Reading extended header file
|
|
|
+Found variable GNU.sparse.major = 1
|
|
|
+Found variable GNU.sparse.minor = 0
|
|
|
+Found variable GNU.sparse.name = sparsefile
|
|
|
+Found variable GNU.sparse.realsize = 217481216
|
|
|
+Reading v.1.0 sparse map
|
|
|
+Expanding file `/home/gray/GNUSparseFile.6058/sparsefile' to
|
|
|
+`/home/gray/sparsefile'
|
|
|
+Done
|
|
|
+@end group
|
|
|
+@end smallexample
|
|
|
|
|
|
-This option is meaningless with @option{--list} (@option{-t}).
|
|
|
+An @dfn{extended header} is a special @command{tar} archive header
|
|
|
+that precedes an archive member and contains a set of
|
|
|
+@dfn{variables}, describing the member properties that cannot be
|
|
|
+stored in the standard @code{ustar} header. While optional for
|
|
|
+expanding sparse version 1.0 members, use of extended headers is
|
|
|
+mandatory when expanding sparse members in older sparse formats: v.0.0
|
|
|
+and v.0.1 (The sparse formats are described in detail in @pxref{Sparse
|
|
|
+Formats}). So, for this format, the question is: how to obtain
|
|
|
+extended headers from the archive?
|
|
|
|
|
|
-@opindex same-owner
|
|
|
-@item --same-owner
|
|
|
-Create extracted files with the same ownership they have in the
|
|
|
-archive.
|
|
|
+If you use a @command{tar} implementation that does not support PAX
|
|
|
+format, extended headers for each member will be extracted as a
|
|
|
+separate file. If we represent the member name as
|
|
|
+@file{@var{dir}/@var{name}}, then the extended header file will be
|
|
|
+named @file{@var{dir}/@/PaxHeaders.@var{n}/@/@var{name}}, where
|
|
|
+@var{n} is an integer number.
|
|
|
|
|
|
-This is the default behavior for the superuser,
|
|
|
-so this option is meaningful only for non-root users, when @command{tar}
|
|
|
-is executed on those systems able to give files away. This is
|
|
|
-considered as a security flaw by many people, at least because it
|
|
|
-makes quite difficult to correctly account users for the disk space
|
|
|
-they occupy. Also, the @code{suid} or @code{sgid} attributes of
|
|
|
-files are easily and silently lost when files are given away.
|
|
|
+Things become more difficult if your @command{tar} implementation
|
|
|
+does support PAX headers, because in this case you will have to
|
|
|
+manually extract the headers. We recommend the following algorithm:
|
|
|
|
|
|
-When writing an archive, @command{tar} writes the user id and user name
|
|
|
-separately. If it can't find a user name (because the user id is not
|
|
|
-in @file{/etc/passwd}), then it does not write one. When restoring,
|
|
|
-it tries to look the name (if one was written) up in
|
|
|
-@file{/etc/passwd}. If it fails, then it uses the user id stored in
|
|
|
-the archive instead.
|
|
|
+@enumerate 1
|
|
|
+@item
|
|
|
+Consult the documentation for your @command{tar} implementation for an
|
|
|
+option that will print @dfn{block numbers} along with the archive
|
|
|
+listing (analogous to @GNUTAR{}'s @option{-R} option). For example,
|
|
|
+@command{star} has @option{-block-number}.
|
|
|
|
|
|
-@opindex no-same-owner
|
|
|
-@item --no-same-owner
|
|
|
-@itemx -o
|
|
|
-Do not attempt to restore ownership when extracting. This is the
|
|
|
-default behavior for ordinary users, so this option has an effect
|
|
|
-only for the superuser.
|
|
|
+@item
|
|
|
+Obtain the verbose listing using the @samp{block number} option, and
|
|
|
+find the position of the sparse member in question and the member
|
|
|
+immediately following it. For example, running @command{star} on our
|
|
|
+archive we obtain:
|
|
|
|
|
|
-@opindex numeric-owner
|
|
|
-@item --numeric-owner
|
|
|
-The @option{--numeric-owner} option allows (ANSI) archives to be written
|
|
|
-without user/group name information or such information to be ignored
|
|
|
-when extracting. It effectively disables the generation and/or use
|
|
|
-of user/group name information. This option forces extraction using
|
|
|
-the numeric ids from the archive, ignoring the names.
|
|
|
+@smallexample
|
|
|
+@group
|
|
|
+$ @kbd{star -t -v -block-number -f arc.tar}
|
|
|
+@dots{}
|
|
|
+star: Unknown extended header keyword 'GNU.sparse.size' ignored.
|
|
|
+star: Unknown extended header keyword 'GNU.sparse.numblocks' ignored.
|
|
|
+star: Unknown extended header keyword 'GNU.sparse.name' ignored.
|
|
|
+star: Unknown extended header keyword 'GNU.sparse.map' ignored.
|
|
|
+block 56: 425984 -rw-r--r-- gray/users Jun 25 14:46 2006 GNUSparseFile.28124/sparsefile
|
|
|
+block 897: 65391 -rw-r--r-- gray/users Jun 24 20:06 2006 README
|
|
|
+@dots{}
|
|
|
+@end group
|
|
|
+@end smallexample
|
|
|
|
|
|
-This is useful in certain circumstances, when restoring a backup from
|
|
|
-an emergency floppy with different passwd/group files for example.
|
|
|
-It is otherwise impossible to extract files with the right ownerships
|
|
|
-if the password file in use during the extraction does not match the
|
|
|
-one belonging to the file system(s) being extracted. This occurs,
|
|
|
-for example, if you are restoring your files after a major crash and
|
|
|
-had booted from an emergency floppy with no password file or put your
|
|
|
-disk into another machine to do the restore.
|
|
|
+@noindent
|
|
|
+(as usual, ignore the warnings about unknown keywords.)
|
|
|
|
|
|
-The numeric ids are @emph{always} saved into @command{tar} archives.
|
|
|
-The identifying names are added at create time when provided by the
|
|
|
-system, unless @option{--old-archive} (@option{-o}) is used. Numeric ids could be
|
|
|
-used when moving archives between a collection of machines using
|
|
|
-a centralized management for attribution of numeric ids to users
|
|
|
-and groups. This is often made through using the NIS capabilities.
|
|
|
+@item
|
|
|
+Let @var{size} be the size of the sparse member, @var{Bs} be its block number
|
|
|
+and @var{Bn} be the block number of the next member.
|
|
|
+Compute:
|
|
|
|
|
|
-When making a @command{tar} file for distribution to other sites, it
|
|
|
-is sometimes cleaner to use a single owner for all files in the
|
|
|
-distribution, and nicer to specify the write permission bits of the
|
|
|
-files as stored in the archive independently of their actual value on
|
|
|
-the file system. The way to prepare a clean distribution is usually
|
|
|
-to have some Makefile rule creating a directory, copying all needed
|
|
|
-files in that directory, then setting ownership and permissions as
|
|
|
-wanted (there are a lot of possible schemes), and only then making a
|
|
|
-@command{tar} archive out of this directory, before cleaning
|
|
|
-everything out. Of course, we could add a lot of options to
|
|
|
-@GNUTAR{} for fine tuning permissions and ownership.
|
|
|
-This is not the good way, I think. @GNUTAR{} is
|
|
|
-already crowded with options and moreover, the approach just explained
|
|
|
-gives you a great deal of control already.
|
|
|
+@smallexample
|
|
|
+@var{N} = @var{Bs} - @var{Bn} - @var{size}/512 - 2
|
|
|
+@end smallexample
|
|
|
|
|
|
-@xopindex{same-permissions, short description}
|
|
|
-@xopindex{preserve-permissions, short description}
|
|
|
-@item -p
|
|
|
-@itemx --same-permissions
|
|
|
-@itemx --preserve-permissions
|
|
|
-Extract all protection information.
|
|
|
+@noindent
|
|
|
+This number gives the size of the extended header part in tar @dfn{blocks}.
|
|
|
+In our example, this formula gives: @code{897 - 56 - 425984 / 512 - 2
|
|
|
+= 7}.
|
|
|
|
|
|
-This option causes @command{tar} to set the modes (access permissions) of
|
|
|
-extracted files exactly as recorded in the archive. If this option
|
|
|
-is not used, the current @code{umask} setting limits the permissions
|
|
|
-on extracted files. This option is by default enabled when
|
|
|
-@command{tar} is executed by a superuser.
|
|
|
+@item
|
|
|
+Use @command{dd} to extract the headers:
|
|
|
|
|
|
+@smallexample
|
|
|
+@kbd{dd if=@var{archive} of=@var{hname} bs=512 skip=@var{Bs} count=@var{N}}
|
|
|
+@end smallexample
|
|
|
|
|
|
-This option is meaningless with @option{--list} (@option{-t}).
|
|
|
+@noindent
|
|
|
+where @var{archive} is the archive name, @var{hname} is a name of the
|
|
|
+file to store the extended header in, @var{Bs} and @var{N} are
|
|
|
+computed in previous steps.
|
|
|
|
|
|
-@opindex preserve
|
|
|
-@item --preserve
|
|
|
-Same as both @option{--same-permissions} and @option{--same-order}.
|
|
|
+In our example, this command will be
|
|
|
|
|
|
-The @option{--preserve} option has no equivalent short option name.
|
|
|
-It is equivalent to @option{--same-permissions} plus @option{--same-order}.
|
|
|
+@smallexample
|
|
|
+$ @kbd{dd if=arc.tar of=xhdr bs=512 skip=56 count=7}
|
|
|
+@end smallexample
|
|
|
+@end enumerate
|
|
|
|
|
|
-@FIXME{I do not see the purpose of such an option. (Neither I. FP.)
|
|
|
-Neither do I. --Sergey}
|
|
|
+Finally, you can expand the condensed file, using the obtained header:
|
|
|
|
|
|
-@end table
|
|
|
+@smallexample
|
|
|
+@group
|
|
|
+$ @kbd{xsparse -v -x xhdr GNUSparseFile.6058/sparsefile}
|
|
|
+Reading extended header file
|
|
|
+Found variable GNU.sparse.size = 217481216
|
|
|
+Found variable GNU.sparse.numblocks = 208
|
|
|
+Found variable GNU.sparse.name = sparsefile
|
|
|
+Found variable GNU.sparse.map = 0,2048,1050624,2048,@dots{}
|
|
|
+Expanding file `GNUSparseFile.28124/sparsefile' to `sparsefile'
|
|
|
+Done
|
|
|
+@end group
|
|
|
+@end smallexample
|
|
|
|
|
|
@node cpio
|
|
|
@section Comparison of @command{tar} and @command{cpio}
|