diff options
author | kientzle <kientzle@FreeBSD.org> | 2004-02-09 23:22:54 +0000 |
---|---|---|
committer | kientzle <kientzle@FreeBSD.org> | 2004-02-09 23:22:54 +0000 |
commit | af9413b539628b0758b4af66ffbb879fba49cc6d (patch) | |
tree | 20e1d80fd0a1d288a08af1696ee258bcd08f41d0 /lib/libarchive | |
parent | 6b6e533f7e91602d4ccb3353351eb9f816c2daf2 (diff) | |
download | FreeBSD-src-af9413b539628b0758b4af66ffbb879fba49cc6d.zip FreeBSD-src-af9413b539628b0758b4af66ffbb879fba49cc6d.tar.gz |
Initial import of libarchive.
What it is:
A library for reading and writing various streaming archive
formats, especially tar and cpio. Being a library, it should
be easy to incorporate into pkg_* tools, sysinstall, and any
other place that needs to read or write such archives.
Features:
* Full automatic detection of both compression and archive format.
* Extensible internal architecture to make it easy to add new formats.
* Support for "pax interchange format," a new POSIX-standard tar format
that eliminates essentially all of the restrictions of historic formats.
* BSD license
Thanks to: jkh for pushing me to start this work, gordon for
encouraging me to commit it, bde for answering endless style
questions, and many others for feedback and encouragement.
Status: Pretty good overall, though there are still a few rough edges and
the library could always use more testing. Feedback eagerly solicited.
Diffstat (limited to 'lib/libarchive')
45 files changed, 11942 insertions, 0 deletions
diff --git a/lib/libarchive/Makefile b/lib/libarchive/Makefile new file mode 100644 index 0000000..e59b243 --- /dev/null +++ b/lib/libarchive/Makefile @@ -0,0 +1,119 @@ +# Makefile for libarchive. +# +# $FreeBSD$ +# +LIB= archive +SRCS= archive_check_magic.c \ + archive_entry.c \ + archive_read.c \ + archive_read_data_into_buffer.c \ + archive_read_data_into_fd.c \ + archive_read_extract.c \ + archive_read_open_file.c \ + archive_read_support_compression_all.c \ + archive_read_support_compression_bzip2.c \ + archive_read_support_compression_gzip.c \ + archive_read_support_compression_none.c \ + archive_read_support_format_all.c \ + archive_read_support_format_cpio.c \ + archive_read_support_format_gnutar.c \ + archive_read_support_format_tar.c \ + archive_string.c \ + archive_string_sprintf.c \ + archive_util.c \ + archive_write.c \ + archive_write_open_file.c \ + archive_write_set_compression_bzip2.c \ + archive_write_set_compression_gzip.c \ + archive_write_set_compression_none.c \ + archive_write_set_format.c \ + archive_write_set_format_by_name.c \ + archive_write_set_format_cpio.c \ + archive_write_set_format_pax.c \ + archive_write_set_format_shar.c \ + archive_write_set_format_ustar.c + +MAN = archive_entry.3 \ + archive_read.3 \ + archive_util.3 \ + archive_write.3 \ + libarchive.3 \ + tar.5 + +MLINKS += archive_entry.3 archive_entry_clear.3 +MLINKS += archive_entry.3 archive_entry_clone.3 +MLINKS += archive_entry.3 archive_entry_copy_stat.3 +MLINKS += archive_entry.3 archive_entry_dup.3 +MLINKS += archive_entry.3 archive_entry_free.3 +MLINKS += archive_entry.3 archive_entry_gname.3 +MLINKS += archive_entry.3 archive_entry_hardlink.3 +MLINKS += archive_entry.3 archive_entry_new.3 +MLINKS += archive_entry.3 archive_entry_pathname.3 +MLINKS += archive_entry.3 archive_entry_set_devmajor.3 +MLINKS += archive_entry.3 archive_entry_set_devminor.3 +MLINKS += archive_entry.3 archive_entry_set_gid.3 +MLINKS += archive_entry.3 archive_entry_set_gname.3 +MLINKS += archive_entry.3 archive_entry_set_hardlink.3 +MLINKS += archive_entry.3 archive_entry_set_mode.3 +MLINKS += archive_entry.3 archive_entry_set_pathname.3 +MLINKS += archive_entry.3 archive_entry_set_symlink.3 +MLINKS += archive_entry.3 archive_entry_set_tartype.3 +MLINKS += archive_entry.3 archive_entry_set_uid.3 +MLINKS += archive_entry.3 archive_entry_set_uname.3 +MLINKS += archive_entry.3 archive_entry_size.3 +MLINKS += archive_entry.3 archive_entry_stat.3 +MLINKS += archive_entry.3 archive_entry_symlink.3 +MLINKS += archive_entry.3 archive_entry_tartype.3 +MLINKS += archive_entry.3 archive_entry_uname.3 +MLINKS += archive_read.3 archive_read_data.3 +MLINKS += archive_read.3 archive_read_data_into_buffer.3 +MLINKS += archive_read.3 archive_read_data_into_file.3 +MLINKS += archive_read.3 archive_read_data_skip.3 +MLINKS += archive_read.3 archive_read_extract.3 +MLINKS += archive_read.3 archive_read_finish.3 +MLINKS += archive_read.3 archive_read_new.3 +MLINKS += archive_read.3 archive_read_next_header.3 +MLINKS += archive_read.3 archive_read_open.3 +MLINKS += archive_read.3 archive_read_open_file.3 +MLINKS += archive_read.3 archive_read_open_tar.3 +MLINKS += archive_read.3 archive_read_set_bytes_per_block.3 +MLINKS += archive_read.3 archive_read_support_compression_all.3 +MLINKS += archive_read.3 archive_read_support_compression_bzip2.3 +MLINKS += archive_read.3 archive_read_support_compression_gzip.3 +MLINKS += archive_read.3 archive_read_support_compression_none.3 +MLINKS += archive_read.3 archive_read_support_format_all.3 +MLINKS += archive_read.3 archive_read_support_format_cpio.3 +MLINKS += archive_read.3 archive_read_support_format_gnutar.3 +MLINKS += archive_read.3 archive_read_support_format_tar.3 +MLINKS += archive_util.3 archive_compression.3 +MLINKS += archive_util.3 archive_compression_name.3 +MLINKS += archive_util.3 archive_errno.3 +MLINKS += archive_util.3 archive_error_string.3 +MLINKS += archive_util.3 archive_format.3 +MLINKS += archive_util.3 archive_format_name.3 +MLINKS += archive_write.3 archive_write_data.3 +MLINKS += archive_write.3 archive_write_finish.3 +MLINKS += archive_write.3 archive_write_header.3 +MLINKS += archive_write.3 archive_write_new.3 +MLINKS += archive_write.3 archive_write_open.3 +MLINKS += archive_write.3 archive_write_open_file.3 +MLINKS += archive_write.3 archive_write_prepare.3 +MLINKS += archive_write.3 archive_write_set_bytes_per_block.3 +MLINKS += archive_write.3 archive_write_set_bytes_in_last_block.3 +MLINKS += archive_write.3 archive_write_set_callbacks.3 +MLINKS += archive_write.3 archive_write_set_compression_bzip2.3 +MLINKS += archive_write.3 archive_write_set_compression_gzip.3 +MLINKS += archive_write.3 archive_write_set_format_pax.3 +MLINKS += archive_write.3 archive_write_set_format_ustar.3 +MLINKS += libarchive.3 archive.3 + +INCS = archive.h archive_entry.h + +CFLAGS+=-DDEBUG -g +.if defined(DMALLOC) +CFLAGS+=-DDMALLOC -I/usr/local/include +LDFLAGS+=-L/usr/local/lib -ldmalloc +.endif +WARNS?= 10 + +.include <bsd.lib.mk> diff --git a/lib/libarchive/Makefile.freebsd b/lib/libarchive/Makefile.freebsd new file mode 100644 index 0000000..e59b243 --- /dev/null +++ b/lib/libarchive/Makefile.freebsd @@ -0,0 +1,119 @@ +# Makefile for libarchive. +# +# $FreeBSD$ +# +LIB= archive +SRCS= archive_check_magic.c \ + archive_entry.c \ + archive_read.c \ + archive_read_data_into_buffer.c \ + archive_read_data_into_fd.c \ + archive_read_extract.c \ + archive_read_open_file.c \ + archive_read_support_compression_all.c \ + archive_read_support_compression_bzip2.c \ + archive_read_support_compression_gzip.c \ + archive_read_support_compression_none.c \ + archive_read_support_format_all.c \ + archive_read_support_format_cpio.c \ + archive_read_support_format_gnutar.c \ + archive_read_support_format_tar.c \ + archive_string.c \ + archive_string_sprintf.c \ + archive_util.c \ + archive_write.c \ + archive_write_open_file.c \ + archive_write_set_compression_bzip2.c \ + archive_write_set_compression_gzip.c \ + archive_write_set_compression_none.c \ + archive_write_set_format.c \ + archive_write_set_format_by_name.c \ + archive_write_set_format_cpio.c \ + archive_write_set_format_pax.c \ + archive_write_set_format_shar.c \ + archive_write_set_format_ustar.c + +MAN = archive_entry.3 \ + archive_read.3 \ + archive_util.3 \ + archive_write.3 \ + libarchive.3 \ + tar.5 + +MLINKS += archive_entry.3 archive_entry_clear.3 +MLINKS += archive_entry.3 archive_entry_clone.3 +MLINKS += archive_entry.3 archive_entry_copy_stat.3 +MLINKS += archive_entry.3 archive_entry_dup.3 +MLINKS += archive_entry.3 archive_entry_free.3 +MLINKS += archive_entry.3 archive_entry_gname.3 +MLINKS += archive_entry.3 archive_entry_hardlink.3 +MLINKS += archive_entry.3 archive_entry_new.3 +MLINKS += archive_entry.3 archive_entry_pathname.3 +MLINKS += archive_entry.3 archive_entry_set_devmajor.3 +MLINKS += archive_entry.3 archive_entry_set_devminor.3 +MLINKS += archive_entry.3 archive_entry_set_gid.3 +MLINKS += archive_entry.3 archive_entry_set_gname.3 +MLINKS += archive_entry.3 archive_entry_set_hardlink.3 +MLINKS += archive_entry.3 archive_entry_set_mode.3 +MLINKS += archive_entry.3 archive_entry_set_pathname.3 +MLINKS += archive_entry.3 archive_entry_set_symlink.3 +MLINKS += archive_entry.3 archive_entry_set_tartype.3 +MLINKS += archive_entry.3 archive_entry_set_uid.3 +MLINKS += archive_entry.3 archive_entry_set_uname.3 +MLINKS += archive_entry.3 archive_entry_size.3 +MLINKS += archive_entry.3 archive_entry_stat.3 +MLINKS += archive_entry.3 archive_entry_symlink.3 +MLINKS += archive_entry.3 archive_entry_tartype.3 +MLINKS += archive_entry.3 archive_entry_uname.3 +MLINKS += archive_read.3 archive_read_data.3 +MLINKS += archive_read.3 archive_read_data_into_buffer.3 +MLINKS += archive_read.3 archive_read_data_into_file.3 +MLINKS += archive_read.3 archive_read_data_skip.3 +MLINKS += archive_read.3 archive_read_extract.3 +MLINKS += archive_read.3 archive_read_finish.3 +MLINKS += archive_read.3 archive_read_new.3 +MLINKS += archive_read.3 archive_read_next_header.3 +MLINKS += archive_read.3 archive_read_open.3 +MLINKS += archive_read.3 archive_read_open_file.3 +MLINKS += archive_read.3 archive_read_open_tar.3 +MLINKS += archive_read.3 archive_read_set_bytes_per_block.3 +MLINKS += archive_read.3 archive_read_support_compression_all.3 +MLINKS += archive_read.3 archive_read_support_compression_bzip2.3 +MLINKS += archive_read.3 archive_read_support_compression_gzip.3 +MLINKS += archive_read.3 archive_read_support_compression_none.3 +MLINKS += archive_read.3 archive_read_support_format_all.3 +MLINKS += archive_read.3 archive_read_support_format_cpio.3 +MLINKS += archive_read.3 archive_read_support_format_gnutar.3 +MLINKS += archive_read.3 archive_read_support_format_tar.3 +MLINKS += archive_util.3 archive_compression.3 +MLINKS += archive_util.3 archive_compression_name.3 +MLINKS += archive_util.3 archive_errno.3 +MLINKS += archive_util.3 archive_error_string.3 +MLINKS += archive_util.3 archive_format.3 +MLINKS += archive_util.3 archive_format_name.3 +MLINKS += archive_write.3 archive_write_data.3 +MLINKS += archive_write.3 archive_write_finish.3 +MLINKS += archive_write.3 archive_write_header.3 +MLINKS += archive_write.3 archive_write_new.3 +MLINKS += archive_write.3 archive_write_open.3 +MLINKS += archive_write.3 archive_write_open_file.3 +MLINKS += archive_write.3 archive_write_prepare.3 +MLINKS += archive_write.3 archive_write_set_bytes_per_block.3 +MLINKS += archive_write.3 archive_write_set_bytes_in_last_block.3 +MLINKS += archive_write.3 archive_write_set_callbacks.3 +MLINKS += archive_write.3 archive_write_set_compression_bzip2.3 +MLINKS += archive_write.3 archive_write_set_compression_gzip.3 +MLINKS += archive_write.3 archive_write_set_format_pax.3 +MLINKS += archive_write.3 archive_write_set_format_ustar.3 +MLINKS += libarchive.3 archive.3 + +INCS = archive.h archive_entry.h + +CFLAGS+=-DDEBUG -g +.if defined(DMALLOC) +CFLAGS+=-DDMALLOC -I/usr/local/include +LDFLAGS+=-L/usr/local/lib -ldmalloc +.endif +WARNS?= 10 + +.include <bsd.lib.mk> diff --git a/lib/libarchive/README b/lib/libarchive/README new file mode 100644 index 0000000..6e55cdf --- /dev/null +++ b/lib/libarchive/README @@ -0,0 +1,90 @@ +$FreeBSD$ + +libarchive: a library for reading and writing streaming archives + +This is all under a BSD license. Use, enjoy, but don't blame me if it breaks! + +As of February, 2004, the library proper is fairly complete and compiles +cleanly on FreeBSD 5-CURRENT. The API should be stable now. + +Documentation: + * libarchive(3) gives an overview of the library as a whole + * archive_read(3) and archive_write(3) provide detailed calling + sequences for the read and write APIs + * archive_entry(3) details the "struct archive_entry" utility class + * tar(5) documents the "tar" file formats supported by the library + +You should also read the copious comments in "archive.h" and the source +code for the sample "bsdtar" program for more details. Please let me know +about any errors or omissions you find. (In particular, I no doubt missed +a few things when researching the tar(5) page.) + +Notes: + * This is a heavily stream-oriented system. There is no direct + support for in-place modification or random access and no intention + of ever adding such support. Adding such support would require + sacrificing a lot of other features, so don't bother asking. + + * The library is designed to be extended with new compression and + archive formats. The only requirement is that the format be + readable or writable as a stream and that each archive entry be + independent. For example, zip archives can't be written as a + stream because they require the compressed size of the data as part + of the file header. Similarly, some file attributes for zip + archives can't be extracted when streaming because those attributes + are only stored in the end-of-archive central directory and thus + aren't available when the corresponding entry is actually + extracted. + + * Under certain circumstances, you can append entries to an archive + by opening the file for reading, skimming to the end of the archive, + noting the file location, then opening it for write with a custom write + callback that seeks to the appropriate position before writing. Be + sure to not enable any compression support if you do this! + + * Compression and blocking are handled implicitly and, as far as + possible, transparently. All archive I/O is correctly blocked, even if + it's compressed. On read, the compression format is detected + automatically and the appropriate decompressor is invoked. + + * It should be easy to implement a system that reads one + archive and writes entries to another archive, omitting + or adding entries as appropriate along the way. This permits + "re-writing" of archive streams in lieu of in-place modification. + bsdtar has some code to demonstrate this. + + * The archive itself is read/written using callback functions. + You can read an archive directly from an in-memory buffer or + write it to a socket, if you wish. There are some utility + functions to provide easy-to-use "open file," etc, capabilities. + + * The read/write APIs are designed to allow individual entries + to be read or written to any data source: You can create + a block of data in memory and add it to a tar archive without + first writing a temporary file. You can also read an entry from + an archive and write the data directly to a socket. If you want + to read/write entries to disk, there are convenience functions to + make this especially easy. + + * Read supports most common tar formats, including GNU tar, + POSIX-compliant "ustar interchange format", and the + shiny-and-improved POSIX "pax extended interchange format." The + pax format, in particular, eliminates most of the traditional tar + limitations in a standard way that is increasingly well supported. + (GNU tar notably does not support "pax interchange format"; the + GPL-licensed 'star' archiver does, however.) GNU format is only + incompletely supported at this time; if you really need GNU-format + sparse file support, volume headers, or GNU-format split archives, + let me know. + + There's also support for a grab-bag of non-tar formats, including + POSIX cpio and shar. + + * When writing tar formats, consider using "pax restricted" format + by default. This avoids the pax extensions whenever it can, enabling + them only on entries that cannot be correctly archived with ustar + format. Thus, you get the broad compatibility of ustar with the + safety of pax's support for very long filenames, etc. + + * Note: "pax interchange format" is really an extended tar format, + despite what the name says. diff --git a/lib/libarchive/archive.h b/lib/libarchive/archive.h new file mode 100644 index 0000000..54fafcb --- /dev/null +++ b/lib/libarchive/archive.h @@ -0,0 +1,266 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#ifndef ARCHIVE_H_INCLUDED +#define ARCHIVE_H_INCLUDED + +#include <stdarg.h> +#include <stdint.h> +#include <unistd.h> + +#define ARCHIVE_BYTES_PER_RECORD 512 +#define ARCHIVE_DEFAULT_BYTES_PER_BLOCK 10240 + +/* Declare our basic types. */ +struct archive; +struct archive_entry; + +/* + * Error codes: Use archive_errno() and archive_error_string() + * to retrieve details. Unless specified otherwise, all functions + * that return 'int' use these codes. + */ +#define ARCHIVE_EOF 1 /* Found end of archive. */ +#define ARCHIVE_OK 0 /* Operation was successful. */ +#define ARCHIVE_WARN (-1) /* Sucess, but minor problem. */ +#define ARCHIVE_RETRY (-2) /* Retry might succeed. */ +#define ARCHIVE_FATAL (-3) /* No more operations are possible. */ + +/* + * Callbacks are invoked to automatically read/write/open/close the archive. + * You can provide your own for complex tasks (like breaking archives + * across multiple tapes) or use standard ones built into the library. + */ + +/* Returns pointer and size of next block of data from archive. */ +typedef ssize_t archive_read_callback(struct archive *, void *_client_data, + const void **_buffer); +/* Returns size actually written, zero on EOF, -1 on error. */ +typedef ssize_t archive_write_callback(struct archive *, void *_client_data, + void *_buffer, size_t _length); +typedef int archive_open_callback(struct archive *, void *_client_data); +typedef int archive_close_callback(struct archive *, void *_client_data); + +/* + * Codes for archive_compression. + */ +#define ARCHIVE_COMPRESSION_NONE 0 +#define ARCHIVE_COMPRESSION_GZIP 1 +#define ARCHIVE_COMPRESSION_BZIP2 2 + +/* + * Codes returned by archive_format. + * + * Top 16 bits identifies the format family (e.g., "tar"); lower + * 16 bits indicate the variant. This is updated by read_next_header. + * Note that the lower 16 bits will often vary from entry to entry. + */ +#define ARCHIVE_FORMAT_BASE_MASK 0xff0000U +#define ARCHIVE_FORMAT_CPIO 0x10000 +#define ARCHIVE_FORMAT_CPIO_POSIX (ARCHIVE_FORMAT_CPIO | 1) +#define ARCHIVE_FORMAT_SHAR 0x20000 +#define ARCHIVE_FORMAT_SHAR_BASE (ARCHIVE_FORMAT_SHAR | 1) +#define ARCHIVE_FORMAT_SHAR_DUMP (ARCHIVE_FORMAT_SHAR | 2) +#define ARCHIVE_FORMAT_TAR 0x30000 +#define ARCHIVE_FORMAT_TAR_USTAR (ARCHIVE_FORMAT_TAR | 1) +#define ARCHIVE_FORMAT_TAR_PAX_INTERCHANGE (ARCHIVE_FORMAT_TAR | 2) +#define ARCHIVE_FORMAT_TAR_PAX_RESTRICTED (ARCHIVE_FORMAT_TAR | 3) +#define ARCHIVE_FORMAT_TAR_GNUTAR (ARCHIVE_FORMAT_TAR | 4) + +/*- + * Basic outline for reading an archive: + * 1) Ask archive_read_new for an archive reader object. + * 2) Update any global properties as appropriate. + * In particular, you'll certainly want to call appropriate + * archive_read_support_XXX functions. + * 3) Call archive_read_open_XXX to open the archive + * 4) Repeatedly call archive_read_next_header to get information about + * successive archive entries. Call archive_read_data to extract + * data for entries of interest. + * 5) Call archive_read_finish to destroy the object. + */ +struct archive *archive_read_new(void); + +/* + * XXX Kill this function. The client callback is now responsible for + * read blocking. XXX + */ +/* + int archive_read_set_bytes_per_block(struct archive *, + int bytes_per_blk); +*/ + +/* + * The archive_read_support_XXX calls enable auto-detect for this + * archive handle. They also link in the necessary support code. + * For example, if you don't want bzlib linked in, don't invoke + * support_compression_bzip2(). The "all" functions provide the + * obvious shorthand. + */ +int archive_read_support_compression_all(struct archive *); +int archive_read_support_compression_bzip2(struct archive *); +int archive_read_support_compression_gzip(struct archive *); +int archive_read_support_compression_none(struct archive *); + +int archive_read_support_format_all(struct archive *); +int archive_read_support_format_cpio(struct archive *); +int archive_read_support_format_gnutar(struct archive *); +int archive_read_support_format_tar(struct archive *); + + +/* Open the archive using callbacks for archive I/O. */ +int archive_read_open(struct archive *, void *_client_data, + archive_open_callback *, archive_read_callback *, + archive_close_callback *); + +/* + * The archive_read_open_file function is a convenience function built + * on archive_read_open that uses a canned callback suitable for + * common situations. Note that a NULL filename indicates stdin. + */ +int archive_read_open_file(struct archive *, const char *_file, + size_t _block_size); + +/* Parses and returns next entry header. */ +int archive_read_next_header(struct archive *, + struct archive_entry **); + +/* + * Retrieve the byte offset in UNCOMPRESSED data where last-read + * header started. + */ +int64_t archive_read_header_position(struct archive *); + +/* Read data from the body of an entry. Similar to read(2). */ +ssize_t archive_read_data(struct archive *, void *, size_t); + +/*- + * Some convenience functions that are built on archive_read_data: + * 'skip': skips entire entry + * 'into_buffer': writes data into memory buffer that you provide + * 'into_file': writes data to specified filedes + */ +int archive_read_data_skip(struct archive *); +ssize_t archive_read_data_into_buffer(struct archive *, void *buffer, + ssize_t len); +ssize_t archive_read_data_into_fd(struct archive *, int fd); + +/*- + * Convenience function to recreate the current entry (whose header + * has just been read) on disk. + * + * This does quite a bit more than just copy data to disk. It also: + * - Creates intermediate directories as required. + * - Manages directory permissions: non-writable directories will + * be initially created with write permission enabled; when the + * archive is closed, dir permissions are edited to the values specified + * in the archive. + * - Checks hardlinks: hardlinks will not be extracted unless the + * linked-to file was also extracted within the same session. (TODO) + */ + +/* The "flags" argument selects optional behavior, 'OR' the flags you want. */ +/* TODO: The 'Default' comments here are not quite correct; clean this up. */ +#define ARCHIVE_EXTRACT_OWNER (1) /* Default: owner/group not restored */ +#define ARCHIVE_EXTRACT_PERM (2) /* Default: restore perm only for reg file*/ +#define ARCHIVE_EXTRACT_TIME (4) /* Default: mod time not restored */ +#define ARCHIVE_EXTRACT_NO_OVERWRITE (8) /* Default: Replace files on disk */ + +int archive_read_extract(struct archive *, struct archive_entry *, + int flags); + +/* Close the file, release any resources, and destroy the object. */ +void archive_read_finish(struct archive *); + +/*- + * To create an archive: + * 1) Ask archive_write_new for a archive writer object. + * 2) Set any global properties. In particular, you should register + * open/write/close callbacks. + * 3) Call archive_write_open to open the file + * 4) For each entry: + * - construct an appropriate struct archive_entry structure + * - archive_write_header to write the header + * - archive_write_data to write the entry data + * 5) archive_write_finish to close the output and cleanup the writer + */ +struct archive *archive_write_new(void); +int archive_write_set_bytes_per_block(struct archive *, + int bytes_per_block); +/* XXX This is badly misnamed; suggestions appreciated. XXX */ +int archive_write_set_bytes_in_last_block(struct archive *, + int bytes_in_last_block); + +int archive_write_set_compression_bzip2(struct archive *); +int archive_write_set_compression_gzip(struct archive *); +int archive_write_set_compression_none(struct archive *); +/* A convenience function to set the format based on the code or name. */ +int archive_write_set_format(struct archive *, int format_code); +int archive_write_set_format_by_name(struct archive *, + const char *name); +/* To minimize link pollution, use one or more of the following. */ +int archive_write_set_format_cpio(struct archive *); +/* TODO: int archive_write_set_format_old_tar(struct archive *); */ +int archive_write_set_format_pax(struct archive *); +int archive_write_set_format_pax_restricted(struct archive *); +int archive_write_set_format_shar(struct archive *); +int archive_write_set_format_shar_dump(struct archive *); +int archive_write_set_format_ustar(struct archive *); +int archive_write_open(struct archive *, void *, + archive_open_callback *, archive_write_callback *, + archive_close_callback *); +int archive_write_open_file(struct archive *, const char *_file); +int archive_write_open_file_position(struct archive *, + const char *_filename, int64_t offset); +int archive_write_open_tar(struct archive *, const char *_file); + +/* + * Note that the library will truncate writes beyond the size provided + * to archive_write_header or pad if the provided data is short. + */ +int archive_write_header(struct archive *, + struct archive_entry *); +int archive_write_data(struct archive *, const void *, size_t); +void archive_write_finish(struct archive *); + +/* + * Accessor functions to read/set various information in + * the struct archive object: + */ +const char *archive_compression_name(struct archive *); +int archive_compression(struct archive *); +int archive_errno(struct archive *); +const char *archive_error_string(struct archive *); +const char *archive_format_name(struct archive *); +int archive_format(struct archive *); +/* void archive_set_errno(struct archive *, int); */ +/* void archive_error_printf(struct archive *, const char *fmt, ...); */ + +void archive_set_error(struct archive *, int _err, const char *fmt, ...); + +#endif /* !ARCHIVE_H_INCLUDED */ diff --git a/lib/libarchive/archive.h.in b/lib/libarchive/archive.h.in new file mode 100644 index 0000000..54fafcb --- /dev/null +++ b/lib/libarchive/archive.h.in @@ -0,0 +1,266 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#ifndef ARCHIVE_H_INCLUDED +#define ARCHIVE_H_INCLUDED + +#include <stdarg.h> +#include <stdint.h> +#include <unistd.h> + +#define ARCHIVE_BYTES_PER_RECORD 512 +#define ARCHIVE_DEFAULT_BYTES_PER_BLOCK 10240 + +/* Declare our basic types. */ +struct archive; +struct archive_entry; + +/* + * Error codes: Use archive_errno() and archive_error_string() + * to retrieve details. Unless specified otherwise, all functions + * that return 'int' use these codes. + */ +#define ARCHIVE_EOF 1 /* Found end of archive. */ +#define ARCHIVE_OK 0 /* Operation was successful. */ +#define ARCHIVE_WARN (-1) /* Sucess, but minor problem. */ +#define ARCHIVE_RETRY (-2) /* Retry might succeed. */ +#define ARCHIVE_FATAL (-3) /* No more operations are possible. */ + +/* + * Callbacks are invoked to automatically read/write/open/close the archive. + * You can provide your own for complex tasks (like breaking archives + * across multiple tapes) or use standard ones built into the library. + */ + +/* Returns pointer and size of next block of data from archive. */ +typedef ssize_t archive_read_callback(struct archive *, void *_client_data, + const void **_buffer); +/* Returns size actually written, zero on EOF, -1 on error. */ +typedef ssize_t archive_write_callback(struct archive *, void *_client_data, + void *_buffer, size_t _length); +typedef int archive_open_callback(struct archive *, void *_client_data); +typedef int archive_close_callback(struct archive *, void *_client_data); + +/* + * Codes for archive_compression. + */ +#define ARCHIVE_COMPRESSION_NONE 0 +#define ARCHIVE_COMPRESSION_GZIP 1 +#define ARCHIVE_COMPRESSION_BZIP2 2 + +/* + * Codes returned by archive_format. + * + * Top 16 bits identifies the format family (e.g., "tar"); lower + * 16 bits indicate the variant. This is updated by read_next_header. + * Note that the lower 16 bits will often vary from entry to entry. + */ +#define ARCHIVE_FORMAT_BASE_MASK 0xff0000U +#define ARCHIVE_FORMAT_CPIO 0x10000 +#define ARCHIVE_FORMAT_CPIO_POSIX (ARCHIVE_FORMAT_CPIO | 1) +#define ARCHIVE_FORMAT_SHAR 0x20000 +#define ARCHIVE_FORMAT_SHAR_BASE (ARCHIVE_FORMAT_SHAR | 1) +#define ARCHIVE_FORMAT_SHAR_DUMP (ARCHIVE_FORMAT_SHAR | 2) +#define ARCHIVE_FORMAT_TAR 0x30000 +#define ARCHIVE_FORMAT_TAR_USTAR (ARCHIVE_FORMAT_TAR | 1) +#define ARCHIVE_FORMAT_TAR_PAX_INTERCHANGE (ARCHIVE_FORMAT_TAR | 2) +#define ARCHIVE_FORMAT_TAR_PAX_RESTRICTED (ARCHIVE_FORMAT_TAR | 3) +#define ARCHIVE_FORMAT_TAR_GNUTAR (ARCHIVE_FORMAT_TAR | 4) + +/*- + * Basic outline for reading an archive: + * 1) Ask archive_read_new for an archive reader object. + * 2) Update any global properties as appropriate. + * In particular, you'll certainly want to call appropriate + * archive_read_support_XXX functions. + * 3) Call archive_read_open_XXX to open the archive + * 4) Repeatedly call archive_read_next_header to get information about + * successive archive entries. Call archive_read_data to extract + * data for entries of interest. + * 5) Call archive_read_finish to destroy the object. + */ +struct archive *archive_read_new(void); + +/* + * XXX Kill this function. The client callback is now responsible for + * read blocking. XXX + */ +/* + int archive_read_set_bytes_per_block(struct archive *, + int bytes_per_blk); +*/ + +/* + * The archive_read_support_XXX calls enable auto-detect for this + * archive handle. They also link in the necessary support code. + * For example, if you don't want bzlib linked in, don't invoke + * support_compression_bzip2(). The "all" functions provide the + * obvious shorthand. + */ +int archive_read_support_compression_all(struct archive *); +int archive_read_support_compression_bzip2(struct archive *); +int archive_read_support_compression_gzip(struct archive *); +int archive_read_support_compression_none(struct archive *); + +int archive_read_support_format_all(struct archive *); +int archive_read_support_format_cpio(struct archive *); +int archive_read_support_format_gnutar(struct archive *); +int archive_read_support_format_tar(struct archive *); + + +/* Open the archive using callbacks for archive I/O. */ +int archive_read_open(struct archive *, void *_client_data, + archive_open_callback *, archive_read_callback *, + archive_close_callback *); + +/* + * The archive_read_open_file function is a convenience function built + * on archive_read_open that uses a canned callback suitable for + * common situations. Note that a NULL filename indicates stdin. + */ +int archive_read_open_file(struct archive *, const char *_file, + size_t _block_size); + +/* Parses and returns next entry header. */ +int archive_read_next_header(struct archive *, + struct archive_entry **); + +/* + * Retrieve the byte offset in UNCOMPRESSED data where last-read + * header started. + */ +int64_t archive_read_header_position(struct archive *); + +/* Read data from the body of an entry. Similar to read(2). */ +ssize_t archive_read_data(struct archive *, void *, size_t); + +/*- + * Some convenience functions that are built on archive_read_data: + * 'skip': skips entire entry + * 'into_buffer': writes data into memory buffer that you provide + * 'into_file': writes data to specified filedes + */ +int archive_read_data_skip(struct archive *); +ssize_t archive_read_data_into_buffer(struct archive *, void *buffer, + ssize_t len); +ssize_t archive_read_data_into_fd(struct archive *, int fd); + +/*- + * Convenience function to recreate the current entry (whose header + * has just been read) on disk. + * + * This does quite a bit more than just copy data to disk. It also: + * - Creates intermediate directories as required. + * - Manages directory permissions: non-writable directories will + * be initially created with write permission enabled; when the + * archive is closed, dir permissions are edited to the values specified + * in the archive. + * - Checks hardlinks: hardlinks will not be extracted unless the + * linked-to file was also extracted within the same session. (TODO) + */ + +/* The "flags" argument selects optional behavior, 'OR' the flags you want. */ +/* TODO: The 'Default' comments here are not quite correct; clean this up. */ +#define ARCHIVE_EXTRACT_OWNER (1) /* Default: owner/group not restored */ +#define ARCHIVE_EXTRACT_PERM (2) /* Default: restore perm only for reg file*/ +#define ARCHIVE_EXTRACT_TIME (4) /* Default: mod time not restored */ +#define ARCHIVE_EXTRACT_NO_OVERWRITE (8) /* Default: Replace files on disk */ + +int archive_read_extract(struct archive *, struct archive_entry *, + int flags); + +/* Close the file, release any resources, and destroy the object. */ +void archive_read_finish(struct archive *); + +/*- + * To create an archive: + * 1) Ask archive_write_new for a archive writer object. + * 2) Set any global properties. In particular, you should register + * open/write/close callbacks. + * 3) Call archive_write_open to open the file + * 4) For each entry: + * - construct an appropriate struct archive_entry structure + * - archive_write_header to write the header + * - archive_write_data to write the entry data + * 5) archive_write_finish to close the output and cleanup the writer + */ +struct archive *archive_write_new(void); +int archive_write_set_bytes_per_block(struct archive *, + int bytes_per_block); +/* XXX This is badly misnamed; suggestions appreciated. XXX */ +int archive_write_set_bytes_in_last_block(struct archive *, + int bytes_in_last_block); + +int archive_write_set_compression_bzip2(struct archive *); +int archive_write_set_compression_gzip(struct archive *); +int archive_write_set_compression_none(struct archive *); +/* A convenience function to set the format based on the code or name. */ +int archive_write_set_format(struct archive *, int format_code); +int archive_write_set_format_by_name(struct archive *, + const char *name); +/* To minimize link pollution, use one or more of the following. */ +int archive_write_set_format_cpio(struct archive *); +/* TODO: int archive_write_set_format_old_tar(struct archive *); */ +int archive_write_set_format_pax(struct archive *); +int archive_write_set_format_pax_restricted(struct archive *); +int archive_write_set_format_shar(struct archive *); +int archive_write_set_format_shar_dump(struct archive *); +int archive_write_set_format_ustar(struct archive *); +int archive_write_open(struct archive *, void *, + archive_open_callback *, archive_write_callback *, + archive_close_callback *); +int archive_write_open_file(struct archive *, const char *_file); +int archive_write_open_file_position(struct archive *, + const char *_filename, int64_t offset); +int archive_write_open_tar(struct archive *, const char *_file); + +/* + * Note that the library will truncate writes beyond the size provided + * to archive_write_header or pad if the provided data is short. + */ +int archive_write_header(struct archive *, + struct archive_entry *); +int archive_write_data(struct archive *, const void *, size_t); +void archive_write_finish(struct archive *); + +/* + * Accessor functions to read/set various information in + * the struct archive object: + */ +const char *archive_compression_name(struct archive *); +int archive_compression(struct archive *); +int archive_errno(struct archive *); +const char *archive_error_string(struct archive *); +const char *archive_format_name(struct archive *); +int archive_format(struct archive *); +/* void archive_set_errno(struct archive *, int); */ +/* void archive_error_printf(struct archive *, const char *fmt, ...); */ + +void archive_set_error(struct archive *, int _err, const char *fmt, ...); + +#endif /* !ARCHIVE_H_INCLUDED */ diff --git a/lib/libarchive/archive_check_magic.c b/lib/libarchive/archive_check_magic.c new file mode 100644 index 0000000..719ba9e --- /dev/null +++ b/lib/libarchive/archive_check_magic.c @@ -0,0 +1,102 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <stdio.h> +#include <string.h> +#include <unistd.h> + +#include "archive_private.h" + +static void +diediedie(void) +{ + *(char *)0 = 1; /* Deliberately segfault and force a coredump. */ + _exit(1); /* If that didn't work, just exit with an error. */ +} + +static const char * +state_name(unsigned s) +{ + switch (s) { + case ARCHIVE_STATE_NEW: return ("new"); + case ARCHIVE_STATE_HEADER: return ("header"); + case ARCHIVE_STATE_DATA: return ("data"); + case ARCHIVE_STATE_EOF: return ("eof"); + case ARCHIVE_STATE_CLOSED: return ("closed"); + case ARCHIVE_STATE_FATAL: return ("fatal"); + default: return ("??"); + } +} + + +static void +write_all_states(FILE *f, int states) +{ + unsigned lowbit; + + /* A trick for computing the lowest set bit. */ + while ((lowbit = states & (-states)) != 0) { + states &= ~lowbit; /* Clear the low bit. */ + fprintf(f, "%s%s", state_name(lowbit), + (states != 0) ? "/" : ""); + } +} + +/* + * Check magic value and current state; bail if it isn't valid. + * + * This is designed to catch serious programming errors that violate + * the libarchive API. + */ +void +__archive_check_magic(struct archive *a, unsigned magic, unsigned state, + const char *function) +{ + if (a->magic != magic) { + fprintf(stderr, "INTERNAL ERROR: Function %s invoked" + " with invalid struct archive structure.\n", function); + diediedie(); + } + + if (state == ARCHIVE_STATE_ANY) + return; + + if ((a->state & state) == 0) { + fprintf(stderr, "INTERNAL ERROR: Function '%s' invoked" + " with archive structure in state '", function); + write_all_states(stderr, a->state); + fprintf(stderr,"', should be in state '"); + write_all_states(stderr, state); + fprintf(stderr, "'\n"); + diediedie(); + } +} diff --git a/lib/libarchive/archive_entry.3 b/lib/libarchive/archive_entry.3 new file mode 100644 index 0000000..c247b23 --- /dev/null +++ b/lib/libarchive/archive_entry.3 @@ -0,0 +1,218 @@ +.\" Copyright (c) 2003-2004 Tim Kientzle +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd December 15, 2003 +.Dt archive_entry 3 +.Os +.Sh NAME +.Nm archive_entry_clear +.Nm archive_entry_clone +.Nm archive_entry_copy_stat +.Nm archive_entry_dup +.Nm archive_entry_free +.Nm archive_entry_gname +.Nm archive_entry_hardlink +.Nm archive_entry_new +.Nm archive_entry_pathname +.Nm archive_entry_set_devmajor +.Nm archive_entry_set_devminor +.Nm archive_entry_set_gid +.Nm archive_entry_set_gname +.Nm archive_entry_set_hardlink +.Nm archive_entry_set_mode +.Nm archive_entry_set_pathname +.Nm archive_entry_set_symlink +.Nm archive_entry_set_tartype +.Nm archive_entry_set_uid +.Nm archive_entry_set_uname +.Nm archive_entry_size +.Nm archive_entry_stat +.Nm archive_entry_symlink +.Nm archive_entry_tartype +.Nm archive_entry_uname +.Nd functions for manipulating archive entry descriptions +.Sh SYNOPSIS +.In archive_entry.h +.Ft void +.Fn archive_entry_clear "struct archive_entry *" +.Ft struct archive_entry * +.Fn archive_entry_clone "struct archive_entry *" +.Ft void +.Fn archive_entry_copy_stat "struct archive_entry *" "struct stat *" +.Ft struct archive_entry * +.Fn archive_entry_dup "struct archive_entry *" +.Ft void +.Fn archive_entry_free "struct archive_entry *" +.Ft const char * +.Fn archive_entry_gname "struct archive_entry *" +.Ft const char * +.Fn archive_entry_hardlink "struct archive_entry *" +.Ft struct archive_entry * +.Fn archive_entry_new "void" +.Ft const char * +.Fn archive_entry_pathname "struct archive_entry *" +.Ft void +.Fn archive_entry_set_devmajor "struct archive_entry *" "dev_t" +.Ft void +.Fn archive_entry_set_devminor "struct archive_entry *" "dev_t" +.Ft void +.Fn archive_entry_set_gid "struct archive_entry *" "gid_t" +.Ft void +.Fn archive_entry_set_gname "struct archive_entry *" "const char *" +.Ft void +.Fn archive_entry_set_hardlink "struct archive_entry *" "const char *" +.Ft void +.Fn archive_entry_set_mode "struct archive_entry *" "mode_t" +.Ft void +.Fn archive_entry_set_pathname "struct archive_entry *" "const char *" +.Ft void +.Fn archive_entry_set_symlink "struct archive_entry *" "const char *" +.Ft void +.Fn archive_entry_set_tartype "struct archive_entry *" "int" +.Ft void +.Fn archive_entry_set_uid "struct archive_entry *" "uid_t" +.Ft void +.Fn archive_entry_set_uname "struct archive_entry *" "const char *" +.Ft int64_t +.Fn archive_entry_size "struct archive_entry *" +.Ft const struct stat * +.Fn archive_entry_stat "struct archive_entry *" +.Ft const char * +.Fn archive_entry_symlink "struct archive_entry *" +.Ft int +.Fn archive_entry_tartype "struct archive_entry *" +.Ft const char * +.Fn archive_entry_uname "struct archive_entry *" +.Sh DESCRIPTION +These functions create and manipulate data objects that +represent entries within an archive. +You can think of a +.Tn struct archive_entry +as a +.Tn struct stat +on steroids: it includes everything from +.Tn struct stat +plus associated pathname, textual group and user names, etc. +These objects are used by +.Xr libarchive 3 +to represent the metadata associated with a particular +entry in an archive. +.Bl -tag -compact -width indent +.It Fn archive_entry_clear +Erases the object, resetting all internal fields to the +same state as a newly-created object. +This is provided to allow you to quickly recycle objects +without thrashing the heap. +.It Fn archive_entry_clone +A deep copy operation; all text fields are duplicated. +.It Fn archive_entry_copy_stat +Copies the contents of the provided +.Tn struct stat +into the +.Tn struct archive_entry +object. +.It Fn archive_entry_dup +A shallow copy; text fields are not duplicated. +.It Fn archive_entry_free +Releases the +.Tn struct archive_entry +object. +.It Fn archive_entry_gname +Returns a pointer to the textual group name. +.It Fn archive_entry_hardlink +If this function returns non-NULL, then this object represents +a hardlink to another filesystem object. +The contents contain the pathname of the object. +.It Fn archive_entry_new +Allocate and return a blank +.Tn struct archive_entry +object. +.It Fn archive_entry_pathname +Returns a pointer to the pathname. +.It Fn archive_entry_set_devmajor +Sets the device major number (only valid for objects representing +block and character devices). +.It Fn archive_entry_set_devminor +Sets the device minor number (only valid for objects representing +block and character devices). +.It Fn archive_entry_set_gid +Sets the group ID for the object. +.It Fn archive_entry_set_gname +Sets a pointer to the textual group name. +Note that the name itself is not copied. +.It Fn archive_entry_set_hardlink +Sets the hardlink property; see +.Fn archive_entry_hardlink +above. +.It Fn archive_entry_set_mode +Sets the file mode. +.It Fn archive_entry_set_pathname +Sets a pointer to the pathname. +Note that the pathname text is not copied. +.It Fn archive_entry_set_symlink +Sets a pointer to the contents of a symbolic link. +Note that the pathname text is not copied. +.It Fn archive_entry_set_tartype +Sets the value to be used in a tar-format header +for this entry. +Client code should generally not set this; if it +is left unset, the library will automatically determine +an appropriate value. +.It Fn archive_entry_set_uid +Set the user ID for the object. +.It Fn archive_entry_set_uname +Sets a pointer to the textual user name. +Note that the name itself is not copied. +.It Fn archive_entry_size +Returns the size of the object on disk in bytes. +.It Fn archive_entry_stat +Returns a pointer to a populated +.Tn struct stat . +.It Fn archive_entry_symlink +Returns a pointer to the symlink contents. +.It Fn archive_entry_tartype +Returns the value used in a tar-format header. +Not generally useful to clients. +.It Fn archive_entry_uname +Returns a pointer to the textual user name. +.El +.\" .Sh EXAMPLE +.\" .Sh RETURN VALUES +.\" .Sh ERRORS +.Sh SEE ALSO +.Xr archive 3 +.Sh HISTORY +The +.Nm libarchive +library first appeared in +.Fx 5.3 . +.Sh AUTHORS +.An -nosplit +The +.Nm libarchive +library was written by +.An Tim Kientzle Aq kientzle@acm.org . +.Sh BUGS diff --git a/lib/libarchive/archive_entry.c b/lib/libarchive/archive_entry.c new file mode 100644 index 0000000..d48ab32 --- /dev/null +++ b/lib/libarchive/archive_entry.c @@ -0,0 +1,407 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <sys/stat.h> +#include <sys/types.h> +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <stdlib.h> +#include <string.h> + +#include "archive_entry.h" + +/* + * Description of an archive entry. + * + * Basically, this is a "struct stat" with a few text fields added in. + * + * TODO: Add "comment", "charset", "acl", and possibly other entries + * that are supported by "pax interchange" format. However, GNU, ustar, + * cpio, and other variants don't support these features, so they're not an + * excruciatingly high priority right now. + * + * TODO: "pax interchange" format allows essentially arbitrary + * key/value attributes to be attached to any entry. Supporting + * such extensions may make this library useful for special + * applications (e.g., a package manager could attach special + * package-management attributes to each entry). There are tricky + * API issues involved, so this is not going to happen until + * there's a real demand for it. + * + * TODO: Design a good API for handling sparse files. + */ +struct archive_entry { + /* + * Note that ae_stat.st_mode & S_IFMT can be 0! + * This occurs when the actual file type of the underlying object is + * not in the archive. For example, 'tar' archives store hardlinks + * without marking the type of the underlying object. + */ + struct stat ae_stat; + + /* I'm not happy with having this format-particular data here. */ + int ae_tartype; + + /* + * Note: If you add any more string fields, update + * archive_entry_clone accordingly. + */ + const char *ae_acl; /* ACL text */ + const char *ae_acl_default; /* default ACL */ + const char *ae_fflags; /* Text fflags per fflagstostr(3) */ + const char *ae_gname; /* Name of owning group */ + const char *ae_hardlink; /* Name of target for hardlink */ + const char *ae_pathname; /* Name of entry */ + const char *ae_symlink; /* symlink contents */ + const char *ae_uname; /* Name of owner */ + + char buff[1]; /* MUST BE AT END OF STRUCT!!! */ +}; + +struct archive_entry * +archive_entry_clear(struct archive_entry *entry) +{ + memset(entry, 0, sizeof(*entry)); + entry->ae_tartype = -1; + return entry; +} + +struct archive_entry * +archive_entry_clone(struct archive_entry *entry) +{ + int size; + struct archive_entry *entry2; + char *p; + + size = sizeof(*entry2); + if (entry->ae_acl) + size += strlen(entry->ae_acl) + 1; + if (entry->ae_acl_default) + size += strlen(entry->ae_acl_default) + 1; + if (entry->ae_fflags) + size += strlen(entry->ae_fflags) + 1; + if (entry->ae_gname) + size += strlen(entry->ae_gname) + 1; + if (entry->ae_hardlink) + size += strlen(entry->ae_hardlink) + 1; + if (entry->ae_pathname) + size += strlen(entry->ae_pathname) + 1; + if (entry->ae_symlink) + size += strlen(entry->ae_symlink) + 1; + if (entry->ae_uname) + size += strlen(entry->ae_uname) + 1; + + entry2 = malloc(size); + *entry2 = *entry; + + /* Copy all of the strings from the original. */ + p = entry2->buff; + + if (entry->ae_acl) { + entry2->ae_acl = p; + strcpy(p, entry->ae_acl); + p += strlen(p) + 1; + } + + if (entry->ae_acl_default) { + entry2->ae_acl_default = p; + strcpy(p, entry->ae_acl_default); + p += strlen(p) + 1; + } + + if (entry->ae_fflags) { + entry2->ae_fflags = p; + strcpy(p, entry->ae_fflags); + p += strlen(p) + 1; + } + + if (entry->ae_gname) { + entry2->ae_gname = p; + strcpy(p, entry->ae_gname); + p += strlen(p) + 1; + } + + if (entry->ae_hardlink) { + entry2->ae_hardlink = p; + strcpy(p, entry->ae_hardlink); + p += strlen(p) + 1; + } + + if (entry->ae_pathname) { + entry2->ae_pathname = p; + strcpy(p, entry->ae_pathname); + p += strlen(p) + 1; + } + + if (entry->ae_symlink) { + entry2->ae_symlink = p; + strcpy(p, entry->ae_symlink); + p += strlen(p) + 1; + } + + if (entry->ae_uname) { + entry2->ae_uname = p; + strcpy(p, entry->ae_uname); + p += strlen(p) + 1; + } + + return (entry2); +} + +struct archive_entry * +archive_entry_dup(struct archive_entry *entry) +{ + struct archive_entry *entry2; + + entry2 = malloc(sizeof(*entry2)); + *entry2 = *entry; + return (entry2); +} + +void +archive_entry_free(struct archive_entry *entry) +{ + free(entry); +} + +struct archive_entry * +archive_entry_new(void) +{ + struct archive_entry *entry; + + entry = malloc(sizeof(*entry)); + if(entry == NULL) + return (NULL); + archive_entry_clear(entry); + return (entry); +} + + +/* + * Functions for reading fields from an archive_entry. + */ + +const char * +archive_entry_acl(struct archive_entry *entry) +{ + return (entry->ae_acl); +} + + +const char * +archive_entry_acl_default(struct archive_entry *entry) +{ + return (entry->ae_acl_default); +} + +dev_t +archive_entry_devmajor(struct archive_entry *entry) +{ + return (major(entry->ae_stat.st_rdev)); +} + + +dev_t +archive_entry_devminor(struct archive_entry *entry) +{ + return (minor(entry->ae_stat.st_rdev)); +} + +const char * +archive_entry_fflags(struct archive_entry *entry) +{ + return (entry->ae_fflags); +} + +const char * +archive_entry_gname(struct archive_entry *entry) +{ + return (entry->ae_gname); +} + +const char * +archive_entry_hardlink(struct archive_entry *entry) +{ + return (entry->ae_hardlink); +} + +mode_t +archive_entry_mode(struct archive_entry *entry) +{ + return (entry->ae_stat.st_mode); +} + +const char * +archive_entry_pathname(struct archive_entry *entry) +{ + return (entry->ae_pathname); +} + +int64_t +archive_entry_size(struct archive_entry *entry) +{ + return (entry->ae_stat.st_size); +} + +const struct stat * +archive_entry_stat(struct archive_entry *entry) +{ + return (&entry->ae_stat); +} + +const char * +archive_entry_symlink(struct archive_entry *entry) +{ + return (entry->ae_symlink); +} + +int +archive_entry_tartype(struct archive_entry *entry) +{ + return (entry->ae_tartype); +} + +const char * +archive_entry_uname(struct archive_entry *entry) +{ + return (entry->ae_uname); +} + +/* + * Functions to set archive_entry properties. + */ + +/* + * Note "copy" not "set" here. The "set" functions that accept a pointer + * only store the pointer; they don't copy the underlying object. + */ +void +archive_entry_copy_stat(struct archive_entry *entry, const struct stat *st) +{ + entry->ae_stat = *st; +} + +void +archive_entry_set_acl(struct archive_entry *entry, const char *acl) +{ + entry->ae_acl = acl; +} + + +void +archive_entry_set_acl_default(struct archive_entry *entry, const char *acl) +{ + entry->ae_acl_default = acl; +} + +void +archive_entry_set_devmajor(struct archive_entry *entry, dev_t m) +{ + dev_t d; + + d = entry->ae_stat.st_rdev; + entry->ae_stat.st_rdev = makedev(m, minor(d)); +} + +void +archive_entry_set_devminor(struct archive_entry *entry, dev_t m) +{ + dev_t d; + + d = entry->ae_stat.st_rdev; + entry->ae_stat.st_rdev = makedev( major(d), m); +} + +void +archive_entry_set_fflags(struct archive_entry *entry, const char *flags) +{ + entry->ae_fflags = flags; +} + +void +archive_entry_set_gid(struct archive_entry *entry, gid_t g) +{ + entry->ae_stat.st_gid = g; +} + +void +archive_entry_set_gname(struct archive_entry *entry, const char *name) +{ + entry->ae_gname = name; +} + +void +archive_entry_set_hardlink(struct archive_entry *entry, const char *target) +{ + entry->ae_hardlink = target; +} + +void +archive_entry_set_mode(struct archive_entry *entry, mode_t m) +{ + entry->ae_stat.st_mode = m; +} + +void +archive_entry_set_pathname(struct archive_entry *entry, const char *name) +{ + entry->ae_pathname = name; +} + +void +archive_entry_set_size(struct archive_entry *entry, int64_t s) +{ + entry->ae_stat.st_size = s; +} + +void +archive_entry_set_symlink(struct archive_entry *entry, const char *link) +{ + entry->ae_symlink = link; +} + +void +archive_entry_set_tartype(struct archive_entry *entry, char t) +{ + entry->ae_tartype = t; +} + +void +archive_entry_set_uid(struct archive_entry *entry, uid_t u) +{ + entry->ae_stat.st_uid = u; +} + +void +archive_entry_set_uname(struct archive_entry *entry, const char *name) +{ + entry->ae_uname = name; +} + diff --git a/lib/libarchive/archive_entry.h b/lib/libarchive/archive_entry.h new file mode 100644 index 0000000..e210fa7 --- /dev/null +++ b/lib/libarchive/archive_entry.h @@ -0,0 +1,111 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#ifndef ARCHIVE_ENTRY_H_INCLUDED +#define ARCHIVE_ENTRY_H_INCLUDED + + +#include <sys/stat.h> +#include <sys/types.h> + +/* + * Description of an archive entry. + * + * Basically, a "struct stat" with a few text fields added in. + * + * TODO: Add "comment", "charset", and possibly other entries that are + * supported by "pax interchange" format. However, GNU, ustar, cpio, + * and other variants don't support these features, so they're not an + * excruciatingly high priority right now. + * + * TODO: "pax interchange" format allows essentially arbitrary + * key/value attributes to be attached to any entry. Supporting + * such extensions may make this library useful for special + * applications (e.g., a package manager could attach special + * package-management attributes to each entry). + * + * TODO: Design a good API for handling sparse files. + */ +struct archive_entry; + +/* + * Basic object manipulation + */ + +struct archive_entry *archive_entry_clear(struct archive_entry *); +/* The 'clone' function does a deep copy; all of the strings are copied too. */ +struct archive_entry *archive_entry_clone(struct archive_entry *); +/* The 'dup' function does a shallow copy; referenced strings aren't copied. */ +struct archive_entry *archive_entry_dup(struct archive_entry *); +void archive_entry_free(struct archive_entry *); +struct archive_entry *archive_entry_new(void); + +/* + * Retrieve fields from an archive_entry. + */ + +const char *archive_entry_acl(struct archive_entry *); +const char *archive_entry_acl_default(struct archive_entry *); +dev_t archive_entry_devmajor(struct archive_entry *); +dev_t archive_entry_devminor(struct archive_entry *); +const char *archive_entry_fflags(struct archive_entry *); +const char *archive_entry_gname(struct archive_entry *); +const char *archive_entry_hardlink(struct archive_entry *); +mode_t archive_entry_mode(struct archive_entry *); +const char *archive_entry_pathname(struct archive_entry *); +int64_t archive_entry_size(struct archive_entry *); +const struct stat *archive_entry_stat(struct archive_entry *); +const char *archive_entry_symlink(struct archive_entry *); +int archive_entry_tartype(struct archive_entry *); +const char *archive_entry_uname(struct archive_entry *); + +/* + * Set fields in an archive_entry. + * + * Note that string 'set' functions do not copy the string, only the pointer. + * In contrast, 'copy_stat' does copy the full structure. + */ + +void archive_entry_copy_stat(struct archive_entry *, const struct stat *); +void archive_entry_set_acl(struct archive_entry *, const char *); +void archive_entry_set_acl_default(struct archive_entry *, const char *); +void archive_entry_set_fflags(struct archive_entry *, const char *); +void archive_entry_set_devmajor(struct archive_entry *, dev_t); +void archive_entry_set_devminor(struct archive_entry *, dev_t); +void archive_entry_set_gid(struct archive_entry *, gid_t); +void archive_entry_set_gname(struct archive_entry *, const char *); +void archive_entry_set_hardlink(struct archive_entry *, const char *); +void archive_entry_set_mode(struct archive_entry *, mode_t); +void archive_entry_set_pathname(struct archive_entry *, const char *); +void archive_entry_set_size(struct archive_entry *, int64_t); +void archive_entry_set_symlink(struct archive_entry *, const char *); +void archive_entry_set_tartype(struct archive_entry *, char); +void archive_entry_set_uid(struct archive_entry *, uid_t); +void archive_entry_set_uname(struct archive_entry *, const char *); + +#endif /* !ARCHIVE_ENTRY_H_INCLUDED */ diff --git a/lib/libarchive/archive_private.h b/lib/libarchive/archive_private.h new file mode 100644 index 0000000..58c093b --- /dev/null +++ b/lib/libarchive/archive_private.h @@ -0,0 +1,267 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#ifndef ARCHIVE_PRIVATE_H_INCLUDED +#define ARCHIVE_PRIVATE_H_INCLUDED + +#include <stdint.h> + +#include "archive.h" +#include "archive_string.h" + +#define ARCHIVE_WRITE_MAGIC (0xb0c5c0deU) +#define ARCHIVE_READ_MAGIC (0xdeb0c5U) + +/* + * This is used by archive_extract to keep track of non-writable + * directories so that they can be initially restored writable, then + * fixed up at end. This also handles mtime/atime fixups. + */ +struct archive_extract_dir_entry { + struct archive_extract_dir_entry *next; + mode_t mode; + int64_t mtime; + int64_t atime; + unsigned long mtime_nanos; + unsigned long atime_nanos; + /* Note: ctime cannot be restored, so don't bother */ + char *name; +}; + +struct archive { + /* + * The magic/state values are used to sanity-check the + * client's usage. If an API function is called at a + * rediculous time, or the client passes us an invalid + * pointer, these values allow me to catch that. + */ + unsigned magic; + unsigned state; + + struct archive_entry *entry; + + /* + * Space to store per-entry strings. Most header strings are + * copied here from the format-specific header, in order to + * gaurantee null-termination. Maybe these should go into + * per-format storage? + */ + struct archive_string entry_name; + struct archive_string entry_linkname; + struct archive_string entry_uname; + struct archive_string entry_gname; + + /* Utility: Pointer to a block of nulls. */ + const char *nulls; + size_t null_length; + + /* + * Used to limit reads of entry data. Eventually, each reader + * will be able to register it's own read_data routine and these + * will move into the per-format data for the formats that use them. + */ + uint64_t entry_bytes_remaining; + uint64_t entry_padding; /* Skip this much after entry data. */ + + uid_t user_uid; /* UID of current user. */ + + /* Callbacks to open/read/write/close archive stream. */ + archive_open_callback *client_opener; + archive_read_callback *client_reader; + archive_write_callback *client_writer; + archive_close_callback *client_closer; + void *client_data; + + /* + * Blocking information. Note that bytes_in_last_block is + * misleadingly named; I should find a better name. These + * control the final output from all compressors, including + * compression_none. + */ + int bytes_per_block; + int bytes_in_last_block; + + /* + * These control whether data within a gzip/bzip2 compressed + * stream gets padded or not. If pad_uncompressed is set, + * the data will be padded to a full block before being + * compressed. The pad_uncompressed_byte determines the value + * that will be used for padding. Note that these have no + * effect on compression "none." + */ + int pad_uncompressed; + int pad_uncompressed_byte; /* TODO: Support this. */ + + /* + * PAX extended header data. When reading, + * name/linkname/uname/gname fields may point into here. This + * should be moved into per-format data storage. + */ + struct archive_string pax_header; + + /* + * GNU header fields. These should be moved into format-specific + * storage. + */ + struct archive_string gnu_name; + struct archive_string gnu_linkname; + int gnu_header_recursion_depth; + + /* Position in UNCOMPRESSED data stream. */ + intmax_t file_position; + /* File offset of beginning of most recently-read header. */ + intmax_t header_position; + + /* + * Detection functions for decompression: bid functions are + * given a block of data from the beginning of the stream and + * can bid on whether or not they support the data stream. + * General guideline: bid the number of bits that you actually + * test, e.g., 16 if you test a 2-byte magic value. The + * highest bidder will have their init function invoked, which + * can set up pointers to specific handlers. + * + * On write, the client just invokes an archive_write_set function + * which sets up the data here directly. + */ + int compression_code; /* Currently active compression. */ + const char *compression_name; + struct { + int (*bid)(const void *buff, size_t); + int (*init)(struct archive *, const void *buff, size_t); + } decompressors[4]; + /* Read/write data stream (with compression). */ + void *compression_data; /* Data for (de)compressor. */ + int (*compression_init)(struct archive *); /* Initialize. */ + int (*compression_finish)(struct archive *); + ssize_t (*compression_write)(struct archive *, const void *, size_t); + /* + * Read uses a peek/consume I/O model: the decompression code + * returns a pointer to the requested block and advances the + * file position only when requested by a consume call. This + * reduces copying and also simplifies look-ahead for format + * detection. + */ + ssize_t (*compression_read_ahead)(struct archive *, + const void **, size_t request); + ssize_t (*compression_read_consume)(struct archive *, size_t); + + /* + * Format detection is mostly the same as compression + * detection, with two significant differences: The bidders + * use the read_ahead calls above to examine the stream rather + * than having the supervisor hand them a block of data to + * examine, and the auction is repeated for every header. + * Winning bidders should set the archive_format and + * archive_format_name appropriately. Bid routines should + * check archive_format and decline to bid if the format of + * the last header was incompatible. + * + * Again, write support is considerably simpler because there's + * no need for an auction. + */ + int archive_format; + const char *archive_format_name; + + struct archive_format_descriptor { + int (*bid)(struct archive *); + int (*read_header)(struct archive *, struct archive_entry *); + int (*cleanup)(struct archive *); + void *format_data; /* Format-specific data for readers. */ + } formats[4]; + struct archive_format_descriptor *format; /* Active format. */ + + /* + * Storage for format-specific data. Note that there can be + * multiple format readers active at one time, so we need to + * allow for multiple format readers to have their data + * available. The pformat_data slot here is the solution: on + * read, it's set up in the bid phase and is gauranteed to + * always point to a void* variable that the format can use. + */ + void **pformat_data; /* Pointer to current format_data. */ + void *format_data; /* Used by writers. */ + + /* + * Pointers to format-specific functions. On read, these are + * initialized in the bid process. On write, they're initialized by + * archive_write_set_format_XXX() calls. + */ + int (*format_init)(struct archive *); /* Only used on write. */ + int (*format_finish)(struct archive *); + int (*format_finish_entry)(struct archive *); + int (*format_write_header)(struct archive *, + struct archive_entry *); + int (*format_write_data)(struct archive *, + const void *buff, size_t); + + /* + * Various information needed by archive_extract. + */ + struct archive_string extract_mkdirpath; + struct archive_extract_dir_entry *archive_extract_dir_list; + void (*cleanup_archive_extract)(struct archive *); + + int archive_error_number; + const char *error; + struct archive_string error_string; +}; + + +/* Utility function to format a USTAR header into a buffer. */ +int +__archive_write_format_header_ustar(struct archive *, char buff[512], + struct archive_entry *); + +#define ARCHIVE_STATE_ANY 0xFFFFU +#define ARCHIVE_STATE_NEW 1U +#define ARCHIVE_STATE_HEADER 2U +#define ARCHIVE_STATE_DATA 4U +#define ARCHIVE_STATE_EOF 8U +#define ARCHIVE_STATE_CLOSED 0x10U +#define ARCHIVE_STATE_FATAL 0x8000U + +/* Check magic value and state; exit if it isn't valid. */ +void +__archive_check_magic(struct archive *, unsigned magic, + unsigned state, const char *func); + +#define archive_check_magic(a,m,s) \ + __archive_check_magic((a), (m), (s), __func__) + +int __archive_read_register_format(struct archive *a, + void *format_data, + int (*bid)(struct archive *), + int (*read_header)(struct archive *, struct archive_entry *), + int (*cleanup)(struct archive *)); + +int __archive_read_register_compression(struct archive *a, + int (*bid)(const void *, size_t), + int (*init)(struct archive *, const void *, size_t)); + +#endif diff --git a/lib/libarchive/archive_read.3 b/lib/libarchive/archive_read.3 new file mode 100644 index 0000000..c2d2c3d --- /dev/null +++ b/lib/libarchive/archive_read.3 @@ -0,0 +1,346 @@ +.\" Copyright (c) 2003-2004 Tim Kientzle +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd October 1, 2003 +.Dt archive_read 3 +.Os +.Sh NAME +.Nm archive_read_new , +.Nm archive_read_set_bytes_per_block , +.Nm archive_read_support_compression_none , +.Nm archive_read_support_compression_gzip , +.Nm archive_read_support_compression_bzip2 , +.Nm archive_read_support_compression_all , +.Nm archive_read_support_format_tar , +.Nm archive_read_support_format_gnutar , +.Nm archive_read_support_format_cpio , +.Nm archive_read_support_format_all , +.Nm archive_read_open , +.Nm archive_read_open_file , +.Nm archive_read_next_header , +.Nm archive_read_data , +.Nm archive_read_data_skip , +.Nm archive_read_data_into_buffer , +.Nm archive_read_data_into_file , +.Nm archive_read_extract , +.Nm archive_read_finish +.Nd functions for reading tar archives +.Sh SYNOPSIS +.In archive.h +.Ft struct archive * +.Fn archive_read_new "void" +.Ft int +.Fn archive_read_set_bytes_per_block "struct archive *" "int" +.Ft int +.Fn archive_read_support_compression_none "struct archive *" +.Ft int +.Fn archive_read_support_compression_gzip "struct archive *" +.Ft int +.Fn archive_read_support_compression_bzip2 "struct archive *" +.Ft int +.Fn archive_read_support_compression_all "struct archive *" +.Ft int +.Fn archive_read_support_format_tar "struct archive *" +.Ft int +.Fn archive_read_support_format_gnutar "struct archive *" +.Ft int +.Fn archive_read_support_format_cpio "struct archive *" +.Ft int +.Fn archive_read_support_format_all "struct archive *" +.Ft int +.Fn archive_read_open "struct archive *" "void *client_data" "archive_read_archive_callback *" "archive_open_archive_callback *" "archive_close_archive_callback *" +.Ft int +.Fn archive_read_open_file "struct archive *" "const char *filename" +.Ft int +.Fn archive_read_next_header "struct archive *" "struct archive_entry **" +.Ft ssize_t +.Fn archive_read_data "struct archive *" "void *buff" "size_t len" +.Ft int +.Fn archive_read_data_skip "struct archive *" +.Ft int +.Fn archive_read_data_into_buffer "struct archive *" "void *" +.Ft int +.Fn archive_read_data_into_file "struct archive *" "int fd" +.Ft int +.Fn archive_read_extract "struct archive *" "int flags" +.Ft void +.Fn archive_read_finish "struct archive *" +.Sh DESCRIPTION +These functions provide a complete API for reading streaming archives. +The general process is to first create the +.Tn struct archive +object, set options, initialize the reader, iterate over the archive +headers and associated data, then close the archive and release all +resources. +The following summary describes the functions in approximately the +order they would be used: +.Bl -tag -compact -width indent +.It Fn archive_read_new +Allocates and initializes a +.Tn struct archive +object suitable for reading from an archive. +.It Fn archive_read_set_bytes_per_block +Sets the block size used for reading the archive data. +This controls the size that will be used when invoking the read +callback function. +The default is 20 records or 10240 bytes for tar formats. +.It Fn archive_read_support_compression_XXX +Enables auto-detection code and decompression support for the +specified compression. +Note that +.Dq none +is always enabled by default. +For convenience, +.Fn archive_read_support_compression_all +enables all available decompression code. +.It Fn archive_read_support_format_XXX +Enables support---including auto-detection code---for the +specified archive format. +In particular, +.Fn archive_read_support_format_tar +enables support for a variety of standard tar formats, old-style tar, +ustar, pax interchange format, and many common variants. +For convenience, +.Fn archive_read_support_format_all +enables support for all available formats. +Note that there is no default. +.It Fn archive_read_open +Freeze the settings, open the archive, and prepare for reading entries. +This is the most generic version of this call, which accepts +three callback functions. +The library invokes these client-provided functions to obtain +raw bytes from the archive. +.It Fn archive_read_open_file +Like +.Fn archive_read_open , +except that it accepts a simple filename. +A NULL filename represents standard input. +.It Fn archive_read_next_header +Read the header for the next entry and return a pointer to +a +.Tn struct archive_entry . +.It Fn archive_read_data +Read data associated with the header just read. +.It Fn archive_read_data_skip +A convenience function that repeatedly calls +.Fn archive_read_data +to skip all of the data for this archive entry. +.It Fn archive_read_data_into_buffer +A convenience function that repeatedly calls +.Fn archive_read_data +to copy the entire entry into the client-supplied buffer. +Note that the client is responsible for sizing the buffer appropriately. +.It Fn archive_read_data_into_file +A convenience function that repeatedly calls +.Fn archive_read_data +to copy the entire entry to the provided file descriptor. +.It Fn archive_read_extract +A convenience function that recreates the specified object on +disk and reads the entry data into that object. +The +.Va flags +argument modifies how the object is recreated. +It consists of a bitwise OR of one or more of the following values: +.Bl -tag -compact -width "indent" +.It Cm ARCHIVE_EXTRACT_OWNER +The user and group IDs should be set on the restored file. +By default, the user and group IDs are not restored. +.It Cm ARCHIVE_EXTRACT_PERM +The permissions (mode bits) should be restored for all objects. +By default, permissions are only restored for regular files. +.It Cm ARCHIVE_EXTRACT_TIME +The timestamps (mtime, ctime, and atime) should be restored. +By default, they are ignored. +Note that restoring of atime is not currently supported. +.It Cm ARCHIVE_EXTRACT_NO_OVERWRITE +Existing files on disk will not be overwritten. +By default, existing files are unlinked before the new entry is written. +.El +.It Fn archive_read_finish +Complete the archive, invoke the close callback, and release +all resources. +.El +.Pp +Note that the library determines most of the relevant information about +the archive by inspection. +In particular, it automatically detects +.Xr gzip 1 +or +.Xr bzip2 1 +compression and transparently performs the appropriate decompression. +It also automatically detects the archive format. +.Pp +The callback functions must match the following prototypes: +.Bl -item -offset indent +.It +.Ft typedef ssize_t +.Fn archive_read_callback "struct archive *" "void *client_data" "const void **buffer" +.It +.Ft typedef int +.Fn archive_open_callback "struct archive *" "void *client_data" +.It +.Ft typedef int +.Fn archive_close_callback "struct archive *" "void *client_data" +.El +These callback functions are called whenever the library requires +raw bytes from the archive. +Note that it is the client's responsibility to correctly +block the input. +.Pp +A complete description of the +.Tn struct archive +and +.Tn struct archive_entry +objects can be found in the overview manual page for +.Xr libarchive 3 . +.Sh EXAMPLE +The following illustrates basic usage of the library. In this example, +the callback functions are simply wrappers around the standard +.Xr open 2 , +.Xr read 2 , +and +.Xr close 2 +system calls. +.Bd -literal -offset indent +void +list_archive(const char *name) +{ + struct mydata *mydata; + struct archive *a; + struct archive_entry *entry; + + mydata = malloc(sizeof(struct mydata)); + a = archive_read_new(); + mydata->name = name; + archive_read_support_compression_all(a); + archive_read_support_format_all(a); + archive_read_open(a, mydata, myopen, myread, myclose); + while (archive_read_next_header(a, &entry) == ARCHIVE_READ_OK) { + printf("%s\\n",archive_entry_pathname(entry)); + archive_read_data_skip(a); + } + archive_read_finish(a); + free(mydata); +} + +ssize_t +myread(struct archive *a, void *client_data, const void **buff) +{ + struct mydata *mydata = client_data; + + *buff = mydata->buff; + return (read(mydata->fd, mydata->buff, 10240)); +} + +int +myopen(struct archive *a, void *client_data) +{ + struct mydata *mydata = client_data; + + mydata->fd = open(mydata->name, O_RDONLY); + return (mydata->fd >= 0); +} + +int +myclose(struct archive *a, void *client_data) +{ + struct mydata *mydata = client_data; + + if (mydata->fd > 0) + close(mydata->fd); + return (0); +} +.Ed +.Sh RETURN VALUES +Most functions return zero on success, non-zero on error. +The possible return codes include: +.Cm ARCHIVE_READ_OK +(the operation succeeded) +.Cm ARCHIVE_READ_WARN +(the operation succeeded but a non-critical error was encountered) +.Cm ARCHIVE_READ_EOF +(the operation failed because end-of-archive was encountered), +.Cm ARCHIVE_READ_RETRY +(the operation failed but can be retried), +and +.Cm ARCHIVE_READ_FATAL +(there was a fatal error; the archive should be closed immediately). +Detailed error codes and textual descriptions are available from the +.Fn archive_errno +and +.Fn archive_error_string +functions. +.Pp +.Fn archive_read_new +returns a pointer to a freshly allocated +.Tn struct archive +object. +It returns +.Dv NULL +on error. +.Pp +.Fn archive_read_data +returns a count of bytes actually read or zero at the end of the entry. +On error, a value of +.Cm ARCHIVE_FATAL , +.Cm ARCHIVE_WARN , +or +.Cm ARCHIVE_RETRY +is returned and an error code and textual description can be retrieved from the +.Fn archive_errno +and +.Fn archive_error_string +functions. +.Pp +The library expects the client callbacks to behave similarly. +If there is an error, you can use +.Fn archive_set_error +to set an appropriate error code and description, +then return one of the non-zero values above. +(Note that the value eventually returned to the client may +not be the same; many errors that are not critical at the level +of basic I/O can prevent the archive from being properly read, +thus most errors eventually cause +.Cm ARCHIVE_FATAL +to be returned.) +.\" .Sh ERRORS +.Sh SEE ALSO +.Xr tar 1 , +.Xr archive 3 , +.Xr tar 5 . +.Sh HISTORY +The +.Nm libarchive +library first appeared in +.Fx 5.3 . +.Sh AUTHORS +.An -nosplit +The +.Nm libarchive +library was written by +.An Tim Kientzle Aq kientzle@acm.org . +.Sh BUGS +The support for GNU tar formats is somewhat limited and should be improved. diff --git a/lib/libarchive/archive_read.c b/lib/libarchive/archive_read.c new file mode 100644 index 0000000..1c24d2d --- /dev/null +++ b/lib/libarchive/archive_read.c @@ -0,0 +1,505 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +/* + * This file contains the "essential" portions of the read API, that + * is, stuff that will probably always be used by any client that + * actually needs to read an archive. Optional pieces have been, as + * far as possible, separated out into separate files to avoid + * needlessly bloating statically-linked clients. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <err.h> +#include <errno.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> + +#include "archive.h" +#include "archive_entry.h" +#include "archive_private.h" + +static int choose_decompressor(struct archive *, const void*, size_t); +static int choose_format(struct archive *); + +/* + * Allocate, initialize and return a struct archive object. + */ +struct archive * +archive_read_new(void) +{ + struct archive *a; + char *nulls; + + a = malloc(sizeof(*a)); + memset(a, 0, sizeof(*a)); + + a->user_uid = geteuid(); + a->magic = ARCHIVE_READ_MAGIC; + a->bytes_per_block = ARCHIVE_DEFAULT_BYTES_PER_BLOCK; + + a->null_length = 1024; + nulls = malloc(a->null_length); + memset(nulls, 0, a->null_length); + a->nulls = nulls; + + a->state = ARCHIVE_STATE_NEW; + a->entry = archive_entry_new(); + + /* We always support uncompressed archives. */ + archive_read_support_compression_none((struct archive*)a); + + return (a); +} + +/* + * Set the block size. + */ +/* +int +archive_read_set_bytes_per_block(struct archive *a, int bytes_per_block) +{ + archive_check_magic(a, ARCHIVE_READ_MAGIC, ARCHIVE_STATE_NEW); + if (bytes_per_block < 1) + bytes_per_block = 1; + a->bytes_per_block = bytes_per_block; + return (0); +} +*/ + +/* + * Open the archive + */ +int +archive_read_open(struct archive *a, void *client_data, + archive_open_callback *opener, archive_read_callback *reader, + archive_close_callback *closer) +{ + const void *buffer; + size_t bytes_read; + int high_bidder; + int e; + + archive_check_magic(a, ARCHIVE_READ_MAGIC, ARCHIVE_STATE_NEW); + + if (reader == NULL) + errx(1, + "Fatal: No reader function provided to archive_read_open"); + + a->client_reader = reader; + a->client_opener = opener; + a->client_closer = closer; + a->client_data = client_data; + + a->state = ARCHIVE_STATE_HEADER; + + /* Open data source. */ + if (a->client_opener != NULL) { + e =(a->client_opener)(a, a->client_data); + if (e != 0) + return (e); + } + + /* Read first block now for format detection. */ + bytes_read = (a->client_reader)(a, a->client_data, &buffer); + + /* Select a decompression routine. */ + high_bidder = choose_decompressor(a, buffer, bytes_read); + if (high_bidder < 0) + return (ARCHIVE_FATAL); + + /* Initialize decompression routine with the first block of data. */ + e = (a->decompressors[high_bidder].init)(a, buffer, bytes_read); + return (e); +} + +/* + * Allow each registered decompression routine to bid on whether it + * wants to handle this stream. Return index of winning bidder. + */ +static int +choose_decompressor(struct archive *a, const void *buffer, size_t bytes_read) +{ + int decompression_slots, i, bid, best_bid, best_bid_slot; + + decompression_slots = sizeof(a->decompressors) / + sizeof(a->decompressors[0]); + + best_bid = -1; + best_bid_slot = -1; + + for (i = 0; i < decompression_slots; i++) { + if (a->decompressors[i].bid) { + bid = (a->decompressors[i].bid)(buffer, bytes_read); + if ((bid > best_bid) || (best_bid_slot < 0)) { + best_bid = bid; + best_bid_slot = i; + } + } + } + + /* + * There were no bidders; this is a serious programmer error + * and demands a quick and definitive abort. + */ + if (best_bid_slot < 0) + errx(1, "Fatal: No decompressors were registered; you " + "must call at least one " + "archive_read_support_compression_XXX function in order " + "to successfully read an archive."); + + /* + * There were bidders, but no non-zero bids; this means we can't + * support this stream. + */ + if (best_bid < 1) { + archive_set_error(a, EFTYPE, "Unrecognized archive format"); + /* EFTYPE == "Inappropriate file type or format" */ + return (ARCHIVE_FATAL); + } + + return (best_bid_slot); +} + +/* + * Read header of next entry. + */ +int +archive_read_next_header(struct archive *a, struct archive_entry **entryp) +{ + struct archive_entry *entry; + int slot, ret; + + archive_check_magic(a, ARCHIVE_READ_MAGIC, + ARCHIVE_STATE_HEADER | ARCHIVE_STATE_DATA); + + *entryp = NULL; + entry = a->entry; + archive_entry_clear(entry); + + /* + * If client didn't consume entire data, skip any remainder + * (This is especially important for GNU incremental directories.) + */ + if (a->state == ARCHIVE_STATE_DATA) { + ret = archive_read_data_skip(a); + if (ret == ARCHIVE_EOF) { + archive_set_error(a, EIO, "Premature end-of-file."); + a->state = ARCHIVE_STATE_FATAL; + return (ARCHIVE_FATAL); + } + } + + /* Record start-of-header. */ + a->header_position = a->file_position; + + slot = choose_format(a); + if (slot < 0) { + a->state = ARCHIVE_STATE_FATAL; + return (ARCHIVE_FATAL); + } + a->format = &(a->formats[slot]); + a->pformat_data = &(a->format->format_data); + ret = (a->format->read_header)(a, entry); + + /* + * EOF and FATAL are persistent at this layer. By + * modifying the state, we gaurantee that future calls to + * read a header or read data will fail. + */ + switch (ret) { + case ARCHIVE_EOF: + a->state = ARCHIVE_STATE_EOF; + break; + case ARCHIVE_OK: + a->state = ARCHIVE_STATE_DATA; + break; + case ARCHIVE_WARN: + a->state = ARCHIVE_STATE_DATA; + break; + case ARCHIVE_RETRY: + break; + case ARCHIVE_FATAL: + a->state = ARCHIVE_STATE_FATAL; + break; + } + + *entryp = entry; + return (ret); +} + +/* + * Allow each registered format to bid on whether it wants to handle + * the next entry. Return index of winning bidder. + */ +static int +choose_format(struct archive *a) +{ + int slots; + int i; + int bid, best_bid; + int best_bid_slot; + + slots = sizeof(a->formats) / sizeof(a->formats[0]); + best_bid = -1; + best_bid_slot = -1; + + /* Set up a->format and a->pformat_data for convenience of bidders. */ + a->format = &(a->formats[0]); + for (i = 0; i < slots; i++, a->format++) { + if (a->format->bid) { + a->pformat_data = &(a->format->format_data); + bid = (a->format->bid)(a); + if (bid == ARCHIVE_FATAL) + return (ARCHIVE_FATAL); + if ((bid > best_bid) || (best_bid_slot < 0)) { + best_bid = bid; + best_bid_slot = i; + } + } + } + + /* + * There were no bidders; this is a serious programmer error + * and demands a quick and definitive abort. + */ + if (best_bid_slot < 0) + errx(1, "Fatal: No formats were registered; you must " + "invoke at least one archive_read_support_format_XXX " + "function in order to successfully read an archive."); + + /* + * There were bidders, but no non-zero bids; this means we + * can't support this stream. + */ + if (best_bid < 1) { + archive_set_error(a, EFTYPE, "Unrecognized archive format"); + return (ARCHIVE_FATAL); + } + + return (best_bid_slot); +} + +/* + * Return the file offset (within the uncompressed data stream) where + * the last header started. + */ +int64_t +archive_read_header_position(struct archive *a) +{ + return (a->header_position); +} + +/* + * Read data from an archive entry. + */ +ssize_t +archive_read_data(struct archive *a, void *buff, size_t s) +{ + const void *data; + ssize_t bytes_read; + + archive_check_magic(a, ARCHIVE_READ_MAGIC, ARCHIVE_STATE_DATA); + if (s > a->entry_bytes_remaining) + s = a->entry_bytes_remaining; + if (s > 0) { + bytes_read = (a->compression_read_ahead)(a, &data, 1); + if (bytes_read < 0) { + a->state = ARCHIVE_STATE_FATAL; + return (bytes_read); + } + if ((size_t)bytes_read > s) + bytes_read = s; + } else + bytes_read = 0; + + if (bytes_read > 0) { + memcpy(buff, data, bytes_read); + (a->compression_read_consume)(a, bytes_read); + } + + a->entry_bytes_remaining -= bytes_read; + return (bytes_read); +} + +/* + * Skip over all remaining data in this entry. + */ +int +archive_read_data_skip(struct archive *a) +{ + const void *buff; + ssize_t bytes_read, to_skip; + + archive_check_magic(a, ARCHIVE_READ_MAGIC, ARCHIVE_STATE_DATA); + + to_skip = a->entry_bytes_remaining + a->entry_padding; + a->entry_bytes_remaining = 0; + + for (; to_skip > 0; to_skip -= bytes_read) { + /* TODO: Optimize skip in compression layer. */ + bytes_read = (a->compression_read_ahead)(a, &buff, to_skip); + if (bytes_read < 0) { + a->entry_padding = to_skip; + return (ARCHIVE_FATAL); + } + if (bytes_read == 0) { + archive_set_error(a, 0, + "Premature end of archive entry"); + return (ARCHIVE_FATAL); + } + if (bytes_read > to_skip) + bytes_read = to_skip; + (a->compression_read_consume)(a, bytes_read); + } + a->entry_padding = 0; + a->state = ARCHIVE_STATE_HEADER; + return (ARCHIVE_OK); +} + +/* + * Cleanup and free the archive object. + * + * Be careful: client might just call read_new and then read_finish. + * Don't assume we actually read anything or performed any non-trivial + * initialization. + */ +void +archive_read_finish(struct archive *a) +{ + archive_check_magic(a, ARCHIVE_READ_MAGIC, ARCHIVE_STATE_ANY); + a->state = ARCHIVE_STATE_CLOSED; + + /* Call cleanup functions registered by optional components. */ + if (a->cleanup_archive_extract != NULL) + (a->cleanup_archive_extract)(a); + + /* TODO: Finish the format processing. */ + + /* Close the input machinery. */ + if (a->compression_finish != NULL) + (a->compression_finish)(a); + + /*- + * Release allocated strings. + * + * TODO: Add a "cleanup" column to the "formats" array and + * use that to cleanup format-specific data. E.g., + * + * for (i=0; i< slots; i++) { + * if (a->formats[i].cleanup) + * (a->formats[i].cleanup)(a); + * } + */ + /* Casting a pointer to int allows us to remove 'const.' */ + free((void *)(uintptr_t)(const void *)a->nulls); + if (a->entry_name.s != NULL) + free(a->entry_name.s); + if (a->entry_linkname.s != NULL) + free(a->entry_linkname.s); + if (a->entry_uname.s != NULL) + free(a->entry_uname.s); + if (a->entry_gname.s != NULL) + free(a->entry_gname.s); + if (a->pax_header.s != NULL) + free(a->pax_header.s); + if (a->gnu_name.s != NULL) + free(a->gnu_name.s); + if (a->gnu_linkname.s != NULL) + free(a->gnu_linkname.s); + if (a->extract_mkdirpath.s != NULL) + free(a->extract_mkdirpath.s); + if (a->entry) + archive_entry_free(a->entry); + free(a); +} + +/* + * Used internally by read format handlers to register their bid and + * initialization functions. + */ +int +__archive_read_register_format(struct archive *a, + void *format_data, + int (*bid)(struct archive *), + int (*read_header)(struct archive *, struct archive_entry *), + int (*cleanup)(struct archive *)) +{ + int i, number_slots; + + archive_check_magic(a, ARCHIVE_READ_MAGIC, ARCHIVE_STATE_NEW); + + number_slots = sizeof(a->formats) / sizeof(a->formats[0]); + + for (i = 0; i < number_slots; i++) { + if (a->formats[i].bid == bid) + return (0); /* We've already installed */ + if (a->formats[i].bid == NULL) { + a->formats[i].bid = bid; + a->formats[i].read_header = read_header; + a->formats[i].cleanup = cleanup; + a->formats[i].format_data = format_data; + return (ARCHIVE_OK); + } + } + + errx(1, "Fatal: Not enough slots for format registration"); +} + +/* + * Used internally by decompression routines to register their bid and + * initialization functions. + */ +int +__archive_read_register_compression(struct archive *a, + int (*bid)(const void *, size_t), + int (*init)(struct archive *, const void *, size_t)) +{ + int i, number_slots; + + archive_check_magic(a, ARCHIVE_READ_MAGIC, ARCHIVE_STATE_NEW); + + number_slots = sizeof(a->decompressors) / sizeof(a->decompressors[0]); + + for (i = 0; i < number_slots; i++) { + if (a->decompressors[i].bid == bid) + return (ARCHIVE_OK); /* We've already installed */ + if (a->decompressors[i].bid == NULL) { + a->decompressors[i].bid = bid; + a->decompressors[i].init = init; + return (ARCHIVE_OK); + } + } + + errx(1, "Fatal: Not enough slots for compression registration"); +} diff --git a/lib/libarchive/archive_read_data_into_buffer.c b/lib/libarchive/archive_read_data_into_buffer.c new file mode 100644 index 0000000..988a2bf --- /dev/null +++ b/lib/libarchive/archive_read_data_into_buffer.c @@ -0,0 +1,52 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <string.h> + +#include "archive.h" + +ssize_t +archive_read_data_into_buffer(struct archive *a, void *d, ssize_t len) +{ + char *dest; + ssize_t bytes_read, total_bytes; + + dest = d; + total_bytes = 0; + bytes_read = archive_read_data(a, dest, len); + while (bytes_read > 0) { + total_bytes += bytes_read; + bytes_read = archive_read_data(a, dest + total_bytes, + len - total_bytes); + } + return (total_bytes); +} diff --git a/lib/libarchive/archive_read_data_into_fd.c b/lib/libarchive/archive_read_data_into_fd.c new file mode 100644 index 0000000..7dad233 --- /dev/null +++ b/lib/libarchive/archive_read_data_into_fd.c @@ -0,0 +1,64 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <unistd.h> + +#include "archive.h" +#include "archive_private.h" + +/* + * This implementation minimizes copying of data. + */ +ssize_t +archive_read_data_into_fd(struct archive *a, int fd) +{ + ssize_t bytes_read, bytes_written, total_written; + const void *buff; + + total_written = 0; + while (a->entry_bytes_remaining > 0) { + bytes_read = (a->compression_read_ahead)(a, &buff, + a->entry_bytes_remaining); + if (bytes_read < 0) + return (-1); + if ((size_t)bytes_read > a->entry_bytes_remaining) + bytes_read = (ssize_t)a->entry_bytes_remaining; + + bytes_written = write(fd, buff, bytes_read); + if (bytes_written < 0) + return (-1); + (a->compression_read_consume)(a, bytes_written); + total_written += bytes_written; + a->entry_bytes_remaining -= bytes_written; + } + return (total_written); +} diff --git a/lib/libarchive/archive_read_extract.c b/lib/libarchive/archive_read_extract.c new file mode 100644 index 0000000..328010b --- /dev/null +++ b/lib/libarchive/archive_read_extract.c @@ -0,0 +1,754 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <sys/stat.h> +#include <sys/acl.h> +#include <sys/time.h> +#include <sys/types.h> + +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <errno.h> +#include <fcntl.h> +#include <grp.h> +#include <limits.h> +#include <pwd.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <tar.h> +#include <unistd.h> + +#include "archive.h" +#include "archive_string.h" +#include "archive_entry.h" +#include "archive_private.h" + +static void archive_extract_cleanup(struct archive *); +static int archive_read_extract_block_device(struct archive *, + struct archive_entry *, int); +static int archive_read_extract_char_device(struct archive *, + struct archive_entry *, int); +static int archive_read_extract_device(struct archive *, + struct archive_entry *, int flags, mode_t mode); +static int archive_read_extract_dir(struct archive *, + struct archive_entry *, int); +static int archive_read_extract_dir_create(struct archive *, + const char *name, int mode, int flags); +static int archive_read_extract_fifo(struct archive *, + struct archive_entry *, int); +static int archive_read_extract_hard_link(struct archive *, + struct archive_entry *, int); +static int archive_read_extract_regular(struct archive *, + struct archive_entry *, int); +static int archive_read_extract_regular_open(struct archive *, + const char *name, int mode, int flags); +static int archive_read_extract_symbolic_link(struct archive *, + struct archive_entry *, int); +static int mkdirpath(struct archive *, const char *); +static int mkdirpath_recursive(char *path); +static int mksubdir(char *path); +static int set_acls(struct archive *, struct archive_entry *); +static int set_extended_perm(struct archive *, struct archive_entry *, + int flags); +static int set_fflags(struct archive *, struct archive_entry *); +static int set_ownership(struct archive *, struct archive_entry *, int); +static int set_perm(struct archive *, struct archive_entry *, int mode, int flags); +static int set_time(struct archive *, struct archive_entry *, int); + +/* + * Extract this entry to disk. + * + * TODO: Validate hardlinks. Is there any way to validate hardlinks + * without keeping a complete list of filenames from the entire archive?? Ugh. + * + */ +int +archive_read_extract(struct archive *a, struct archive_entry *entry, int flags) +{ + mode_t writable_mode; + struct archive_extract_dir_entry *le; + int ret; + int restore_pwd; + + restore_pwd = -1; + if (S_ISDIR(archive_entry_stat(entry)->st_mode)) { + /* + * TODO: Does this really work under all conditions? + * + * E.g., root restores a dir owned by someone else? + */ + writable_mode = archive_entry_stat(entry)->st_mode | 0700; + + /* + * If this dir isn't writable, restore it with write + * permissions and add it to the fixup list for later + * handling. + */ + if (archive_entry_stat(entry)->st_mode != writable_mode) { + le = malloc(sizeof(struct archive_extract_dir_entry)); + le->next = a->archive_extract_dir_list; + a->archive_extract_dir_list = le; + le->mode = archive_entry_stat(entry)->st_mode; + le->name = + malloc(strlen(archive_entry_pathname(entry)) + 1); + strcpy(le->name, archive_entry_pathname(entry)); + a->cleanup_archive_extract = archive_extract_cleanup; + /* Make sure I can write to this directory. */ + archive_entry_set_mode(entry, writable_mode); + } + } + + if (archive_entry_hardlink(entry) != NULL) + return (archive_read_extract_hard_link(a, entry, flags)); + + /* + * TODO: If pathname is longer than PATH_MAX, record starting + * directory and move to a suitable intermediate dir, which + * might require creating them! + */ + if (strlen(archive_entry_pathname(entry)) > PATH_MAX) { + restore_pwd = open(".", O_RDONLY); + /* XXX chdir() to a suitable intermediate dir XXX */ + /* XXX Update pathname in 'entry' XXX */ + } + + switch (archive_entry_stat(entry)->st_mode & S_IFMT) { + default: + /* Fall through, as required by POSIX. */ + case S_IFREG: + ret = archive_read_extract_regular(a, entry, flags); + break; + case S_IFLNK: /* Symlink */ + ret = archive_read_extract_symbolic_link(a, entry, flags); + break; + case S_IFCHR: + ret = archive_read_extract_char_device(a, entry, flags); + break; + case S_IFBLK: + ret = archive_read_extract_block_device(a, entry, flags); + break; + case S_IFDIR: + ret = archive_read_extract_dir(a, entry, flags); + break; + case S_IFIFO: + ret = archive_read_extract_fifo(a, entry, flags); + break; + } + + /* If we changed directory above, restore it here. */ + if (restore_pwd >= 0) + fchdir(restore_pwd); + + return (ret); +} + +/* + * Cleanup function for archive_extract. Free name/mode list and + * restore permissions. + * + * TODO: Restore times here as well. + * + * Registering this function (rather than calling it explicitly by + * name from archive_read_finish) reduces link pollution, since + * applications that don't use this API won't get this file linked in. + */ +static +void archive_extract_cleanup(struct archive *a) +{ + struct archive_extract_dir_entry *lp; + + /* + * TODO: Does dir list need to be sorted so permissions are restored + * depth-first? + */ + while (a->archive_extract_dir_list) { + lp = a->archive_extract_dir_list->next; + chmod(a->archive_extract_dir_list->name, + a->archive_extract_dir_list->mode); + /* + * TODO: Consider using this hook to restore dir + * timestamps as well. However, dir timestamps don't + * really matter, and it would be a memory issue to + * record timestamps for every directory + * extracted... Ugh. + */ + if (a->archive_extract_dir_list->name) + free(a->archive_extract_dir_list->name); + free(a->archive_extract_dir_list); + a->archive_extract_dir_list = lp; + } +} + +static int +archive_read_extract_regular(struct archive *a, struct archive_entry *entry, + int flags) +{ + int fd, r; + ssize_t s; + + r = ARCHIVE_OK; + fd = archive_read_extract_regular_open(a, + archive_entry_pathname(entry), archive_entry_stat(entry)->st_mode, + flags); + if (fd < 0) { + archive_set_error(a, errno, "Can't open"); + return (ARCHIVE_WARN); + } + s = archive_read_data_into_fd(a, fd); + if (s < archive_entry_size(entry)) { + /* Didn't read enough data? Complain but keep going. */ + archive_set_error(a, EIO, "Archive data truncated"); + r = ARCHIVE_WARN; + } + set_ownership(a, entry, flags); + set_time(a, entry, flags); + /* set_perm(a, entry, mode, flags); */ /* Handled implicitly by open.*/ + set_extended_perm(a, entry, flags); + close(fd); + return (r); +} + +/* + * Keep trying until we either open the file or run out of tricks. + * + * Note: the GNU tar 'unlink first' option seems redundant + * with this strategy, since we never actually write over an + * existing file. (If it already exists, we remove it.) + */ +static int +archive_read_extract_regular_open(struct archive *a, + const char *name, int mode, int flags) +{ + int fd; + + fd = open(name, O_WRONLY | O_CREAT | O_EXCL, mode); + if (fd >= 0) + return (fd); + + /* Try removing a pre-existing file. */ + if (!(flags & ARCHIVE_EXTRACT_NO_OVERWRITE)) { + unlink(name); + fd = open(name, O_WRONLY | O_CREAT | O_EXCL, mode); + if (fd >= 0) + return (fd); + } + + /* Might be a non-existent parent dir; try fixing that. */ + mkdirpath(a, name); + fd = open(name, O_WRONLY | O_CREAT | O_EXCL, mode); + if (fd >= 0) + return (fd); + + return (-1); +} + +static int +archive_read_extract_dir(struct archive *a, struct archive_entry *entry, + int flags) +{ + int mode, ret, ret2; + + mode = archive_entry_stat(entry)->st_mode; + + if (archive_read_extract_dir_create(a, archive_entry_pathname(entry), + mode, flags)) { + /* Unable to create directory; just use the existing dir. */ + return (ARCHIVE_WARN); + } + + set_ownership(a, entry, flags); + /* + * There is no point in setting the time here. + * + * Note that future extracts into this directory will reset + * the times, so to get correct results, the client has to + * track timestamps for directories and update them at the end + * of the run anyway. + */ + /* set_time(t, flags); */ + + /* + * This next line may appear redundant, but it's not. If the + * directory already exists, it won't get re-created by + * mkdir(), so we have to manually set permissions to get + * everything right. + */ + ret = set_perm(a, entry, mode, flags); + ret2 = set_extended_perm(a, entry, flags); + + /* XXXX TODO: Fix this to work the right way. XXXX */ + if (ret == ARCHIVE_OK) + return (ret2); + else + return (ret); +} + +/* + * Create the directory: try until something works or we run out of magic. + */ +static int +archive_read_extract_dir_create(struct archive *a, const char *name, int mode, + int flags) +{ + /* Don't try to create '.' */ + if (name[0] == '.' && name[1] == 0) + return (ARCHIVE_OK); + if (mkdir(name, mode) == 0) + return (ARCHIVE_OK); + if (errno == ENOENT) { /* Missing parent directory. */ + mkdirpath(a, name); + if (mkdir(name, mode) == 0) + return (ARCHIVE_OK); + } + + if (errno != EEXIST) + return (ARCHIVE_WARN); + if ((flags & ARCHIVE_EXTRACT_NO_OVERWRITE)) { + archive_set_error(a, EEXIST, "Directory already exists"); + return (ARCHIVE_WARN); + } + + /* Could be a file; try unlinking. */ + if (unlink(name) == 0 && + mkdir(name, mode) == 0) + return (ARCHIVE_OK); + + /* Unlink failed. It's okay if it failed because it's already a dir. */ + if (errno != EPERM) { + archive_set_error(a, errno, "Couldn't create dir"); + return (ARCHIVE_WARN); + } + + /* Try removing the directory and recreating it from scratch. */ + if (rmdir(name)) { + /* Failure to remove a non-empty directory is not a problem. */ + if (errno == ENOTEMPTY) + return (ARCHIVE_OK); + /* Any other failure is a problem. */ + archive_set_error(a, errno, + "Error attempting to remove existing directory"); + return (ARCHIVE_WARN); + } + + /* We successfully removed the directory; now recreate it. */ + if (mkdir(name, mode) == 0) + return (ARCHIVE_OK); + + archive_set_error(a, errno, "Failed to create dir"); + return (ARCHIVE_WARN); +} + +static int +archive_read_extract_hard_link(struct archive *a, struct archive_entry *entry, + int flags) +{ + /* Just remove any pre-existing file with this name. */ + if (!(flags & ARCHIVE_EXTRACT_NO_OVERWRITE)) + unlink(archive_entry_pathname(entry)); + + if (link(archive_entry_hardlink(entry), + archive_entry_pathname(entry))) { + archive_set_error(a, errno, "Can't restore hardlink"); + return (ARCHIVE_WARN); + } + + /* Set ownership, time, permission information. */ + set_ownership(a, entry, flags); + set_time(a, entry, flags); + set_perm(a, entry, archive_entry_stat(entry)->st_mode, flags); + set_extended_perm(a, entry, flags); + + return (ARCHIVE_OK); +} + +static int +archive_read_extract_symbolic_link(struct archive *a, + struct archive_entry *entry, int flags) +{ + /* Just remove any pre-existing file with this name. */ + if (!(flags & ARCHIVE_EXTRACT_NO_OVERWRITE)) + unlink(archive_entry_pathname(entry)); + + if (symlink(archive_entry_symlink(entry), + archive_entry_pathname(entry))) { + /* XXX Better error message here XXX */ + archive_set_error(a, errno, "Can't restore symlink to '%s'", + archive_entry_symlink(entry)); + return (ARCHIVE_WARN); + } + + /* Set ownership, time, permission information. */ + set_ownership(a, entry, flags); + set_time(a, entry, flags); + set_perm(a, entry, archive_entry_stat(entry)->st_mode, flags); + set_extended_perm(a, entry, flags); + + return (ARCHIVE_OK); +} + +static int +archive_read_extract_device(struct archive *a, struct archive_entry *entry, + int flags, mode_t mode) +{ + int r; + + /* Just remove any pre-existing file with this name. */ + if (!(flags & ARCHIVE_EXTRACT_NO_OVERWRITE)) + unlink(archive_entry_pathname(entry)); + + r = mknod(archive_entry_pathname(entry), mode, + archive_entry_stat(entry)->st_rdev); + + /* Might be a non-existent parent dir; try fixing that. */ + if (r != 0 && errno == ENOENT) { + mkdirpath(a, archive_entry_pathname(entry)); + r = mknod(archive_entry_pathname(entry), mode, + archive_entry_stat(entry)->st_rdev); + } + + if (r != 0) { + archive_set_error(a, errno, "Can't recreate device node"); + return (ARCHIVE_WARN); + } + + /* Set ownership, time, permission information. */ + set_ownership(a, entry, flags); + set_time(a, entry, flags); + set_perm(a, entry, archive_entry_stat(entry)->st_mode, flags); + set_extended_perm(a, entry, flags); + + return (ARCHIVE_OK); +} + +static int +archive_read_extract_char_device(struct archive *a, + struct archive_entry *entry, int flags) +{ + mode_t mode; + + mode = (archive_entry_stat(entry)->st_mode & ~S_IFMT) | S_IFCHR; + return (archive_read_extract_device(a, entry, flags, mode)); +} + +static int +archive_read_extract_block_device(struct archive *a, + struct archive_entry *entry, int flags) +{ + mode_t mode; + + mode = (archive_entry_stat(entry)->st_mode & ~S_IFMT) | S_IFBLK; + return (archive_read_extract_device(a, entry, flags, mode)); +} + +static int +archive_read_extract_fifo(struct archive *a, + struct archive_entry *entry, int flags) +{ + int r; + + /* Just remove any pre-existing file with this name. */ + if (!(flags & ARCHIVE_EXTRACT_NO_OVERWRITE)) + unlink(archive_entry_pathname(entry)); + + r = mkfifo(archive_entry_pathname(entry), + archive_entry_stat(entry)->st_mode); + + /* Might be a non-existent parent dir; try fixing that. */ + if (r != 0 && errno == ENOENT) { + mkdirpath(a, archive_entry_pathname(entry)); + r = mkfifo(archive_entry_pathname(entry), + archive_entry_stat(entry)->st_mode); + } + + if (r != 0) { + archive_set_error(a, errno, "Can't restore fifo"); + return (ARCHIVE_WARN); + } + + /* Set ownership, time, permission information. */ + set_ownership(a, entry, flags); + set_time(a, entry, flags); + /* Done by mkfifo. */ + /* set_perm(a, entry, archive_entry_stat(entry)->st_mode, flags); */ + set_extended_perm(a, entry, flags); + + return (ARCHIVE_OK); +} + +/* + * Returns 0 if it successfully created necessary directories. + * Otherwise, returns ARCHIVE_WARN. + */ + +static int +mkdirpath(struct archive *a, const char *path) +{ + char *p; + + /* Copy path to mutable storage, then call mkdirpath_recursive. */ + archive_strcpy(&(a->extract_mkdirpath), path); + /* Prune a trailing '/' character. */ + p = a->extract_mkdirpath.s; + if (p[strlen(p)-1] == '/') + p[strlen(p)-1] = 0; + /* Recursively try to build the path. */ + return (mkdirpath_recursive(p)); +} + +/* + * For efficiency, just try creating longest path first (usually, + * archives walk through directories in a reasonable order). If that + * fails, prune the last element and recursively try again. + */ +static int +mkdirpath_recursive(char *path) +{ + char * p; + int r; + + p = strrchr(path, '/'); + if (!p) return (0); + + *p = 0; /* Terminate path name. */ + r = mksubdir(path); /* Try building path. */ + *p = '/'; /* Restore the '/' we just overwrote. */ + return (r); +} + +static int +mksubdir(char *path) +{ + int mode = 0755; + + if (mkdir(path, mode) == 0) return (0); + + if (errno == EEXIST) /* TODO: stat() here to verify it is dir */ + return (0); + if (mkdirpath_recursive(path)) + return (ARCHIVE_WARN); + if (mkdir(path, mode) == 0) + return (0); + return (ARCHIVE_WARN); /* Still failed. Harumph. */ +} + +/* + * Note that I only inspect entry->ae_uid and entry->ae_gid here; if + * the client wants POSIX compat, they'll need to do uname/gname + * lookups themselves. I don't do it here because of the potential + * performance issues: if uname/gname lookup is expensive, then the + * results should be aggressively cached; if they're cheap, then we + * shouldn't waste memory on cache tables. + * + * Returns 0 if UID/GID successfully restored; ARCHIVE_WARN otherwise. + */ +static int +set_ownership(struct archive *a, struct archive_entry *entry, int flags) +{ + /* If UID/GID are already correct, return 0. */ + /* TODO: Fix this; need to stat() to find on-disk GID <sigh> */ + if (a->user_uid == archive_entry_stat(entry)->st_uid) + return (0); + + /* Not changed. */ + if ((flags & ARCHIVE_EXTRACT_OWNER) == 0) + return (ARCHIVE_WARN); + + /* + * Root can change owner/group; owner can change group; + * otherwise, bail out now. + */ + if ((a->user_uid != 0) + && (a->user_uid != archive_entry_stat(entry)->st_uid)) + return (ARCHIVE_WARN); + + if (lchown(archive_entry_pathname(entry), + archive_entry_stat(entry)->st_uid, + archive_entry_stat(entry)->st_gid)) { + archive_set_error(a, errno, + "Can't set user=%d/group=%d for %s", + archive_entry_stat(entry)->st_uid, + archive_entry_stat(entry)->st_gid, + archive_entry_pathname(entry)); + return (ARCHIVE_WARN); + } + return (ARCHIVE_OK); +} + +static int +set_time(struct archive *a, struct archive_entry *entry, int flags) +{ + const struct stat *st; + struct timeval times[2]; + + (void)a; /* UNUSED */ + st = archive_entry_stat(entry); + + if ((flags & ARCHIVE_EXTRACT_TIME) == 0) + return (ARCHIVE_OK); + + times[1].tv_sec = st->st_mtime; + times[1].tv_usec = st->st_mtimespec.tv_nsec / 1000; + + times[0].tv_sec = st->st_atime; + times[0].tv_usec = st->st_atimespec.tv_nsec / 1000; + + if (lutimes(archive_entry_pathname(entry), times) != 0) { + archive_set_error(a, errno, "Can't update time for %s", + archive_entry_pathname(entry)); + return (ARCHIVE_WARN); + } + + /* + * Note: POSIX does not provide a portable way to restore ctime. + * So, any restoration of ctime will necessarily be OS-specific. + */ + + /* TODO: Can FreeBSD restore ctime? */ + + return (ARCHIVE_OK); +} + +static int +set_perm(struct archive *a, struct archive_entry *entry, int mode, int flags) +{ + const char *name; + + if ((flags & ARCHIVE_EXTRACT_PERM) == 0) + return (ARCHIVE_OK); + + name = archive_entry_pathname(entry); + if (lchmod(name, mode) != 0) { + archive_set_error(a, errno, "Can't set permissions"); + return (ARCHIVE_WARN); + } + return (0); +} + +static int +set_extended_perm(struct archive *a, struct archive_entry *entry, int flags) +{ + int ret; + + if ((flags & ARCHIVE_EXTRACT_PERM) == 0) + return (ARCHIVE_OK); + + ret = set_fflags(a, entry); + if (ret == ARCHIVE_OK) + ret = set_acls(a, entry); + return (ret); +} + +static int +set_fflags(struct archive *a, struct archive_entry *entry) +{ + char *fflags; + const char *fflagsc; + char *fflags_p; + const char *name; + int ret; + unsigned long set, clear; + struct stat st; + + name = archive_entry_pathname(entry); + + ret = ARCHIVE_OK; + fflagsc = archive_entry_fflags(entry); + if (fflagsc == NULL) + return (ARCHIVE_OK); + + fflags = strdup(fflagsc); + if (fflags == NULL) + return (ARCHIVE_WARN); + + fflags_p = fflags; + if (strtofflags(&fflags_p, &set, &clear) != 0 && + stat(name, &st) == 0) { + st.st_flags &= ~clear; + st.st_flags |= set; + if (chflags(name, st.st_flags) != 0) { + archive_set_error(a, errno, + "Failed to set file flags"); + ret = ARCHIVE_WARN; + } + } + free(fflags); + return (ret); +} + +/* + * XXX TODO: What about ACL types other than ACCESS and DEFAULT? + */ +static int +set_acls(struct archive *a, struct archive_entry *entry) +{ + const char *acldesc; + acl_t acl; + const char *name; + int ret; + + ret = ARCHIVE_OK; + name = archive_entry_pathname(entry); + acldesc = archive_entry_acl(entry); + if (acldesc != NULL) { + acl = acl_from_text(acldesc); + if (acl == NULL) { + archive_set_error(a, errno, "Error parsing acl '%s'", + acldesc); + ret = ARCHIVE_WARN; + } else { + if (acl_set_file(name, ACL_TYPE_ACCESS, acl) != 0) { + archive_set_error(a, errno, + "Failed to set acl"); + ret = ARCHIVE_WARN; + } + acl_free(acl); + } + } + + acldesc = archive_entry_acl_default(entry); + if (acldesc != NULL) { + acl = acl_from_text(acldesc); + if (acl == NULL) { + archive_set_error(a, errno, "error parsing acl '%s'", + acldesc); + ret = ARCHIVE_WARN; + } else { + if (acl_set_file(name, ACL_TYPE_DEFAULT, acl) != 0) { + archive_set_error(a, errno, + "Failed to set acl"); + ret = ARCHIVE_WARN; + } + acl_free(acl); + } + } + + return (ret); +} diff --git a/lib/libarchive/archive_read_open_file.c b/lib/libarchive/archive_read_open_file.c new file mode 100644 index 0000000..f2fdc63 --- /dev/null +++ b/lib/libarchive/archive_read_open_file.c @@ -0,0 +1,109 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <errno.h> +#include <fcntl.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> + +#include "archive.h" + +struct read_file_data { + int fd; + size_t block_size; + void *buffer; + char filename[1]; +}; + +static int file_close(struct archive *, void *); +static int file_open(struct archive *, void *); +static ssize_t file_read(struct archive *, void *, const void **buff); + +int +archive_read_open_file(struct archive *a, const char *filename, + size_t block_size) +{ + struct read_file_data *mine; + + /* XXX detect and report malloc failure XXX */ + if (filename == NULL) { + mine = malloc(sizeof(*mine)); + mine->filename[0] = 0; + } else { + mine = malloc(sizeof(*mine) + strlen(filename)); + strcpy(mine->filename, filename); + } + mine->block_size = block_size; + mine->buffer = malloc(mine->block_size); + mine->fd = -1; + return (archive_read_open(a, mine, file_open, file_read, file_close)); +} + +static int +file_open(struct archive *a, void *client_data) +{ + struct read_file_data *mine = client_data; + + if (*mine->filename != 0) + mine->fd = open(mine->filename, O_RDONLY); + else + mine->fd = 0; + if (mine->fd < 0) { + archive_set_error(a, errno, "Failed to open '%s'", + mine->filename); + return (ARCHIVE_FATAL); + } + return (0); +} + +static ssize_t +file_read(struct archive *a, void *client_data, const void **buff) +{ + struct read_file_data *mine = client_data; + + (void)a; /* UNUSED */ + *buff = mine->buffer; + return (read(mine->fd, mine->buffer, mine->block_size)); +} + +static int +file_close(struct archive *a, void *client_data) +{ + struct read_file_data *mine = client_data; + + (void)a; /* UNUSED */ + if (mine->fd >= 0) + close(mine->fd); + free(mine); + return (ARCHIVE_OK); +} diff --git a/lib/libarchive/archive_read_open_filename.c b/lib/libarchive/archive_read_open_filename.c new file mode 100644 index 0000000..f2fdc63 --- /dev/null +++ b/lib/libarchive/archive_read_open_filename.c @@ -0,0 +1,109 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <errno.h> +#include <fcntl.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> + +#include "archive.h" + +struct read_file_data { + int fd; + size_t block_size; + void *buffer; + char filename[1]; +}; + +static int file_close(struct archive *, void *); +static int file_open(struct archive *, void *); +static ssize_t file_read(struct archive *, void *, const void **buff); + +int +archive_read_open_file(struct archive *a, const char *filename, + size_t block_size) +{ + struct read_file_data *mine; + + /* XXX detect and report malloc failure XXX */ + if (filename == NULL) { + mine = malloc(sizeof(*mine)); + mine->filename[0] = 0; + } else { + mine = malloc(sizeof(*mine) + strlen(filename)); + strcpy(mine->filename, filename); + } + mine->block_size = block_size; + mine->buffer = malloc(mine->block_size); + mine->fd = -1; + return (archive_read_open(a, mine, file_open, file_read, file_close)); +} + +static int +file_open(struct archive *a, void *client_data) +{ + struct read_file_data *mine = client_data; + + if (*mine->filename != 0) + mine->fd = open(mine->filename, O_RDONLY); + else + mine->fd = 0; + if (mine->fd < 0) { + archive_set_error(a, errno, "Failed to open '%s'", + mine->filename); + return (ARCHIVE_FATAL); + } + return (0); +} + +static ssize_t +file_read(struct archive *a, void *client_data, const void **buff) +{ + struct read_file_data *mine = client_data; + + (void)a; /* UNUSED */ + *buff = mine->buffer; + return (read(mine->fd, mine->buffer, mine->block_size)); +} + +static int +file_close(struct archive *a, void *client_data) +{ + struct read_file_data *mine = client_data; + + (void)a; /* UNUSED */ + if (mine->fd >= 0) + close(mine->fd); + free(mine); + return (ARCHIVE_OK); +} diff --git a/lib/libarchive/archive_read_support_compression_all.c b/lib/libarchive/archive_read_support_compression_all.c new file mode 100644 index 0000000..2d5efe1 --- /dev/null +++ b/lib/libarchive/archive_read_support_compression_all.c @@ -0,0 +1,42 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#ifdef DMALLOC +#include <dmalloc.h> +#endif + +#include "archive.h" + +int +archive_read_support_compression_all(struct archive *a) +{ + archive_read_support_compression_bzip2(a); + archive_read_support_compression_gzip(a); + return (ARCHIVE_OK); +} diff --git a/lib/libarchive/archive_read_support_compression_bzip2.c b/lib/libarchive/archive_read_support_compression_bzip2.c new file mode 100644 index 0000000..c97160f --- /dev/null +++ b/lib/libarchive/archive_read_support_compression_bzip2.c @@ -0,0 +1,365 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <err.h> +#include <errno.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> +#include <bzlib.h> + +#include "archive.h" +#include "archive_private.h" + +struct private_data { + bz_stream stream; + unsigned char *uncompressed_buffer; + size_t uncompressed_buffer_size; + char *read_next; + int64_t total_out; +}; + +static int bid(const void *, size_t); +static int finish(struct archive *); +static int init(struct archive *, const void *, size_t); +static ssize_t read_ahead(struct archive *, const void **, size_t); +static ssize_t read_consume(struct archive *, size_t); +static int drive_decompressor(struct archive *a, struct private_data *); + +int +archive_read_support_compression_bzip2(struct archive *a) +{ + return (__archive_read_register_compression(a, bid, init)); +} + +/* + * Test whether we can handle this data. + * + * This logic returns zero if any part of the signature fails. It + * also tries to Do The Right Thing if a very short buffer prevents us + * from verifying as much as we would like. + */ +static int +bid(const void *buff, size_t len) +{ + const unsigned char *buffer; + int bits_checked; + + if (len < 1) + return (0); + + buffer = buff; + bits_checked = 0; + if (buffer[0] != 'B') /* Verify first ID byte. */ + return (0); + bits_checked += 8; + if (len < 2) + return (bits_checked); + + if (buffer[1] != 'Z') /* Verify second ID byte. */ + return (0); + bits_checked += 8; + if (len < 3) + return (bits_checked); + + if (buffer[2] != 'h') /* Verify third ID byte. */ + return (0); + bits_checked += 8; + if (len < 4) + return (bits_checked); + + if (buffer[3] < '1' || buffer[3] > '9') + return (0); + bits_checked += 5; + + /* + * Research Question: Can we do any more to verify that this + * really is BZip2 format?? For 99.9% of the time, the above + * test is sufficient, but it would be nice to do a more + * thorough check. It's especially troubling that the BZip2 + * signature begins with all ASCII characters; a tar archive + * whose first filename begins with 'BZh3' would potentially + * fool this logic. (It may also be possible to gaurd against + * such anomalies in archive_read_support_compression_none.) + */ + + return (bits_checked); +} + +/* + * Setup the callbacks. + */ +static int +init(struct archive *a, const void *buff, size_t n) +{ + struct private_data *state; + int ret; + + a->compression_code = ARCHIVE_COMPRESSION_BZIP2; + a->compression_name = "bzip2"; + + state = malloc(sizeof(*state)); + if (state == NULL) { + archive_set_error(a, ENOMEM, + "Can't allocate data for %s decompression", + a->compression_name); + return (ARCHIVE_FATAL); + } + memset(state, 0, sizeof(*state)); + + state->uncompressed_buffer_size = 64 * 1024; + state->uncompressed_buffer = malloc(state->uncompressed_buffer_size); + state->stream.next_out = state->uncompressed_buffer; + state->read_next = state->uncompressed_buffer; + state->stream.avail_out = state->uncompressed_buffer_size; + + if (state->uncompressed_buffer == NULL) { + archive_set_error(a, ENOMEM, + "Can't allocate %s decompression buffers", + a->compression_name); + free(state); + return (ARCHIVE_FATAL); + } + + /* + * A bug in bzlib.h: stream.next_in should be marked 'const' + * but isn't (the library never alters data through the + * next_in pointer, only reads it). The result: this ugly + * cast to remove 'const'. + */ + state->stream.next_in = (void *)(uintptr_t)(const void *)buff; + state->stream.avail_in = n; + + a->compression_read_ahead = read_ahead; + a->compression_read_consume = read_consume; + a->compression_finish = finish; + + /* Initialize compression library. */ + ret = BZ2_bzDecompressInit(&(state->stream), + 0 /* library verbosity */, + 0 /* don't use slow low-mem algorithm */); + + /* If init fails, try using low-memory algorithm instead. */ + if (ret == BZ_MEM_ERROR) { + ret = BZ2_bzDecompressInit(&(state->stream), + 0 /* library verbosity */, + 1 /* do use slow low-mem algorithm */); + } + + if (ret == BZ_OK) { + a->compression_data = state; + return (ARCHIVE_OK); + } + + /* Library setup failed: Clean up. */ + archive_set_error(a, -1, "Internal error initializing %s library", + a->compression_name); + free(state->uncompressed_buffer); + free(state); + + /* Override the error message if we know what really went wrong. */ + switch (ret) { + case BZ_PARAM_ERROR: + archive_set_error(a, -1, + "Internal error initializing compression library: " + "invalid setup parameter"); + break; + case BZ_MEM_ERROR: + archive_set_error(a, -1, + "Internal error initializing compression library: " + "out of memory"); + break; + case BZ_CONFIG_ERROR: + archive_set_error(a, -1, + "Internal error initializing compression library: " + "mis-compiled library"); + break; + } + + return (ARCHIVE_FATAL); +} + +/* + * Return a block of data from the decompression buffer. Decompress more + * as necessary. + */ +static ssize_t +read_ahead(struct archive *a, const void **p, size_t min) +{ + struct private_data *state; + int read_avail, was_avail, ret; + + state = a->compression_data; + was_avail = -1; + if (!a->client_reader) { + archive_set_error(a, EINVAL, + "No read callback is registered? " + "This is probably an internal programming error."); + return (ARCHIVE_FATAL); + } + + read_avail = state->stream.next_out - state->read_next; + + if (read_avail + state->stream.avail_out < min) { + memmove(state->uncompressed_buffer, state->read_next, + read_avail); + state->read_next = state->uncompressed_buffer; + state->stream.next_out = state->read_next + read_avail; + state->stream.avail_out + = state->uncompressed_buffer_size - read_avail; + } + + while (was_avail < read_avail && /* Made some progress. */ + read_avail < (int)min && /* Haven't satisfied min. */ + read_avail < (int)state->uncompressed_buffer_size) { /* !full */ + if ((ret = drive_decompressor(a, state)) != ARCHIVE_OK) + return (ret); + was_avail = read_avail; + read_avail = state->stream.next_out - state->read_next; + } + + *p = state->read_next; + return (read_avail); +} + +/* + * Mark a previously-returned block of data as read. + */ +static ssize_t +read_consume(struct archive *a, size_t n) +{ + struct private_data *state; + + state = a->compression_data; + a->file_position += n; + state->read_next += n; + if (state->read_next > state->stream.next_out) + errx(1, "Internal error: Request to consume too many " + "bytes from %s decompressor.\n", + a->compression_name); + return (n); +} + +/* + * Clean up the decompressor. + */ +static int +finish(struct archive *a) +{ + struct private_data *state; + int ret; + + state = a->compression_data; + ret = ARCHIVE_OK; + switch (BZ2_bzDecompressEnd(&(state->stream))) { + case BZ_OK: + break; + default: + archive_set_error(a, -1, "Failed to clean up %s compressor", + a->compression_name); + ret = ARCHIVE_FATAL; + } + + free(state->uncompressed_buffer); + free(state); + + a->compression_data = NULL; + if (a->client_closer != NULL) + (a->client_closer)(a, a->client_data); + + return (ret); +} + +/* + * Utility function to pull data through decompressor, reading input + * blocks as necessary. + */ +static int +drive_decompressor(struct archive *a, struct private_data *state) +{ + ssize_t ret; + int decompressed, total_decompressed; + char *output; + + total_decompressed = 0; + for (;;) { + if (state->stream.avail_in == 0) { + ret = (a->client_reader)(a, a->client_data, + (const void **)&state->stream.next_in); + if (ret < 0) { + /* + * TODO: Find a better way to handle + * this read failure. + */ + goto fatal; + } + if (ret == 0 && total_decompressed == 0) { + archive_set_error(a, -1, + "Premature end of %s compressed data", + a->compression_name); + return (ARCHIVE_FATAL); + } + state->stream.avail_in = ret; + } + + { + output = state->stream.next_out; + + /* Decompress some data. */ + ret = BZ2_bzDecompress(&(state->stream)); + decompressed = state->stream.next_out - output; + + /* Accumulate the total bytes of output. */ + state->total_out += decompressed; + total_decompressed += decompressed; + + switch (ret) { + case BZ_OK: /* Decompressor made some progress. */ + if (decompressed > 0) + return (ARCHIVE_OK); + break; + case BZ_STREAM_END: /* Found end of stream. */ + return (ARCHIVE_OK); + default: + /* Any other return value is an error. */ + goto fatal; + } + } + } + return (ARCHIVE_OK); + + /* Return a fatal error. */ +fatal: + archive_set_error(a, -1, "%s decompression failed", + a->compression_name); + return (ARCHIVE_FATAL); +} diff --git a/lib/libarchive/archive_read_support_compression_gzip.c b/lib/libarchive/archive_read_support_compression_gzip.c new file mode 100644 index 0000000..d0816b0 --- /dev/null +++ b/lib/libarchive/archive_read_support_compression_gzip.c @@ -0,0 +1,499 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <err.h> +#include <errno.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> +#include <zlib.h> + +#include "archive.h" +#include "archive_private.h" + +struct private_data { + z_stream stream; + unsigned char *uncompressed_buffer; + size_t uncompressed_buffer_size; + unsigned char *read_next; + int64_t total_out; + unsigned long crc; + char header_done; +}; + +static int bid(const void *, size_t); +static int finish(struct archive *); +static int init(struct archive *, const void *, size_t); +static ssize_t read_ahead(struct archive *, const void **, size_t); +static ssize_t read_consume(struct archive *, size_t); +static int drive_decompressor(struct archive *a, struct private_data *); + +int +archive_read_support_compression_gzip(struct archive *a) +{ + return (__archive_read_register_compression(a, bid, init)); +} + +/* + * Test whether we can handle this data. + * + * This logic returns zero if any part of the signature fails. It + * also tries to Do The Right Thing if a very short buffer prevents us + * from verifying as much as we would like. + */ +static int +bid(const void *buff, size_t len) +{ + const unsigned char *buffer; + int bits_checked; + + if (len < 1) + return (0); + + buffer = buff; + bits_checked = 0; + if (buffer[0] != 037) /* Verify first ID byte. */ + return (0); + bits_checked += 8; + if (len < 2) + return (bits_checked); + + if (buffer[1] != 0213) /* Verify second ID byte. */ + return (0); + bits_checked += 8; + if (len < 3) + return (bits_checked); + + if (buffer[2] != 8) /* Compression must be 'deflate'. */ + return (0); + bits_checked += 8; + if (len < 4) + return (bits_checked); + + if ((buffer[3] & 0xE0)!= 0) /* No reserved flags set. */ + return (0); + bits_checked += 3; + if (len < 5) + return (bits_checked); + + /* + * TODO: Verify more; in particular, gzip has an optional + * header CRC, which would give us 16 more verified bits. We + * may also be able to verify certain constraints on other + * fields. + */ + + return (bits_checked); +} + +/* + * Setup the callbacks. + */ +static int +init(struct archive *a, const void *buff, size_t n) +{ + struct private_data *state; + int ret; + + a->compression_code = ARCHIVE_COMPRESSION_GZIP; + a->compression_name = "gzip"; + + state = malloc(sizeof(*state)); + if (state == NULL) { + archive_set_error(a, ENOMEM, + "Can't allocate data for %s decompression", + a->compression_name); + return (ARCHIVE_FATAL); + } + memset(state, 0, sizeof(*state)); + + state->crc = crc32(0L, NULL, 0); + state->header_done = 0; /* We've not yet begun to parse header... */ + + state->uncompressed_buffer_size = 64 * 1024; + state->uncompressed_buffer = malloc(state->uncompressed_buffer_size); + state->stream.next_out = state->uncompressed_buffer; + state->read_next = state->uncompressed_buffer; + state->stream.avail_out = state->uncompressed_buffer_size; + + if (state->uncompressed_buffer == NULL) { + archive_set_error(a, ENOMEM, + "Can't allocate %s decompression buffers", + a->compression_name); + free(state); + return (ARCHIVE_FATAL); + } + + /* + * A bug in zlib.h: stream.next_in should be marked 'const' + * but isn't (the library never alters data through the + * next_in pointer, only reads it). The result: this ugly + * cast to remove 'const'. + */ + state->stream.next_in = (void *)(uintptr_t)(const void *)buff; + state->stream.avail_in = n; + + a->compression_read_ahead = read_ahead; + a->compression_read_consume = read_consume; + a->compression_finish = finish; + + /* + * TODO: Do I need to parse the gzip header before calling + * inflateInit2()? In particular, one of the header bytes + * marks "best compression" or "fastest", which may be + * appropriate for setting the second parameter here. + * However, I think the only penalty for not setting it + * correctly is wasted memory. If this is necessary, it + * should probably go into drive_decompressor() below. + */ + + /* Initialize compression library. */ + ret = inflateInit2(&(state->stream), + -15 /* Don't check for zlib header */); + if (ret == Z_OK) { + a->compression_data = state; + return (ARCHIVE_OK); + } + + /* Library setup failed: Clean up. */ + archive_set_error(a, -1, "Internal error initializing %s library", + a->compression_name); + free(state->uncompressed_buffer); + free(state); + + /* Override the error message if we know what really went wrong. */ + switch (ret) { + case Z_STREAM_ERROR: + archive_set_error(a, -1, + "Internal error initializing compression library: " + "invalid setup parameter"); + break; + case Z_MEM_ERROR: + archive_set_error(a, -1, + "Internal error initializing compression library: " + "out of memory"); + break; + case Z_VERSION_ERROR: + archive_set_error(a, -1, + "Internal error initializing compression library: " + "invalid library version"); + break; + } + + return (ARCHIVE_FATAL); +} + +/* + * Return a block of data from the decompression buffer. Decompress more + * as necessary. + */ +static ssize_t +read_ahead(struct archive *a, const void **p, size_t min) +{ + struct private_data *state; + int read_avail, was_avail, ret; + + state = a->compression_data; + was_avail = -1; + if (!a->client_reader) { + archive_set_error(a, EINVAL, + "No read callback is registered? " + "This is probably an internal programming error."); + return (ARCHIVE_FATAL); + } + + read_avail = state->stream.next_out - state->read_next; + + if (read_avail + state->stream.avail_out < min) { + memmove(state->uncompressed_buffer, state->read_next, + read_avail); + state->read_next = state->uncompressed_buffer; + state->stream.next_out = state->read_next + read_avail; + state->stream.avail_out + = state->uncompressed_buffer_size - read_avail; + } + + while (was_avail < read_avail && /* Made some progress. */ + read_avail < (int)min && /* Haven't satisfied min. */ + read_avail < (int)state->uncompressed_buffer_size) { /* !full */ + if ((ret = drive_decompressor(a, state)) != ARCHIVE_OK) + return (ret); + was_avail = read_avail; + read_avail = state->stream.next_out - state->read_next; + } + + *p = state->read_next; + return (read_avail); +} + +/* + * Mark a previously-returned block of data as read. + */ +static ssize_t +read_consume(struct archive *a, size_t n) +{ + struct private_data *state; + + state = a->compression_data; + a->file_position += n; + state->read_next += n; + if (state->read_next > state->stream.next_out) + errx(1, "Internal error: Request to consume too many " + "bytes from %s decompressor.\n", + a->compression_name); + return (n); +} + +/* + * Clean up the decompressor. + */ +static int +finish(struct archive *a) +{ + struct private_data *state; + int ret; + + state = a->compression_data; + ret = ARCHIVE_OK; + switch (inflateEnd(&(state->stream))) { + case Z_OK: + break; + default: + archive_set_error(a, -1, "Failed to clean up %s compressor", + a->compression_name); + ret = ARCHIVE_FATAL; + } + + free(state->uncompressed_buffer); + free(state); + + a->compression_data = NULL; + if (a->client_closer != NULL) + (a->client_closer)(a, a->client_data); + + return (ret); +} + +/* + * Utility function to pull data through decompressor, reading input + * blocks as necessary. + */ +static int +drive_decompressor(struct archive *a, struct private_data *state) +{ + ssize_t ret; + int decompressed, total_decompressed; + int count, flags, header_state; + unsigned char *output; + unsigned char b; + + flags = 0; + count = 0; + header_state = 0; + total_decompressed = 0; + for (;;) { + if (state->stream.avail_in == 0) { + ret = (a->client_reader)(a, a->client_data, + (const void **)&state->stream.next_in); + if (ret < 0) { + /* + * TODO: Find a better way to handle + * this read failure. + */ + goto fatal; + } + if (ret == 0 && total_decompressed == 0) { + archive_set_error(a, -1, + "Premature end of %s compressed data", + a->compression_name); + return (ARCHIVE_FATAL); + } + state->stream.avail_in = ret; + } + + if (!state->header_done) { + /* + * If still parsing the header, interpret the + * next byte. + */ + b = *(state->stream.next_in++); + state->stream.avail_in--; + + /* + * Yes, this is somewhat crude, but it works, + * GZip format isn't likely to change anytime + * in the near future, and header parsing is + * certainly not a performance issue, so + * there's little point in making this more + * elegant. Of course, if you see an easy way + * to make this more elegant, please let me + * know.. ;-) + */ + switch (header_state) { + case 0: /* First byte of signature. */ + if (b != 037) + goto fatal; + header_state = 1; + break; + case 1: /* Second byte of signature. */ + if (b != 0213) + goto fatal; + header_state = 2; + break; + case 2: /* Compression type must be 8. */ + if (b != 8) + goto fatal; + header_state = 3; + break; + case 3: /* GZip flags. */ + flags = b; + header_state = 4; + break; + case 4: case 5: case 6: case 7: /* Mod time. */ + header_state++; + break; + case 8: /* Deflate flags. */ + header_state = 9; + break; + case 9: /* OS. */ + header_state = 10; + break; + case 10: /* Optional Extra: First byte of Length. */ + if ((flags & 4)) { + count = 255 & (int)b; + header_state = 11; + break; + } + /* + * Fall through if there is no + * Optional Extra field. + */ + case 11: /* Optional Extra: Second byte of Length. */ + if ((flags & 4)) { + count = (count << 8) | (255 & (int)b); + header_state = 12; + break; + } + /* + * Fall through if there is no + * Optional Extra field. + */ + case 12: /* Optional Extra Field: counted length. */ + if ((flags & 4)) { + --count; + if (count == 0) header_state = 13; + else header_state = 12; + break; + } + /* + * Fall through if there is no + * Optional Extra field. + */ + case 13: /* Optional Original Filename. */ + if ((flags & 8)) { + if (b == 0) header_state = 14; + else header_state = 13; + break; + } + /* + * Fall through if no Optional + * Original Filename. + */ + case 14: /* Optional Comment. */ + if ((flags & 16)) { + if (b == 0) header_state = 15; + else header_state = 14; + break; + } + /* Fall through if no Optional Comment. */ + case 15: /* Optional Header CRC: First byte. */ + if ((flags & 2)) { + header_state = 16; + break; + } + /* Fall through if no Optional Header CRC. */ + case 16: /* Optional Header CRC: Second byte. */ + if ((flags & 2)) { + header_state = 17; + break; + } + /* Fall through if no Optional Header CRC. */ + case 17: /* First byte of compressed data. */ + state->header_done = 1; /* done with header */ + state->stream.avail_in++; + state->stream.next_in--; + } + + /* + * TODO: Consider moving the inflateInit2 call + * here so it can include the compression type + * from the header? + */ + } else { + output = state->stream.next_out; + + /* Decompress some data. */ + ret = inflate(&(state->stream), 0); + decompressed = state->stream.next_out - output; + + /* Accumulate the CRC of the uncompressed data. */ + state->crc = crc32(state->crc, output, decompressed); + + /* Accumulate the total bytes of output. */ + state->total_out += decompressed; + total_decompressed += decompressed; + + switch (ret) { + case Z_OK: /* Decompressor made some progress. */ + if (decompressed > 0) + return (ARCHIVE_OK); + break; + case Z_STREAM_END: /* Found end of stream. */ + /* + * TODO: Verify gzip trailer + * (uncompressed length and CRC). + */ + return (ARCHIVE_OK); + default: + /* Any other return value is an error. */ + goto fatal; + } + } + } + return (ARCHIVE_OK); + + /* Return a fatal error. */ +fatal: + archive_set_error(a, -1, "%s decompression failed", + a->compression_name); + return (ARCHIVE_FATAL); +} diff --git a/lib/libarchive/archive_read_support_compression_none.c b/lib/libarchive/archive_read_support_compression_none.c new file mode 100644 index 0000000..0f7e42f --- /dev/null +++ b/lib/libarchive/archive_read_support_compression_none.c @@ -0,0 +1,259 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <errno.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> + +#include "archive.h" +#include "archive_private.h" + +struct archive_decompress_none { + char *buffer; + size_t buffer_size; + char *next; /* Current read location. */ + size_t avail; /* Bytes in my buffer. */ + const char *client_buff; /* Client buffer information. */ + size_t client_total; + const char *client_next; + size_t client_avail; + char end_of_file; + char fatal; +}; + +/* + * Size of internal buffer used for combining short reads. This is + * also an upper limit on the size of a read request. Recall, + * however, that we can (and will!) return blocks of data larger than + * this. The read semantics are: you ask for a minimum, I give you a + * pointer to my best-effort match and tell you how much data is + * there. It could be less than you asked for, it could be much more. + * For example, a client might use mmap() to "read" the entire file as + * a single block. In that case, I will return that entire block to + * my clients. + */ +#define BUFFER_SIZE 65536 + +static int archive_decompressor_none_bid(const void *, size_t); +static int archive_decompressor_none_finish(struct archive *); +static int archive_decompressor_none_init(struct archive *, + const void *, size_t); +static ssize_t archive_decompressor_none_read_ahead(struct archive *, + const void **, size_t); +static ssize_t archive_decompressor_none_read_consume(struct archive *, + size_t); + +int +archive_read_support_compression_none(struct archive *a) +{ + return (__archive_read_register_compression(a, + archive_decompressor_none_bid, + archive_decompressor_none_init)); +} + +/* + * Try to detect an "uncompressed" archive. + */ +static int +archive_decompressor_none_bid(const void *buff, size_t len) +{ + (void)buff; + (void)len; + + return (1); /* Default: We'll take it if noone else does. */ +} + +static int +archive_decompressor_none_init(struct archive *a, const void *buff, size_t n) +{ + struct archive_decompress_none *state; + + a->compression_code = ARCHIVE_COMPRESSION_NONE; + a->compression_name = "none"; + + state = (struct archive_decompress_none *)malloc(sizeof(*state)); + if (!state) { + archive_set_error(a, ENOMEM, "Can't allocate input data"); + return (ARCHIVE_FATAL); + } + memset(state, 0, sizeof(*state)); + + state->buffer_size = BUFFER_SIZE; + state->buffer = malloc(state->buffer_size); + state->next = state->buffer; + if (state->buffer == NULL) { + free(state); + archive_set_error(a, ENOMEM, "Can't allocate input buffer"); + return (ARCHIVE_FATAL); + } + + /* Save reference to first block of data. */ + state->client_buff = buff; + state->client_total = n; + state->client_next = state->client_buff; + state->client_avail = state->client_total; + + a->compression_data = state; + a->compression_read_ahead = archive_decompressor_none_read_ahead; + a->compression_read_consume = archive_decompressor_none_read_consume; + a->compression_finish = archive_decompressor_none_finish; + + return (ARCHIVE_OK); +} + +/* + * We just pass through pointers to the client buffer if we can. + * If the client buffer is short, then we copy stuff to our internal + * buffer to combine reads. + */ +static ssize_t +archive_decompressor_none_read_ahead(struct archive *a, const void **buff, + size_t min) +{ + struct archive_decompress_none *state; + ssize_t bytes_read; + + state = a->compression_data; + if (state->fatal) + return (-1); + + if (min > state->buffer_size) + min = state->buffer_size; + + /* Keep reading until we have accumulated enough data. */ + while (state->avail + state->client_avail < min) { + if (state->next > state->buffer && + state->next + min > state->buffer + state->buffer_size && + state->avail > 0) { + memmove(state->buffer, state->next, state->avail); + state->next = state->buffer; + } + if (state->client_avail > 0) { + memcpy(state->next + state->avail, state->client_next, + state->client_avail); + state->client_next += state->client_avail; + state->avail += state->client_avail; + state->client_avail = 0; + } + /* + * It seems to me that const void ** and const char ** + * should be compatible, but they aren't, hence the cast. + */ + bytes_read = (a->client_reader)(a, a->client_data, + (const void **)&state->client_buff); + if (bytes_read < 0) { /* Read error. */ + state->client_total = state->client_avail = 0; + state->client_next = state->client_buff = NULL; + state->fatal = 1; + return (-1); + } + if (bytes_read == 0) { /* End-of-file. */ + state->client_total = state->client_avail = 0; + state->client_next = state->client_buff = NULL; + state->end_of_file = 1; + break; + } + state->client_total = bytes_read; + state->client_avail = state->client_total; + state->client_next = state->client_buff; + } + + /* Common case: If client buffer suffices, use that. */ + if (state->avail == 0) { + *buff = state->client_next; + return (state->client_avail); + } + + /* Add in bytes from client buffer as necessary to meet the minimum. */ + if (min > state->avail + state->client_avail) + min = state->avail + state->client_avail; + if (state->avail < min) { + memcpy(state->next + state->avail, state->client_next, + min - state->avail); + state->client_next += min - state->avail; + state->client_avail -= min - state->avail; + state->avail = min; + } + + *buff = state->next; + return (state->avail); +} + +/* + * Mark the appropriate data as used. Note that the request here could + * be much smaller than the size of the previous read_ahead request, but + * typically it won't be. I make an attempt to go back to reading straight + * from the client buffer in case some end-of-block alignment mismatch forced + * me to combine writes above. + */ +static ssize_t +archive_decompressor_none_read_consume(struct archive *a, size_t request) +{ + struct archive_decompress_none *state; + + state = a->compression_data; + if (state->avail > 0) { + state->next += request; + state->avail -= request; + /* + * Rollback state->client_next if we can so that future + * reads come straight from the client buffer and we + * avoid copying more data into our buffer. + */ + if (state->avail <= + (size_t)(state->client_next - state->client_buff)) { + state->client_next -= state->avail; + state->client_avail += state->avail; + state->avail = 0; + state->next = state->buffer; + } + } else { + state->client_next += request; + state->client_avail -= request; + } + return (request); +} + +static int +archive_decompressor_none_finish(struct archive *a) +{ + struct archive_decompress_none *state; + + state = a->compression_data; + free(state->buffer); + free(state); + a->compression_data = NULL; + if (a->client_closer != NULL) + return ((a->client_closer)(a, a->client_data)); + return (ARCHIVE_OK); +} diff --git a/lib/libarchive/archive_read_support_format_all.c b/lib/libarchive/archive_read_support_format_all.c new file mode 100644 index 0000000..ded39f8 --- /dev/null +++ b/lib/libarchive/archive_read_support_format_all.c @@ -0,0 +1,43 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#ifdef DMALLOC +#include <dmalloc.h> +#endif + +#include "archive.h" + +int +archive_read_support_format_all(struct archive *a) +{ + archive_read_support_format_tar(a); + archive_read_support_format_gnutar(a); + archive_read_support_format_cpio(a); + return (ARCHIVE_OK); +} diff --git a/lib/libarchive/archive_read_support_format_cpio.c b/lib/libarchive/archive_read_support_format_cpio.c new file mode 100644 index 0000000..8fb0437 --- /dev/null +++ b/lib/libarchive/archive_read_support_format_cpio.c @@ -0,0 +1,187 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <errno.h> +#include <stdint.h> +#include <string.h> +#include <unistd.h> + +#include "archive.h" +#include "archive_entry.h" +#include "archive_private.h" + +struct cpio_header { + char c_magic[6]; + char c_dev[6]; + char c_ino[6]; + char c_mode[6]; + char c_uid[6]; + char c_gid[6]; + char c_nlink[6]; + char c_rdev[6]; + char c_mtime[11]; + char c_namesize[6]; + char c_filesize[11]; +}; + +static int64_t atol8(const char *, unsigned); +static int archive_read_format_cpio_bid(struct archive *); +static int archive_read_format_cpio_read_header(struct archive *, + struct archive_entry *); + +int +archive_read_support_format_cpio(struct archive *a) +{ + return (__archive_read_register_format(a, + NULL, + archive_read_format_cpio_bid, + archive_read_format_cpio_read_header, + NULL)); +} + + +static int +archive_read_format_cpio_bid(struct archive *a) +{ + int bid, bytes_read; + const void *h; + const struct cpio_header *header; + + bid = 0; + bytes_read = + (a->compression_read_ahead)(a, &h, sizeof(struct cpio_header)); + if (bytes_read < (int)sizeof(struct cpio_header)) + return (-1); + + header = h; + + if (memcmp(header->c_magic, "070707", 6)) return 0; + bid += 48; + + /* TODO: Verify more of header: Can at least check that only octal + digits appear in appropriate header locations */ + + return (bid); +} + +static int +archive_read_format_cpio_read_header(struct archive *a, + struct archive_entry *entry) +{ + struct stat st; + size_t bytes; + const struct cpio_header *header; + const void *h; + size_t namelength; + + a->archive_format = ARCHIVE_FORMAT_CPIO; + a->archive_format_name = "POSIX octet-oriented cpio"; + + /* Read fixed-size portion of header. */ + bytes = (a->compression_read_ahead)(a, &h, sizeof(struct cpio_header)); + if (bytes < sizeof(struct cpio_header)) + return (ARCHIVE_FATAL); + (a->compression_read_consume)(a, sizeof(struct cpio_header)); + + /* Parse out octal fields into struct stat */ + memset(&st, 0, sizeof(st)); + header = h; + + st.st_dev = atol8(header->c_dev, sizeof(header->c_dev)); + st.st_ino = atol8(header->c_ino, sizeof(header->c_ino)); + st.st_mode = atol8(header->c_mode, sizeof(header->c_mode)); + st.st_uid = atol8(header->c_uid, sizeof(header->c_uid)); + st.st_gid = atol8(header->c_gid, sizeof(header->c_gid)); + st.st_nlink = atol8(header->c_nlink, sizeof(header->c_nlink)); + st.st_rdev = atol8(header->c_rdev, sizeof(header->c_rdev)); + st.st_mtime = atol8(header->c_mtime, sizeof(header->c_mtime)); + namelength = atol8(header->c_namesize, sizeof(header->c_namesize)); + + /* + * Note: entry_bytes_remaining is at least 64 bits and + * therefore gauranteed to be big enough for a 33-bite file + * size. struct stat.st_size may only be 32 bits, so + * assigning there first could lose information. + */ + a->entry_bytes_remaining = + atol8(header->c_filesize, sizeof(header->c_filesize)); + st.st_size = a->entry_bytes_remaining; + a->entry_padding = 0; + + /* Assign all of the 'stat' fields at once. */ + archive_entry_copy_stat(entry, &st); + + /* Read name from buffer. */ + bytes = (a->compression_read_ahead)(a, &h, namelength); + if (bytes < namelength) + return (ARCHIVE_FATAL); + (a->compression_read_consume)(a, namelength); + archive_strncpy(&a->entry_name, h, bytes); + archive_entry_set_pathname(entry, a->entry_name.s); + + /* Compare name to "TRAILER!!!" to test for end-of-archive. */ + if (namelength == 11 && strcmp(h,"TRAILER!!!")==0) { + /* TODO: Store file location of start of block. */ + archive_set_error(a, 0, NULL); + return (ARCHIVE_EOF); + } + + return (ARCHIVE_OK); +} + + +/* Note that this implementation does not (and should not!) obey + * locale settings; you cannot simply substitute strtol here, since + * it does obey locale. + */ +static int64_t +atol8(const char *p, unsigned char_cnt) +{ + int64_t l; + int digit; + + static const int64_t limit = INT64_MAX / 8; + static const int base = 8; + static const char last_digit_limit = INT64_MAX % 8; + + l = 0; + digit = *p - '0'; + while (digit >= 0 && digit < base && char_cnt-- > 0) { + if (l > limit || (l == limit && digit > last_digit_limit)) { + l = UINT64_MAX; /* Truncate on overflow */ + break; + } + l = (l * base) + digit; + digit = *++p - '0'; + } + return (l); +} diff --git a/lib/libarchive/archive_read_support_format_gnutar.c b/lib/libarchive/archive_read_support_format_gnutar.c new file mode 100644 index 0000000..fd1a9cc --- /dev/null +++ b/lib/libarchive/archive_read_support_format_gnutar.c @@ -0,0 +1,516 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <sys/stat.h> +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <err.h> +#include <errno.h> +#include <stdint.h> +#include <string.h> +#include <unistd.h> + +#include "archive.h" +#include "archive_entry.h" +#include "archive_private.h" + +/* + * Structure of GNU tar header + */ +struct archive_entry_header_gnutar { + char name[100]; + char mode[8]; + char uid[8]; + char gid[8]; + char size[12]; + char mtime[12]; + char checksum[8]; + char typeflag[1]; + char linkname[100]; + char magic[8]; /* "ustar \0" (note blank/blank/null at end) */ + char uname[32]; + char gname[32]; + char devmajor[8]; + char devminor[8]; + char atime[12]; + char ctime[12]; + char offset[12]; + char longnames[4]; + char unused[1]; + struct { + char offset[12]; + char numbytes[12]; + } sparse[4]; + char isextended[1]; + char realsize[12]; + /* + * GNU doesn't use POSIX 'prefix' field; they use the 'L' (longname) + * entry instead. + */ +}; + +static int archive_block_is_null(const unsigned char *p); +static int archive_header_gnu(struct archive *, struct archive_entry *, + const void *); +static int archive_read_format_gnutar_bid(struct archive *a); +static int archive_read_format_gnutar_read_header(struct archive *a, + struct archive_entry *); +static int checksum(struct archive *a, const void *h); +static int64_t tar_atol(const char *, unsigned); +static int64_t tar_atol8(const char *, unsigned); +static int64_t tar_atol256(const char *, unsigned); + +/* + * The ONLY publicly visible function in this file. + */ +int +archive_read_support_format_gnutar(struct archive *a) +{ + return (__archive_read_register_format(a, + NULL, + archive_read_format_gnutar_bid, + archive_read_format_gnutar_read_header, + NULL)); +} + +static int +archive_read_format_gnutar_bid(struct archive *a) +{ + int bid; + size_t bytes_read; + const void *h; + const struct archive_entry_header_gnutar *header; + + /* + * If we're already reading a non-tar file, don't + * bother to bid. + */ + if (a->archive_format != 0 && + (a->archive_format & ARCHIVE_FORMAT_BASE_MASK) != + ARCHIVE_FORMAT_TAR) + return (0); + + bid = 0; + + /* If last header was my preferred format, bid a bit more. */ + if (a->archive_format == ARCHIVE_FORMAT_TAR_GNUTAR) + bid += 10; + + bytes_read = (a->compression_read_ahead)(a, &h, 512); + if (bytes_read < 512) + return (-1); + + /* + * TODO: if checksum or header fail, scan ahead for + * next valid header. + */ + + /* Checksum field is eight 8-bit values: 64 bits of validation. */ + if (!checksum(a, h)) + return (0); + bid += 64; + + header = (const struct archive_entry_header_gnutar *)h; + + /* This distinguishes GNU tar formats from POSIX formats */ + if (memcmp(header->magic, "ustar \0", 8) != 0) + return (0); + bid += 64; + + return (bid); +} + +static int +archive_read_format_gnutar_read_header(struct archive *a, + struct archive_entry *entry) +{ + const void *h; + ssize_t bytes; + int oldstate; + + a->archive_format = ARCHIVE_FORMAT_TAR_GNUTAR; + a->archive_format_name = "GNU tar"; + + /* Skip remains of previous entry. */ + oldstate = a->state; + a->state = ARCHIVE_STATE_DATA; + archive_read_data_skip(a); + a->state = oldstate; + + /* Read 512-byte header record */ + bytes = (a->compression_read_ahead)(a, &h, 512); + if (bytes < 512) + return (ARCHIVE_FATAL); + (a->compression_read_consume)(a, 512); + + /* + * If this is a block of nulls, return 0 (no more entries). + * Note the initial (*h)==0 test short-circuits the function call + * in the most common case. + */ + if (((*(const char *)h)==0) && archive_block_is_null(h)) { + /* TODO: Store file location of start of block in public area */ + archive_set_error(a, 0, NULL); + return (ARCHIVE_EOF); + } + + /* TODO: add support for scanning for next valid header */ + if (!checksum(a, h)) { + archive_set_error(a, EINVAL, "Damaged GNU tar archive"); + return (ARCHIVE_FATAL); /* Not a valid header. */ + } + + /* This function gets called recursively for long name headers, etc. */ + if (++a->gnu_header_recursion_depth > 32) + errx(EINVAL, + "*** Too many special headers for one entry; giving up. " + "(%s:%s@%d)\n", + __FUNCTION__, __FILE__, __LINE__); + + archive_header_gnu(a, entry, h); + a->gnu_header_recursion_depth--; + return (0); +} + +/* + * Return true if block checksum is correct. + */ +static int +checksum(struct archive *a, const void *h) +{ + const unsigned char *bytes; + const struct archive_entry_header_gnutar *header; + int i, sum, signed_sum, unsigned_sum; + + (void)a; /* UNUSED */ + bytes = h; + header = h; + + /* Test checksum: POSIX specifies UNSIGNED for this calculation. */ + sum = tar_atol(header->checksum, sizeof(header->checksum)); + unsigned_sum = 0; + for (i = 0; i < 148; i++) + unsigned_sum += (unsigned char)bytes[i]; + for (; i < 156; i++) + unsigned_sum += 32; + for (; i < 512; i++) + unsigned_sum += (unsigned char)bytes[i]; + if (sum == unsigned_sum) + return (1); + + /* + * Repeat test with SIGNED bytes, just in case this archive + * was created by an old BSD, Solaris, or HP-UX tar with a broken + * checksum calculation. + */ + signed_sum = 0; + for (i = 0; i < 148; i++) + signed_sum += (signed char)bytes[i]; + for (; i < 156; i++) + signed_sum += 32; + for (; i < 512; i++) + signed_sum += (signed char)bytes[i]; + if (sum == signed_sum) + return (1); + + return (0); +} + +/* + * Return true if this block contains only nulls. + */ +static int +archive_block_is_null(const unsigned char *p) +{ + unsigned i; + + for (i = 0; i < ARCHIVE_BYTES_PER_RECORD / sizeof(*p); i++) { + if (*p++) + return (0); + } + return (1); +} + +/* + * Parse GNU tar header + */ +static int +archive_header_gnu(struct archive *a, struct archive_entry *entry, + const void *h) +{ + struct stat st; + const struct archive_entry_header_gnutar *header; + char tartype; + + /* Clear out entry structure */ + memset(&st, 0, sizeof(st)); + + /* + * GNU header is like POSIX, except 'prefix' is + * replaced with some other fields. This also means the + * filename is stored as in old-style archives. + */ + + /* Copy filename over (to ensure null termination). */ + header = h; + archive_strncpy(&(a->entry_name), header->name, sizeof(header->name)); + archive_entry_set_pathname(entry, a->entry_name.s); + + /* Copy linkname over */ + if (header->linkname[0]) + archive_strncpy(&(a->entry_linkname), header->linkname, + sizeof(header->linkname)); + + /* Parse out the numeric fields (all are octal) */ + st.st_mode = tar_atol(header->mode, sizeof(header->mode)); + st.st_uid = tar_atol(header->uid, sizeof(header->uid)); + st.st_gid = tar_atol(header->gid, sizeof(header->gid)); + st.st_size = tar_atol(header->size, sizeof(header->size)); + st.st_mtime = tar_atol(header->mtime, sizeof(header->mtime)); + + /* Handle the tar type flag appropriately. */ + tartype = header->typeflag[0]; + archive_entry_set_tartype(entry, tartype); + st.st_mode &= ~S_IFMT; + + /* Fields common to ustar and GNU */ + archive_strncpy(&(a->entry_uname), + header->uname, sizeof(header->uname)); + archive_entry_set_uname(entry, a->entry_uname.s); + + archive_strncpy(&(a->entry_gname), + header->gname, sizeof(header->gname)); + archive_entry_set_gname(entry, a->entry_gname.s); + + /* Parse out device numbers only for char and block specials */ + if (header->typeflag[0] == '3' || header->typeflag[0] == '4') + st.st_rdev = makedev ( + tar_atol(header->devmajor, sizeof(header->devmajor)), + tar_atol(header->devminor, sizeof(header->devminor))); + else + st.st_rdev = 0; + + /* Grab additional GNU fields. */ + /* TODO: FILL THIS IN!!! */ + st.st_atime = tar_atol(header->atime, sizeof(header->atime)); + st.st_ctime = tar_atol(header->atime, sizeof(header->ctime)); + + /* Set internal counter for locating next header */ + a->entry_bytes_remaining = st.st_size; + a->entry_padding = 0x1ff & (-a->entry_bytes_remaining); + + /* Interpret entry type */ + switch (tartype) { + case '1': /* Hard link */ + archive_entry_set_hardlink(entry, a->entry_linkname.s); + /* + * Note: Technically, tar does not store the file type + * for a "hard link" entry, only the fact that it is a + * hard link. So, I leave the file type in st_mode + * zero here. + */ + archive_entry_copy_stat(entry, &st); + break; + case '2': /* Symlink */ + st.st_mode |= S_IFLNK; + st.st_size = 0; + archive_entry_set_symlink(entry, a->entry_linkname.s); + archive_entry_copy_stat(entry, &st); + break; + case '3': /* Character device */ + st.st_mode |= S_IFCHR; + st.st_size = 0; + archive_entry_copy_stat(entry, &st); + break; + case '4': /* Block device */ + st.st_mode |= S_IFBLK; + st.st_size = 0; + archive_entry_copy_stat(entry, &st); + break; + case '5': /* POSIX Dir */ + st.st_mode |= S_IFDIR; + st.st_size = 0; + archive_entry_copy_stat(entry, &st); + break; + case '6': /* FIFO device */ + st.st_mode |= S_IFIFO; + st.st_size = 0; + archive_entry_copy_stat(entry, &st); + break; + case 'D': /* GNU incremental directory type */ + /* + * No special handling is actually required here. + * It might be nice someday to preprocess the file list and + * provide it to the client, though. + */ + st.st_mode &= ~ S_IFMT; + st.st_mode |= S_IFDIR; + archive_entry_copy_stat(entry, &st); + break; + case 'K': /* GNU long linkname */ + /* Entry body is full name of link for next header. */ + archive_string_ensure(&(a->gnu_linkname), st.st_size+1); + archive_read_data_into_buffer(a, a->gnu_linkname.s, + st.st_size); + a->gnu_linkname.s[st.st_size] = 0; /* Null term name! */ + /* + * This next call will usually overwrite + * a->entry_linkname, which is why we _must_ have a + * separate gnu_linkname field. + */ + archive_read_format_gnutar_read_header(a, entry); + if (archive_entry_tartype(entry) == '1') + archive_entry_set_hardlink(entry, a->gnu_linkname.s); + else if (archive_entry_tartype(entry) == '2') + archive_entry_set_symlink(entry, a->gnu_linkname.s); + /* TODO: else { ... } */ + break; + case 'L': /* GNU long filename */ + /* Entry body is full pathname for next header. */ + archive_string_ensure(&(a->gnu_name), st.st_size+1); + archive_read_data_into_buffer(a, a->gnu_name.s, + st.st_size); + a->gnu_name.s[st.st_size] = 0; /* Null terminate name! */ + /* This next call will typically overwrite a->entry_name, which + * is why we _must_ have a separate gnu_name field */ + archive_read_format_gnutar_read_header(a, entry); + archive_entry_set_pathname(entry, a->gnu_name.s); + break; + case 'M': /* GNU Multi-volume (remainder of file from last archive) */ + /* + * As far as I can tell, this is just like a regular file + * entry, except that the contents should be _appended_ to + * the indicated file at the indicated offset. This may + * require some API work to fully support. + */ + break; + case 'N': /* Old GNU long filename; this will never be supported */ + /* Essentially, body of this entry is a script for + * renaming previously-extracted entries. Ugh. */ + break; + case 'S': /* GNU Sparse files: These are really ugly, and unlikely + * to be supported anytime soon. */ + break; + case 'V': /* GNU volume header */ + /* Just skip it */ + return (archive_read_format_gnutar_read_header(a, entry)); + default: /* Regular file and non-standard types */ + /* Per POSIX: non-recognized types should always be + * treated as regular files. Of course, GNU + * extensions aren't compatible with this dictum. + * <sigh> */ + st.st_mode |= S_IFREG; + archive_entry_copy_stat(entry, &st); + break; + } + + return (0); +} + +/* + * Convert text->integer. + * + * Traditional tar formats (including POSIX) specify base-8 for + * all of the standard numeric fields. GNU tar supports base-256 + * as well in many of the numeric fields. There is also an old + * and short-lived base-64 format, but I doubt I'll ever see + * an archive that uses it. (According to the changelog for GNU tar, + * that format was only implemented for a couple of weeks!) + */ +static int64_t +tar_atol(const char *p, unsigned char_cnt) +{ + if (*p & 0x80) + return (tar_atol256(p, char_cnt)); + return (tar_atol8(p, char_cnt)); +} + +/* + * Note that this implementation does not (and should not!) obey + * locale settings; you cannot simply substitute strtol here, since + * it does obey locale. + */ +static int64_t +tar_atol8(const char *p, unsigned char_cnt) +{ + int64_t l; + int digit, sign; + + static const int64_t limit = INT64_MAX / 8; + static const int base = 8; + static const char last_digit_limit = INT64_MAX % 8; + + while (*p == ' ' || *p == '\t') + p++; + if (*p == '-') { + sign = -1; + p++; + } else + sign = 1; + + l = 0; + digit = *p - '0'; + while (digit >= 0 && digit < base && char_cnt-- > 0) { + if (l>limit || (l == limit && digit > last_digit_limit)) { + l = INT64_MAX; /* Truncate on overflow */ + break; + } + l = ( l * base ) + digit; + digit = *++p - '0'; + } + return (sign < 0) ? -l : l; +} + +/* + * Parse a base-256 integer. + * + * TODO: This overflows very quickly for negative values; fix this. + */ +static int64_t +tar_atol256(const char *p, unsigned char_cnt) +{ + int64_t l; + int digit; + + const int64_t limit = INT64_MAX / 256; + + /* Ignore high bit of first byte (that's the base-256 flag). */ + l = 0; + digit = 0x7f & *(const unsigned char *)p; + while (char_cnt-- > 0) { + if (l > limit) { + l = INT64_MAX; /* Truncate on overflow */ + break; + } + l = (l << 8) + digit; + digit = *(const unsigned char *)++p; + } + return (l); +} diff --git a/lib/libarchive/archive_read_support_format_tar.c b/lib/libarchive/archive_read_support_format_tar.c new file mode 100644 index 0000000..7210b73 --- /dev/null +++ b/lib/libarchive/archive_read_support_format_tar.c @@ -0,0 +1,934 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <sys/stat.h> +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <errno.h> +#include <stdint.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> + +#include "archive.h" +#include "archive_entry.h" +#include "archive_private.h" + +/* + * Layout of POSIX 'ustar' tar header. + */ +struct archive_entry_header_ustar { + char name[100]; + char mode[8]; + char uid[8]; + char gid[8]; + char size[12]; + char mtime[12]; + char checksum[8]; + char typeflag[1]; + char linkname[100]; /* "old format" header ends here */ + char magic[6]; /* For POSIX: "ustar\0" */ + char version[2]; /* For POSIX: "00" */ + char uname[32]; + char gname[32]; + char devmajor[8]; + char devminor[8]; + char prefix[155]; +}; + +static int archive_block_is_null(const unsigned char *p); +static int archive_header_common(struct archive *, struct archive_entry *, + struct stat *, const void *); +static int archive_header_old_tar(struct archive *, + struct archive_entry *, struct stat *, const void *); +static int archive_header_pax_extensions(struct archive *, + struct archive_entry *, struct stat *, const void *); +static int archive_header_pax_global(struct archive *, + struct archive_entry *, struct stat *, const void *h); +static int archive_header_ustar(struct archive *, struct archive_entry *, + struct stat *, const void *h); +static int archive_read_format_tar_bid(struct archive *); +static int archive_read_format_tar_read_header(struct archive *, + struct archive_entry *); +static int checksum(struct archive *, const void *); +static int pax_attribute(struct archive *, struct archive_entry *, + struct stat *, char *key, char *value); +static int pax_header(struct archive *, struct archive_entry *, + struct stat *, char *attr, uint64_t length); +static void pax_time(const char *, struct timespec *t); +static int64_t tar_atol(const char *, unsigned); +static int64_t tar_atol10(const char *, unsigned); +static int64_t tar_atol256(const char *, unsigned); +static int64_t tar_atol8(const char *, unsigned); + +int +archive_read_support_format_tar(struct archive *a) +{ + return (__archive_read_register_format(a, + NULL, + archive_read_format_tar_bid, + archive_read_format_tar_read_header, + NULL)); +} + + +static int +archive_read_format_tar_bid(struct archive *a) +{ + int bid; + ssize_t bytes_read; + const void *h; + const struct archive_entry_header_ustar *header; + + /* + * If we're already reading a non-tar file, don't + * bother to bid. + */ + if (a->archive_format != 0 && + (a->archive_format & ARCHIVE_FORMAT_BASE_MASK) != + ARCHIVE_FORMAT_TAR) + return (0); + bid = 0; + + /* + * If we're already reading a tar format, start the bid at 1 as + * a failsafe. + */ + if ((a->archive_format & ARCHIVE_FORMAT_BASE_MASK) == + ARCHIVE_FORMAT_TAR) + bid++; + + /* If last header was my preferred format, bid a bit more. */ + if (a->archive_format == ARCHIVE_FORMAT_TAR_USTAR || + a->archive_format == ARCHIVE_FORMAT_TAR_PAX_INTERCHANGE) + bid++; + + /* Now let's look at the actual header and see if it matches. */ + bytes_read = (a->compression_read_ahead)(a, &h, 512); + if (bytes_read < 512) + return (ARCHIVE_FATAL); + + /* If it's an end-of-archive mark, we can handle it. */ + if ((*(const char *)h) == 0 && archive_block_is_null(h)) + return (bid + 1); + + /* If it's not an end-of-archive mark, it must have a valid checksum.*/ + if (!checksum(a, h)) + return (0); + bid += 48; /* Checksum is usually 6 octal digits. */ + + header = h; + + /* This distinguishes POSIX formats from GNU tar formats. */ + if ((memcmp(header->magic, "ustar\0", 6) == 0) + &&(memcmp(header->version, "00", 2)==0)) + bid += 56; + + /* Type flag must be null, digit or A-Z, a-z. */ + if (header->typeflag[0] != 0 && + !( header->typeflag[0] >= '0' && header->typeflag[0] <= '9') && + !( header->typeflag[0] >= 'A' && header->typeflag[0] <= 'Z') && + !( header->typeflag[0] >= 'a' && header->typeflag[0] <= 'z') ) + return (0); + + /* Sanity check: Look at first byte of mode field. */ + switch (255 & (unsigned)header->mode[0]) { + case 0: case 255: + /* Base-256 value: No further verification possible! */ + break; + case ' ': /* Not recommended, but not illegal, either. */ + break; + case '0': case '1': case '2': case '3': + case '4': case '5': case '6': case '7': + /* Octal Value. */ + /* TODO: Check format of remainder of this field. */ + break; + default: + /* Not a valid mode; bail out here. */ + return (0); + } + /* TODO: Sanity test uid/gid/size/mtime/devmajor/devminor fields. */ + + return (bid); +} + +static int +archive_read_format_tar_read_header(struct archive *a, + struct archive_entry *entry) +{ + struct stat st; + ssize_t bytes; + int err; + const void *h; + const struct archive_entry_header_ustar *header; + + memset(&st, 0, sizeof(st)); + + /* Read 512-byte header record */ + bytes = (a->compression_read_ahead)(a, &h, 512); + if (bytes < 512) { + /* TODO: Set error values */ + return (-1); + } + (a->compression_read_consume)(a, 512); + + /* Check for end-of-archive mark. */ + if (((*(const char *)h)==0) && archive_block_is_null(h)) { + /* TODO: Store file location of start of block */ + archive_set_error(a, 0, NULL); + return (ARCHIVE_EOF); + } + + /* + * Note: If the checksum fails and we return ARCHIVE_RETRY, + * then the client is likely to just retry. This is a very crude way + * to search for the next valid header! + * + * TODO: Improve this by implementing a real header scan. + */ + if (!checksum(a, h)) { + archive_set_error(a, EINVAL, "Damaged tar archive"); + return (ARCHIVE_RETRY); /* Retryable: Invalid header */ + } + + /* Determine the format variant. */ + header = h; + if (memcmp(header->magic, "ustar", 5) != 0) + err = archive_header_old_tar(a, entry, &st, h); /* non-POSIX */ + else switch(header->typeflag[0]) { + case 'g': + a->archive_format = ARCHIVE_FORMAT_TAR_PAX_INTERCHANGE; + a->archive_format_name = "POSIX pax interchange format"; + err = archive_header_pax_global(a, entry, &st, h); + break; + case 'x': + a->archive_format = ARCHIVE_FORMAT_TAR_PAX_INTERCHANGE; + a->archive_format_name = "POSIX pax interchange format"; + err = archive_header_pax_extensions(a, entry, &st, h); + break; + case 'X': + a->archive_format = ARCHIVE_FORMAT_TAR_PAX_INTERCHANGE; + a->archive_format_name = + "POSIX pax interchange format (Sun variant)"; + err = archive_header_pax_extensions(a, entry, &st, h); + break; + default: + if (a->archive_format != ARCHIVE_FORMAT_TAR_PAX_INTERCHANGE) { + a->archive_format = ARCHIVE_FORMAT_TAR_USTAR; + a->archive_format_name = "POSIX ustar"; + } + err = archive_header_ustar(a, entry, &st, h); + } + archive_entry_copy_stat(entry, &st); + return (err); +} + + +/* + * Return true if block checksum is correct. + */ +static int +checksum(struct archive *a, const void *h) +{ + const unsigned char *bytes; + const struct archive_entry_header_ustar *header; + int check, i, sum; + + (void)a; /* UNUSED */ + bytes = h; + header = h; + + /* + * Test the checksum. Note that POSIX specifies _unsigned_ + * bytes for this calculation. + */ + sum = tar_atol(header->checksum, sizeof(header->checksum)); + check = 0; + for (i = 0; i < 148; i++) + check += (unsigned char)bytes[i]; + for (; i < 156; i++) + check += 32; + for (; i < 512; i++) + check += (unsigned char)bytes[i]; + if (sum == check) + return (1); + + /* + * Repeat test with _signed_ bytes, just in case this archive + * was created by an old BSD, Solaris, or HP-UX tar with a + * broken checksum calculation. + */ + check = 0; + for (i = 0; i < 148; i++) + check += (signed char)bytes[i]; + for (; i < 156; i++) + check += 32; + for (; i < 512; i++) + check += (signed char)bytes[i]; + if (sum == check) + return (1); + + return (0); +} + + +/* + * Return true if this block contains only nulls. + */ +static int +archive_block_is_null(const unsigned char *p) +{ + unsigned i; + + for (i = 0; i < ARCHIVE_BYTES_PER_RECORD / sizeof(*p); i++) + if (*p++) + return (0); + return (1); +} + +/* + * Parse out common header elements. + * + * This would be the same as archive_header_old_tar, except that the + * filename is handled slightly differently for old and POSIX + * entries (POSIX entries support a 'prefix'). This factoring + * allows archive_header_old_tar and archive_header_ustar + * to handle filenames differently, while still putting most of the + * common parsing into one place. + */ +static int +archive_header_common(struct archive *a, struct archive_entry *entry, + struct stat *st, const void *h) +{ + const struct archive_entry_header_ustar *header; + char tartype; + + header = h; + if (header->linkname[0]) + archive_strncpy(&(a->entry_linkname), header->linkname, + sizeof(header->linkname)); + else + archive_string_empty(&(a->entry_linkname)); + + /* Parse out the numeric fields (all are octal) */ + st->st_mode = tar_atol(header->mode, sizeof(header->mode)); + st->st_uid = tar_atol(header->uid, sizeof(header->uid)); + st->st_gid = tar_atol(header->gid, sizeof(header->gid)); + st->st_size = tar_atol(header->size, sizeof(header->size)); + st->st_mtime = tar_atol(header->mtime, sizeof(header->mtime)); + + /* Handle the tar type flag appropriately. */ + tartype = header->typeflag[0]; + archive_entry_set_tartype(entry, tartype); + st->st_mode &= ~S_IFMT; + + switch (tartype) { + case '1': /* Hard link */ + archive_entry_set_hardlink(entry, a->entry_linkname.s); + /* + * The following may seem odd, but: Technically, tar + * does not store the file type for a "hard link" + * entry, only the fact that it is a hard link. So, I + * leave the type zero normally. But, pax interchange + * format allows hard links to have data, which + * implies that the underlying entry is a regular + * file. + */ + if (st->st_size > 0) + st->st_mode |= S_IFREG; + break; + case '2': /* Symlink */ + st->st_mode |= S_IFLNK; + st->st_size = 0; + archive_entry_set_symlink(entry, a->entry_linkname.s); + break; + case '3': /* Character device */ + st->st_mode |= S_IFCHR; + st->st_size = 0; + break; + case '4': /* Block device */ + st->st_mode |= S_IFBLK; + st->st_size = 0; + break; + case '5': /* Dir */ + st->st_mode |= S_IFDIR; + st->st_size = 0; + break; + case '6': /* FIFO device */ + st->st_mode |= S_IFIFO; + st->st_size = 0; + break; + default: /* Regular file and non-standard types */ + /* + * Per POSIX: non-recognized types should always be + * treated as regular files. + */ + st->st_mode |= S_IFREG; + break; + } + return (0); +} + +/* + * Parse out header elements for "old-style" tar archives + */ +static int +archive_header_old_tar(struct archive *a, struct archive_entry *entry, + struct stat *st, const void *h) +{ + const struct archive_entry_header_ustar *header; + + a->archive_format = ARCHIVE_FORMAT_TAR; + a->archive_format_name = "tar (non-POSIX)"; + + /* Copy filename over (to ensure null termination). */ + header = h; + archive_strncpy(&(a->entry_name), header->name, sizeof(header->name)); + archive_entry_set_pathname(entry, a->entry_name.s); + + /* Grab rest of common fields */ + archive_header_common(a, entry, st, h); + + /* + * TODO: Decide whether the following special handling + * is needed for POSIX headers. Factor accordingly. + */ + + /* "Regular" entry with trailing '/' is really directory. */ + if (S_ISREG(st->st_mode) && + '/' == a->entry_name.s[strlen(a->entry_name.s) - 1]) { + st->st_mode &= ~S_IFMT; + st->st_mode |= S_IFDIR; + archive_entry_set_tartype(entry, '5'); + } + + a->entry_bytes_remaining = st->st_size; + a->entry_padding = 0x1ff & (-a->entry_bytes_remaining); + return (0); +} + + +/* + * Parse a file header for a pax extended archive entry. + */ +static int +archive_header_pax_global(struct archive *a, struct archive_entry *entry, + struct stat *st, const void *h) +{ + uint64_t extension_size; + size_t bytes; + int err; + char *global; + const struct archive_entry_header_ustar *header; + + header = h; + extension_size = tar_atol(header->size, sizeof(header->size)); + a->entry_bytes_remaining = extension_size; + a->entry_padding = 0x1ff & (-a->entry_bytes_remaining); + + global = malloc(extension_size + 1); + archive_read_data_into_buffer(a, global, extension_size); + global[extension_size] = 0; + + /* + * TODO: Store the global default options somewhere for future use. + * For now, just free the buffer and keep going. + */ + free(global); + + /* Skip the padding. */ + archive_read_data_skip(a); + + /* Read the next header. */ + bytes = (a->compression_read_ahead)(a, &h, 512); + if (bytes < 512) { + /* TODO: Set error values. */ + return (-1); + } + (a->compression_read_consume)(a, 512); + + header = h; + switch(header->typeflag[0]) { + case 'x': + case 'X': + err = archive_header_pax_extensions(a, entry, st, h); + break; + default: + err = archive_header_ustar(a, entry, st, h); + } + + return (err); +} + +static int +archive_header_pax_extensions(struct archive *a, + struct archive_entry *entry, struct stat *st, const void *h) +{ + uint64_t extension_size; + size_t bytes; + int err; + const struct archive_entry_header_ustar *header; + int oldstate; + + header = h; + extension_size = tar_atol(header->size, sizeof(header->size)); + a->entry_bytes_remaining = extension_size; + a->entry_padding = 0x1ff & (-a->entry_bytes_remaining); + + archive_string_ensure(&(a->pax_header), extension_size + 1); + oldstate = a->state; + a->state = ARCHIVE_STATE_DATA; + archive_read_data_into_buffer(a, a->pax_header.s, extension_size); + a->pax_header.s[extension_size] = 0; + archive_read_data_skip(a); /* Skip any padding. */ + a->state = oldstate; + + /* Read the next header. */ + bytes = (a->compression_read_ahead)(a, &h, 512); + if (bytes < 512) { + /* TODO: Set error values */ + return (-1); + } + (a->compression_read_consume)(a, 512); + + /* Must be a regular POSIX ustar entry. */ + err = archive_header_ustar(a, entry, st, h); + + /* + * TODO: Parse global/default options into 'entry' struct here + * before handling file-specific options. + * + * This design (parse standard header, then overwrite with pax + * extended attribute data) usually works well, but isn't ideal; + * it would be better to parse the pax extended attributes first + * and then skip any fields in the standard header that were + * defined in the pax header. + */ + pax_header(a, entry, st, a->pax_header.s, extension_size); + a->entry_bytes_remaining = st->st_size; + a->entry_padding = 0x1ff & (-a->entry_bytes_remaining); + return (err); +} + + +/* + * Parse a file header for a Posix "ustar" archive entry. This also + * handles "pax" or "extended ustar" entries. + */ +static int +archive_header_ustar(struct archive *a, struct archive_entry *entry, + struct stat *st, const void *h) +{ + const struct archive_entry_header_ustar *header; + + header = h; + + /* Copy name into an internal buffer to ensure null-termination. */ + if (header->prefix[0]) { + archive_strncpy(&(a->entry_name), header->prefix, + sizeof(header->prefix)); + archive_strappend_char(&(a->entry_name), '/'); + archive_strncat(&(a->entry_name), header->name, + sizeof(header->name)); + } else + archive_strncpy(&(a->entry_name), header->name, + sizeof(header->name)); + + archive_entry_set_pathname(entry, a->entry_name.s); + + /* Handle rest of common fields. */ + archive_header_common(a, entry, st, h); + + /* Handle POSIX ustar fields. */ + archive_strncpy(&(a->entry_uname), header->uname, + sizeof(header->uname)); + archive_entry_set_uname(entry, a->entry_uname.s); + + archive_strncpy(&(a->entry_gname), header->gname, + sizeof(header->gname)); + archive_entry_set_gname(entry, a->entry_gname.s); + + /* Parse out device numbers only for char and block specials. */ + if (header->typeflag[0] == '3' || header->typeflag[0] == '4') { + st->st_rdev = makedev( + tar_atol(header->devmajor, sizeof(header->devmajor)), + tar_atol(header->devminor, sizeof(header->devminor))); + } + + a->entry_bytes_remaining = st->st_size; + a->entry_padding = 0x1ff & (-a->entry_bytes_remaining); + + return (0); +} + + +/* + * Parse the pax extended attributes record. + * + * Returns non-zero if there's an error in the data. + */ +static int +pax_header(struct archive *a, struct archive_entry *entry, struct stat *st, + char *attr, uint64_t attr_length) +{ + uint64_t l; + uint64_t line_length; + char *line, *key, *p, *value; + + while (attr_length > 0) { + /* Parse decimal length field at start of line. */ + line_length = 0; + l = attr_length; + line = p = attr; /* Record start of line. */ + while (l>0) { + if (*p == ' ') { + p++; + l--; + break; + } + if (*p < '0' || *p > '9') + return (-1); + line_length *= 10; + line_length += *p - '0'; + if (line_length > 999999) + return (-1); + p++; + l--; + } + + if (line_length > attr_length) + return (0); + + /* Null-terminate 'key' value. */ + key = p; + p = strchr(key, '='); + if (p == NULL) + return (0); + if (p > line + line_length) + return (-1); + *p = 0; + if (strlen(key) < 1) + return (-1); + + /* Null-terminate 'value' portion. */ + value = p + 1; + p = strchr(value, '\n'); + if (p == NULL) + return (-1); + if (p > line + line_length) + return (-1); + *p = 0; + + if (pax_attribute(a, entry, st, key, value)) + return (-1); + + /* Skip to next line */ + attr += line_length; + attr_length -= line_length; + } + return (0); +} + + + +/* + * Parse a single key=value attribute. key/value pointers are + * assumed to point into reasonably long-lived storage. + * + * Note that POSIX reserves all-lowercase keywords. Vendor-specific + * extensions should always have keywords of the form "VENDOR.attribute" + * In particular, it's quite feasible to support many different + * vendor extensions here. I'm using "LIBARCHIVE" for extensions + * unique to this library (currently, there are none). + * + * Investigate other vendor-specific extensions, as well and see if + * any of them look useful. + */ +static int +pax_attribute(struct archive *a, struct archive_entry *entry, struct stat *st, + char *key, char *value) +{ + + (void)a; /* UNUSED */ + + switch (key[0]) { + case 'L': + /* Our extensions */ +/* TODO: Handle arbitrary extended attributes... */ +/* + if (strcmp(key, "LIBARCHIVE.xxxxxxx")==0) + archive_entry_set_xxxxxx(entry, value); +*/ + break; + case 'S': + /* We support some keys used by the "star" archiver */ + if (strcmp(key, "SCHILY.acl.access")==0) + archive_entry_set_acl(entry, value); + else if (strcmp(key, "SCHILY.acl.default")==0) + archive_entry_set_acl_default(entry, value); + else if (strcmp(key, "SCHILY.devmajor")==0) + st->st_rdev = makedev(tar_atol10(value, strlen(value)), + minor(st->st_dev)); + else if (strcmp(key, "SCHILY.devminor")==0) + st->st_rdev = makedev(major(st->st_dev), + tar_atol10(value, strlen(value))); + else if (strcmp(key, "SCHILY.fflags")==0) + archive_entry_set_fflags(entry, value); + break; + case 'a': + if (strcmp(key, "atime")==0) + pax_time(value, &(st->st_atimespec)); + break; + case 'c': + if (strcmp(key, "ctime")==0) + pax_time(value, &(st->st_ctimespec)); + else if (strcmp(key, "charset")==0) { + /* TODO: Publish charset information in entry. */ + } else if (strcmp(key, "comment")==0) { + /* TODO: Publish comment in entry. */ + } + break; + case 'g': + if (strcmp(key, "gid")==0) + st->st_gid = tar_atol10(value, strlen(value)); + else if (strcmp(key, "gname")==0) + archive_entry_set_gname(entry, value); + break; + case 'l': + /* pax interchange doesn't distinguish hardlink vs. symlink. */ + if (strcmp(key, "linkpath")==0) { + if (archive_entry_hardlink(entry)) + archive_entry_set_hardlink(entry, value); + else + archive_entry_set_symlink(entry, value); + } + break; + case 'm': + if (strcmp(key, "mtime")==0) + pax_time(value, &(st->st_mtimespec)); + break; + case 'p': + if (strcmp(key, "path")==0) + archive_entry_set_pathname(entry, value); + break; + case 'r': + /* POSIX has reserved 'realtime.*' */ + break; + case 's': + /* POSIX has reserved 'security.*' */ + /* Someday: if (strcmp(key, "security.acl")==0) { ... } */ + if (strcmp(key, "size")==0) + st->st_size = tar_atol10(value, strlen(value)); + break; + case 'u': + if (strcmp(key, "uid")==0) + st->st_uid = tar_atol10(value, strlen(value)); + else if (strcmp(key, "uname")==0) + archive_entry_set_uname(entry, value); + break; + } + return (0); +} + + + +/* + * parse a decimal time value, which may include a fractional portion + */ +static void +pax_time(const char *p, struct timespec *t) +{ + char digit; + int64_t s; + unsigned long l; + int sign; + + static const int64_t limit64 = INT64_MAX / 10; + static const char last_digit_limit64 = INT64_MAX % 10; + + s = 0; + sign = 1; + if (*p == '-') { + sign = -1; + p++; + } + while (*p >= '0' && *p <= '9') { + digit = *p - '0'; + if (s > limit64 || + (s == limit64 && digit > last_digit_limit64)) { + s = UINT64_MAX; /* Truncate on overflow */ + break; + } + s = (s * 10) + digit; + ++p; + } + + t->tv_sec = s * sign; + + /* Calculate nanoseconds. */ + t->tv_nsec = 0; + + if (*p != '.') + return; + + l = 100000000UL; + do { + ++p; + if (*p >= '0' && *p <= '9') + t->tv_nsec += (*p - '0') * l; + else + break; + } while (l /= 10); +} + +/*- + * Convert text->integer. + * + * Traditional tar formats (including POSIX) specify base-8 for + * all of the standard numeric fields. This is a significant limitation + * in practice: + * = file size is limited to 8GB + * = devmajor and devminor are limited to 21 bits + * = uid/gid are limited to 21 bits + * + * There are two workarounds for this: + * = pax extended headers, which use variable-length string fields + * = GNU tar and STAR both allow either base-8 or base-256 in + * most fields. The high bit is set to indicate base-256. + * + * On read, this implementation supports both extensions. + */ +static int64_t +tar_atol(const char *p, unsigned char_cnt) +{ + if (*p & 0x80) + return (tar_atol256(p, char_cnt)); + return (tar_atol8(p, char_cnt)); +} + +/* + * Note that this implementation does not (and should not!) obey + * locale settings; you cannot simply substitute strtol here, since + * it does obey locale. + */ +static int64_t +tar_atol8(const char *p, unsigned char_cnt) +{ + int64_t l; + int digit, sign; + + static const int64_t limit = INT64_MAX / 8; + static const int base = 8; + static const char last_digit_limit = INT64_MAX % 8; + + while (*p == ' ' || *p == '\t') + p++; + if (*p == '-') { + sign = -1; + p++; + } else + sign = 1; + + l = 0; + digit = *p - '0'; + while (digit >= 0 && digit < base && char_cnt-- > 0) { + if (l>limit || (l == limit && digit > last_digit_limit)) { + l = UINT64_MAX; /* Truncate on overflow */ + break; + } + l = (l * base) + digit; + digit = *++p - '0'; + } + return (sign < 0) ? -l : l; +} + + +/* + * Note that this implementation does not (and should not!) obey + * locale settings; you cannot simply substitute strtol here, since + * it does obey locale. + */ +static int64_t +tar_atol10(const char *p, unsigned char_cnt) +{ + int64_t l; + int digit, sign; + + static const int64_t limit = INT64_MAX / 10; + static const int base = 10; + static const char last_digit_limit = INT64_MAX % 10; + + while (*p == ' ' || *p == '\t') + p++; + if (*p == '-') { + sign = -1; + p++; + } else + sign = 1; + + l = 0; + digit = *p - '0'; + while (digit >= 0 && digit < base && char_cnt-- > 0) { + if (l > limit || (l == limit && digit > last_digit_limit)) { + l = UINT64_MAX; /* Truncate on overflow */ + break; + } + l = (l * base) + digit; + digit = *++p - '0'; + } + return (sign < 0) ? -l : l; +} + + + +/* + * Parse a base-256 integer. + */ +static int64_t +tar_atol256(const char *p, unsigned char_cnt) +{ + int64_t l; + int digit; + + const int64_t limit = INT64_MAX / 256; + + /* Ignore high bit of first byte (that's the base-256 flag). */ + l = 0; + digit = 0x7f & *(const unsigned char *)p; + while (char_cnt-- > 0) { + if (l > limit) { + l = INT64_MAX; /* Truncate on overflow */ + break; + } + l = (l << 8) + digit; + digit = *(const unsigned char *)++p; + } + return (l); +} diff --git a/lib/libarchive/archive_string.c b/lib/libarchive/archive_string.c new file mode 100644 index 0000000..8f50a42 --- /dev/null +++ b/lib/libarchive/archive_string.c @@ -0,0 +1,146 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +/* + * Basic resizable string support, to simplify manipulating arbitrary-sized + * strings while minimizing heap activity. + */ + +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <err.h> +#include <stdlib.h> +#include <string.h> + +#include "archive_string.h" + +struct archive_string * +__archive_string_append(struct archive_string *as, const char *p, size_t s) +{ + __archive_string_ensure(as, as->length + s + 1); + memcpy(as->s + as->length, p, s); + as->s[as->length + s] = 0; + as->length += s; + return (as); +} + +void +__archive_string_free(struct archive_string *as) +{ + as->length = 0; + as->buffer_length = 0; + if (as->s != NULL) + free(as->s); +} + +struct archive_string * +__archive_string_ensure(struct archive_string *as, size_t s) +{ + if (as->s && (s <= as->buffer_length)) + return (as); + + if (as->buffer_length < 32) + as->buffer_length = 32; + while (as->buffer_length < s) + as->buffer_length *= 2; + as->s = realloc(as->s, as->buffer_length); + if (as->s == NULL) + errx(1,"Out of memory"); + return (as); +} + +struct archive_string * +__archive_strncat(struct archive_string *as, const char *p, size_t n) +{ + size_t s; + const char *pp; + + /* Like strlen(p), except won't examine positions beyond p[n]. */ + s = 0; + pp = p; + while (*pp && s < n) { + pp++; + s++; + } + return (__archive_string_append(as, p, s)); +} + +struct archive_string * +__archive_strappend_char(struct archive_string *as, char c) +{ + return (__archive_string_append(as, &c, 1)); +} + +#if 0 +/* Append Unicode character to string using UTF8 encoding */ +struct archive_string * +__archive_strappend_char_UTF8(struct archive_string *as, int c) +{ + char buff[6]; + + if (c <= 0x7f) { + buff[0] = c; + return (__archive_string_append(as, buff, 1)); + } else if (c <= 0x7ff) { + buff[0] = 0xc0 | (c >> 6); + buff[1] = 0x80 | (c & 0x3f); + return (__archive_string_append(as, buff, 2)); + } else if (c <= 0xffff) { + buff[0] = 0xe0 | (c >> 12); + buff[1] = 0x80 | ((c >> 6) & 0x3f); + buff[2] = 0x80 | (c & 0x3f); + return (__archive_string_append(as, buff, 3)); + } else if (c <= 0x1fffff) { + buff[0] = 0xf0 | (c >> 18); + buff[1] = 0x80 | ((c >> 12) & 0x3f); + buff[2] = 0x80 | ((c >> 6) & 0x3f); + buff[3] = 0x80 | (c & 0x3f); + return (__archive_string_append(as, buff, 4)); + } else if (c <= 0x3ffffff) { + buff[0] = 0xf8 | (c >> 24); + buff[1] = 0x80 | ((c >> 18) & 0x3f); + buff[2] = 0x80 | ((c >> 12) & 0x3f); + buff[3] = 0x80 | ((c >> 6) & 0x3f); + buff[4] = 0x80 | (c & 0x3f); + return (__archive_string_append(as, buff, 5)); + } else if (c <= 0x7fffffff) { + buff[0] = 0xfc | (c >> 30); + buff[1] = 0x80 | ((c >> 24) & 0x3f); + buff[1] = 0x80 | ((c >> 18) & 0x3f); + buff[2] = 0x80 | ((c >> 12) & 0x3f); + buff[3] = 0x80 | ((c >> 6) & 0x3f); + buff[4] = 0x80 | (c & 0x3f); + return (__archive_string_append(as, buff, 6)); + } else { + /* TODO: Handle this error?? */ + return (as); + } +} +#endif diff --git a/lib/libarchive/archive_string.h b/lib/libarchive/archive_string.h new file mode 100644 index 0000000..f6076a4 --- /dev/null +++ b/lib/libarchive/archive_string.h @@ -0,0 +1,111 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * $FreeBSD$ + * + */ + +#ifndef ARCHIVE_STRING_H_INCLUDED +#define ARCHIVE_STRING_H_INCLUDED + +#include <stdarg.h> +#include <string.h> + +/* + * Basic resizable/reusable string support a la Java's "StringBuffer." + * + * Unlike sbuf(9), the buffers here are fully reusable and track the + * length throughout. + * + * Note that all visible symbols here begin with "__archive" as they + * are internal symbols not intended for anyone outside of this library + * to see or use. + */ + +struct archive_string { + char *s; /* Pointer to the storage */ + size_t length; /* Length of 's' */ + size_t buffer_length; /* Length of malloc-ed storage */ +}; + +#define EMPTY_ARCHIVE_STRING {0,0,0} + +/* Append a C char to an archive_string, resizing as necessary. */ +struct archive_string * +__archive_strappend_char(struct archive_string *, char); +#define archive_strappend_char __archive_strappend_char + + +/* Append a char to an archive_string using UTF8. */ +struct archive_string * +__archive_strappend_char_UTF8(struct archive_string *, int); +#define archive_strappend_char_UTF8 __archive_strappend_char_UTF8 + +/* Basic append operation. */ +struct archive_string * +__archive_string_append(struct archive_string *as, const char *p, size_t s); + +/* Ensure that the underlying buffer is at least as large as the request. */ +struct archive_string * +__archive_string_ensure(struct archive_string *, size_t); +#define archive_string_ensure __archive_string_ensure + +/* Append C string, which may lack trailing \0. */ +struct archive_string * +__archive_strncat(struct archive_string *, const char *, size_t); +#define archive_strncat __archive_strncat + +/* Append a C string to an archive_string, resizing as necessary. */ +#define archive_strcat(as,p) __archive_string_append((as),(p),strlen(p)) + +/* Copy a C string to an archive_string, resizing as necessary. */ +#define archive_strcpy(as,p) \ + ((as)->length = 0, __archive_string_append((as), (p), strlen(p))) + +/* Copy a C string to an archive_string with limit, resizing as necessary. */ +#define archive_strncpy(as,p,l) \ + ((as)->length=0,archive_strncat((as), (p), (l))) + +/* Return length of string. */ +#define archive_strlen(a) ((a)->length) + +/* Set string length to zero. */ +#define archive_string_empty(a) ((a)->length = 0) + +/* Release any allocated storage resources. */ +void __archive_string_free(struct archive_string *); +#define archive_string_free __archive_string_free + +/* Like 'vsprintf', but resizes the underlying string as necessary. */ +void __archive_string_vsprintf(struct archive_string *, const char *, + va_list); +#define archive_string_vsprintf __archive_string_vsprintf + +/* Like 'sprintf', but resizes the underlying string as necessary. */ +void __archive_string_sprintf(struct archive_string *, const char *, ...); +#define archive_string_sprintf __archive_string_sprintf + + +#endif diff --git a/lib/libarchive/archive_string_sprintf.c b/lib/libarchive/archive_string_sprintf.c new file mode 100644 index 0000000..c24a747 --- /dev/null +++ b/lib/libarchive/archive_string_sprintf.c @@ -0,0 +1,79 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +/* + * This uses 'printf' family functions, which can cause issues + * for size-critical applications. I've separated it out to make + * this issue clear. (Currently, it is called directly from within + * the core code, so it cannot easily be omitted.) + */ + +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <err.h> +#include <stdio.h> + +#include "archive_string.h" + +/* + * Like 'vsprintf', but ensures the target is big enough, resizing if + * necessary. + */ +void +__archive_string_vsprintf(struct archive_string *as, const char *fmt, + va_list ap) +{ + size_t l; + + if (fmt == NULL) { + as->s[0] = 0; + return; + } + + l = vsnprintf(as->s, as->buffer_length, fmt, ap); + /* If output is bigger than the buffer, resize and try again. */ + if (l+1 >= as->buffer_length) { + __archive_string_ensure(as, l + 1); + l = vsnprintf(as->s, as->buffer_length, fmt, ap); + } +} + +/* + * Corresponding 'sprintf' interface. + */ +void +__archive_string_sprintf(struct archive_string *as, const char *fmt, ...) +{ + va_list ap; + + va_start(ap, fmt); + __archive_string_vsprintf(as, fmt, ap); + va_end(ap); +} diff --git a/lib/libarchive/archive_util.3 b/lib/libarchive/archive_util.3 new file mode 100644 index 0000000..eb6478d --- /dev/null +++ b/lib/libarchive/archive_util.3 @@ -0,0 +1,113 @@ +.\" Copyright (c) 2003-2004 Tim Kientzle +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd October 1, 2003 +.Dt archive_util 3 +.Os +.Sh NAME +.Nm archive_compression , +.Nm archive_compression_name , +.Nm archive_errno , +.Nm archive_error_string , +.Nm archive_format , +.Nm archive_format_name , +.Nm archive_set_error +.Nd libarchive utility functions +.Sh SYNOPSIS +.In archive.h +.Ft int +.Fn archive_compression "struct archive *" +.Ft const char * +.Fn archive_compression_name "struct archive *" +.Ft int +.Fn archive_errno "struct archive *" +.Ft const char * +.Fn archive_error_string "struct archive *" +.Ft int +.Fn archive_format "struct archive *" +.Ft const char * +.Fn archive_format_name "struct archive *" +.Ft int +.Fn archive_set_error "struct archive *" "int error_code" "const char *fmt" "..." +.Sh DESCRIPTION +These functions provide access to various information about the +.Tn struct archive +object used in the +.Xr libarchive 3 +library. +.Bl -tag -compact -width indent +.It Fn archive_compression +Returns a numeric code indicating the current compression. +This value is set by +.Fn archive_read_open . +.It Fn archive_compression_name +Returns a text description of the current compression suitable for display. +.It Fn archive_errno +Returns a numeric error code (see +.Xr errno 2 ) +indicating the reason for the most recent error return. +.It Fn archive_error_string +Returns a textual error message suitable for display. +The error message here is usually more specific than that +obtained from passing the result of +.Fn archive_errno +to +.Xr strerror 3 . +.It Fn archive_format +Returns a numeric code indicating the format of the current +archive entry. +This value is set by a successful call to +.Fn archive_read_next_header . +Note that it is common for this value to change from +entry to entry. +For example, a tar archive might have several entries that +utilize GNU tar extensions and several entries that do not. +These entries will have different format codes. +.It Fn archive_format_name +A textual description of the format of the current entry. +.It Fn archive_set_error +Sets the numeric error code and error description that will be returned +by +.Fn archive_errno +and +.Fn archive_error_string . +This function is sometimes useful within I/O callbacks. +.El +.Sh SEE ALSO +.Xr libarchive 3 , +.Xr archive_read 3 , +.Xr archive_write 3 +.Sh HISTORY +The +.Nm libarchive +library first appeared in +.Fx 5.3 . +.Sh AUTHORS +.An -nosplit +The +.Nm libarchive +library was written by +.An Tim Kientzle Aq kientzle@acm.org . diff --git a/lib/libarchive/archive_util.c b/lib/libarchive/archive_util.c new file mode 100644 index 0000000..a3b1a6a --- /dev/null +++ b/lib/libarchive/archive_util.c @@ -0,0 +1,101 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <string.h> + +#include "archive.h" +#include "archive_private.h" + +int +archive_errno(struct archive *a) +{ + return (a->archive_error_number); +} + +const char * +archive_error_string(struct archive *a) +{ + + if (a->error != NULL && *a->error != '\0') + return (a->error); + else + return (NULL); +} + + +int +archive_format(struct archive *a) +{ + return (a->archive_format); +} + +const char * +archive_format_name(struct archive *a) +{ + return (a->archive_format_name); +} + + +int +archive_compression(struct archive *a) +{ + return (a->compression_code); +} + +const char * +archive_compression_name(struct archive *a) +{ + return (a->compression_name); +} + +void +archive_set_error(struct archive *a, int error_number, const char *fmt, ...) +{ + va_list ap; + char errbuff[512]; + + a->archive_error_number = error_number; + if (fmt == NULL) { + a->error = NULL; + return; + } + + va_start(ap, fmt); + archive_string_vsprintf(&(a->error_string), fmt, ap); + if(error_number > 0) { + archive_strcat(&(a->error_string), ": "); + strerror_r(error_number, errbuff, sizeof(errbuff)); + archive_strcat(&(a->error_string), errbuff); + } + a->error = a->error_string.s; + va_end(ap); +} diff --git a/lib/libarchive/archive_write.3 b/lib/libarchive/archive_write.3 new file mode 100644 index 0000000..2cebaab --- /dev/null +++ b/lib/libarchive/archive_write.3 @@ -0,0 +1,368 @@ +.\" Copyright (c) 2003-2004 Tim Kientzle +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd October 1, 2003 +.Dt archive_write 3 +.Os +.Sh NAME +.Nm archive_write_new , +.Nm archive_write_set_format_cpio , +.Nm archive_write_set_format_pax , +.Nm archive_write_set_format_pax_restricted , +.Nm archive_write_set_format_shar , +.Nm archive_write_set_format_shar_binary , +.Nm archive_write_set_format_ustar , +.Nm archive_write_set_bytes_per_block , +.Nm archive_write_set_bytes_in_last_block , +.Nm archive_write_set_compressor_gzip , +.Nm archive_write_set_compressor_bzip2 , +.Nm archive_write_open , +.Nm archive_write_open_file , +.Nm archive_write_open_tar , +.Nm archive_write_prepare , +.Nm archive_write_header , +.Nm archive_write_data , +.Nm archive_write_finish +.Nd functions for creating archives +.Sh SYNOPSIS +.In archive.h +.Ft struct archive * +.Fn archive_write_new "void" +.Ft int +.Fn archive_write_set_bytes_per_block "archive *" "int bytes_per_block" +.Ft int +.Fn archive_write_set_bytes_in_last_block "archive *" "int" +.Ft int +.Fn archive_write_set_compressor_gzip "struct archive *" +.Ft int +.Fn archive_write_set_compressor_bzip2 "struct archive *" +.Ft int +.Fn archive_write_set_format_cpio "struct archive *" +.Ft int +.Fn archive_write_set_format_pax "struct archive *" +.Ft int +.Fn archive_write_set_format_pax_restricted "struct archive *" +.Ft int +.Fn archive_write_set_format_shar "struct archive *" +.Ft int +.Fn archive_write_set_format_shar_binary "struct archive *" +.Ft int +.Fn archive_write_set_format_ustar "struct archive *" +.Ft int +.Fn archive_write_open "struct archive *" "void *client_data" "archive_write_archive_callback *" "archive_open_archive_callback *" "archive_close_archive_callback *" +.Ft int +.Fn archive_write_open_file "struct archive *" "const char *filename" +.Ft int +.Fn archive_write_open_tar "struct archive *" "const char *archive_name" +.Ft int +.Fn archive_write_header "struct archive *" +.Ft int +.Fn archive_write_data "struct archive *" "const void *" "size_t" +.Ft int +.Fn archive_write_finish "struct archive *" +.Sh DESCRIPTION +These functions provide a complete API for creating streaming +archive files. +The general process is to first create the +.Tn struct archive +object, set any desired options, initialize the archive, append entries, then +close the archive and release all resources. +The following summary describes the functions in approximately +the order they are ordinarily used: +.Bl -tag -width indent +.It Fn archive_write_new +Allocates and initializes a +.Tn struct archive +object suitable for writing a tar archive. +.It Fn archive_write_set_bytes_per_block +Sets the block size used for writing the archive data. +Every call to the write callback function, except possibly the last one, will +use this value for the length. +The third parameter is a boolean that specifies whether or not the final block +written will be padded to the full block size. +If it is zero, the last block will not be padded. +If it is non-zero, padding will be added both before and after compression. +The default is to use a block size of 10240 bytes and to pad the last block. +.It Fn archive_write_set_bytes_in_last_block +Sets the block size used for writing the last block. +If this value is zero, the last block will be padded to the same size +as the other blocks. +Otherwise, the final block will be padded to a multiple of this size. +In particular, setting it to 1 will cause the final block to not be padded. +For compressed output, any padding generated by this option +is applied only after the compression. +The uncompressed data is always unpadded. +The default is to pad the last block to the full block size (note that +.Fn archive_write_open_file +affects this). +Unlike the other +.Dq set +functions, this function can be called after the archive is opened. +.It Fn archive_write_set_format_cpio , Fn archive_write_set_format_pax , Fn archive_write_set_format_pax_restricted , Fn archive_write_set_format_shar , Fn archive_write_set_format_shar_binary , Fn archive_write_set_format_ustar +Sets the format that will be used for the archive. +The library can write +POSIX octet-oriented cpio format archives, +POSIX-standard +.Dq pax interchange +format archives, +traditional +.Dq shar +archives, +enhanced +.Dq binary +shar archives that store a variety of file attributes and handle binary files, +and +POSIX-standard +.Dq ustar +archives. +The pax interchange format is a backwards-compatible tar format that +adds key/value attributes to each entry and supports arbitrary +filenames, linknames, uids, sizes, etc. +.Dq Restricted pax interchange format +is the library default; this is the same as pax format, but suppresses +the pax extended header for most normal files. +In most cases, this will result in ordinary ustar archives. +.It Fn archive_write_set_compression_gzip , Fn archive_write_set_compression_bzip2 +The resulting archive will be compressed as specified. +Note that the compressed output is always properly blocked. +.It Fn archive_write_open +Freeze the settings, open the archive, and prepare for writing entries. +This is the most generic form of this function, which accepts +pointers to three callback functions which will be invoked by +the library to write the constructed archive. +.It Fn archive_write_open_file +A convenience form of +.Fn archive_write_open +that accepts a filename. +A NULL argument indicates that the output should be written to standard output; +an argument of +.Dq - +will open a file with that name. +If you have not invoked +.Fn archive_write_set_bytes_in_last_block , +then +.Fn archive_write_open_file +will adjust the last-block padding depending on the file: +it will enable padding when writing to standard output or +to a character or block device node, it will disable padding otherwise. +You can override this by manually invoking +.Fn archive_write_set_bytes_in_last_block +either before or after calling +.Fn archive_write_open . +.It Fn archive_write_open_tar +A convenience form of +.Fn archive_write_open +that accepts an archive name in the same formats accepted by +.Xr tar 1 . +In particular, a +.Pa - +argument indicates that the output should be written to standard output. +.It Fn archive_write_header +Build and write a header using the data in the provided +.Tn struct archive_entry +structure. +.It Fn archive_write_data +Write data corresponding to the header just written. +.It Fn archive_write_finish +Complete the archive, invoke the close callback, and release +all resources. +.El +.Pp +The callback functions are defined as follows: +.Bl -item -offset indent +.It +.Ft typedef ssize_t +.Fn archive_write_archive_callback "struct archive *" "void *client_data" "void *buffer" "size_t length" +.It +.Ft typedef int +.Fn archive_open_archive_callback "struct archive *" "void *client_data" +.It +.Ft typedef int +.Fn archive_close_archive_callback "struct archive *" "void *client_data" +.El +For correct blocking, each call to the write callback function +should translate into a single +.Xr write 2 +system call. +This is especially critical when writing tar archives to tape drives. +.Pp +More information about tar archive formats and blocking can be found +in the +.Xr tar 5 +manual page. +.Pp +More information about the +.Va struct archive +object and the overall design of the library can be found in the +.Xr libarchive 3 +overview. +.Sh IMPLEMENTATION +Compression support is built-in to libarchive, which uses zlib and bzlib +to handle gzip and bzip2 compression, respectively. +.Sh EXAMPLE +The following sketch illustrates basic usage of the library. In this example, +the callback functions are simply wrappers around the standard +.Xr open 2 , +.Xr write 2 , +and +.Xr close 2 +system calls. +.Bd -literal -offset indent +void +write_archive(const char **filename) +{ + struct mydata *mydata = malloc(sizeof(struct mydata)); + struct archive *a; + struct archive_entry *entry; + struct stat st; + char buff[8192]; + int len; + + a = archive_write_new(); + mydata->name = name; + archive_write_set_compression_gzip(a); + archive_write_set_format_ustar(a); + archive_write_open(a, mydata, myopen, mywrite, myclose); + while (*filename) { + stat(*filename, &st); + entry = archive_entry_new(); + archive_entry_copy_stat(entry, &st); + archive_entry_set_pathname(entry, *filename); + archive_write_header(a, entry); + fd = open(*filename, O_RDONLY); + len = read(fd, buff, sizeof(buff)); + while ( len >= 0 ) { + archive_write_data(a, buff, len); + len = read(fd, buff, sizeof(buff)); + } + archive_entry_free(entry); + filename++; + } + archive_write_finish(a); +} + +int +myopen(struct archive *a, void *client_data) +{ + struct mydata *mydata = client_data; + + mydata->fd = open(mydata->name, O_WRONLY | O_CREAT, 0644); + return (mydata->fd >= 0); +} + +ssize_t +mywrite(struct archive *a, void *client_data, void *buff, size_t n) +{ + struct mydata *mydata = client_data; + + return (write(mydata->fd, buff, n)); +} + +int +myclose(struct archive *a, void *client_data) +{ + struct mydata *mydata = client_data; + + if (mydata->fd > 0) + close(mydata->fd); + return (0); +} +.Ed +.Sh RETURN VALUES +Most functions return zero on success, non-zero on error. +The +.Fn archive_errno +and +.Fn archive_error_string +functions can be used to retrieve an appropriate error code and a +textual error message. +.Pp +.Fn archive_write_new +returns a pointer to a newly-allocated +.Tn struct archive +object. +.Pp +.Fn archive_write_data +returns a count of the number of bytes actually written. +On error, -1 is returned and the +.Fn archive_errno +and +.Fn archive_error_string +functions will return appropriate values. +Note that if the client-provided write callback function +returns -1, that error will be propagated back to the caller +through whatever API function resulted in that call, which +may include +.Fn archive_write_header , +.Fn archive_write_data , +or +.Fn archive_write_finish . +In such a case, the +.Fn archive_errno +or +.Fn archive_error_string +fields will not return useful information; you should use +client-private data to return error information +back to your mainline code. +.Sh SEE ALSO +.Xr tar 1 , +.Xr libarchive 3 , +.Xr tar 5 . +.Sh HISTORY +The +.Nm libarchive +library first appeared in +.Fx 5.3 . +.Sh AUTHORS +.An -nosplit +The +.Nm libarchive +library was written by +.An Tim Kientzle Aq kientzle@acm.org . +.Sh BUGS +There are many peculiar bugs in historic tar implementations that may cause +certain programs to reject archives written by this library. +For example, several historic implementations calculated header checksums +incorrectly and will thus reject valid archives; GNU tar does not fully support +pax interchange format; some old tar implementations required specific +field terminations. +.Pp +The default pax interchange format eliminates most of the historic +tar limitations and provides a generic key/value attribute facility +for vendor-defined extensions. +One oversight in POSIX is the failure to provide a standard attribute +for large device numbers. +This library uses +.Dq SCHILY.devminor +and +.Dq SCHILY.devmajor +for device numbers that exceed the range supported by the backwards-compatible +ustar header. +These keys are compatible with Joerg Schilling's +.Nm star +archiver. +Other implementations may not recognize these keys and will thus be unable +to correctly restore large device numbers archived by this library. diff --git a/lib/libarchive/archive_write.c b/lib/libarchive/archive_write.c new file mode 100644 index 0000000..cd8434e --- /dev/null +++ b/lib/libarchive/archive_write.c @@ -0,0 +1,220 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +/* + * This file contains the "essential" portions of the write API, that + * is, stuff that will essentially always be used by any client that + * actually needs to write a archive. Optional pieces have been, as + * far as possible, separated out into separate files to reduce + * needlessly bloating statically-linked clients. + */ + +#include <sys/errno.h> +#include <sys/wait.h> +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <limits.h> +#include <paths.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <tar.h> +#include <time.h> +#include <unistd.h> + +#include "archive.h" +#include "archive_private.h" + +extern char **environ; + +/* + * Allocate, initialize and return an archive object. + */ +struct archive * +archive_write_new(void) +{ + struct archive *a; + char *nulls; + + a = malloc(sizeof(*a)); + if (a == NULL) + return (NULL); + memset(a, 0, sizeof(*a)); + a->magic = ARCHIVE_WRITE_MAGIC; + a->user_uid = geteuid(); + a->bytes_per_block = ARCHIVE_DEFAULT_BYTES_PER_BLOCK; + a->bytes_in_last_block = -1; /* Default */ + a->state = ARCHIVE_STATE_NEW; + a->pformat_data = &(a->format_data); + + /* Initialize a block of nulls for padding purposes. */ + a->null_length = 1024; + nulls = malloc(a->null_length); + if (nulls == NULL) { + free(a); + return (NULL); + } + memset(nulls, 0, a->null_length); + a->nulls = nulls; + /* + * Set default compression, but don't set a default format. + * Were we to set a default format here, we would force every + * client to link in support for that format, even if they didn't + * ever use it. + */ + archive_write_set_compression_none(a); + return (a); +} + + +/* + * Set the block size. Returns 0 if successful. + */ +int +archive_write_set_bytes_per_block(struct archive *a, int bytes_per_block) +{ + archive_check_magic(a, ARCHIVE_WRITE_MAGIC, ARCHIVE_STATE_NEW); + a->bytes_per_block = bytes_per_block; + return (ARCHIVE_OK); +} + + +/* + * Set the size for the last block. + * Returns 0 if successful. + */ +int +archive_write_set_bytes_in_last_block(struct archive *a, int bytes) +{ + archive_check_magic(a, ARCHIVE_WRITE_MAGIC, ARCHIVE_STATE_ANY); + a->bytes_in_last_block = bytes; + return (ARCHIVE_OK); +} + + +/* + * Open the archive using the current settings. + */ +int +archive_write_open(struct archive *a, void *client_data, + archive_open_callback *opener, archive_write_callback *writer, + archive_close_callback *closer) +{ + int ret; + + ret = ARCHIVE_OK; + archive_check_magic(a, ARCHIVE_WRITE_MAGIC, ARCHIVE_STATE_NEW); + a->state = ARCHIVE_STATE_HEADER; + a->client_data = client_data; + a->client_writer = writer; + a->client_opener = opener; + a->client_closer = closer; + ret = (a->compression_init)(a); + if (a->format_init && ret == ARCHIVE_OK) + ret = (a->format_init)(a); + return (ret); +} + + +/* + * Cleanup and free the archive object. + * + * Be careful: user might just call write_new and then write_finish. + * Don't assume we actually wrote anything or performed any non-trivial + * initialization. + */ +void +archive_write_finish(struct archive *a) +{ + archive_check_magic(a, ARCHIVE_WRITE_MAGIC, ARCHIVE_STATE_ANY); + + /* Finish the last entry. */ + if (a->state & ARCHIVE_STATE_DATA) + ((a->format_finish_entry)(a)); + + /* Finish off the archive. */ + if (a->format_finish != NULL) + (a->format_finish)(a); + + /* Finish the compression and close the stream. */ + if (a->compression_finish != NULL) + (a->compression_finish)(a); + + /* Release various dynamic buffers. */ + free((void *)(uintptr_t)(const void *)a->nulls); + if (a->entry_name.s != NULL) + free(a->entry_name.s); + if (a->entry_linkname.s != NULL) + free(a->entry_linkname.s); + if (a->entry_uname.s != NULL) + free(a->entry_uname.s); + if (a->entry_gname.s != NULL) + free(a->entry_gname.s); + if (a->gnu_name.s != NULL) + free(a->gnu_name.s); + if (a->gnu_linkname.s != NULL) + free(a->gnu_linkname.s); + if (a->extract_mkdirpath.s != NULL) + free(a->extract_mkdirpath.s); + free(a); +} + + +/* + * Write the appropriate header. + */ +int +archive_write_header(struct archive *a, struct archive_entry *entry) +{ + int ret; + + archive_check_magic(a, ARCHIVE_WRITE_MAGIC, + ARCHIVE_STATE_HEADER | ARCHIVE_STATE_DATA); + + /* Finish last entry. */ + if (a->state & ARCHIVE_STATE_DATA) + ((a->format_finish_entry)(a)); + + /* Format and write header. */ + ret = ((a->format_write_header)(a, entry)); + + a->state = ARCHIVE_STATE_DATA; + return (ret); +} + +/* + * Note that the compressor is responsible for blocking. + */ +int +archive_write_data(struct archive *a, const void *buff, size_t s) +{ + archive_check_magic(a, ARCHIVE_WRITE_MAGIC, ARCHIVE_STATE_DATA); + return (a->format_write_data(a, buff, s)); +} diff --git a/lib/libarchive/archive_write_open_file.c b/lib/libarchive/archive_write_open_file.c new file mode 100644 index 0000000..65d3d45 --- /dev/null +++ b/lib/libarchive/archive_write_open_file.c @@ -0,0 +1,149 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <sys/stat.h> +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <errno.h> +#include <fcntl.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> + +#include "archive.h" +#include "archive_private.h" + +struct write_file_data { + intmax_t offset; + int fd; + char filename[1]; +}; + +static int file_close(struct archive *, void *); +static int file_open(struct archive *, void *); +static ssize_t file_write(struct archive *, void *, void *buff, size_t); + +int +archive_write_open_file(struct archive *a, const char *filename) +{ + return (archive_write_open_file_position(a, filename, 0)); +} + +int +archive_write_open_file_position(struct archive *a, const char *filename, + int64_t offset) +{ + struct write_file_data *mine; + + if (filename == NULL) { + mine = malloc(sizeof(*mine)); + mine->filename[0] = 0; + } else { + mine = malloc(sizeof(*mine) + strlen(filename)); + strcpy(mine->filename, filename); + } + mine->offset = offset; + mine->fd = -1; + return (archive_write_open(a, mine, + file_open, file_write, file_close)); +} + +static int +file_open(struct archive *a, void *client_data) +{ + int flags; + struct write_file_data *mine; + struct stat st; + + mine = client_data; + if (mine->offset == 0) + flags = O_WRONLY | O_CREAT | O_TRUNC; + else + flags = O_WRONLY; + + if (*mine->filename != 0) { + mine->fd = open(mine->filename, flags, 0666); + + /* + * If client hasn't explicitly set the last block + * handling, then set it here: If the output is a + * block or character device, pad the last block, + * otherwise leave it unpadded. + */ + if (mine->fd >= 0 && a->bytes_in_last_block < 0) { + /* Last block will be fully padded. */ + fstat(mine->fd, &st); + if (S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode) || + S_ISFIFO(st.st_mode)) + archive_write_set_bytes_in_last_block(a, 0); + else + archive_write_set_bytes_in_last_block(a, 1); + } + } else { + mine->fd = 1; + if (a->bytes_in_last_block < 0) /* Still default? */ + /* Last block will be fully padded. */ + archive_write_set_bytes_in_last_block(a, 0); + } + + if (mine->fd < 0) { + archive_set_error(a, errno, "Failed to open"); + return -1; + } + + if (mine->offset > 0) { + lseek(mine->fd, mine->offset, SEEK_SET); + ftruncate(mine->fd, mine->offset); + } + + return (ARCHIVE_OK); +} + +static ssize_t +file_write(struct archive *a, void *client_data, void *buff, size_t length) +{ + struct write_file_data *mine; + + (void)a; /* UNUSED */ + mine = client_data; + return (write(mine->fd, buff, length)); +} + +static int +file_close(struct archive *a, void *client_data) +{ + struct write_file_data *mine = client_data; + + (void)a; /* UNUSED */ + if (mine->fd >= 0) + close(mine->fd); + free(mine); + return (ARCHIVE_OK); +} diff --git a/lib/libarchive/archive_write_open_filename.c b/lib/libarchive/archive_write_open_filename.c new file mode 100644 index 0000000..65d3d45 --- /dev/null +++ b/lib/libarchive/archive_write_open_filename.c @@ -0,0 +1,149 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <sys/stat.h> +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <errno.h> +#include <fcntl.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> + +#include "archive.h" +#include "archive_private.h" + +struct write_file_data { + intmax_t offset; + int fd; + char filename[1]; +}; + +static int file_close(struct archive *, void *); +static int file_open(struct archive *, void *); +static ssize_t file_write(struct archive *, void *, void *buff, size_t); + +int +archive_write_open_file(struct archive *a, const char *filename) +{ + return (archive_write_open_file_position(a, filename, 0)); +} + +int +archive_write_open_file_position(struct archive *a, const char *filename, + int64_t offset) +{ + struct write_file_data *mine; + + if (filename == NULL) { + mine = malloc(sizeof(*mine)); + mine->filename[0] = 0; + } else { + mine = malloc(sizeof(*mine) + strlen(filename)); + strcpy(mine->filename, filename); + } + mine->offset = offset; + mine->fd = -1; + return (archive_write_open(a, mine, + file_open, file_write, file_close)); +} + +static int +file_open(struct archive *a, void *client_data) +{ + int flags; + struct write_file_data *mine; + struct stat st; + + mine = client_data; + if (mine->offset == 0) + flags = O_WRONLY | O_CREAT | O_TRUNC; + else + flags = O_WRONLY; + + if (*mine->filename != 0) { + mine->fd = open(mine->filename, flags, 0666); + + /* + * If client hasn't explicitly set the last block + * handling, then set it here: If the output is a + * block or character device, pad the last block, + * otherwise leave it unpadded. + */ + if (mine->fd >= 0 && a->bytes_in_last_block < 0) { + /* Last block will be fully padded. */ + fstat(mine->fd, &st); + if (S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode) || + S_ISFIFO(st.st_mode)) + archive_write_set_bytes_in_last_block(a, 0); + else + archive_write_set_bytes_in_last_block(a, 1); + } + } else { + mine->fd = 1; + if (a->bytes_in_last_block < 0) /* Still default? */ + /* Last block will be fully padded. */ + archive_write_set_bytes_in_last_block(a, 0); + } + + if (mine->fd < 0) { + archive_set_error(a, errno, "Failed to open"); + return -1; + } + + if (mine->offset > 0) { + lseek(mine->fd, mine->offset, SEEK_SET); + ftruncate(mine->fd, mine->offset); + } + + return (ARCHIVE_OK); +} + +static ssize_t +file_write(struct archive *a, void *client_data, void *buff, size_t length) +{ + struct write_file_data *mine; + + (void)a; /* UNUSED */ + mine = client_data; + return (write(mine->fd, buff, length)); +} + +static int +file_close(struct archive *a, void *client_data) +{ + struct write_file_data *mine = client_data; + + (void)a; /* UNUSED */ + if (mine->fd >= 0) + close(mine->fd); + free(mine); + return (ARCHIVE_OK); +} diff --git a/lib/libarchive/archive_write_set_compression_bzip2.c b/lib/libarchive/archive_write_set_compression_bzip2.c new file mode 100644 index 0000000..23828a2 --- /dev/null +++ b/lib/libarchive/archive_write_set_compression_bzip2.c @@ -0,0 +1,326 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <errno.h> +#include <stdlib.h> +#include <string.h> +#include <bzlib.h> + +#include "archive.h" +#include "archive_private.h" + +struct private_data { + bz_stream stream; + int64_t total_in; + char *compressed; + size_t compressed_buffer_size; +}; + + +/* + * Yuck. bzlib.h is not const-correct, so I need this one bit + * of ugly hackery to convert a const * pointer to a non-const pointer. + */ +#define SET_NEXT_IN(st,src) \ + (st)->stream.next_in = (void *)(uintptr_t)(const void *)(src) + +static int archive_compressor_bzip2_finish(struct archive *); +static int archive_compressor_bzip2_init(struct archive *); +static ssize_t archive_compressor_bzip2_write(struct archive *, const void *, + size_t); +static int drive_compressor(struct archive *, struct private_data *, + int finishing); + +/* + * Allocate, initialize and return an archive object. + */ +int +archive_write_set_compression_bzip2(struct archive *a) +{ + archive_check_magic(a, ARCHIVE_WRITE_MAGIC, ARCHIVE_STATE_NEW); + a->compression_init = &archive_compressor_bzip2_init; + a->compression_code = ARCHIVE_COMPRESSION_BZIP2; + a->compression_name = "bzip2"; + return (ARCHIVE_OK); +} + +/* + * Setup callback. + */ +static int +archive_compressor_bzip2_init(struct archive *a) +{ + int ret; + struct private_data *state; + + a->compression_code = ARCHIVE_COMPRESSION_BZIP2; + a->compression_name = "bzip2"; + + if (a->client_opener != NULL) { + ret = (a->client_opener)(a, a->client_data); + if (ret != 0) + return (ret); + } + + state = malloc(sizeof(*state)); + if (state == NULL) { + archive_set_error(a, ENOMEM, + "Can't allocate data for compression"); + return (ARCHIVE_FATAL); + } + memset(state, 0, sizeof(*state)); + + state->compressed_buffer_size = a->bytes_per_block; + state->compressed = malloc(state->compressed_buffer_size); + + if (state->compressed == NULL) { + archive_set_error(a, ENOMEM, + "Can't allocate data for compression buffer"); + free(state); + return (ARCHIVE_FATAL); + } + + state->stream.next_out = state->compressed; + state->stream.avail_out = state->compressed_buffer_size; + a->compression_write = archive_compressor_bzip2_write; + a->compression_finish = archive_compressor_bzip2_finish; + + /* Initialize compression library */ + ret = BZ2_bzCompressInit(&(state->stream), 9, 0, 30); + if (ret == BZ_OK) { + a->compression_data = state; + return (ARCHIVE_OK); + } + + /* Library setup failed: clean up. */ + archive_set_error(a, -1, + "Internal error initializing compression library"); + free(state->compressed); + free(state); + + /* Override the error message if we know what really went wrong. */ + switch (ret) { + case BZ_PARAM_ERROR: + archive_set_error(a, -1, + "Internal error initializing compression library: " + "invalid setup parameter"); + break; + case BZ_MEM_ERROR: + archive_set_error(a, -1, + "Internal error initializing compression library: " + "out of memory"); + break; + case BZ_CONFIG_ERROR: + archive_set_error(a, -1, + "Internal error initializing compression library: " + "mis-compiled library"); + break; + } + + return (ARCHIVE_FATAL); + +} + +/* + * Write data to the compressed stream. + */ +static ssize_t +archive_compressor_bzip2_write(struct archive *a, const void *buff, + size_t length) +{ + struct private_data *state; + + state = a->compression_data; + if (!a->client_writer) { + archive_set_error(a, EINVAL, + "No write callback is registered? " + "This is probably an internal programming error."); + return (ARCHIVE_FATAL); + } + + /* Update statistics */ + state->total_in += length; + + /* Compress input data to output buffer */ + SET_NEXT_IN(state, buff); + state->stream.avail_in = length; + if (drive_compressor(a, state, 0)) + return (-1); + return (length); +} + + +/* + * Finish the compression. + */ +static int +archive_compressor_bzip2_finish(struct archive *a) +{ + ssize_t block_length; + int ret; + struct private_data *state; + ssize_t target_block_length; + unsigned tocopy; + + state = a->compression_data; + ret = ARCHIVE_OK; + if (a->client_writer == NULL) { + archive_set_error(a, EINVAL, + "No write callback is registered?\n" + "This is probably an internal programming error."); + ret = ARCHIVE_FATAL; + goto cleanup; + } + + /* By default, always pad the uncompressed data. */ + if (a->pad_uncompressed) { + tocopy = a->bytes_per_block - + (state->total_in % a->bytes_per_block); + while (tocopy > 0 && tocopy < (unsigned)a->bytes_per_block) { + SET_NEXT_IN(state, a->nulls); + state->stream.avail_in = tocopy < a->null_length ? + tocopy : a->null_length; + state->total_in += state->stream.avail_in; + tocopy -= state->stream.avail_in; + ret = drive_compressor(a, state, 0); + if (ret != ARCHIVE_OK) + goto cleanup; + } + } + + /* Finish compression cycle. */ + if ((ret = drive_compressor(a, state, 1))) + goto cleanup; + + /* Optionally, pad the final compressed block. */ + block_length = state->stream.next_out - state->compressed; + + + /* Tricky calculation to determine size of last block. */ + target_block_length = block_length; + if (a->bytes_in_last_block <= 0) + /* Default or Zero: pad to full block */ + target_block_length = a->bytes_per_block; + else + /* Round length to next multiple of bytes_in_last_block. */ + target_block_length = a->bytes_in_last_block * + ( (block_length + a->bytes_in_last_block - 1) / + a->bytes_in_last_block); + if (target_block_length > a->bytes_per_block) + target_block_length = a->bytes_per_block; + if (block_length < target_block_length) { + memset(state->stream.next_out, 0, + target_block_length - block_length); + block_length = target_block_length; + } + + /* Write the last block */ + ret = (a->client_writer)(a, a->client_data, state->compressed, + block_length); + + if (ret != 0) + goto cleanup; + + /* Cleanup: shut down compressor, release memory, etc. */ +cleanup: + switch (BZ2_bzCompressEnd(&(state->stream))) { + case BZ_OK: + break; + default: + archive_set_error(a, -1, "Failed to clean up compressor"); + ret = ARCHIVE_FATAL; + } + + free(state->compressed); + free(state); + + /* Close the output */ + if (a->client_closer != NULL) + (a->client_closer)(a, a->client_data); + + return (ret); +} + +/* + * Utility function to push input data through compressor, writing + * full output blocks as necessary. + * + * Note that this handles both the regular write case (finishing == + * false) and the end-of-archive case (finishing == true). + */ +static int +drive_compressor(struct archive *a, struct private_data *state, int finishing) +{ + size_t ret; + + for (;;) { + if (state->stream.avail_out == 0) { + ret = (a->client_writer)(a, a->client_data, + state->compressed, state->compressed_buffer_size); + if (ret <= 0) { + /* TODO: Handle this write failure */ + return (ARCHIVE_FATAL); + } else if (ret < state->compressed_buffer_size) { + /* Short write: Move remainder to + * front and keep filling */ + memmove(state->compressed, + state->compressed + ret, + state->compressed_buffer_size - ret); + } + + state->stream.next_out = state->compressed + + state->compressed_buffer_size - ret; + state->stream.avail_out = ret; + } + + ret = BZ2_bzCompress(&(state->stream), + finishing ? BZ_FINISH : BZ_RUN); + + switch (ret) { + case BZ_RUN_OK: + /* In non-finishing case, did compressor + * consume everything? */ + if (!finishing && state->stream.avail_in == 0) + return (ARCHIVE_OK); + break; + case BZ_FINISH_OK: /* Finishing: There's more work to do */ + break; + case BZ_STREAM_END: /* Finishing: all done */ + /* Only occurs in finishing case */ + return (ARCHIVE_OK); + default: + /* Any other return value indicates an error */ + archive_set_error(a, -1, "Bzip2 compression failed"); + return (ARCHIVE_FATAL); + } + } +} diff --git a/lib/libarchive/archive_write_set_compression_gzip.c b/lib/libarchive/archive_write_set_compression_gzip.c new file mode 100644 index 0000000..7c9ade6 --- /dev/null +++ b/lib/libarchive/archive_write_set_compression_gzip.c @@ -0,0 +1,380 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <errno.h> +#include <stdlib.h> +#include <string.h> +#include <time.h> +#include <zlib.h> + +#include "archive.h" +#include "archive_private.h" + +struct private_data { + z_stream stream; + int64_t total_in; + unsigned char *compressed; + size_t compressed_buffer_size; + unsigned long crc; +}; + + +/* + * Yuck. zlib.h is not const-correct, so I need this one bit + * of ugly hackery to convert a const * pointer to a non-const pointer. + */ +#define SET_NEXT_IN(st,src) \ + (st)->stream.next_in = (void *)(uintptr_t)(const void *)(src) + +static int archive_compressor_gzip_finish(struct archive *); +static int archive_compressor_gzip_init(struct archive *); +static ssize_t archive_compressor_gzip_write(struct archive *, const void *, + size_t); +static int drive_compressor(struct archive *, struct private_data *, + int finishing); + + +/* + * Allocate, initialize and return a archive object. + */ +int +archive_write_set_compression_gzip(struct archive *a) +{ + archive_check_magic(a, ARCHIVE_WRITE_MAGIC, ARCHIVE_STATE_NEW); + a->compression_init = &archive_compressor_gzip_init; + a->compression_code = ARCHIVE_COMPRESSION_GZIP; + a->compression_name = "gzip"; + return (ARCHIVE_OK); +} + +/* + * Setup callback. + */ +static int +archive_compressor_gzip_init(struct archive *a) +{ + int ret; + struct private_data *state; + time_t t; + + a->compression_code = ARCHIVE_COMPRESSION_GZIP; + a->compression_name = "gzip"; + + if (a->client_opener != NULL) { + ret = (a->client_opener)(a, a->client_data); + if (ret != ARCHIVE_OK) + return (ret); + } + + state = (struct private_data *)malloc(sizeof(*state)); + if (state == NULL) { + archive_set_error(a, ENOMEM, + "Can't allocate data for compression"); + return (ARCHIVE_FATAL); + } + memset(state, 0, sizeof(*state)); + + state->compressed_buffer_size = a->bytes_per_block; + state->compressed = malloc(state->compressed_buffer_size); + state->crc = crc32(0L, NULL, 0); + + if (state->compressed == NULL) { + archive_set_error(a, ENOMEM, + "Can't allocate data for compression buffer"); + free(state); + return (ARCHIVE_FATAL); + } + + state->stream.next_out = state->compressed; + state->stream.avail_out = state->compressed_buffer_size; + + /* Prime output buffer with a gzip header. */ + t = time(NULL); + state->compressed[0] = 0x1f; /* GZip signature bytes */ + state->compressed[1] = 0x8b; + state->compressed[2] = 0x08; /* "Deflate" compression */ + state->compressed[3] = 0; /* No options */ + state->compressed[4] = (t)&0xff; /* Timestamp */ + state->compressed[5] = (t>>8)&0xff; + state->compressed[6] = (t>>16)&0xff; + state->compressed[7] = (t>>24)&0xff; + state->compressed[8] = 0; /* No deflate options */ + state->compressed[9] = 3; /* OS=Unix */ + state->stream.next_out += 10; + state->stream.avail_out -= 10; + + a->compression_write = archive_compressor_gzip_write; + a->compression_finish = archive_compressor_gzip_finish; + + /* Initialize compression library. */ + ret = deflateInit2(&(state->stream), + Z_DEFAULT_COMPRESSION, + Z_DEFLATED, + -15 /* < 0 to suppress zlib header */, + 8, + Z_DEFAULT_STRATEGY); + + if (ret == Z_OK) { + a->compression_data = state; + return (0); + } + + /* Library setup failed: clean up. */ + archive_set_error(a, -1, "Internal error " + "initializing compression library"); + free(state->compressed); + free(state); + + /* Override the error message if we know what really went wrong. */ + switch (ret) { + case Z_STREAM_ERROR: + archive_set_error(a, EINVAL, "Internal error initializing " + "compression library: invalid setup parameter"); + break; + case Z_MEM_ERROR: + archive_set_error(a, ENOMEM, "Internal error initializing " + "compression library"); + break; + case Z_VERSION_ERROR: + archive_set_error(a, -1, "Internal error initializing " + "compression library: invalid library version"); + break; + } + + return (ARCHIVE_FATAL); +} + +/* + * Write data to the compressed stream. + */ +static ssize_t +archive_compressor_gzip_write(struct archive *a, const void *buff, + size_t length) +{ + struct private_data *state; + int ret; + + state = a->compression_data; + if (!a->client_writer) { + archive_set_error(a, EDOOFUS, + "No write callback is registered? " + "This is probably an internal programming error."); + return (ARCHIVE_FATAL); + } + + /* Update statistics */ + state->crc = crc32(state->crc, buff, length); + state->total_in += length; + + /* Compress input data to output buffer */ + SET_NEXT_IN(state, buff); + state->stream.avail_in = length; + if ((ret = drive_compressor(a, state, 0)) != ARCHIVE_OK) + return (ret); + + return (length); +} + + +/* + * Finish the compression... + */ +static int +archive_compressor_gzip_finish(struct archive *a) +{ + ssize_t block_length, target_block_length; + int ret; + struct private_data *state; + unsigned tocopy; + unsigned char trailer[8]; + + state = a->compression_data; + ret = 0; + if (a->client_writer == NULL) { + archive_set_error(a, EDOOFUS, + "No write callback is registered? " + "This is probably an internal programming error."); + ret = ARCHIVE_FATAL; + goto cleanup; + } + + /* By default, always pad the uncompressed data. */ + if (a->pad_uncompressed) { + tocopy = a->bytes_per_block - + (state->total_in % a->bytes_per_block); + while (tocopy > 0 && tocopy < (unsigned)a->bytes_per_block) { + SET_NEXT_IN(state, a->nulls); + state->stream.avail_in = tocopy < a->null_length ? + tocopy : a->null_length; + state->crc = crc32(state->crc, a->nulls, + state->stream.avail_in); + state->total_in += state->stream.avail_in; + tocopy -= state->stream.avail_in; + ret = drive_compressor(a, state, 0); + if (ret != ARCHIVE_OK) + goto cleanup; + } + } + + /* Finish compression cycle */ + if (((ret = drive_compressor(a, state, 1))) != ARCHIVE_OK) + goto cleanup; + + /* Build trailer: 4-byte CRC and 4-byte length. */ + trailer[0] = (state->crc)&0xff; + trailer[1] = (state->crc >> 8)&0xff; + trailer[2] = (state->crc >> 16)&0xff; + trailer[3] = (state->crc >> 24)&0xff; + trailer[4] = (state->total_in)&0xff; + trailer[5] = (state->total_in >> 8)&0xff; + trailer[6] = (state->total_in >> 16)&0xff; + trailer[7] = (state->total_in >> 24)&0xff; + + /* Add trailer to current block. */ + tocopy = 8; + if (tocopy > state->stream.avail_out) + tocopy = state->stream.avail_out; + memcpy(state->stream.next_out, trailer, tocopy); + state->stream.next_out += tocopy; + state->stream.avail_out -= tocopy; + + /* If it overflowed, flush and start a new block. */ + if (tocopy < 8) { + ret = (a->client_writer)(a, a->client_data, state->compressed, + state->compressed_buffer_size); + state->stream.next_out = state->compressed; + state->stream.avail_out = state->compressed_buffer_size; + memcpy(state->stream.next_out, trailer + tocopy, 8-tocopy); + state->stream.next_out += 8-tocopy; + state->stream.avail_out -= 8-tocopy; + } + + /* Optionally, pad the final compressed block. */ + block_length = state->stream.next_out - state->compressed; + + + /* Tricky calculation to determine size of last block. */ + target_block_length = block_length; + if (a->bytes_in_last_block <= 0) + /* Default or Zero: pad to full block */ + target_block_length = a->bytes_per_block; + else + /* Round length to next multiple of bytes_in_last_block. */ + target_block_length = a->bytes_in_last_block * + ( (block_length + a->bytes_in_last_block - 1) / + a->bytes_in_last_block); + if (target_block_length > a->bytes_per_block) + target_block_length = a->bytes_per_block; + if (block_length < target_block_length) { + memset(state->stream.next_out, 0, + target_block_length - block_length); + block_length = target_block_length; + } + + /* Write the last block */ + ret = (a->client_writer)(a, a->client_data, state->compressed, + block_length); + + /* Cleanup: shut down compressor, release memory, etc. */ +cleanup: + switch (deflateEnd(&(state->stream))) { + case Z_OK: + break; + default: + archive_set_error(a, -1, "Failed to clean up compressor"); + ret = ARCHIVE_FATAL; + } + free(state->compressed); + free(state); + + /* Close the output */ + if (a->client_closer != NULL) + (a->client_closer)(a, a->client_data); + + return (ret); +} + +/* + * Utility function to push input data through compressor, + * writing full output blocks as necessary. + * + * Note that this handles both the regular write case (finishing == + * false) and the end-of-archive case (finishing == true). + */ +static int +drive_compressor(struct archive *a, struct private_data *state, int finishing) +{ + size_t ret; + + for (;;) { + if (state->stream.avail_out == 0) { + ret = (a->client_writer)(a, a->client_data, + state->compressed, state->compressed_buffer_size); + if (ret <= 0) { + /* TODO: Handle this write failure */ + return (ARCHIVE_FATAL); + } else if (ret < state->compressed_buffer_size) { + /* Short write: Move remaining to + * front of block and keep filling */ + memmove(state->compressed, + state->compressed + ret, + state->compressed_buffer_size - ret); + } + + state->stream.next_out + = state->compressed + + state->compressed_buffer_size - ret; + state->stream.avail_out = ret; + } + + ret = deflate(&(state->stream), + finishing ? Z_FINISH : Z_NO_FLUSH ); + + switch (ret) { + case Z_OK: + /* In non-finishing case, check if compressor + * consumed everything */ + if (!finishing && state->stream.avail_in == 0) + return (ARCHIVE_OK); + /* In finishing case, this return always means + * there's more work */ + break; + case Z_STREAM_END: + /* This return can only occur in finishing case. */ + return (ARCHIVE_OK); + default: + /* Any other return value indicates an error. */ + archive_set_error(a, -1, "GZip compression failed"); + return (ARCHIVE_FATAL); + } + } +} diff --git a/lib/libarchive/archive_write_set_compression_none.c b/lib/libarchive/archive_write_set_compression_none.c new file mode 100644 index 0000000..b0ed24c --- /dev/null +++ b/lib/libarchive/archive_write_set_compression_none.c @@ -0,0 +1,210 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <errno.h> +#include <stdlib.h> +#include <string.h> + +#include "archive.h" +#include "archive_private.h" + +static int archive_compressor_none_finish(struct archive *a); +static int archive_compressor_none_init(struct archive *); +static ssize_t archive_compressor_none_write(struct archive *, const void *, + size_t); + +struct archive_none { + char *buffer; + ssize_t buffer_size; + char *next; /* Current insert location */ + ssize_t avail; /* Free space left in buffer */ +}; + +int +archive_write_set_compression_none(struct archive *a) +{ + archive_check_magic(a, ARCHIVE_WRITE_MAGIC, ARCHIVE_STATE_NEW); + a->compression_init = &archive_compressor_none_init; + a->compression_code = ARCHIVE_COMPRESSION_NONE; + a->compression_name = "none"; + return (0); +} + +/* + * Setup callback. + */ +static int +archive_compressor_none_init(struct archive *a) +{ + int ret; + struct archive_none *state; + + a->compression_code = ARCHIVE_COMPRESSION_NONE; + a->compression_name = "none"; + + if (a->client_opener != NULL) { + ret = (a->client_opener)(a, a->client_data); + if (ret != 0) + return (ret); + } + + state = (struct archive_none *)malloc(sizeof(*state)); + if (state == NULL) { + archive_set_error(a, ENOMEM, + "Can't allocate data for output buffering"); + return (ARCHIVE_FATAL); + } + memset(state, 0, sizeof(*state)); + + state->buffer_size = a->bytes_per_block; + state->buffer = malloc(state->buffer_size); + + if (state->buffer == NULL) { + archive_set_error(a, ENOMEM, + "Can't allocate output buffer"); + free(state); + return (ARCHIVE_FATAL); + } + + state->next = state->buffer; + state->avail = state->buffer_size; + + a->compression_data = state; + a->compression_write = archive_compressor_none_write; + a->compression_finish = archive_compressor_none_finish; + return (ARCHIVE_OK); +} + +/* + * Write data to the stream. + */ +static ssize_t +archive_compressor_none_write(struct archive *a, const void *vbuff, + size_t length) +{ + const char *buff; + ssize_t remaining, to_copy; + int ret; + struct archive_none *state; + + state = a->compression_data; + buff = vbuff; + if (a->client_writer == NULL) { + archive_set_error(a, EDOOFUS, + "No write callback is registered? " + "This is probably an internal programming error."); + return (ARCHIVE_FATAL); + } + + remaining = length; + while (remaining > 0) { + /* + * If we have a full output block, write it and reset the + * output buffer. + */ + if (state->avail == 0) { + ret = (a->client_writer)(a, a->client_data, + state->buffer, state->buffer_size); + /* TODO: if ret < state->buffer_size */ + state->next = state->buffer; + state->avail = state->buffer_size; + } + + /* Now we have space in the buffer; copy new data into it. */ + to_copy = (remaining > state->avail) ? + state->avail : remaining; + memcpy(state->next, buff, to_copy); + state->next += to_copy; + state->avail -= to_copy; + buff += to_copy; + remaining -= to_copy; + } + return (length); +} + + +/* + * Finish the compression. + */ +static int +archive_compressor_none_finish(struct archive *a) +{ + ssize_t block_length; + ssize_t target_block_length; + int ret; + int ret2; + struct archive_none *state; + + state = a->compression_data; + ret = ret2 = ARCHIVE_OK; + if (a->client_writer == NULL) { + archive_set_error(a, EDOOFUS, + "No write callback is registered? " + "This is probably an internal programming error."); + return (ARCHIVE_FATAL); + } + + /* If there's pending data, pad and write the last block */ + if (state->next != state->buffer) { + block_length = state->buffer_size - state->avail; + + /* Tricky calculation to determine size of last block */ + target_block_length = block_length; + if (a->bytes_in_last_block <= 0) + /* Default or Zero: pad to full block */ + target_block_length = a->bytes_per_block; + else + /* Round to next multiple of bytes_in_last_block. */ + target_block_length = a->bytes_in_last_block * + ( (block_length + a->bytes_in_last_block - 1) / + a->bytes_in_last_block); + if (target_block_length > a->bytes_per_block) + target_block_length = a->bytes_per_block; + if (block_length < target_block_length) { + memset(state->next, 0, + target_block_length - block_length); + block_length = target_block_length; + } + ret = (a->client_writer)(a, a->client_data, state->buffer, + block_length); + } + + /* Close the output */ + if (a->client_closer != NULL) + ret2 = (a->client_closer)(a, a->client_data); + + free(state->buffer); + free(state); + a->compression_data = NULL; + + return (ret != ARCHIVE_OK ? ret : ret2); +} diff --git a/lib/libarchive/archive_write_set_format.c b/lib/libarchive/archive_write_set_format.c new file mode 100644 index 0000000..711699c --- /dev/null +++ b/lib/libarchive/archive_write_set_format.c @@ -0,0 +1,62 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include "archive.h" +#include "archive_private.h" + +/* A table that maps format codes to functions. */ +static +struct { int code; int (*setter)(struct archive *); } codes[] = +{ + { ARCHIVE_FORMAT_CPIO, archive_write_set_format_cpio }, + { ARCHIVE_FORMAT_CPIO_POSIX, archive_write_set_format_cpio }, + { ARCHIVE_FORMAT_SHAR, archive_write_set_format_shar }, + { ARCHIVE_FORMAT_SHAR_BASE, archive_write_set_format_shar }, + { ARCHIVE_FORMAT_SHAR_DUMP, archive_write_set_format_shar_dump }, + { ARCHIVE_FORMAT_TAR, archive_write_set_format_pax_restricted }, + { ARCHIVE_FORMAT_TAR_PAX_INTERCHANGE, archive_write_set_format_pax }, + { ARCHIVE_FORMAT_TAR_PAX_RESTRICTED, + archive_write_set_format_pax_restricted }, + { ARCHIVE_FORMAT_TAR_USTAR, archive_write_set_format_ustar }, + { 0, NULL } +}; + +int +archive_write_set_format(struct archive *a, int code) +{ + int i; + + for (i = 0; codes[i].code != 0; i++) { + if (code == codes[i].code) + return ((codes[i].setter)(a)); + } + + archive_set_error(a, -1, "No such format"); + return (ARCHIVE_FATAL); +} diff --git a/lib/libarchive/archive_write_set_format_by_name.c b/lib/libarchive/archive_write_set_format_by_name.c new file mode 100644 index 0000000..e313a98 --- /dev/null +++ b/lib/libarchive/archive_write_set_format_by_name.c @@ -0,0 +1,59 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <string.h> + +#include "archive.h" +#include "archive_private.h" + +/* A table that maps names to functions. */ +static +struct { const char *name; int (*setter)(struct archive *); } names[] = +{ + { "cpio", archive_write_set_format_cpio }, + { "pax", archive_write_set_format_pax }, + { "shar", archive_write_set_format_shar }, + { "shardump", archive_write_set_format_shar_dump }, + { "ustar", archive_write_set_format_ustar }, + { NULL, NULL } +}; + +int +archive_write_set_format_by_name(struct archive *a, const char *name) +{ + int i; + + for (i = 0; names[i].name != NULL; i++) { + if (strcmp(name, names[i].name) == 0) + return ((names[i].setter)(a)); + } + + archive_set_error(a, -1, "No such format '%s'", name); + return (ARCHIVE_FATAL); +} diff --git a/lib/libarchive/archive_write_set_format_cpio.c b/lib/libarchive/archive_write_set_format_cpio.c new file mode 100644 index 0000000..f0df03e --- /dev/null +++ b/lib/libarchive/archive_write_set_format_cpio.c @@ -0,0 +1,244 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <sys/stat.h> +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <errno.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> + +#include "archive.h" +#include "archive_entry.h" +#include "archive_private.h" + +static int archive_write_cpio_data(struct archive *, const void *buff, + size_t s); +static int archive_write_cpio_finish(struct archive *); +static int archive_write_cpio_finish_entry(struct archive *); +static int archive_write_cpio_header(struct archive *, + struct archive_entry *); +static int format_octal(int64_t, void *, int); +static int64_t format_octal_recursive(int64_t, char *, int); + +struct cpio { + uint64_t entry_bytes_remaining; +}; + +struct cpio_header { + char c_magic[6]; + char c_dev[6]; + char c_ino[6]; + char c_mode[6]; + char c_uid[6]; + char c_gid[6]; + char c_nlink[6]; + char c_rdev[6]; + char c_mtime[11]; + char c_namesize[6]; + char c_filesize[11]; +}; + +/* + * Set output format to 'cpio' format. + */ +int +archive_write_set_format_cpio(struct archive *a) +{ + struct cpio *cpio; + + /* If someone else was already registered, unregister them. */ + if (a->format_finish != NULL) + (a->format_finish)(a); + + cpio = malloc(sizeof(*cpio)); + if (cpio == NULL) { + archive_set_error(a, ENOMEM, "Can't allocate cpio data"); + return (ARCHIVE_FATAL); + } + memset(cpio, 0, sizeof(*cpio)); + a->format_data = cpio; + + a->pad_uncompressed = 1; + a->format_write_header = archive_write_cpio_header; + a->format_write_data = archive_write_cpio_data; + a->format_finish_entry = archive_write_cpio_finish_entry; + a->format_finish = archive_write_cpio_finish; + a->archive_format = ARCHIVE_FORMAT_CPIO_POSIX; + a->archive_format_name = "POSIX cpio"; + return (ARCHIVE_OK); +} + +static int +archive_write_cpio_header(struct archive *a, struct archive_entry *entry) +{ + struct cpio *cpio; + const char *p, *path; + int pathlength, ret, written; + const struct stat *st; + struct cpio_header h; + + cpio = a->format_data; + ret = 0; + + path = archive_entry_pathname(entry); + pathlength = strlen(path) + 1; /* Include trailing null. */ + st = archive_entry_stat(entry); + + memset(&h, 0, sizeof(h)); + format_octal(070707, &h.c_magic, sizeof(h.c_magic)); + format_octal(st->st_dev, &h.c_dev, sizeof(h.c_dev)); + /* + * TODO: Generate artificial inode numbers rather than just + * re-using the ones off the disk. That way, the 18-bit c_ino + * field only limits the number of files in the archive. + */ + if (st->st_ino > 0777777) { + archive_set_error(a, ERANGE, "large inode number truncated"); + ret = ARCHIVE_WARN; + } + + format_octal(st->st_ino & 0777777, &h.c_ino, sizeof(h.c_ino)); + format_octal(st->st_mode, &h.c_mode, sizeof(h.c_mode)); + format_octal(st->st_uid, &h.c_uid, sizeof(h.c_uid)); + format_octal(st->st_gid, &h.c_gid, sizeof(h.c_gid)); + format_octal(st->st_nlink, &h.c_nlink, sizeof(h.c_nlink)); + if(S_ISBLK(st->st_mode) || S_ISCHR(st->st_mode)) + format_octal(st->st_rdev, &h.c_rdev, sizeof(h.c_rdev)); + else + format_octal(0, &h.c_rdev, sizeof(h.c_rdev)); + format_octal(st->st_mtime, &h.c_mtime, sizeof(h.c_mtime)); + format_octal(pathlength, &h.c_namesize, sizeof(h.c_namesize)); + + /* Symlinks get the link written as the body of the entry. */ + p = archive_entry_symlink(entry); + if (p != NULL && *p != '\0') + format_octal(strlen(p), &h.c_filesize, sizeof(h.c_filesize)); + else + format_octal(st->st_size, &h.c_filesize, sizeof(h.c_filesize)); + + written = (a->compression_write)(a, &h, sizeof(h)); + if (written < (int)sizeof(h)) + return (ARCHIVE_FATAL); + + written = (a->compression_write)(a, path, pathlength); + if (written < (int)pathlength) + return (ARCHIVE_FATAL); + + cpio->entry_bytes_remaining = st->st_size; + + /* Write the symlink now. */ + if (p != NULL && *p != '\0') + (a->compression_write)(a, p, strlen(p)); + + return (ret); +} + +static int +archive_write_cpio_data(struct archive *a, const void *buff, size_t s) +{ + struct cpio *cpio; + int ret; + + cpio = a->format_data; + if (s > cpio->entry_bytes_remaining) + s = cpio->entry_bytes_remaining; + + ret = (a->compression_write)(a, buff, s); + cpio->entry_bytes_remaining -= s; + return (ret); +} + +/* + * Format a number into the specified field. + */ +static int +format_octal(int64_t v, void *p, int digits) +{ + int64_t max; + int ret; + + max = (((int64_t)1) << (digits * 3)) - 1; + if (v >= 0 && v <= max) { + format_octal_recursive(v, p, digits); + ret = 0; + } else { + format_octal_recursive(max, p, digits); + ret = -1; + } + return (ret); +} + +static int64_t +format_octal_recursive(int64_t v, char *p, int s) +{ + if (s == 0) + return (v); + v = format_octal_recursive(v, p+1, s-1); + *p = '0' + (v & 7); + return (v >>= 3); +} + +static int +archive_write_cpio_finish(struct archive *a) +{ + struct stat st; + int er; + struct archive_entry *trailer; + + trailer = archive_entry_new(); + memset(&st, 0, sizeof(st)); + st.st_nlink = 1; + archive_entry_copy_stat(trailer, &st); + archive_entry_set_pathname(trailer, "TRAILER!!!"); + er = archive_write_cpio_header(a, trailer); + archive_entry_free(trailer); + return (er); +} + +static int +archive_write_cpio_finish_entry(struct archive *a) +{ + struct cpio *cpio; + int to_write, ret; + + cpio = a->format_data; + ret = 0; + while (cpio->entry_bytes_remaining > 0) { + to_write = cpio->entry_bytes_remaining < a->null_length ? + cpio->entry_bytes_remaining : a->null_length; + ret = (a->compression_write)(a, a->nulls, to_write); + if (ret < to_write) + return (-1); + cpio->entry_bytes_remaining -= to_write; + } + return (ret); +} diff --git a/lib/libarchive/archive_write_set_format_pax.c b/lib/libarchive/archive_write_set_format_pax.c new file mode 100644 index 0000000..bee95d3 --- /dev/null +++ b/lib/libarchive/archive_write_set_format_pax.c @@ -0,0 +1,694 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <sys/stat.h> +#include <errno.h> +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <stdlib.h> +#include <string.h> +#include <unistd.h> + +#include "archive.h" +#include "archive_entry.h" +#include "archive_private.h" + +struct pax { + uint64_t entry_bytes_remaining; + uint64_t entry_padding; + struct archive_string pax_header; + char written; +}; + +static void add_pax_attr(struct archive_string *, const char *key, + const char *value); +static void add_pax_attr_int(struct archive_string *, + const char *key, int64_t value); +static void add_pax_attr_time(struct archive_string *, + const char *key, int64_t sec, + unsigned long nanos); +static int archive_write_pax_data(struct archive *, + const void *, size_t); +static int archive_write_pax_finish(struct archive *); +static int archive_write_pax_finish_entry(struct archive *); +static int archive_write_pax_header(struct archive *, + struct archive_entry *); +static char *build_pax_attribute_name(const char *abbreviated, + struct archive_string *work); +static char *build_ustar_entry_name(char *dest, const char *src); +static char *format_int(char *dest, int64_t); +static int write_nulls(struct archive *, size_t); + +/* + * Set output format to 'restricted pax' format. + * + * This is the same as normal 'pax', but tries to suppress + * the pax header whenever possible. This is the default for + * bsdtar, for instance. + */ +int +archive_write_set_format_pax_restricted(struct archive *a) +{ + int r; + r = archive_write_set_format_pax(a); + a->archive_format = ARCHIVE_FORMAT_TAR_PAX_RESTRICTED; + a->archive_format_name = "restricted POSIX pax interchange"; + return (r); +} + +/* + * Set output format to 'pax' format. + */ +int +archive_write_set_format_pax(struct archive *a) +{ + struct pax *pax; + + if (a->format_finish != NULL) + (a->format_finish)(a); + + pax = malloc(sizeof(*pax)); + if (pax == NULL) { + archive_set_error(a, ENOMEM, "Can't allocate pax data"); + return (ARCHIVE_FATAL); + } + memset(pax, 0, sizeof(*pax)); + a->format_data = pax; + + a->pad_uncompressed = 1; + a->format_write_header = archive_write_pax_header; + a->format_write_data = archive_write_pax_data; + a->format_finish = archive_write_pax_finish; + a->format_finish_entry = archive_write_pax_finish_entry; + a->archive_format = ARCHIVE_FORMAT_TAR_PAX_INTERCHANGE; + a->archive_format_name = "POSIX pax interchange"; + return (ARCHIVE_OK); +} + +/* + * Note: This code assumes that 'nanos' has the same sign as 'sec', + * which implies that sec=-1, nanos=200000000 represents -1.2 seconds + * and not -0.8 seconds. This is a pretty pedantic point, as we're + * unlikely to encounter many real files created before Jan 1, 1970, + * much less ones with timestamps recorded to sub-second resolution. + */ +static void +add_pax_attr_time(struct archive_string *as, const char *key, + int64_t sec, unsigned long nanos) +{ + int digit, i; + char *t; + /* + * Note that each byte contributes fewer than 3 base-10 + * digits, so this will always be big enough. + */ + char tmp[1 + 3*sizeof(sec) + 1 + 3*sizeof(nanos)]; + + tmp[sizeof(tmp) - 1] = 0; + t = tmp + sizeof(tmp) - 1; + + /* Skip trailing zeros in the fractional part. */ + for(digit = 0, i = 10; i > 0 && digit == 0; i--) { + digit = nanos % 10; + nanos /= 10; + } + + /* Only format the fraction if it's non-zero. */ + if (i > 0) { + while (i > 0) { + *--t = "0123456789"[digit]; + digit = nanos % 10; + nanos /= 10; + i--; + } + *--t = '.'; + } + t = format_int(t, sec); + + add_pax_attr(as, key, t); +} + +static char * +format_int(char *t, int64_t i) +{ + int sign; + + if (i < 0) { + sign = -1; + i = -i; + } else + sign = 1; + + do { + *--t = "0123456789"[i % 10]; + } while (i /= 10); + if (sign < 0) + *--t = '-'; + return (t); +} + +static void +add_pax_attr_int(struct archive_string *as, const char *key, int64_t value) +{ + char tmp[1 + 3 * sizeof(value)]; + + tmp[sizeof(tmp) - 1] = 0; + add_pax_attr(as, key, format_int(tmp + sizeof(tmp) - 1, value)); +} + +/* + * Add a key/value attribute to the pax header. This function handles + * the length field and various other syntactic requirements. + */ +static void +add_pax_attr(struct archive_string *as, const char *key, const char *value) +{ + int digits, i, len, next_ten; + char tmp[1 + 3 * sizeof(int)]; /* < 3 base-10 digits per byte */ + + /*- + * PAX attributes have the following layout: + * <len> <space> <key> <=> <value> <nl> + */ + len = 1 + strlen(key) + 1 + strlen(value) + 1; + + /* + * The <len> field includes the length of the <len> field, so + * computing the correct length is tricky. I start by + * counting the number of base-10 digits in 'len' and + * computing the next higher power of 10. + */ + next_ten = 1; + digits = 0; + i = len; + while (i > 0) { + i = i / 10; + digits++; + next_ten = next_ten * 10; + } + /* + * For example, if string without the length field is 99 + * chars, then adding the 2 digit length "99" will force the + * total length past 100, requiring an extra digit. The next + * statement adjusts for this effect. + */ + if (len + digits >= next_ten) + digits++; + + /* Now, we have the right length so we can build the line. */ + tmp[sizeof(tmp) - 1] = 0; /* Null-terminate the work area. */ + archive_strcat(as, format_int(tmp + sizeof(tmp) - 1, len + digits)); + archive_strappend_char(as, ' '); + archive_strcat(as, key); + archive_strappend_char(as, '='); + archive_strcat(as, value); + archive_strappend_char(as, '\n'); +} + +/* + * TODO: Consider adding 'comment' and 'charset' fields to + * archive_entry so that clients can specify them. Also, consider + * adding generic key/value tags so clients can add arbitrary + * key/value data. + */ +static int +archive_write_pax_header(struct archive *a, + struct archive_entry *entry_original) +{ + struct archive_entry *entry_main; + const char *linkname, *name_start, *p; + int need_extension, oldstate, r, ret; + struct pax *pax; + const struct stat *st_main, *st_original; + + struct archive_string pax_entry_name = EMPTY_ARCHIVE_STRING; + char paxbuff[512]; + char ustarbuff[512]; + char ustar_entry_name[256]; + + need_extension = 0; + pax = a->format_data; + pax->written = 1; + + st_original = archive_entry_stat(entry_original); + + /* Make sure this is a type of entry that we can handle here */ + if (!archive_entry_hardlink(entry_original)) { + switch (st_original->st_mode & S_IFMT) { + case S_IFREG: + case S_IFLNK: + case S_IFCHR: + case S_IFBLK: + case S_IFDIR: + case S_IFIFO: + break; + case S_IFSOCK: + archive_set_error(a, -1, + "tar format cannot archive socket"); + return (ARCHIVE_WARN); + default: + archive_set_error(a, -1, + "tar format cannot archive this"); + return (ARCHIVE_WARN); + } + } + + /* Copy entry so we can modify it as needed. */ + entry_main = archive_entry_dup(entry_original); + archive_string_empty(&(pax->pax_header)); /* Blank our work area. */ + st_main = archive_entry_stat(entry_main); + + /* + * Determining whether or not the name is too big is ugly + * because of the rules for dividing names between 'name' and + * 'prefix' fields. Here, I pick out the longest possible + * suffix, then test whether the remaining prefix is too long. + */ + p = archive_entry_pathname(entry_main); + if (strlen(p) <= 100) /* Short enough for just 'name' field */ + name_start = p; /* Record a zero-length prefix */ + else { + /* Find the largest suffix that fits in 'name' field. */ + name_start = strchr(p + strlen(p) - 100 - 1, '/'); + if (name_start == NULL) /* No feasible break point. */ + name_start = p + strlen(p); + } + + /* If name is too long, add 'path' to pax extended attrs. */ + if (name_start - p > 155) { + add_pax_attr(&(pax->pax_header), "path", p); + archive_entry_set_pathname(entry_main, + build_ustar_entry_name(ustar_entry_name, p)); + need_extension = 1; + } + + /* If link name is too long, add 'linkpath' to pax extended attrs. */ + linkname = archive_entry_hardlink(entry_main); + if (linkname == NULL) + linkname = archive_entry_symlink(entry_main); + + if (linkname != NULL && strlen(linkname) > 100) { + add_pax_attr(&(pax->pax_header), "linkpath", linkname); + if (archive_entry_hardlink(entry_main)) + archive_entry_set_hardlink(entry_main, + "././@LongHardLink"); + else + archive_entry_set_symlink(entry_main, + "././@LongSymLink"); + need_extension = 1; + } + + /* If file size is too large, add 'size' to pax extended attrs. */ + if (st_main->st_size >= (1 << 30)) { + add_pax_attr_int(&(pax->pax_header), "size", st_main->st_size); + need_extension = 1; + } + + /* If numeric GID is too large, add 'gid' to pax extended attrs. */ + if (st_main->st_gid >= (1 << 20)) { + add_pax_attr_int(&(pax->pax_header), "gid", st_main->st_gid); + need_extension = 1; + } + + /* If group name is too large, add 'gname' to pax extended attrs. */ + /* TODO: If gname has non-ASCII characters, use pax attribute. */ + p = archive_entry_gname(entry_main); + if (p != NULL && strlen(p) > 31) { + add_pax_attr(&(pax->pax_header), "gname", p); + archive_entry_set_gname(entry_main, NULL); + need_extension = 1; + } + + /* If numeric UID is too large, add 'uid' to pax extended attrs. */ + if (st_main->st_uid >= (1 << 20)) { + add_pax_attr_int(&(pax->pax_header), "uid", st_main->st_uid); + need_extension = 1; + } + + /* If user name is too large, add 'uname' to pax extended attrs. */ + /* TODO: If uname has non-ASCII characters, use pax attribute. */ + p = archive_entry_uname(entry_main); + if (p != NULL && strlen(p) > 31) { + add_pax_attr(&(pax->pax_header), "uname", p); + archive_entry_set_uname(entry_main, NULL); + need_extension = 1; + } + + /* + * POSIX/SUSv3 doesn't provide a standard key for large device + * numbers. I use the same keys here that Joerg Schilling used for + * 'star.' No doubt, other implementations use other keys. Note that + * there's no reason we can't write the same information into a number + * of different keys. + * + * Of course, this is only needed for block or char device entries. + */ + if (S_ISBLK(st_main->st_mode) || + S_ISCHR(st_main->st_mode)) { + /* + * If devmajor is too large, add 'SCHILY.devmajor' to + * extended attributes. + */ + dev_t devmajor, devminor; + devmajor = major(st_main->st_rdev); + devminor = minor(st_main->st_rdev); + if (devmajor >= (1 << 18)) { + add_pax_attr_int(&(pax->pax_header), "SCHILY.devmajor", + devmajor); + archive_entry_set_devmajor(entry_main, (1 << 18) - 1); + need_extension = 1; + } + + /* + * If devminor is too large, add 'SCHILY.devminor' to + * extended attributes. + */ + if (devminor >= (1 << 18)) { + add_pax_attr_int(&(pax->pax_header), "SCHILY.devminor", + devminor); + archive_entry_set_devminor(entry_main, (1 << 18) - 1); + need_extension = 1; + } + } + + /* + * Technically, the mtime field in the ustar header can + * support 33 bits, but many platforms use signed 32-bit time + * values. The cutoff of 0x7fffffff here is a compromise. + * Yes, this check is duplicated just below; this helps to + * avoid writing an mtime attribute just to handle a + * high-resolution timestamp in "restricted pax" mode. + */ + if ((st_main->st_mtime < 0) || (st_main->st_mtime >= 0x7fffffff)) + need_extension = 1; + + /* + * The following items are handled differently in "pax + * restricted" format. In particular, in "pax restricted" + * format they won't be added unless need_extension is + * already set (we're already generated an extended header, so + * may as well include these). + */ + if (a->archive_format != ARCHIVE_FORMAT_TAR_PAX_RESTRICTED || + need_extension) { + + if (st_main->st_mtime < 0 || + st_main->st_mtime >= 0x7fffffff || + st_main->st_mtimespec.tv_nsec != 0) + add_pax_attr_time(&(pax->pax_header), "mtime", + st_main->st_mtime, st_main->st_mtimespec.tv_nsec); + + if (st_main->st_ctimespec.tv_nsec != 0 || + st_main->st_ctime != 0) + add_pax_attr_time(&(pax->pax_header), "ctime", + st_main->st_ctime, st_main->st_ctimespec.tv_nsec); + + if (st_main->st_atimespec.tv_nsec != 0 || + st_main->st_atime != 0) + add_pax_attr_time(&(pax->pax_header), "atime", + st_main->st_atime, st_main->st_atimespec.tv_nsec); + + /* I use a star-compatible file flag attribute. */ + p = archive_entry_fflags(entry_main); + if (p != NULL && *p != '\0') + add_pax_attr(&(pax->pax_header), "SCHILY.fflags", p); + + /* I use star-compatible ACL attributes. */ + p = archive_entry_acl(entry_main); + if (p != NULL && *p != '\0') + add_pax_attr(&(pax->pax_header), "SCHILY.acl.access", p); + p = archive_entry_acl_default(entry_main); + if (p != NULL && *p != '\0') + add_pax_attr(&(pax->pax_header), "SCHILY.acl.default", + p); + + /* Include star-compatible metadata info. */ + add_pax_attr_int(&(pax->pax_header), "SCHILY.dev", + st_main->st_dev); + add_pax_attr_int(&(pax->pax_header), "SCHILY.ino", + st_main->st_ino); + add_pax_attr_int(&(pax->pax_header), "SCHILY.nlink", + st_main->st_nlink); + } + + /* Format 'ustar' header for main entry. */ + /* We don't care if this returns an error. */ + __archive_write_format_header_ustar(a, ustarbuff, entry_main); + + /* If we built any extended attributes, write that entry first. */ + ret = 0; + if (archive_strlen(&(pax->pax_header)) > 0) { + struct stat st; + struct archive_entry *pax_attr_entry; + const char *pax_attr_name; + + memset(&st, 0, sizeof(st)); + pax_attr_entry = archive_entry_new(); + p = archive_entry_pathname(entry_main); + pax_attr_name = build_pax_attribute_name(p, &pax_entry_name); + + archive_entry_set_tartype(pax_attr_entry, 'x'); + archive_entry_set_pathname(pax_attr_entry, pax_attr_name); + st.st_size = archive_strlen(&(pax->pax_header)); + st.st_uid = st_main->st_uid; + st.st_gid = st_main->st_gid; + st.st_mode = st_main->st_mode; + archive_entry_copy_stat(pax_attr_entry, &st); + + archive_entry_set_uname(pax_attr_entry, + archive_entry_uname(entry_main)); + archive_entry_set_gname(pax_attr_entry, + archive_entry_gname(entry_main)); + + ret = __archive_write_format_header_ustar(a, paxbuff, + pax_attr_entry); + + archive_entry_free(pax_attr_entry); + free(pax_entry_name.s); + + /* Note that the 'x' header shouldn't ever fail to format */ + if (ret != 0) { + const char *msg = "archive_write_header_pax: " + "'x' header failed?! This can't happen.\n"; + write(2, msg, strlen(msg)); + exit(1); + } + r = (a->compression_write)(a, paxbuff, 512); + if (r < 512) { + pax->entry_bytes_remaining = 0; + pax->entry_padding = 0; + return (ARCHIVE_FATAL); + } + + pax->entry_bytes_remaining = archive_strlen(&(pax->pax_header)); + pax->entry_padding = 0x1ff & (- pax->entry_bytes_remaining); + + oldstate = a->state; + a->state = ARCHIVE_STATE_DATA; + r = archive_write_data(a, pax->pax_header.s, + archive_strlen(&(pax->pax_header))); + a->state = oldstate; + if (r < (int)archive_strlen(&(pax->pax_header))) { + /* If a write fails, we're pretty much toast. */ + return (ARCHIVE_FATAL); + } + + archive_write_pax_finish_entry(a); + } + + /* Write the header for main entry. */ + r = (a->compression_write)(a, ustarbuff, 512); + if (ret != ARCHIVE_OK) + ret = (r < 512) ? ARCHIVE_FATAL : ARCHIVE_OK; + + /* Only regular files have data. Note that pax, unlike ustar, + * does permit a hardlink to have data associated with it. */ + if (!S_ISREG(archive_entry_mode(entry_main))) + pax->entry_bytes_remaining = 0; + else + pax->entry_bytes_remaining = archive_entry_size(entry_main); + + pax->entry_padding = 0x1ff & (- pax->entry_bytes_remaining); + archive_entry_free(entry_main); + + return (ret); +} + +/* + * We need a valid name for the regular 'ustar' entry. This routine + * tries to hack something more-or-less reasonable. + */ +static char * +build_ustar_entry_name(char *dest, const char *src) +{ + const char *basename, *break_point, *prefix; + int basename_length, dirname_length, prefix_length; + + prefix = src; + basename = strrchr(src, '/'); + if (basename == NULL) { + basename = src; + prefix_length = 0; + basename_length = strlen(basename); + if (basename_length > 100) + basename_length = 100; + } else { + basename_length = strlen(basename); + if (basename_length > 100) + basename_length = 100; + dirname_length = basename - src; + + break_point = + strchr(src + dirname_length + basename_length - 101, '/'); + prefix_length = break_point - prefix - 1; + while (prefix_length > 155) { + prefix = strchr(prefix, '/') + 1; /* Drop 1st dir. */ + prefix_length = break_point - prefix - 1; + } + } + + strlcpy(dest, prefix, basename - prefix + 1 + basename_length); + + return (dest); +} + +/* + * The ustar header for the pax extended attributes must have a + * reasonable name: SUSv3 suggests 'dirname'/PaxHeaders/'basename' + * + * Joerg Schiling has argued that this is unnecessary because, in practice, + * if the pax extended attributes get extracted as regular files, noone is + * going to bother reading those attributes to manually restore them. + * This is a tempting argument, but I'm not entirely convinced. + * + * Of course, adding "PaxHeaders/" might force the name to be too big. + * Here, I start from the (possibly already-trimmed) name used in the + * main ustar header and delete some additional early path elements to + * fit in the extra "PaxHeader/" part. + */ +static char * +build_pax_attribute_name(const char *abbreviated, /* ustar-compat name */ + struct archive_string *work) +{ + const char *basename, *break_point, *prefix; + int prefix_length, suffix_length; + + /* + * This is much simpler because I know that "abbreviated" is + * already small enough; I just need to determine if it needs + * any further trimming to fit the "PaxHeader/" portion. + */ + + /* Identify the final prefix and suffix portions. */ + prefix = abbreviated; /* First guess: prefix starts at beginning */ + if (strlen(abbreviated) > 100) { + break_point = strchr(prefix + strlen(prefix) - 101, '/'); + prefix_length = break_point - prefix - 1; + suffix_length = strlen(break_point + 1); + /* + * The next loop keeps trimming until "/PaxHeader/" can + * be added to either the prefix or the suffix. + */ + while (prefix_length > 144 && suffix_length > 89) { + prefix = strchr(prefix, '/') + 1; /* Drop 1st dir. */ + prefix_length = break_point - prefix - 1; + } + } + + archive_string_empty(work); + basename = strrchr(prefix, '/'); + if (basename == NULL) { + archive_strcpy(work, "PaxHeader/"); + archive_strcat(work, prefix); + } else { + basename++; + archive_strncpy(work, prefix, basename - prefix); + archive_strcat(work, "PaxHeader/"); + archive_strcat(work, basename); + } + + return (work->s); +} + +/* Write two null blocks for the end of archive */ +static int +archive_write_pax_finish(struct archive *a) +{ + struct pax *pax; + pax = a->format_data; + if (pax->written && a->compression_write != NULL) + return (write_nulls(a, 512 * 2)); + free(pax); + a->format_data = NULL; + return (ARCHIVE_OK); +} + +static int +archive_write_pax_finish_entry(struct archive *a) +{ + struct pax *pax; + int ret; + + pax = a->format_data; + ret = write_nulls(a, pax->entry_bytes_remaining + pax->entry_padding); + pax->entry_bytes_remaining = pax->entry_padding = 0; + return (ret); +} + +static int +write_nulls(struct archive *a, size_t padding) +{ + int ret, to_write; + + while (padding > 0) { + to_write = padding < a->null_length ? padding : a->null_length; + ret = (a->compression_write)(a, a->nulls, to_write); + if (ret <= 0) + return (ARCHIVE_FATAL); + padding -= ret; + } + return (ARCHIVE_OK); +} + +static int +archive_write_pax_data(struct archive *a, const void *buff, size_t s) +{ + struct pax *pax; + int ret; + + pax = a->format_data; + pax->written = 1; + if (s > pax->entry_bytes_remaining) + s = pax->entry_bytes_remaining; + + ret = (a->compression_write)(a, buff, s); + pax->entry_bytes_remaining -= s; + return (ret); +} diff --git a/lib/libarchive/archive_write_set_format_shar.c b/lib/libarchive/archive_write_set_format_shar.c new file mode 100644 index 0000000..52f5509 --- /dev/null +++ b/lib/libarchive/archive_write_set_format_shar.c @@ -0,0 +1,403 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <sys/stat.h> +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <errno.h> +#include <stdarg.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> + +#include "archive.h" +#include "archive_entry.h" +#include "archive_private.h" + +static int archive_write_shar_finish(struct archive *); +static int archive_write_shar_header(struct archive *, + struct archive_entry *); +static int archive_write_shar_data_sed(struct archive *, + const void * buff, size_t); +static int archive_write_shar_data_uuencode(struct archive *, + const void * buff, size_t); +static int archive_write_shar_finish_entry(struct archive *); +static int shar_printf(struct archive *, const char *fmt, ...); + +struct shar { + int dump; + int end_of_line; + struct archive_entry *entry; + int has_data; + char outbuff[1024]; + size_t outbytes; + size_t outpos; + int uuavail; + char uubuffer[3]; + int wrote_header; +}; + +static int +shar_printf(struct archive *a, const char *fmt, ...) +{ + va_list ap; + char *p; + int ret; + + va_start(ap, fmt); + vasprintf(&p, fmt, ap); + ret = ((a->compression_write)(a, p, strlen(p))); + free(p); + va_end(ap); + return (ret); +} + +/* + * Set output format to 'shar' format. + */ +int +archive_write_set_format_shar(struct archive *a) +{ + struct shar *shar; + + /* If someone else was already registered, unregister them. */ + if (a->format_finish != NULL) + (a->format_finish)(a); + + shar = malloc(sizeof(*shar)); + if (shar == NULL) { + archive_set_error(a, ENOMEM, "Can't allocate shar data"); + return (ARCHIVE_FATAL); + } + memset(shar, 0, sizeof(*shar)); + a->format_data = shar; + + a->pad_uncompressed = 0; + a->format_write_header = archive_write_shar_header; + a->format_finish = archive_write_shar_finish; + a->format_write_data = archive_write_shar_data_sed; + a->format_finish_entry = archive_write_shar_finish_entry; + a->archive_format = ARCHIVE_FORMAT_SHAR_BASE; + a->archive_format_name = "shar"; + return (ARCHIVE_OK); +} + +/* + * An alternate 'shar' that uses uudecode instead of 'sed' to encode + * file contents and can therefore be used to archive binary files. + * In addition, this variant also attempts to restore ownership, file modes, + * and other extended file information. + */ +int +archive_write_set_format_shar_dump(struct archive *a) +{ + archive_write_set_format_shar(a); + a->format_write_data = archive_write_shar_data_uuencode; + a->archive_format = ARCHIVE_FORMAT_SHAR_DUMP; + a->archive_format_name = "shar dump"; + return (ARCHIVE_OK); +} + +static int +archive_write_shar_header(struct archive *a, struct archive_entry *entry) +{ + const char *linkname; + const char *name; + struct shar *shar; + const struct stat *st; + + shar = a->format_data; + if (!shar->wrote_header) { + shar_printf(a, "#!/bin/sh\n"); + shar_printf(a, "# This is a shar archive\n"); + shar->wrote_header = 1; + } + + /* Save the entry for the closing */ + if (shar->entry) + archive_entry_free(shar->entry); + shar->entry = archive_entry_clone(entry); + name = archive_entry_pathname(entry); + st = archive_entry_stat(entry); + + shar->has_data = 0; + if ((linkname = archive_entry_hardlink(entry)) != NULL) { + shar_printf(a, "echo x %s\n", name); + shar_printf(a, "ln -f %s %s\n", linkname, name); + } else if ((linkname = archive_entry_symlink(entry)) != NULL) { + shar_printf(a, "echo x %s\n", name); + shar_printf(a, "ln -fs %s %s\n", linkname, name); + } else { + switch(st->st_mode & S_IFMT) { + case S_IFREG: + shar_printf(a, "echo x %s\n", name); + if (archive_entry_size(entry) == 0) { + shar_printf(a, "touch %s\n", name); + shar->has_data = 0; + } else { + if (shar->dump) { + shar_printf(a, + "uudecode -o %s << 'SHAR_END'\n", + name); + shar_printf(a, "begin %o %s\n", + archive_entry_mode(entry) & 0777, + name); + } else + shar_printf(a, + "sed 's/^X//' > %s << 'SHAR_END'\n", + name); + shar->has_data = 1; + shar->end_of_line = 1; + shar->outpos = 0; + shar->outbytes = 0; + } + break; + case S_IFDIR: + shar_printf(a, "echo x %s\n", name); + shar_printf(a, "mkdir -p %s > /dev/null 2>&1\n", name); + /* + * TODO: Put dir name/mode on a list to be fixed + * up at end of archive. + */ + break; + case S_IFIFO: + shar_printf(a, "echo x %s\n", name); + shar_printf(a, "mkfifo %s\n", name); + break; + case S_IFCHR: + shar_printf(a, "echo x %s\n", name); + shar_printf(a, "mknod %s c %d %d\n", name, + archive_entry_devmajor(entry), + archive_entry_devminor(entry)); + break; + case S_IFBLK: + shar_printf(a, "echo x %s\n", name); + shar_printf(a, "mknod %s b %d %d\n", name, + archive_entry_devmajor(entry), + archive_entry_devminor(entry)); + break; + case S_IFSOCK: + archive_set_error(a, -1, + "shar format cannot archive socket"); + return (ARCHIVE_WARN); + default: + archive_set_error(a, -1, + "shar format cannot archive this"); + return (ARCHIVE_WARN); + } + } + + return (ARCHIVE_OK); +} + +static int +archive_write_shar_data_sed(struct archive *a, const void *buff, size_t length) +{ + struct shar *shar; + const char *src; + size_t n; + + shar = a->format_data; + if (!shar->has_data) + return (0); + + src = buff; + n = length; + shar->outpos = 0; + while (n-- > 0) { + if (shar->end_of_line) { + shar->outbuff[shar->outpos++] = 'X'; + shar->end_of_line = 0; + } + if (*src == '\n') + shar->end_of_line = 1; + shar->outbuff[shar->outpos++] = *src++; + + if (shar->outpos > sizeof(shar->outbuff) - 2) { + (a->compression_write)(a, shar->outbuff, shar->outpos); + shar->outpos = 0; + } + } + + if (shar->outpos > 0) + (a->compression_write)(a, shar->outbuff, shar->outpos); + return (length); +} + +#define UUENC(c) (((c)!=0) ? ((c) & 077) + ' ': '`') + +static void +uuencode_group(struct shar *shar) +{ + int t; + + t = 0; + if (shar->uuavail > 0) + t = 0xff0000 & (shar->uubuffer[0] << 16); + if (shar->uuavail > 1) + t |= 0x00ff00 & (shar->uubuffer[1] << 8); + if (shar->uuavail > 2) + t |= 0x0000ff & (shar->uubuffer[2]); + shar->outbuff[shar->outpos++] = UUENC( 0x3f & (t>>18) ); + shar->outbuff[shar->outpos++] = UUENC( 0x3f & (t>>12) ); + shar->outbuff[shar->outpos++] = UUENC( 0x3f & (t>>6) ); + shar->outbuff[shar->outpos++] = UUENC( 0x3f & (t) ); + shar->uuavail = 0; + shar->outbytes += shar->uuavail; + shar->outbuff[shar->outpos] = 0; +} + +static int +archive_write_shar_data_uuencode(struct archive *a, const void *buff, + size_t length) +{ + struct shar *shar; + const char *src; + size_t n; + + shar = a->format_data; + if (!shar->has_data) + return (ARCHIVE_OK); + src = buff; + n = length; + while (n-- > 0) { + if (shar->uuavail == 3) + uuencode_group(shar); + if (shar->outpos >= 60) { + shar_printf(a, "%c%s\n", UUENC(shar->outbytes), + shar->outbuff); + shar->outpos = 0; + shar->outbytes = 0; + } + + shar->uubuffer[shar->uuavail++] = *src++; + shar->outbytes++; + } + return (length); +} + +static int +archive_write_shar_finish_entry(struct archive *a) +{ + const char *g, *p, *u; + struct shar *shar; + + shar = a->format_data; + if (shar->entry == NULL) + return (0); + + if (shar->dump) { + /* Finish uuencoded data. */ + if (shar->has_data) { + if (shar->uuavail > 0) + uuencode_group(shar); + if (shar->outpos > 0) { + shar_printf(a, "%c%s\n", UUENC(shar->outbytes), + shar->outbuff); + shar->outpos = 0; + shar->uuavail = 0; + shar->outbytes = 0; + } + shar_printf(a, "%c\n", UUENC(0)); + shar_printf(a, "end\n", UUENC(0)); + shar_printf(a, "SHAR_END\n"); + } + /* Restore file mode, owner, flags. */ + /* + * TODO: Don't immediately restore mode for + * directories; defer that to end of script. + */ + shar_printf(a, "chmod %o %s\n", + archive_entry_mode(shar->entry) & 07777, + archive_entry_pathname(shar->entry)); + + u = archive_entry_uname(shar->entry); + g = archive_entry_gname(shar->entry); + if (u != NULL || g != NULL) { + shar_printf(a, "chown %s%s%s %s\n", + (u != NULL) ? u : "", + (g != NULL) ? ":" : "", (g != NULL) ? g : "", + archive_entry_pathname(shar->entry)); + } + + if ((p = archive_entry_fflags(shar->entry)) != NULL) { + shar_printf(a, "chflags %s %s\n", p, + archive_entry_pathname(shar->entry)); + } + + /* TODO: restore ACLs */ + + } else { + if (shar->has_data) { + /* Finish sed-encoded data: ensure last line ends. */ + if (!shar->end_of_line) + shar_printf(a, "\n"); + shar_printf(a, "SHAR_END\n"); + } + } + + archive_entry_free(shar->entry); + shar->entry = NULL; + return (0); +} + +static int +archive_write_shar_finish(struct archive *a) +{ + struct shar *shar; + + /* + * TODO: Accumulate list of directory names/modes and + * fix them all up at end-of-archive. + */ + + shar = a->format_data; + + /* + * Only write the end-of-archive markers if the archive was + * actually started. This avoids problems if someone sets + * shar format, then sets another format (which would invoke + * shar_finish to free the format-specific data). + */ + if (shar->wrote_header) { + shar_printf(a, "exit\n"); + /* Shar output is never padded. */ + archive_write_set_bytes_in_last_block(a, 1); + /* + * TODO: shar should also suppress padding of + * uncompressed data within gzip/bzip2 streams. + */ + } + if (shar->entry) + archive_entry_free(shar->entry); + free(shar); + a->format_data = NULL; + return (ARCHIVE_OK); +} diff --git a/lib/libarchive/archive_write_set_format_ustar.c b/lib/libarchive/archive_write_set_format_ustar.c new file mode 100644 index 0000000..1a67a1d --- /dev/null +++ b/lib/libarchive/archive_write_set_format_ustar.c @@ -0,0 +1,428 @@ +/*- + * Copyright (c) 2003-2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <sys/cdefs.h> +__FBSDID("$FreeBSD$"); + +#include <sys/stat.h> +#ifdef DMALLOC +#include <dmalloc.h> +#endif +#include <errno.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> + +#include "archive.h" +#include "archive_entry.h" +#include "archive_private.h" + +/* Constants are chosen so that 'term & 15' is number of termination bytes */ +#define TERM_BYTES(n) ((n) & 15) +#define OCTAL_TERM_SPACE_NULL 0x12 +#define OCTAL_TERM_NULL_SPACE 0x22 +#define OCTAL_TERM_NULL 0x31 +#define OCTAL_TERM_SPACE 0x41 +#define OCTAL_TERM_NONE 0x50 + +struct ustar { + uint64_t entry_bytes_remaining; + uint64_t entry_padding; + char written; +}; + +/* + * Define structure of POSIX 'ustar' tar header. + */ +struct archive_entry_header_ustar { + char name[100]; + char mode[8]; + char uid[8]; + char gid[8]; + char size[12]; + char mtime[12]; + char checksum[8]; + char typeflag[1]; + char linkname[100]; + char magic[6]; /* For POSIX: "ustar\0" */ + char version[2]; /* For POSIX: "00" */ + char uname[32]; + char gname[32]; + char devmajor[8]; + char devminor[8]; + char prefix[155]; +}; + +static int archive_write_ustar_data(struct archive *a, const void *buff, + size_t s); +static int archive_write_ustar_finish(struct archive *); +static int archive_write_ustar_finish_entry(struct archive *); +static int archive_write_ustar_header(struct archive *, + struct archive_entry *entry); +static int format_octal(int64_t, char *, int, int term); +static int64_t format_octal_recursive(int64_t, char *, int); +static int write_nulls(struct archive *a, size_t); + +/* + * Set output format to 'ustar' format. + */ +int +archive_write_set_format_ustar(struct archive *a) +{ + struct ustar *ustar; + + /* If someone else was already registered, unregister them. */ + if (a->format_finish != NULL) + (a->format_finish)(a); + + ustar = malloc(sizeof(*ustar)); + if (ustar == NULL) { + archive_set_error(a, ENOMEM, "Can't allocate ustar data"); + return (ARCHIVE_FATAL); + } + memset(ustar, 0, sizeof(*ustar)); + a->format_data = ustar; + + a->pad_uncompressed = 1; /* Mimic gtar in this respect. */ + a->format_write_header = archive_write_ustar_header; + a->format_write_data = archive_write_ustar_data; + a->format_finish = archive_write_ustar_finish; + a->format_finish_entry = archive_write_ustar_finish_entry; + a->archive_format = ARCHIVE_FORMAT_TAR_USTAR; + a->archive_format_name = "POSIX ustar"; + return (ARCHIVE_OK); +} + +static int +archive_write_ustar_header(struct archive *a, struct archive_entry *entry) +{ + char buff[512]; + int ret; + struct ustar *ustar; + + ustar = a->format_data; + ustar->written = 1; + ret = __archive_write_format_header_ustar(a, buff, entry); + if (ret != ARCHIVE_OK) + return (ret); + ret = (a->compression_write)(a, buff, 512); + if (ret < 512) + return (ARCHIVE_FATAL); + + /* Only regular files (not hardlinks) have data. */ + if (archive_entry_hardlink(entry) != NULL || + archive_entry_symlink(entry) != NULL || + !S_ISREG(archive_entry_mode(entry))) + ustar->entry_bytes_remaining = 0; + else + ustar->entry_bytes_remaining = archive_entry_size(entry); + + ustar->entry_padding = 0x1ff & (- ustar->entry_bytes_remaining); + return (ARCHIVE_OK); +} + +/* + * Format a basic 512-byte "ustar" header. + * + * Returns -1 if format failed (due to field overflow). + * Note that this always formats as much of the header as possible. + * + * This is exported so that other 'tar' formats can use it. + */ +int +__archive_write_format_header_ustar(struct archive *a, char buff[512], + struct archive_entry *entry) +{ + unsigned int checksum; + struct archive_entry_header_ustar *h; + int i, ret; + const char *p, *pp; + const struct stat *st; + + ret = 0; + memset(buff, 0, 512); + h = (struct archive_entry_header_ustar *)buff; + + /* + * Because the block is already null-filled, and strings + * are allowed to exactly fill their destination (without null), + * I use memcpy(dest, src, strlen()) here a lot to copy strings. + */ + + pp = archive_entry_pathname(entry); + if (strlen(pp) <= sizeof(h->name)) + memcpy(h->name, pp, strlen(pp)); + else { + /* Store in two pieces, splitting at a '/'. */ + p = strchr(pp + strlen(pp) - sizeof(h->name) - 1, '/'); + /* + * If there is no path separator, or the prefix or + * remaining name are too large, return an error. + */ + if (!p) { + archive_set_error(a, -1, "Pathname too long"); + ret = ARCHIVE_WARN; + } else if (p > pp + sizeof(h->prefix)) { + archive_set_error(a, -1, "Pathname too long"); + ret = ARCHIVE_WARN; + } else { + /* Copy prefix and remainder to appropriate places */ + memcpy(h->prefix, pp, p - pp); + memcpy(h->name, p + 1, pp + strlen(pp) - p - 1); + } + } + + p = archive_entry_hardlink(entry); + if(p == NULL) + p = archive_entry_symlink(entry); + if (p != NULL && p[0] != '\0') { + if (strlen(p) > sizeof(h->linkname)) { + archive_set_error(a, -1, "Link contents too long"); + ret = ARCHIVE_WARN; + } else + memcpy(h->linkname, p, strlen(p)); + } + + p = archive_entry_uname(entry); + if (p != NULL && p[0] != '\0') { + if (strlen(p) > sizeof(h->uname)) { + archive_set_error(a, -1, "Username too long"); + ret = ARCHIVE_WARN; + } else + memcpy(h->uname, p, strlen(p)); + } + + p = archive_entry_gname(entry); + if (p != NULL && p[0] != '\0') { + if (strlen(p) > sizeof(h->gname)) { + archive_set_error(a, -1, "Group name too long"); + ret = ARCHIVE_WARN; + } else + memcpy(h->gname, p, strlen(p)); + } + + st = archive_entry_stat(entry); + + if (format_octal(st->st_mode & 07777, h->mode, sizeof(h->mode), + OCTAL_TERM_SPACE_NULL)) { + archive_set_error(a, ERANGE, "Numeric mode too large"); + ret = ARCHIVE_WARN; + } + + if (format_octal(st->st_uid, h->uid, sizeof(h->uid), + OCTAL_TERM_SPACE_NULL)) { + archive_set_error(a, ERANGE, "Numeric user ID too large"); + ret = ARCHIVE_WARN; + } + + if (format_octal(st->st_gid, h->gid, sizeof(h->gid), + OCTAL_TERM_SPACE_NULL)) { + archive_set_error(a, ERANGE, "Numeric group ID too large"); + ret = ARCHIVE_WARN; + } + + if (format_octal(st->st_size, h->size, + sizeof(h->size), OCTAL_TERM_SPACE)) { + archive_set_error(a, ERANGE, "File size too large"); + ret = ARCHIVE_WARN; + } + + if (format_octal(st->st_mtime, h->mtime, + sizeof(h->mtime), OCTAL_TERM_SPACE)) { + archive_set_error(a, ERANGE, + "File modification time too large"); + ret = ARCHIVE_WARN; + } + + if (S_ISBLK(st->st_mode) || S_ISCHR(st->st_mode)) { + if (format_octal(major(st->st_rdev), h->devmajor, + sizeof(h->devmajor), OCTAL_TERM_SPACE_NULL)) { + archive_set_error(a, ERANGE, + "Major device number too large"); + ret = ARCHIVE_WARN; + } + + if (format_octal(minor(st->st_rdev), h->devminor, + sizeof(h->devminor), OCTAL_TERM_SPACE_NULL)) { + archive_set_error(a, ERANGE, + "Minor device number too large"); + ret = ARCHIVE_WARN; + } + } else { + format_octal(0, h->devmajor, sizeof(h->devmajor), + OCTAL_TERM_SPACE_NULL); + format_octal(0, h->devminor, sizeof(h->devminor), + OCTAL_TERM_SPACE_NULL); + } + + if (archive_entry_tartype(entry) >= 0) { + h->typeflag[0] = archive_entry_tartype(entry); + } else if (archive_entry_hardlink(entry) != NULL) { + h->typeflag[0] = '1'; + } else { + switch (st->st_mode & S_IFMT) { + case S_IFREG: h->typeflag[0] = '0' ; break; + case S_IFLNK: h->typeflag[0] = '2' ; break; + case S_IFCHR: h->typeflag[0] = '3' ; break; + case S_IFBLK: h->typeflag[0] = '4' ; break; + case S_IFDIR: h->typeflag[0] = '5' ; break; + case S_IFIFO: h->typeflag[0] = '6' ; break; + case S_IFSOCK: + archive_set_error(a, -1, + "tar format cannot archive socket"); + ret = ARCHIVE_WARN; + break; + default: + archive_set_error(a, -1, + "tar format cannot archive this"); + ret = ARCHIVE_WARN; + } + } + + memcpy(h->magic, "ustar\0", 6); + memcpy(h->version, "00", 2); + memcpy(h->checksum, " ", 8); + checksum = 0; + for (i = 0; i < 512; i++) + checksum += 255 & (unsigned int)buff[i]; + if (format_octal(checksum, h->checksum, + sizeof(h->checksum), OCTAL_TERM_NULL_SPACE)) { + archive_set_error(a, ERANGE, + "Checksum too large (Internal error; this can't happen)"); + ret = ARCHIVE_WARN; + } + + return (ret); +} + +/* + * Format a number into the specified field. + */ +static int +format_octal(int64_t v, char *p, int s, int term) +{ + /* POSIX specifies that all numeric values are unsigned. */ + int64_t max; + int digits, ret; + + digits = s - TERM_BYTES(term); + max = (((int64_t)1) << (digits * 3)) - 1; + + if (v >= 0 && v <= max) { + format_octal_recursive(v, p, digits); + ret = 0; + } else { + format_octal_recursive(max, p, digits); + ret = -1; + } + + switch (term) { + case OCTAL_TERM_SPACE_NULL: + p[s-2] = 0x20; + /* fall through */ + case OCTAL_TERM_NULL: + p[s-1] = 0; + break; + case OCTAL_TERM_NULL_SPACE: + p[s-2] = 0; + /* fall through */ + case OCTAL_TERM_SPACE: + p[s-1] = 0x20; + break; + } + return (ret); +} + +static int64_t +format_octal_recursive(int64_t v, char *p, int s) +{ + if (s == 0) + return (v); + + v = format_octal_recursive(v, p+1, s-1); + *p = '0' + (v & 7); + return (v >>= 3); +} + +static int +archive_write_ustar_finish(struct archive *a) +{ + struct ustar *ustar; + + ustar = a->format_data; + /* + * Suppress end-of-archive if nothing else was ever written. + * This fixes a problem where setting one format, then another ends up + * attempting to write a gratuitous end-of-archive marker. + */ + if (ustar->written && a->compression_write != NULL) + if (write_nulls(a, 512*2) < 512*2) + return (ARCHIVE_FATAL); + free(a->format_data); + a->format_data = NULL; + return (ARCHIVE_OK); +} + +static int +archive_write_ustar_finish_entry(struct archive *a) +{ + struct ustar *ustar; + int ret; + + ustar = a->format_data; + ret = write_nulls(a, + ustar->entry_bytes_remaining + ustar->entry_padding); + ustar->entry_bytes_remaining = ustar->entry_padding = 0; + return (ret); +} + +static int +write_nulls(struct archive *a, size_t padding) +{ + int ret, to_write; + + while (padding > 0) { + to_write = padding < a->null_length ? padding : a->null_length; + ret = (a->compression_write)(a, a->nulls, to_write); + if (ret < to_write) return (-1); + padding -= to_write; + } + return (0); +} + +static int +archive_write_ustar_data(struct archive *a, const void *buff, size_t s) +{ + struct ustar *ustar; + int ret; + + ustar = a->format_data; + if (s > ustar->entry_bytes_remaining) + s = ustar->entry_bytes_remaining; + ret = (a->compression_write)(a, buff, s); + ustar->entry_bytes_remaining -= s; + return (ret); +} diff --git a/lib/libarchive/libarchive.3 b/lib/libarchive/libarchive.3 new file mode 100644 index 0000000..f6fdb5e --- /dev/null +++ b/lib/libarchive/libarchive.3 @@ -0,0 +1,325 @@ +.\" Copyright (c) 2003-2004 Tim Kientzle +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd October 1, 2003 +.Dt LIBARCHIVE 3 +.Os +.Sh NAME +.Nm libarchive +.Nd functions for reading and writing streaming archives +.Sh LIBRARY +.Lb libarchive +.Sh OVERVIEW +The +.Nm +library provides a flexible interface for reading and writing +streaming archive files such as tar and cpio. +The library is inherently stream-oriented; readers serially iterate through +the archive, writers serially add things to the archive. +In particular, note that there is no built-in support for +random access nor for in-place modification. +.Pp +When reading an archive, the library automatically detects the +format and the compression. +The library currently has read support for: +.Bl -bullet -compact +.It +old-style tar +.It +most variants of the POSIX +.Dq ustar +format, +.It +the POSIX +.Dq pax interchange +format, +.It +GNU-format tar archives, +.It +POSIX octet-oriented cpio archives. +.El +The library automatically detects archives compressed with +.Xr gzip 1 +or +.Xr bzip2 1 +and decompresses them transparently. +.Pp +When writing an archive, you can specify the compression +to be used and the format to use. +The library can write +.Bl -bullet -compact +.It +POSIX-standard +.Dq ustar +archives, +.It +POSIX +.Dq pax interchange format +archives, +.It +POSIX octet-oriented cpio archives. +.El +The default write format is the pax interchange +format. +Pax interchange format is an extension of the tar archive format that +eliminates essentially all of the limitations of historic tar formats +in a standard fashion that is supported +by POSIX-compliant +.Xr pax 1 +implementations on many systems as well as several newer implementations of +.Xr tar 1 . +.Pp +The read and write APIs are accessed through the +.Fn archive_read_XXX +functions and the +.Fn archive_write_XXX +functions, respectively, and either can be used independently +of the other. +.Pp +The rest of this manual page provides an overview of the library +operation. +More detailed information can be found in the individual manual +pages for each API or utility function. +.Sh READING AN ARCHIVE +To read an archive, you must first obtain an initialized +.Tn struct archive +object from +.Fn archive_read_new . +You can then modify this object for the desired operations with the +various +.Fn archive_read_set_XXX +and +.Fn archive_read_support_XXX +functions. +In particular, you will need to invoke appropriate +.Fn archive_read_support_XXX +functions to enable the corresponding compression and format +support. +Note that these latter functions perform two distinct operations: +they cause the corresponding support code to be linked into your +program, and they enable the corresponding auto-detect code. +Unless you have specific constraints, you will generally want +to invoke +.Fn archive_read_support_compression_all +and +.Fn archive_read_support_format_all +to enable auto-detect for all formats and compression types +currently supported by the library. +.Pp +Once you have prepared the +.Tn struct archive +object, you call +.Fn archive_read_open +to actually open the archive and prepare it for reading. +.Pp +Each archive entry consists of a header followed by a certain +amount of data. +You can obtain the next header with +.Fn archive_read_next_header , +which returns a pointer to an +.Tn struct archive_entry +structure with information about the current archive element. +If the entry is a regular file, then the header will be followed +by the file data. +You can use +.Fn archive_read_data +(which works much like the +.Xr read 2 +system call) +to read this data from the archive. +You may prefer to use the higher-level +.Fn archive_read_data_skip , +which reads and discards the data for this entry, +.Fn archive_read_data_to_buffer , +which reads the data into an in-memory buffer, +.Fn archive_read_data_to_file , +which copies the data to the provided file descriptor, or +.Fn archive_read_extract , +which recreates the specified entry on disk and copies data +from the archive. +In particular, note that +.Fn archive_read_extract +uses the +.Tn struct archive_entry +structure that you provide it, which may differ from the +entry just read from the archive. +In particular, many applications will want to override the +pathname, file permissions, or ownership. +.Pp +Once you have finished reading data from the archive, you +should call +.Fn archive_read_finish +to release all resources. +In particular, +.Fn archive_read_finish +closes the archive and frees any memory that was allocated by the library. +.Pp +The +.Xr archive_read 3 +manual page provides more detailed calling information for this API. +.Sh WRITING AN ARCHIVE +You use a similar process to write an archive. +The +.Fn archive_write_new +function creates an archive object useful for writing, +the various +.Fn archive_write_set_XXX +functions are used to set parameters for writing the archive, and +.Fn archive_write_open +completes the setup and opens the archive for writing. +.Pp +Individual archive entries are written in a three-step +process: +You first initialize a +.Tn struct archive_entry +structure with information about the new entry. +At a minimum, you should set the pathname of the +entry and provide a +.Va struct stat +with a valid +.Va st_mode +field, which specifies the type of object and +.Va st_size +field, which specifies the size of the data portion of the object. +The +.Fn archive_write_header +function actually writes the header data to the archive. +You can then use +.Fn archive_write_data +to write the actual data. +.Pp +After all entries have been written, use the +.Fn archive_write_finish +function to release all resources. +.Pp +The +.Xr archive_write 3 +manual page provides more detailed calling information for this API. +.Sh DESCRIPTION +Detailed descriptions of each function are provided by the +corresponding manual pages. +.Pp +All of the functions utilize an opaque +.Tn struct archive +datatype that provides access to the archive contents. +.Pp +The +.Tn struct archive_entry +structure contains a complete description of a single archive +entry. +It uses an opaque interface that is fully documented in +.Xr archive_entry 3 . +.Pp +Users familiar with historic formats should be aware that the newer +variants have eliminated most restrictions on the length of textual fields. +Clients should not assume that filenames, link names, user names, or +group names are limited in length. +In particular, pax interchange format can easily accomodate pathnames +that exceed +.Va PATH_MAX . +.Sh RETURN VALUES +Most functions return zero on success, non-zero on error. +On error, the +.Fn archive_errno +function can be used to retrieve a numeric error code (see +.Xr errno 2 ) . +The +.Fn archive_error_string +returns a textual error message suitable for display. +.Pp +.Fn archive_read_new +and +.Fn archive_write_new +return pointers to an allocated and initialized +.Tn struct archive +object. +.Pp +.Fn archive_read_next_header +returns a pointer to an +.Tn struct archive_entry +structure or +.Dv NULL . +If +.Dv NULL +is returned, the value from +.Fn archive_errno +will be zero if the end of the archive was reached, +-1 if there was a recoverable error reading the archive, +or positive if there was a non-recoverable error reading the archive. +If there was a recoverable error, the client should retry the +operation. +.Pp +.Fn archive_read_data +and +.Fn archive_write_data +return a count of the number of bytes actually read or written. +A value of zero indicates the end of the data for this entry. +A negative value indicates an error, in which case the +.Fn archive_errno +and +.Fn archive_error_string +functions can be used to obtain more information. +.Sh ENVIRONMENT +The library currently obeys no environment variables. +.Sh SEE ALSO +.Xr tar 1 , +.Xr archive_entry 3 , +.Xr archive_read 3 , +.Xr archive_util 3 , +.Xr archive_write 3 , +.Xr tar 5 . +.Sh HISTORY +The +.Nm libarchive +library first appeared in +.Fx 5.3 . +.Sh AUTHORS +.An -nosplit +The +.Nm libarchive +library was written by +.An Tim Kientzle Aq kientzle@acm.org . +.Sh BUGS +Some archive formats support information that is not supported by +.Tn struct archive_entry +and cannot therefore be archived or restored using this library. +This includes, for example, comments, character sets, sparse +file information, or the arbitrary key/value pairs that can appear in +pax interchange format archives. +.Pp +Conversely, of course, not all of the information that can be +stored in an +.Tn struct archive_entry +is supported by all formats. +For example, cpio formats do not support nanosecond timestamps; +old tar formats do not support large device numbers. +.Pp +The library does not have write support for pre-POSIX tar archives. +The support for GNU tar format is incomplete. +.Pp +The library should obey the current locale and convert +UTF8 filenames stored by pax interchange format to and from the +currently-active character coding. diff --git a/lib/libarchive/tar.5 b/lib/libarchive/tar.5 new file mode 100644 index 0000000..5149454 --- /dev/null +++ b/lib/libarchive/tar.5 @@ -0,0 +1,626 @@ +.\" Copyright (c) 2003-2004 Tim Kientzle +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd October 1, 2003 +.Dt TAR 5 +.Os +.Sh NAME +.Nm tar +.Nd format of tape archive files +.Sh DESCRIPTION +The +.Nm +archive format collects any number of files, directories, and other +filesystem objects (symbolic links, device nodes, etc.) into a single +stream of bytes. +The format was originally designed to be used with +tape drives that operate with fixed-size blocks, but is widely used as +a general packaging mechanism. +.Ss General Format +A +.Nm +archive consists of a series of 512-byte records. +Each filesystem object requires a header record which stores basic metadata +(pathname, owner, permissions, etc.) and zero or more records containing any +file data. +The end of the archive is indicated by two records consisting +entirely of zero bytes. +.Pp +For compatibility with tape drives that use fixed block sizes, +programs that read or write tar files always read or write a fixed +number of records with each I/O operation. +These +.Dq blocks +are always a multiple of the record size. +The most common block size---and the maximum supported by historical +implementations---is 10240 bytes or 20 records. +(Note: the terms +.Dq block +and +.Dq record +here are not entirely standard; this document follows the +convention established by John Gilmore in documenting +.Nm pdtar . ) +.Ss Old-Style Archive Format +The original tar archive format has been extended many times to +include additional information that various implementors found +necessary. +This section describes a variant that is compatible with +most historic +.Nm +implementations. +.Pp +The header record for an old-style +.Nm +archive consists of the following: +.Bd -literal -offset indent +struct tarfile_header_old { + char name[100]; + char mode[8]; + char uid[8]; + char gid[8]; + char size[12]; + char mtime[12]; + char checksum[8]; +}; +.Ed +The remaining bytes in the header record are filled with nulls. +.Bl -tag -width indent +.It Va name +Pathname, stored as a null-terminated string. +Some very early implementations only supported regular files. +However, a common early convention added +a trailing "/" character to indicate a directory name, allowing +directory permissions and owner information to be archived and restored. +.It Va mode +File mode, stored as an octal number in ASCII. +.It Va uid , Va gid +User id and group id of owner, as octal number in ASCII. +.It Va size +Size of file, as octal number in ASCII. +.It Va mtime +Modification time of file, as an octal number in ASCII. +This indicates the number of seconds since the start of the epoch, +00:00:00 UTC January 1, 1970. +Note that negative values should be avoided +here, as they are handled inconsistently. +.It Va checksum +Header checksum, stored as an octal number in ASCII. +To compute the checksum, set the checksum field to all spaces, +then sum all bytes in the header using unsigned arithmetic. +This field should be stored as six octal digits followed by a null and a space +character. +Note that for many years, Sun tar used signed arithmetic +for the checksum field, which can cause interoperability problems +when transferring archives between systems. +This error was propagated to other implementations that used Sun +tar as a reference. +Modern robust readers compute the checksum both ways and accept the +header if either computation matches. +.El +.Pp +Early implementations of +.Nm +varied in how they terminated these fields. +Early BSD documentation specified the following: the pathname must +be null-terminated; the mode, uid, and gid fields must end in a space and a +null byte; the size and mtime fields must end in a space; the checksum is +terminated by a null and a space. +For best portability, writers of +.Nm +archives should fill the numeric fields with leading zeros. +.Ss Early Extensions +Very early +.Nm +implementations only supported regular files. +Two early extensions added support for directories, hard links, and +symbolic links. +.Pp +Early +.Nm +archives indicated directories by adding a trailing +.Pa / +to the name. +The size field was often used to indicate the total size of all files +in the directory. +This was intended to facilitate extraction on systems that pre-allocated +directory storage; most modern readers should simply ignore the +size field for directories. +.Pp +To support hard links and symbolic links, +.Va linktype +and +.Va linkname +fields were added: +.Bd -literal -offset indent +struct tarfile_entry_common { + char name[100]; + char mode[8]; + char uid[8]; + char gid[8]; + char size[12]; + char mtime[12]; + char checksum[8]; + char linktype[1]; + char linkname[100]; +}; +.Ed +.Pp +The +.Va linktype +field indicates the type of entry. +For backwards compatibility, a NULL +character here indicates a regular file or directory. +An ASCII "1" here indicates a hard link entry, ASCII "2" indicates +a symbolic link. +The +.Va linkname +field holds the name of the file linked to. +.Ss POSIX Standard Archives +POSIX 1003.1 defines a standard +.Nm +file format that is read and written +by POSIX-compliant implementations +of +.Xr pax 1 . +This format is often called the +.Dq ustar +format, after the magic value used +in the header. +(The name is an acronym for +.Dq Unix Standard TAR . ) +It extends the format above +with new fields: +.Bd -literal -offset indent +struct tarfile_entry_posix { + char name[100]; + char mode[8]; + char uid[8]; + char gid[8]; + char size[12]; + char mtime[12]; + char checksum[8]; + char typeflag[1]; + char linkname[100]; + char magic[6]; + char version[2]; + char uname[32]; + char gname[32]; + char devmajor[8]; + char devminor[8]; + char prefix[155]; +}; +.Ed +.Bl -tag -width indent +.It Va typeflag +Type of entry. POSIX adopted the BSD +.Va linktype +field and extended it with several new type values: +.Bl -tag -width indent -compact +.It Dq 0 +Regular file. NULL should be treated as a synonym, for compatibility purposes. +.It Dq 1 +Hard link. +.It Dq 2 +Symbolic link. +.It Dq 3 +Character device node. +.It Dq 4 +Block device node. +.It Dq 5 +Directory. +.It Dq 6 +FIFO node. +.It Dq 7 +Reserved. +.It Other +A POSIX-compliant implementation must treat any unrecognized typeflag value +as a regular file. +In particular, writers should ensure that all entries +have a valid filename so that they can be restored by readers that do not +support the corresponding extension. +Uppercase letters "A" through "Z" are reserved for custom extensions. +Note that sockets and whiteout entries are not archivable. +.El +.It Va magic +Contains the magic value +.Dq ustar +followed by a NULL byte to indicate that this is a POSIX standard archive. +Full compliance requires the uname and gname fields be properly set. +(Note that GNU tar archives uses a trailing space rather than a trailing +NULL here and are therefore not POSIX standard archives.) +.It Va version +Version. This should be +.Dq 00 +(two copies of the ASCII digit zero) for POSIX standard archives. +(Note that GNU tar archives fill this with a space and a null.) +.It Va uname , Va gname +User and group names, as null-terminated ASCII strings. +These should be used in preference to the uid/gid values +when they are set and the corresponding names exist on +the system. +.It Va devmajor , Va devminor +Major and minor numbers for character device or block device entry. +.It Va prefix +First part of pathname. +If the pathname is too long to fit in the 100 bytes provided by the standard +format, it can be split at any +.Pa / +character with the first portion going here. +If the prefix field is not empty, the reader will prepend +the prefix value and a +.Pa / +character to the regular name field to obtain the full pathname. +.El +.Pp +Note that all unused bytes must be set to +.Dv NULL . +.Pp +Field termination is specified slightly differently by POSIX +than by previous implementations. +The +.Va magic , +.Va uname , +and +.Va gname +fields must have a trailing +.Dv NULL . +The +.Va pathname , +.Va linkname , +and +.Va prefix +fields must have a trailing +.Dv NULL +unless they fill the entire field. +(In particular, it is possible to store a 256-character pathname if it +happens to have a +.Pa / +as the 156th character.) +POSIX requires numeric fields to be zero-padded in the front, and allows +them to be terminated with either space or +.Dv NULL +characters. +.Ss Pax Interchange Format +There are many attributes that cannot be portably stored in a +POSIX ustar archive. +POSIX defined a +.Dq pax interchange format +that uses two new types of entries to hold text-formatted +metadata that applies to following entries. +Note that a pax interchange format archive is a ustar archive in every +respect. +The new data is stored in ustar-compatible archive entries that use the +.Dq x +or +.Dq g +typeflag. +In particular, older implementations that do not fully support these +extensions will extract the metadata into regular files, where the +metadata can be examined as necessary. +.Pp +An entry in a pax interchange format archive consists of one or +two standard entries, each with its own header and data. +The first optional entry stores the extended attributes +for the second entry. +This optional first entry has an "x" typeflag and a size field that +indicates the total size of the extended attributes. +The extended attributes themselves are stored as a series of text-format +lines encoded in the portable UTF-8 encoding. +Each line consists of a decimal number, a space, a key string, an equals +sign, a value string, and a new line. +The decimal number indicates the length of the entire line, including the +initial length field and the trailing newline. +Keys are always encoded in portable 7-bit ASCII. +Keys in all lowercase are reserved for future standardization. +Vendors can add their own keys by prefixing them with an all uppercase +vendor name and a period. +Note that, unlike the historic header, numeric values are stored using +decimal, not octal. +.Bl -tag -width indent +.It Cm atime , Cm ctime , Cm mtime +File access, inode change, and modification times. +These fields can be negative or include a decimal point and a fractional value. +.It Cm uname , Cm uid , Cm gname , Cm gid +User name, group name, and numeric UID and GID values. The user name +and group name stored here are encoded in UTF8 and can thus include +non-ASCII characters. The UID and GID fields can be of arbitrary length. +.It Cm linkpath +The full path of the linked-to file. Note that this is encoded in UTF8 +and can thus include non-ASCII characters. +.It Cm path +The full pathname of the entry. Note that this is encoded in UTF8 +and can thus include non-ASCII characters. +.It Cm realtime.* , Cm security.* +These keys are reserved by SUSv3 and may be used for future standardization. +.It Cm size +The size of the file. Note that there is no length limit on this field, +allowing +.Nm +archives to store files much larger than the historic 8GB limit. +.It Cm SCHILY.* +Vendor-specific attributes used by Joerg Schilling's +.Nm star +implementation. +.It Cm SCHILY.acl.access , Cm SCHILY.acl.default +Stores the access and default ACLs as textual strings in a format +that's an extension of the format specified by POSIX XXXX draft 17. +In particular, each user or group access specification can include a fourth +field with the integer UID or GID. +This allows ACLs to be restored on systems that may not have complete +user or group information available (such as when NIS/YP or LDAP services +are temporarily unavailable). +.It Cm SCHILY.devminor , Cm SCHILY.devmajor +The full minor and major numbers for device nodes. +.It Cm SCHILY.ino +The inode number for the entry. +.It Cm VENDOR.* +XXX document other vendor-specific extensions XXX +.El +.Pp +Any values stored in an extended attribute override the corresponding +values in the regular tar header. +Note that compliant readers should ignore the regular fields when they +are overridden. +This is important, as existing archivers are known to store non-compliant +values in the standard header fields in this situation. +There are no limits on length for any of these fields. +In particular, numeric fields can be arbitrarily large. +All text fields are encoded in UTF8. +Compliant writers should store only portable 7-bit ASCII characters in +the standard ustar header and use extended +attributes whenever a text value contains non-ASCII characters. +.Pp +In addition to the +.Cm x +entry described above, the pax interchange format +also supports a +.Cm g +entry. +The +.Cm g +entry is identical in format, but specifies attributes that serve as +defaults for all subsequent archive entries. +The +.Cm g +entry is not widely used. +.Ss GNU Tar Archives +The GNU tar program added new features by starting with an early draft +of POSIX and using three different extension mechanisms: They added +new fields to the empty space in the header (some of which was later +used by POSIX for conflicting purposes); +they allowed the header to +be continued over multiple records; +and they defined new entries +that modify following entries (similar in principle to the +.Cm x +entry described above, but each GNU special entry is single-purpose, +unlike the general-purpose +.Cm x +entry). +As a result, GNU tar archives are not POSIX compatible, although +more lenient POSIX-compliant readers can successfully extract most +GNU tar archives. +.Bd -literal -offset indent +struct tarfile_entry_gnu { + char name[100]; + char mode[8]; + char uid[8]; + char gid[8]; + char size[12]; + char mtime[12]; + char checksum[8]; + char typeflag[1]; + char linkname[100]; + char magic[6]; + char version[2]; + char uname[32]; + char gname[32]; + char devmajor[8]; + char devminor[8]; + char atime[12]; + char ctime[12]; + char offset[12]; + char longnames[4]; + char unused[1]; + struct { + char offset[12]; + char numbytes[12]; + } sparse[4]; + char isextended[1]; + char realsize[12]; +}; +.Ed +.Bl -tag -width indent +.It Va typeflag +GNU tar uses the following special entry types. +.Bl -tag -width indent +.It "7" +GNU tar treats type "7" records identically to type "0" records, +except on one obscure RTOS where they are used to indicate the +pre-allocation of a contiguous file on disk. +.It "D" +This indicates a directory entry. Unlike the POSIX-standard "5" +typeflag, the header is followed by data records listing the names +of files in this directory. Each name is preceded by an ASCII "Y" +if the file is stored in this archive or "N" if the file is not +stored in this archive. Each name is terminated with a null, and +an extra null marks the end of the name list. The purpose of this +entry is to support incremental backups; a program restoring from +such an archive may wish to delete files on disk that did not exist +in the directory when the archive was made. +.Pp +Note that the "D" typeflag specifically violates POSIX, which requires +that unrecognized typeflags be restored as normal files. +In this case, restoring the "D" entry as a file could interfere +with subsequent creation of the like-named directory. +.It "K" +The data for this entry is a long linkname for the following regular entry. +.It "L" +The data for this entry is a long pathname for the following regular entry. +.It "M" +This is a continuation of the last file on the previous volume. +GNU multi-volume archives gaurantee that each volume begins with a valid +entry header. +To ensure this, a file may be split, with part stored at the end of one volume, +and part stored at the beginning of the next volume. +The "M" typeflag indicates that this entry continues +an existing file. +Such entries can only occur as the first or second entry +in an archive (the latter only if the first entry is a volume label). +The +.Va size +field specifies the size of this entry. +The +.Va offset +field at bytes 369-380 specifies the offset where this file fragment +begins. +The +.Va realsize +field specifies the total size of the file (which must equal +.Va size +plus +.Va offset ) . +When extracting, GNU tar checks that the header file name is the one it is +expecting, that the header offset is in the correct sequence, and that +the sum of offset and size is equal to realsize. +FreeBSD's version of GNU tar does not handle the corner case of an +archive being continued in the middle of a long name or other +extension header. +.It "N" +Type "N" records are no longer generated by GNU tar. They contained a +list of files to be renamed or symlinked after extraction; this was +originally used to support long names. The contents of this record +are a text description of the operations to be done, in the form +.Dq Rename %s to %s\en +or +.Dq Symlink %s to %s\en ; +in either case, both +filenames are escaped using K&R C syntax. +.It "S" +This is a +.Dq sparse +regular file. +Sparse files are stored as a series of fragments. +The header contains a list of fragment offset/length pairs. +If more than four such entries are required, the header is +extended as necessary with +.Dq extra +header extensions (an older format that's no longer used), or +.Dq sparse +extensions. +.It "V" +The +.Va name +field should be interpreted as a tape/volume header name. +This entry should generally be ignored on extraction. +.El +.It Va magic +The magic field holds the five characters +.Dq ustar +followed by a space. +Note that POSIX ustar archives have a trailing null. +.It Va version +The version field holds a space character followed by a null. +Note that POSIX ustar archive use two copies of the ASCII digit +.Dq 0 . +.It Va atime , Va ctime +The time the file was last accessed and the time of +last change of file information, stored in octal as with +.Va mtime. +.It Va longnames +This field is apparently no longer used. +.It Sparse Va offset / Va numbytes +Each such structure specifies a single fragment of a sparse +file. +The two fields store values as octal numbers. +The fragments are each padded to a multiple of 512 bytes +in the archive. +On extraction, the list of fragments is collected from the +header (including any extension headers), and the data +is then read and written to the file at appropriate offsets. +.It Va isextended +If this is set to non-zero, the header will be followed by +additional +.Dq sparse header +records. +Each such record contains XXX more details needed XXX +.It Va realsize +A binary representation of the size, with a much larger range +than the POSIX file size. +.El +.Ss Other Extensions +One common extension, utilized by GNU tar, star, and other newer +.Nm +implementations, permits binary numbers in the standard numeric +fields. +This is flagged by setting the high bit of the first character. +This permits 95-bit values for the length and time fields +and 63-bit values for the uid, gid, and device numbers. +GNU tar supports this extension for the +length, mtime, ctime, and atime fields. +Joerg Schilling's star program supports this extension for +all numeric fields. +Note that this extension is largely obsoleted by the extended attribute +record provided by the pax interchange format. +.Pp +Another early GNU extension allowed base-64 values rather +than octal. +This extension was short-lived and such archives are almost never seen. +However, there is still code in GNU tar to support them; this code is +responsible for a very cryptic warning message that is sometimes seen when +GNU tar encounters a damaged archive. +.Sh SEE ALSO +.Xr ar 1 , +.Xr pax 1 , +.Xr tar 1 , +.Sh STANDARDS +The +.Nm tar +utility is no longer a part of any official standard. +It last appeared in SUSv2. +It has been supplanted in subsequent standards by +.Xr pax 1 . +The ustar format is defined in +.St -p1003.1 +as part of the specification for the +.Xr pax 1 +utility. +The pax interchange file format is new with +.St -p1003.1-2001 . +.Sh HISTORY +A +.Nm tar +command appeared in Sixth Edition Unix. +John Gilmore's +.Nm pdtar +public-domain implementation (circa 1987) was highly influential +and formed the basis of GNU tar. +Joerg Shilling's +.Nm star +archiver is another open-source (GPL) archiver (originally developed +circa 1985) which features complete support for pax interchange +format. |