Learn through the super-clean Baeldung Pro experience:
>> Membership and Baeldung Pro.
No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.
Last updated: March 18, 2024
Since its inception in January 1979, the tar tape archiver tool has expanded much beyond its original designation through the years. In fact, it has branched out from the POSIX standard into two main versions:
While there are versions for embedded systems and custom implementations like star by Jörg Schilling, the ones above are arguably the most robust and widely used.
In this tutorial, we’ll explore how the GNU and BSD tar implementations differ. First, we look at BSD tar and its main branches. After that, we turn to GNU tar. Next, we thoroughly go over star. Then, the different tar formats take the spotlight. Finally, we compare the main tar implementations in terms of many of their important features.
We tested the code in this tutorial on Debian 11 (Bullseye) with GNU Bash 5.1.4, as well as OpenBSD 7.3 and FreeBSD 13.2. It should work in most POSIX-compliant environments unless otherwise specified.
Actually, BSD tar has an original and library version. Let’s briefly look at both.
The original BSD tar is still bundled with OpenBSD:
$ tar .
tar: unknown option .
usage: tar {crtux}[014578befHhjLmNOoPpqsvwXZz]
[blocking-factor | archive | replstr] [-C directory] [-I file]
[file ...]
tar {crtux}[014578befHhjLmNOoPpqsvwXZz] [-b blocking-factor]
[-C directory] [-f archive] [-I file] [-s replstr] [file ...]
In fact, its command name is just tar and it has no default alias or symbolic link like bsdtar. That’s mainly due to the stronghold nature of OpenBSD.
For the same reasons, we don’t really have a way to get the current version of the tool in this case, apart from checking the OS release. Actually, this version of tar is the OpenBSD POSIX-strict implementation, part of the bundled toolkit source code, so finding it on another platform is rare.
On the other hand, FreeBSD ships with another version of tar, which we’ll identify as [bsd]tar. Also, it’s available as a package in major Linux distributions:
$ which tar
/usr/bin/tar
$ which bsdtar
/usr/bin/bsdtar
$ ls -l /usr/bin/tar
lrwxr-xr-x 1 root wheel 4 Apr 7 08:42 /usr/bin/tar -> bsdtar
$ tar --version
bsdtar 3.6.2 - libarchive 3.6.2 zlib/1.2.13 liblzma/5.4.1 bz2lib/1.0.8 libzstd/1.4.8
In this case, we use which along with ls and its [-l]ong list –-human-readable format to verify the files and locations:
This [bsd]tar version of tar is based on the libarchive library. Although perhaps not quite up to OpenBSD standards, the libarchive code is very comprehensive and feature-rich.
Because of this, the bsdtar command in recent versions of Ubuntu actually comes from the libarchive-tools package:
$ apt-get install bsdtar
[...]
Package bsdtar is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source
However the following packages replace it:
libarchive-tools
E: Package 'bsdtar' has no installation candidate
$ apt-get install libarchive-tools
[...]
On the other hand, the tar command represents another utility in Ubuntu and most major distributions.
As expected, there is also a GNU tar command, often called regular tar.
For many major Linux distributions, that’s the main tar implementation that either comes preinstalled or in the default tar package:
$ apt-get install tar
Let’s try it out by checking the version we have:
$ tar --version
tar (GNU tar) 1.34
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by John Gilmore and Jay Fenlason.
In fact, GNU tar fully supports all features of the original [bsd]tar implementations but also includes GNU-specific options and additions. Critically, GNU has its own archive format specification due to the limitations of older tar formats. In fact, the default archive format is part of the build options.
Although it’s not directly branching from either of the main competitors, the star command has its own place among tar archivers due to its many exclusive features. Due to its features and the fact that it fully supports POSIX 1003.1 and POSIX.1-2001, we add it here for completeness.
The star command integrates several features from other commands:
For some, even the syntax is similar enough for convenient usage.
Support for many aspects and types of metadata is included in star:
Due to the above, star can handle complex backup features like incremental backups.
Unlike other tar implementations, star can handle the archive and compression format detection automatically and decompress where needed. On top of that, any extractions will not replace more recent copies of the same data by default.
Using several features, star manages to outperform most tar implementations:
Actually, star is the fastest tar archiver so far.
The star tool has its own portable rmt server and mt client implementation. Thus, it provides full support for the remote magtape protocol.
In addition, star manages all these features while remaining independent from both the operating system and the filesystem.
Now, let’s explore some differences between the main versions of tar archives.
There are five main tar archive formats:
+--------+-----------+-----------+-----------+-----------+
| Format | ID | File Size | File Name | Devn |
+--------+-----------+-----------+-----------+-----------+
| gnu | 1.8e19 | unlimited | unlimited | 63 |
| oldgnu | 1.8e19 | unlimited | unlimited | 63 |
| v7 | 2097151 | 8GB | 99 | - |
| ustar | 2097151 | 8GB | 256 | 21 |
| posix | unlimited | unlimited | unlimited | unlimited |
+--------+-----------+-----------+-----------+-----------+
Let’s briefly explore each one.
The GNU tar archive is based on an early POSIX variant of the same but adds many improvements. Its older version was used in GNU tar versions before 1.12.
Critically, the GNU tar archive is incompatible with the original POSIX standard it extends due to header and limit changes.
The original pre-POSIX archive format from UNIX v7 is still relevant despite its many limitations:
In fact, versions of automake prior to 1.9 use v7 to produce Makefiles.
As the initial POSIX specification from 1988, ustar can store symbolic ownership and special files.
Still, it exhibits most other limitations above:
Still, this format is a classic and may still be around for a while, especially for older historical data.
Specific formats exist for some niche tar versions like star.
In fact, star is the fastest tar archival implementation in part due to its special format.
Since the first tar archive specification and the GNU tar format upgrade, POSIX has included a new version of its related standard: POSIX.1-2001.
Simply called posix, the new tar format within has no file size or name length restrictions. While it’s still fairly recent, posix aims to be compatible with ustar. In fact, it overcomes the limitations of the latter by extracting overflowing names and other data as separate files.
Finally, the posix format is currently the GNU tar default.
The original BSD tar doesn’t really have much going for it apart from strictly adhering to the standard and ensuring stability and security. Thus, original BSD tar is perhaps the most reliable and portable.
Because of this, we take it as the common ground between [bsd]tar and GNUtar. So, let’s drop the original BSD tar from the comparison, so we can size up libarchive [bsd]tar and GNU tar in terms of their custom features and implementations.
Format support is perhaps best visualized in a table:
+-----------------------------------+
| Format | GNUtar | [bsd]tar | star |
|--------+--------+----------+------|
| gnu | yes | yes | yes |
| oldgnu | yes | no | yes |
| v7 | yes | yes | yes |
| ustar | yes | yes | yes |
| star | no | no | yes |
| posix | yes | yes | yes |
+-----------------------------------+
In terms of formats, oldgnu isn’t readable by libarchive [bsd]tar. Yet, just like GNUtar, the latter can read all other major tar implementations, including gnu and posix, as well as non-tar files like zip, 7zip, ISO 9660, and similar. Meanwhile, star can handle all of the above.
Still, GNUtar and [bsd]tar are comparable in terms of improving on the format support, despite the claims of libarchive to do it better.
On the one hand, libarchive [bsd]tar and star can both detect the compression type even when the data is coming from stdin. On the other hand, GNUtar needs additional hints to do the same.
Similar to the storage data sectors, sparse files have a lot of empty or unused regions and leverage metadata to represent those regions of their content instead of taking up space with zeroes or random bits.
Sparse file handling is different between GNU and libarchive. However, what we see below for [bsd]tar applies to star.
For example, libarchive [bsd]tar and star use metadata, while GNU tar processes every empty file region. To illustrate, let’s create the 10M sparse file sparse-file with the dd tool:
$ dd of=sparse-file bs=10M seek=1 count=0
Next, we –create three archive [–file]s that contain only that file. One uses GNUtar, another uses star, while the last one employs [bsd]tar:
$ tar --create --file=tar.tar sparse-file
$ star -cf star.tar sparse-file
$ bsdtar --create --file=bsdtar.tar sparse-file
Finally, we can compare the three:
$ ls -lh
total 11M
-rw-r--r-- 1 baeldung baeldung 3.0K May 5 05:00 bsdtar.tar
-rw-r--r-- 1 baeldung baeldung 10M May 5 05:00 sparse-file
-rw-r--r-- 1 baeldung baeldung 3.0K May 5 05:00 star.tar
-rw-r--r-- 1 baeldung baeldung 11M May 5 05:00 tar.tar
Notably, tar.tar is even bigger than the original file, while the sizes of bsdtar.tar and star.tar are close to the minimal format size.
In general, there are three main backup types:
Of these, the full type is supported by any of the tar solutions by just creating an archive with all the data:
$ tar --create --file full.tar /data/*
However, both GNUtar and star can create an incremental backup. In the case of GNUtar, it’s via the –listen-incremental=<snapshot-file> (-g <snapshot-file>) or –incremental (-G) flags:
$ tar --create --file=inc-1.tar --listed-incremental=full.nfo /data/*
Snapshot files provide metadata for changes to avoid processing static files.
While differential backups aren’t supported directly, we can use the –compare, –diff or -d flag of GNUtar to compare files on the storage with files in the archive.
None of the above two functions are supported by [bsd]tar.
Unlike [bsd]tar, GNUtar also supports the –delete flag for removing specific files from an archive without extracting and repacking. Still, the operation recreates the archive:
$ tar --list --file=tar.tar
file1
file2
fileD
file3
$ tar --delete --file=tar.tar fileD
$ tar --list --file=tar.tar
file1
file2
file3
After [–list]ing the files in the tar.tar [–file], we delete fileD and confirm it’s removed. For obvious reasons, –delete has no short form.
On the other hand, we can use the special @archive syntax of [bsd]tar to add files to an existing archive:
$ cat old.tar | tar --create --file=- fileN @tar.tar
In this case, we –create a new archive from the old.tar data coming from stdin and the fileN file.
In this article, we explored several tar implementations along with their differences and similarities.
In conclusion, libarchive [bsd]tar, GNUtar, and perhaps even star are the most feature-rich but have overstepped the POSIX standard as adhered to by the original BSD tar.