魁梧的排球 · 茂名市人民政府关于下达茂名市2018年国民经 ...· 4 周前 · |
机灵的乌冬面 · K8S的更新、回滚、伸缩_failedget ...· 2 月前 · |
谦逊的面包 · ind2sub - 将线性索引转换为下标 ...· 3 月前 · |
气宇轩昂的毛巾 · 硬核巴斯克9人入选,西班牙大名单有争议也有新意· 5 月前 · |
近视的茶壶 · 刘俊-资源与环境工程学院· 5 月前 · |
The official release is installed with hb upgrade. The preview release may have bug fixes and new features and is installed with hb upgrade -p.
this release does an automatic database upgrade to ensure the hb.db and hash.db files are "mated". If they are not, hash.db is rebuilt. At one site, the hash.db and hb.db files were copied around and became mismatched, causing dedup to fail silently though the backup was fine otherwise. This also gave a non-critical selftest error. Thanks Mehmet!
shards: HB now creates regular timestamped log files in each shard subdirectory as with non-sharded backups. Before this change, the log file in backupdir/logs was timestamped, but the timestamp was the time the logs were gathered, not the time files were backed up (for example). The sout logs are mostly redundant now, but will be kept for a while to make sure nothing falls through the cracks.
backup: HB previously did not store extended attributes very efficiently, because historically, they were small and unimportant. But recently Apple started adding large (up to 6MB) extended attributes to container directories, and a backup of ~/Library/Containers creates a 469 MB hb.db, with 99% or 466 MB of that being used to store extended attributes. With this db upgrade, extended attributes are stored more efficiently, about 3x smaller, and it’s 3x faster to backup the Container directory. Thanks Kim!
stats: some new statistics are shown:
file bytes protected
dedup eligible blocks
percentage of dedup eligible blocks in the dedup table
dedup-mem setting required to store all eligible blocks in the dedup table
get: questions are sometimes asked, such as "File exists, okay to overwrite?", and these usually have default responses. With shards, the default responses were not being accepted and a shard would sometimes go into an infinite loop asking the question. This could be seen in the sout files, though on the terminal screen, it appeared that the restore was hung.
get: shards could exit with a fatal error if they didn’t contain a requested pathname, but sometimes this is expected and not an error. When restoring a single file, the exit code would be 1 even though the file was correctly restored. When multiple file pathnames in different shards were on the command line, not all files were restored because some shards exited prematurely with a "Pathname is not in backup" fatal error.
trim: trim works with shards to generate reports if the -i option is not used. The -i option (interactive trim) can be used with individual shards by adding /sN to the -c option. This will interactively trim shard N. The -i option cannot be used to trim multiple shards, and now instead of raising an error 'Inappropriate ioctl for device', the error advises to trim shards individually.
rm: when removing multiple paths, the exit status was only for the last path removed, so if the 1st remove failed and the 2nd worked, the exit status was zero even though there was a failure. Also, it is not necessarily an error if a pathname is not found in a particular shard, so it should not always affect the exit status.
export: shards are supported, either exporting the entire backup, creating an export file for each shard, or exporting a single shard by adding /sN to export’s -c option, where N is the shard number.
config: shard-output-days defaults to 7 (was 30) and instead of using the time the job was started to calculate its age, the last time the job wrote output is used so a long-running job does not have its output removed prematurely.
rekey: an interrupted rekey is automatically rolled back to the previous state on the next command. With shards, a successful rekey after a rollback had an incorrect key file.
backup: Mac filesystems, both HFS and APFS, can fail if the parent directory is deleted by another process during directory traversal. Backup would report an error like: Unable to backup [#3083 backup 2588 2593]: No such file or directory: <pathname>. This is true and not so bad, but it also broke traversal so badly that backup marked all files in all parent path directories as deleted. A workaround now prevents this. Thanks Arthur, this was a tough one!
compare: now works with shards. Also, the total number of files compared statistic was sometimes too big, because each component of each pathname was counted instead of just the pathname.
on filesystems with 16TiB or a multiple of that, to 16TiB plus 2 times arc-size-limit of free space, HB stopped, warning that there was not enough free space. Thanks Boris!
selftest: many warning messages "extra blockid in dedup table" were incorrectly shown on large backups with more than 2 billion blocks. This was a selftest bug: both the dedup table and backup were actually fine.
versions: the "File Space" statistic could incorrectly be shown as a negative number if hard links were removed from the backup either with the rm or retain commands. The file counts were too high when directories were removed.
selftest, backup: huge backups with more than 4 billion blocks could cause a selftest error "arc blockids overlap", with one of the blockids shown as 4294967295. This error is legitimate: the mistake actually occurred during backups. From internal testing.
selftest: if no local arc files and no destinations, set --iso to test backup in isolation without throwing errors for every arc file and every block
get: the --cache option now works with shards instead of only 1 shard working and the rest getting an error that the backup directory is locked.
ls: a new -i option enables case-insensitive (ignore case) pathname searches. The default for Apple is case-insensitive, the default for Linux is case-sensitive, like their filesystems. Thanks Arthur!
In related news:
the pathname component (slash) limit for fast ls searches is gone. Previously a pattern with more than 9 pathname components caused a sequential search of the backup, ie, 3 minutes vs 3 seconds for a backup with 10M files. Thanks DRH!
a traceback was fixed, "TypeError: not enough arguments for format string", when the ls search pattern contained a % sign. From internal testing.
a KeyError was fixed with -l if a file was deleted from a version, then the version was deleted because all files were removed. Thanks Kim!
ls: support has been dropped for character-class glob wildcard patterns (brackets). Most users don’t understand them, and it was difficult to search for filenames actually containing square brackets. For example, a pattern of abc[xyz-abc]def would not match a filename with this exact name but raised an error because z is greater than a. And a pattern like ' [BASIC] ', which should be an easy search, actually means "any filename containing the letters B, A, S, I, or C", which could be hundreds of thousands of files in a large backup. The * (multiple wild characters) and ? (single wild character) wildcards are still supported. Thanks Andy!
ls: add -h option (human readable) to show file sizes in KB, MB, GB
ls: a new -x option shows all paths in the backup, whether deleted or not. When followed by a regular expression, only matching pathnames are shown. -x is usually slower than glob (no -x) patterns, but can do more complex filtering with patterns that may cross pathname components.
For example, searching for any pathname with the word flower then tree, anywhere in a pathname, is not possible with an hb glob pattern, but is easy with -x (dot is the wildcard character in regular expressions, * means any number of these):
$ hb ls -c backupdir -x 'flower.*tree' /User/jim/blueflowers/and/browntrees
compare: compare acl and extended attributes, add -i options to ignore either of these when comparing. Also compares and notes symbolic link targets with "slink"; previously these showed up as ctime and mtime changes only. The output has changed slightly, using periods instead of askterisk for each field listed with -v2, to look a bit cleaner. Thanks Arthur!
rm: previously, when a path was removed, rm required the user to own all versions of the path, ie, they created all backups of that path. But if user A created version 1 and user B created version 2, neither user could remove the version of the path they created. Now, users can remove any path they create. From internal testing.
backup, selftest: ideally, when a file is saved, all of its parent directories are also saved. This is necessary for the ls command to "see" all files. Selftest now checks for this and issues a warning. There were two previous causes of this problem. The first is related to excluding directories but backing up some sub-contents on the command line. This occurred commonly with inex.conf, and has been fixed. The second is when a parent directory was not modified since the last backup, but its subcontents were changed. Backup would normally save the parent directories on its way out of the tree, but if it was interrupted before the parent was saved, the changed files underneath would not be visible to the ls command even though they were backed up; they were restorable by pathname. This new warning can be ignored, the affected files can be removed with the rm -r command (be sure to use the -r option!), or they will naturally expire from the backup with the retain command.
hb: memory usage has been reduced, without reduced performance. If you notice a significant performance decrease with this release, please let us know.
backup: previously, pathnames on the command line containing symbolic links were resolved to the ultimate target, and that’s how the pathname was saved. Now the pathnames are backed up as-is, except that on Mac systems, the pathname might be changed to the actual letter case as stored on disk, ie, backing up /users/jim will actually save /Users/jim to avoid confusion. Thanks Ware!
ls: if /a/b/c is backed up and /a/b is a symlink, ls was not showing /a/b/c. Thanks Ware!
get: if /a/b/c is backed up and /a/b is a symlink, get /a/b did not restore /a/b/c. The target of /a/b must be an existing directory at the time /a/b/c is restored. If /a/b points to a target that has not yet been restored or is not a directory, then /a/b/c cannot be restored either. Thanks Ware!
the FreeBSD and 32-bit Linux builds of HashBackup are being retired. There were only a handful of upgrades last quarter for 32-bit Linux, so it is hardly used. The FreeBSD build was used by about 2.5% of users, but has had compatibility issues with current FreeBSD releases for about a year.
stats: the maximum extended metadata length shown could be much smaller than the actual maximum. This also affected the --xmeta option, which shows the length of the largest metadata items: extended attributes, ACLs, and symbolic links. This is mainly on MacOS, where Apple sets extended attributes up to 6MB on some directories.
backup: if any -c path component was a symlink, and the local backup directory was beneath a directory being saved, the local backup directory was saved instead of being excluded as it should have been. Thanks Ware!
backup: with -v3, a notice is given for previously seen inodes
hb: some unusual system configurations with chroots and read-only partitions could cause a DatabaseError: Disk I/O Error exception. Thanks Sarah!
get: if one path is being restored and that path is not in the current backup version (it was previously deleted), a message shows the previously-saved version and asks whether to restore from that version. But the restore was actually using the latest version, so nothing was restored. Adding -r with the last-saved version would fix the problem, but that is no longer needed. Thanks Henrik!
on some database errors, selftest runs automatically to try to repair the error. If selftest triggers a database error, don’t run selftest to try to correct it since that causes an infinite loop and forks many processes. Thanks Ware!
ls: if the ls search pattern began with slash (absolute pathname) and used wildcards, ls might say "File not in backup" when the file was actually there. Thanks Andy!
ls: square brackets in the ls search pattern specify a character class (a single character). If an actual pathname contains [xyz-def], it is not possible to search for these 9 characters with ls using brackets. Also, this is an illegal character class because z-d is not a valid character range (though d-z is), causing ls to fail with "error: bad character range". Now, if the pattern contains square brackets and is not a valid regular expression or does not match anything, a hint to change brackets to question marks (match any 1 character) is given. Thanks Andy!
ls: if a search pattern is too complex, ie, more than 9 pathname components without a leading slash, ls did a sequential search of the backup pathnames instead of using database indexes. These scans were not always showing matching filenames. Sequential scans were also used if more than 20% of pathnames matched. This was removed because benchmarks showed that a sequential scan was slower even if 80% of pathnames matched.
compare: the compare command could show differences due to floating point rounding errors. Filesystems store times as two integer fields, seconds and nanoseconds, while Python returns times as floating point seconds, so there can be slight conversion errors between the two formats. The compare command also should not show a difference if ctime is different but mtime is the same. This has not yet been fixed.
dest: when a file is downloaded but the file size is not the expected size, the warning message now suggests running dest verify. This will force the file to be re-synced on the next backup if there is a good copy on another destination.
rm: if rm --secure was interrupted at just the right time, it could cause selftest errors on the last arc file packed. From internal testing.
rm: packing the last arc file in a version could reuse the arc filename when packing other arc files. Depending on the order files are packed, this could delete an arc file that should have been kept, cause an error "dest: can’t access file for put" when uploading the arc file, and cause a selftest error. This bug is unusual, happening 15 times in the past year. Thanks Bugz!
rm: when arc files are empty, a pack operation deletes them. If HB was interrupted before the database committed it could cause "required arc file is missing" selftest errors. From internal testing.
backup, dest sync: cache-size-limit was sometimes not being honored during sync operations. Thanks for the detailed notes Christian!
log: when hb log summarizes log files, it creates a .zip file for each month. For sites with huge logs, the .zip file might be more than 4GB, and this caused a traceback while summarizing the logs. Now .zip files larger han 4GB are created without errors. Thanks Bugz!
config: pack-percent-free previously had a lower limit of 10 (percent) but is now 1. Lowering pack-percent-free causes more frequent arc file packing, saving backup space. It should only be lowered if cache-size-limit is -1 (there is a local copy of all arc files), or the first destination in dest.conf does not have download fees. If you can pack files without triggering high download fees, it’s safe to pack more frequently, though with the low cost of object storage, it’s usually not worth it. Thanks David!
rename: if a directory was renamed to a subdirectory of itself, eg, hb rename /a /a/b, then /a would become inaccessible and selftest would either detect errors in the paths or get stuck. Rename now throws an error in this case. Thanks Rodrigo!
rm: trying to remove an existing pathname that had no associated data caused a traceback rather than the correct error, "Path exists but no files found; run selftest --fix: pathname". Thanks BugZ!
Traceback (most recent call last): File "/hb.py", line 303, in <module> File "/rm.py", line 1127, in main File "/rm.py", line 1024, in rmpath TypeError: raise: arg 3 must be a traceback or None
rm, selftest: if the last block of a backup is deleted with either rm or retain and enough data is deleted to trigger a database compression, then after the next backup, selftest could report an error like "Error: arc.1.0 blockids overlap arc.0.4: 458 ⇐ 486". Selftest was previously unable to correct this error with --fix; now it can. Database compression was changed so this error should not occur. Thanks Yannick!
ls: strip trailing slashes from pathname pattern to avoid "Path not in backup" errors. Thanks Neal!
trim: when items are deleted and deletes are committed, trim could hang if destinations are configured. It did all of its work and the backup was fine, but it got stuck closing an already-closed database. Thanks David!
trim: if a subdirectory and its parent were both marked for delete, trim failed with "Exception: No pathname for pathid xxx" and no trim changes occurred. Now, if /a and /a/b are both deleted, /a is deleted and /a/b gets deleted because it is a subdirectory of /a. Thanks Bugz!
trim: if a path is marked exclude, it was only added to inex.conf if it was already excluded. Now excludes are always added, whether they were previously excluded or not. From internal testing.
log: if the backup directory is read-only, the logs directory doesn’t exist, and hb log is asked to summarize the logs, return 0 (no errors) rather than throwing a Permission denied error trying to create the logs directory. Thanks Bugz!
backup: a bad wildcard in inex.conf, for example [x-a], caused a traceback with no context and a bug report. Now a better error message is displayed before aborting, showing the bad wildcard. Thanks Bugz!
get: certain cirumstances with cache-size-limit >= 0 and selective download could cause a race condition, leading to a traceback and restore failure. The traceback is below. Thanks Ian!
dest d1: Traceback (most recent call last): File "/basedest.py", line 438, in loop File "/basedest.py", line 579, in getrangecmd File "/basedest.py", line 660, in getspansfilename OSError: [Errno 2] No such file or directory
Bugz: if a background thread failed, as opposed to the main thread, the automatic bug report could contain pathnames from the backup because background threads don’t have access to the database to allow mapping pathnames to numbers as is normally done. Now, for privacy in this situation, any log line containing a slash is truncated and <path> is added. Thanks Ian!
backup, get, mount: in #2829, file modified times were rounded before comparisons to compensate for floating-point inaccuracies. This rounding is more aggressive now since there were still some incorrect comparisons.
in #3035, a "simple" change to forward the siginfo signal was added at a customer’s request. This was released July 29th, but because of a version control build issue, was never released on Linux. The build issue was fixed in the #3038 release, bringing Linux back in sync, but also causing a new problem: the siginfo signal, triggered by control-t, is only available on BSD and Mac systems, and caused a traceback on Linux. Thanks Alexander (and hundreds of Bugz reports!)
backup: if variable-blocking is enabled with the block-size-ext config option on a multi-core system, small files could cause a traceback. Thanks Arthur (and Bugz)!
Traceback (most recent call last): File "/hb.py", line 225, in <module> File "/backup.py", line 3478, in main File "/backup.py", line 2542, in backupobj File "/backup.py", line 2009, in backupobj Exception: Bug: varblock in mixed threads
recover: in #2957, recover was changed to monitor arc file downloads to predict how much time remained for downloads to finish. When an arc file finished and was renamed to remove the .tmp suffix, a race condition could cause a traceback. After the traceback, the arc file would be present in the backup directory. Thanks David!
Traceback (most recent call last): File "/hb.py", line 287, in <module> File "/recover.py", line 447, in main OSError: [Errno 2] No such file or directory: '/tmp/recovery/arc.362.20.tmp.tmp'
db upgrade: the most recent db upgrade requires enough disk space for 3 copies of the hb.db database in the local backup directory. A disk space check has been added, and if there is not enough space, the upgrade fails early rather than halfway through the upgrade process, which can be rather long for very large backups.
recover: fixed "no such table: destfiles" error. Also changed to halt if the selected destination did not start instead of trying to recover from it anyway. Thanks Bugz!
dest noparts: prevent traceback UnpicklingError: could not find MARK. Thanks Bugz!
backup: when cache-size-limit was >= 0, arc files were not always getting removed from the local backup directory cache after backup ended. From internal testing.
a security vulnerability was found via internal testing on 2022-02-21 and a fix released the same day, noting a full disclosure would be posted in 90 days. That is now published under Technical / Disclosures on the HashBackup web site.
destinations: this is the last release to support the maxsize dest.conf keyword (not related to S3 multipart uploading). Maxsize was implemented in 2013 and is used to divide arc and hb.db.N files into smaller subfiles before uploading to storage services. It was used mostly for small storage services like WebDAV and imap (email) that are often free but have very small and strict file size limits. Support is being removed because:
maxsize requires extra code in HashBackup that has to be maintained and complicates future changes
the cost of object storage is so cheap these days, it doesn’t make sense to complicate things to support free services for small backups. These backups will only cost pennies at any service.
because backups with maxsize are usually small, creating a new destination and letting HashBackup transfer the backup is easy
split files created with maxsize do not support selective download (fetching only part of an arc file)
uploading or downloading a split file doubles I/O in the backup directory to create and/or re-assemble the subfiles
split files can’t be sample tested with selftest
HashBackup will print a large warning if your backup contains files that have been split into subparts. Or, to see which files are split into parts, use the dest ls command with the -v option. This example shows a file that was split into 11 parts with maxsize 10M:
$ hb dest -c hb ls -v
Dest | File | Size | Modified Time | Time Stored
-----+---------+-----------+---------------------+--------------------
d1 | arc.0.0 | 100180336 | 2022-06-06 14:18:06 | 2022-06-06 14:18:07
Part 1: size: 10000000
Part 2: size: 10000000
Part 11: size: 180336
There are 2 ways to eliminate subparts created by maxsize:
create a new destination that doesn’t use the maxsize keyword, then either use the dest sync command to copy files to the new destination, or wait for the next backup to copy the files. Then use the dest clear command to clear the old destination. Delete the old destination from dest.conf right after dest clear.
eliminate subparts on the same destination with the dest noparts command. This will download the files that are in parts (if not already local), assemble them, remove the remote parts, and upload as single files. Before starting, removing the maxsize keyword from dest.conf. Then use this command:
$ hb dest -c backupdir noparts
if no mountpoint is given on the command line, mount will create one in /tmp, mount the backup there, and start a command shell in the mount directory. When the shell exits with ctrl-d, mount ends and the /tmp directory created is removed.
when mounting to a specified directory, mount can be interrupted with ctrl-c instead of ctrl-\, avoiding a core dump. Ctrl-\ still works for compatibility.
mount does an unmount if a dangling mount is leftover from a previous mount command and also when stopping, to avoid "Device not configured" and "Transport endpoint not connected" errors.
internal stress testing caused mount/fuse to fail, revealing a bug in the read code that usually didn’t matter, but sometimes did
recover: when a backup is recovered, a new file testonly.conf is created in the backup directory. While this file is present, downloads work but destinations cannot be modified (no upload or remove). Some customers do test recovers to verify their backup, which is a great idea. But test recovers created a situation where two different backup directories, the real one and the test, match the unique ID HashBackup uses to prevent accidental overwrites of remote data. An operation in the test directory that modifies the backup could unintentionally modify the remote data, potentially causing problems for the production backup. The testonly.conf file allows tests to be run without allowing remote data to be changed. For real recovers of the local backup directory, delete the testonly.conf file.
backup: when kill -term is used to stop a backup, it waits until a good stopping place and sends the database and dest.db files to remotes, but ctrl-c stopped the backup immediately. Now ctrl-c behaves like kill -term and gracefully shuts down the backup. The advantage is that while the remote backup may not be complete because not all arc files were transmitted, you can restore from the remote backup using any arc files that were transmitted. A 2nd ctrl-c will cause immediate termination like before. Backup now distinguishes between an interruption and a time-limit caused by --maxtime.
backup: when USB drives are swapped, one offsite and one onsite, there is always a failed destination for the offsite drive. Previously HB was removing new arc files after copying to the onsite drive. To make this work correctly, a drive swap required both drives to be plugged in, a sync command, then the drives swapped. If the drives were just swapped, the backup became segregated between the 2 drives, so a restore would only work if both drives were plugged it. This is also bad reliability-wise, roughly doubling the probability of a drive failure wiping out the backup. Now, HB keeps new backup files in the local backup directory until they are copied to all destinations that try to start, even if they fail. cache-size-limit may need to be adjusted to accomodate this, depending on how often drives are swapped: there has to be enough space in the local backup directory to hold all incrementals until a drive swap.
dest rename: renames destinations associated with files uploaded in dest.db. This does not change dest.conf - use an editor to do that, as usual. Dest.conf can be changed before or after using dest rename. Thanks Preston!
selftest: an automatic selftest --fix runs in a few more situations
selftest, rm: when cache-size-limit was >= 0 (limited cache) and there were local arc files that had not yet been sent to destinations, rm did a sync if it was time to pack arc files. This is usually fine, but if a new destination was added and it was not yet in sync, the sync operation could take a very long time and rm waited for it to finish. Selftest was similar. This sync has been removed, leaving it for the next backup, with --maxtime and --maxwait options to control the sync time.
selftest: --fix corrects a few more errors that occurred on a backup hosted on failing & faulty hardware. Thanks Israel!
selftest, rm, dest sync: with a limited cache (cache-size-limit >= 0), there were some situations where HB could hang waiting for free cache space
get: paths to restore can start with ../, ./, or no slash. In these cases, the current directory is prepended to the pathname.
get: when restoring a file that has been deleted, get tried to be helpful by showing the latest version of the file and asking whether to restore that version. However, if /a/b/c was restored and /a/b didn’t already exist, a traceback occurred. If /a/b did exist or -r was used, the restore worked. Now it all works.
recover: shows more detailed status during recover, including an estimate of time left to finish downloads. For example:
Queueing arc files from d1 Download size is 858 MB in 9 arc files 8 arcs 494 MB - 57% of 858 MB in 41s, 31s left
hb: if an arc file is missing and data can’t be fetched from another destination, HB may run selftest --fix to correct the database, depending on circumstances.
get/rm/retain/recover: if there is a problem downloading a file, try other destinations that have the file. For example, for rm/retain:
Packing archives Getting arc.0.0 from d1 dest d1: file not on destination: arc.0.0: No such file or directory (previously rm would have failed here) dest d1: trying dest d2: arc.0.2 Packing arc.0.0 into arc.0.1
For get:
Restoring f5 to / dest d1: file not on destination: arc.0.2: No such file or directory (previously get would hang here waiting for the missing file) dest d1: trying dest d2: arc.0.2 Restored /test50m/f5 to /f5 No errors
For recover:
Waiting for /hb/hb.db.1 dest d1: file not on destination: hb.db.1: No such file or directory dest d1: trying dest d2: get hb.db.1 Loading hb.db.1 Verified hb.db.1 signature
Missing files can be corrected with selftest or dest verify + sync.
dest verify: the verify command quickly checks destinations to make sure that the files HB thinks are stored are actually stored, without downloading any files. Any files that are missing are flagged so they can be uploaded again or copied from other destinations. Previously, after dest verify, a sync operation occurred to do uploads or transfers. But if a new destination is being setup, this sync may run for days, and there is no maxtime or maxwait limit like with backup. So the sync after verify has been removed. If you still want to do a sync, add hb dest sync following hb dest verify, or the next backup will do a sync.
recover: HB sends its database to destinations as hb.db.N files to allow backup recovery if the local database is lost (fire, theft, disk dies, etc). These files contain authentication hash codes for the hb.db.N file and for the original hb.db to verify that the reconstructed hb.db is identical to the original. In automated tests, a recover command failed with this error:
Warning: HMAC signature verification failed: HMAC should be: b5e1c633a2a198129de533a584c8c7b104e5f7 HMAC of recovered db is: b5e1c633a2a198129de533a584c8c7b104e5f70b Unable to regenerate hb.db
Recover kept the reconstructed hb.db and it’s clear from the mismatched hash codes that hb.db is actually correct. The bug is that recover was stripping whitespace from both ends of the original "should be" HMAC. In this case, the last character was 0b, a Vertical Tab character, and was stripped off. Since the strip function removes whitespace from either side and there are 6 whitespace characters, the probability of an incorrect failure is:
p(no whitespace at either end) = 250/256 * 250/256 = 0.953
p(failure) = 1 - p(no whitespace at either end) = 0.047
So there was a 4.7% chance of recover giving this incorrect warning about signature verification failure.
log: sometimes included lines that -X should have excluded. Thanks Michael!
selftest: --sample interfered with -r and --inc, causing the wrong archives to be tested.
selftest: previously selftest uploaded hb.db changes only if --fix was used. With -v4, selftest downloads and checks arc files, and while they are downloaded, it also checks whether they need packing, to save a later download. During packing selftest corrects blocks, even without --fix. hb.db is also changed if --inc is used, to record progress. So now, hb.db is uploaded by selftest whenever it changes - even without --fix.
backup: previously, with a limited cache, an arc file upload might start after the --maxtime stop time. That’s now fixed, though backup might still run over the stop time to finish an upload or to upload hb.db.N and dest.db files. The backup stop goal time (maxtime) was always shown; now the maxwait stop goal time is also shown after the backup finishes while uploads are in progress.
backup: when cache-size-limit is set, backup may have to wait for room in the cache. During the wait, it now displays completed transfers rather than displaying them all at once after the wait, which made it appear to be stuck.
backup: after backup has finished it syncs arc files to new destinations. Previously it did this before the backup started, delaying the backup, and --maxtime/maxwait were not enforced. The database is now unlocked while waiting, allowing config changes while files are being transfered.
backup: when a --maxtime or --maxwait time limit is reached, it’s likely that some arc files are not uploaded and the remote destination is not fully up-to-date. It’s fine if this is a temporary situation because of adding a new destination for example, but not good if it’s happening on every backup. These timeous are now errors, to make sure someone is paying attention to them. Thanks Preston!
backup: previously the Wait: statistic at the end of a backup only reflected the time spent waiting on destinations after the backup finished. If backup was delayed during the backup waiting for cache space to become available, because cache-size-limit is set, that was not included in the wait time. Now the Wait: time includes delays during the backup.
backup: if the dedup table has to be rebuilt, backup was halting if block numbers were very large.
get: if a restore aborts early (the test case was trying to restore a large directory when a file with the same name already exists), get could hang if the cache was limited instead of exiting. From internal testing.
rm: previously when rm removed whole arc files it started a sync to delete the files. But if there are new arc files waiting to be sent, for example, a new destination is added, this sync could take a very long time. Now rm just deletes the arc files if cache-size-limit is -1 (no limit). When cache-size-limit is set, things are more complicated and this is an area for improvement.
config: since it only modifies the database, the config command doesn’t need to lock the backup directory, allowing it to be used when the backup directory is locked but the database is not.
config: if a backup already had an admin passphrase and config was used again to change the passphrase, the screen was confusing as shown in the first box. The 2nd box is correct. When an admin passphrase is set, a new note advises using enable/disable-commands to restrict dangerous commands (clear, rm, etc.) by requiring the admin passphrase.
Admin passphrase? New admin passphrase? Current config version: 2
backup, selftest: previously, time intervals for --maxtime (backup) and --inc (selftest) only accepted 1 time period: 3d, 5h, etc. To specify 1 1/2 hours, 90n had to be used. Now multiple time periods like 1h30n can be used for a time interval.
dest: the setid command does not require a destination name if there is only 1 destination configured.
hb: when the db can’t be opened it was causing the traceback below. Now there’s a better error message saying the db can’t be opened.
Warning: logging disabled Warning: unable to audit command: [Errno 13] Permission denied: '/hb/key.conf' Traceback (most recent call last): File "/hb.py", line 161, in <module> File "/misc.py", line 321, in checkcommand AttributeError: 'NoneType' object has no attribute 'get'restore with --delete could delete the HB backup directory that was being used to do the restore if it was under the directory being restored. Example: restoring root with /hb as the backup directory.
when merging onto an existing directory, restoring a file that already existed as a directory would fail. Now the existing directory is deleted, and if that fails, it is renamed with .hbdel
HB was using the external touch command to set modify times on symbolic links. This can cause problems if restoring an entire system because of shared libraries
error messages shown when deleting existing files for --delete did not have the OS reason for failure included in the error
mtimes are rounded before comparison, like backup and compare
if get was planning to use a local file but that local file changed or disappeared, get didn’t restore the file. Now it restores from the backup instead. This led to the next issue.
with a limited cache, if an arc file is not in the restore plan, get could hang waiting for cache space
if more than 10 errors occur while restoring extended attributes, it’s likely that the filesystem doesn’t support them so further errors are suppressed
selftest: hardware and filesystem problems can sometimes lead to malformed database errors. These are unusual, especially on enterprise-class hardware, but they prevent any backup operations. In #2722, automatic correction of hash mismatch errors, another symptom of hardware problems, was added to HashBackup. In this release, selftest --fix will try to correct a malformed database and is automatically triggered on these errors. Example:
Database error: database disk image is malformed ******************************************************************************** IMPORTANT: database integrity errors and malformed databases are usually caused by faulty hardware, filesystem bugs, and incorrect power failure handling. HashBackup cannot cause these errors. Faulty hardware might be: - USB drives (or hard drives in general) - disk controller - USB or disk controller cable - RAM, especially non-ECC (error correcting) RAM - network card, cables, or router if database is remote (not advised) Be sure to check your hardware! ******************************************************************************** Attempting database correction Rechecking database integrity Checked database integritydestinations: a type mismatch caused destinations to fail with this traceback. The backup and database are fine, it was an oversight related to the db upgrade. This is one of those cases where Bugz did not send a notification because HashBackup caught the error, so if you run into a traceback, it’s best to report it rather than assume there will be an automated report. Thanks Ware!
dest b2: stopping destination because of errors dest b2: Traceback (most recent call last): File "/basedest.py", line 357, in loop File "/basedest.py", line 643, in getcmd File "/basedest.py", line 211, in getinfo TypeError: loads() argument 1 must be string, not buffer
trim: the database scan to find unique, unshared data has improved accuracy though is a bit slower. An estimate of time remaining to complete the scan is now shown.
trim: the -n option limits the number of files trim examines. It defaults to 50%, with higher numbers examining more files but taking longer to get started. A new g command finds a pathname. A new i command shows detailed file info for all copies of a pathname: version, file system size, date saved, last accessed, and modified.
trim: as an extra precaution, trim checks to see if the userid running trim is the owner of the backup being modified. It gives a warning before scanning the backup, after scanning the backup, and checks each individual file before keeping or deleting it.
log: if HB was interrupted with ctrl-c, the log files showed either a failure or a still-running program, and then get forwarded in email when the logs were summarized with cron. Instead, an interrupt is now considered successful with a Keyboard Interrupt note added to the end of the log.
db upgrade: the database upgrade failed on some backups with a traceback. Thanks Oliver (and Bugz)!
AttributeError: 'NoneType' object has no attribute 'items'
ssh: the ssh and sftp destination types are similar, but ssh supports selective download (reading parts of remote files) via the dd command. If the dd command is not available, the ssh destination failed. Now ssh destinations test to see if dd is available and if not, disable selective download. To avoid this error, change the destination type from ssh to sftp.
Bugz has been reporting this error for a while, but it wasn’t until the recent change where the last part of the log was included that it was obvious what was causing of the problem. Thanks Bugz!
this release does an automatic database upgrade to remove unused columns, add new columns, delete unused version records, and enable strict data type checking for added reliability. It also creates a logs directory if it does not exist. Removing the logs directory will disable logging, though that is discouraged since logs make troubleshooting easier.
backup: previously dedup was disabled by default since it requires considering how much RAM to use. This is not ideal since benchmarks are sometimes run with an "out of the box" configuration. Also, if a new user does their initial backup with dedup disabled then enables it afterward, none of their initial backup data is used for dedup in the future. Now, dedup is enabled by default using up to 100MB of RAM. Any modern system will have this much free RAM, and if the size of the dedup table is increased with -D or dedup-mem, HB will automatically resize and fill it with previously-saved blocks.
backup: recognize sparse files on Apple’s APFS filesystem
backup: character devices start with a c in ls -l and are not buffered, block devices start with a b and are buffered. OSX and Linux support both types of devices, BSD only has character devices. HB previously supported backup of block devices and now supports saving character devices like /dev/rdisk2 when listed on the command line. On OSX, these can be 2-4x faster to read than the block device, /dev/disk2. Below is a USB2 drive on OSX for comparison:
# dd if=/dev/disk2 of=/dev/null bs=1m ^C101+0 records in 101+0 records out 105906176 bytes transferred in 10.302304 secs (10279853 bytes/sec)
backup: --maxwait off disables all destinations for this one backup. This is useful for example for hourly backups of a home directory where you may not care if it’s backed up remotely every hour. The next backup without --maxwait off, probably the nightly backup, will send all held backup data to destinations. Thanks Arthur!
backup: when backup is finished it shows the 10 largest new and changed files using the most backup space. This is useful for tracking down a bloated backup.
backup: the most recent 10 errors are repeated at the end of the backup with a timestamp of when they occurred. All errors are in the backup log file in the logs directory.
backup: previously, if a new destination was added, backup tried to get it in sync at the start of the backup. This worked okay if cache-size-limit was -1, but with a limited cache, arc files have to be downloaded from an existing destination then sent to the new destination. This did not respect --maxtime or --maxwait, and it held up the backup until the sync was finished. For a large backup this is impractical.
Now the sync of files already cached starts at the beginning of the backup, the backup starts immediately, and the remote-to-remote sync happens after backup is finished and does respect maxtime and maxwait. This allows the sync of a new destination to take place over a period of days without interrupting backups.
trim: a new command to find and delete (with -i) large backup files to reduce the size of backups.
log: the log summary shows the starting and ending times and the formatting is easier to read: each command is on 1 line.
rm: when a pack doesn’t occur, rm said it removed 0 bytes. Now it tells how many user bytes were removed, even though the data hasn’t been removed from arc files yet.
compare: a db cursor was getting reused in a recursive query with -r causing an incorrect comparison. The biggest difference in the test case was that deletes were about 50% lower. From internal testing. A new -v1 default shows new, changed, and deleted files separately instead of added together. Compare always did this for -v1 or higher, but -v0 was the default.
tune: tunes paths, re-enables periodic commits for block tuning to save progress if interrupted, and greatly speeds up tunes on backups with hundreds of thousands of arc files.
the exit code was not correct when a sharded command finished
if the database is locked because a command is running, trying to run a 2nd command gave a warning that the command couldn’t be audited because the database was locked (correct), but then a traceback occurred. Thanks Bugz!
previously, HB waited 5 seconds if the database was locked, then failed. Now, if the HB command is running without a keyboard (cron for example), HB will wait up to 1 hour for the database to unlock before giving up. This is useful if two commands end up overlapping.
backup, compare: if a file is backed up and restored, HB sets the mtime timestamp on the restored file. But on some systems, there may be a slight microsecond difference in the restored mtime and the backed-up mtime, and the compare command detected this. Now backup and compare round mtimes so the timestamps compare equal.
versions: the -l option (show file counts) is still accepted but is now the default and will be removed next year. Formatting has been changed to add headings and look more like a table, user file space and backup space are shown for each version.
log: if logs were summarized while HB was running, a traceback occurred when the running job finished. Now HB does not summarize _RUN logs until they are at least 24 hours old. Thanks Bugz!
Traceback (most recent call last): File "/hb.py", line 204, in <module> File "/log.py", line 332, in main OSError: [Errno 2] No such file or directory
Bugz: there have been several bug fixes made recently because of the online traceback reporting, but sometimes there are error messages before the traceback that explain what happened, making it easier to reproduce and fix a problem. Now when a traceback is reported, the last lines of the log file are also reported to give the traceback context. For privacy, pathnames are mapped to numbers ala export.
backup: the dedup table currently has a limit of 4 billion blocks or about 200TB at the old 32K block size. A traceback from Bugz reported a user exceeding this limit, causing backup to fail with an overflow error.
Traceback (most recent call last): File "/hb.py", line 171, in <module> File "/backup.py", line 3368, in main File "/backup.py", line 2000, in backupobj File "/backup.py", line 590, in addblock File "/backup.py", line 716, in findsha File "shacache.pyx", line 111, in shacache.Cache.put (./Modules/shacache.c:1651) OverflowError: value too large to convert to unsigned int
Now, backup avoids this, though dedup is hampered. Running tune on the backup will also help. New backups are less likely to hit this limit because HB chooses better block sizes instead of always using 32K. Thanks Bugz!
if a read-only backup directory for an older backup is is accessed by a recent version of HB, it tries to upgrade the db and fails when making a backup of the database; this is expected. But then a traceback occurs that is not expected. Thanks BugZ!
Traceback (most recent call last): File "/hb.py", line 150, in <module> File "/db.py", line 761, in closedb ProgrammingError: Cannot operate on a closed database.
log: stdout buffering was left on intentionally for cron jobs, for better performance, but that causes large batches of identical line timestamps in cron job logfiles. Now each line is timestamped correctly. Thanks Preston!
rm: if a small file was removed with --secure, it was not always removed from the arc file as intended. The recipe to reconstruct the file was deleted so restore was not possible, but --secure is supposed to guarantee that all data has been physically removed. From internal testing.
selftest: when a backup has large blocks and small blocks (very common), the progress display would appear stuck while checking large blocks. Now the progress display is smooth over all blocks.
export: if a backup directory is on NFS (not recommended!), export could fail with a traceback after "Deleting export temp files" because files were deleted before they were closed. Thanks Bugz and Daniele!
backup: a string of backup commands with no retain or rm commands would accumulate old hb.db.N files instead of deleting them.
SECURITY ISSUE: a vulnerability has been found in HashBackup via internal testing and this release fixes it. A full disclosure will be in the release notes in 90 days to give sites time to upgrade.
compare: the recent addition of -r to compare caused a traceback with empty directories. Thanks Bugz!
Traceback (most recent call last): File "/hb.py", line 162, in <module> File "/compare.py", line 774, in main File "/compare.py", line 546, in backupobj File "/compare.py", line 536, in backupobj File "/compare.py", line 556, in backupobj File "/opt/lib/python2.7/stat.py", line 41, in S_ISDIR File "/opt/lib/python2.7/stat.py", line 25, in S_IFMT TypeError: unsupported operand type(s) for &: 'NoneType' and 'int'
if permissions prevented accessing the backup directory, a misleading error said the key didn’t exist and advised to run hb init or hb recover. Now this is only shown when the key doesn’t exist.
selftest: in the release notes there are instructions for running selftest on an isolated backup database to test the new tune command. In development testing, selftest ignores missing arc files so it can be run on customer databases. The new --iso option allows anyone to run selftest on an isolated backup without errors about missing arc files.
mount: if the target mount point didn’t exist, mount offered to create it but only created the final directory, not intermediate directories, causing a traceback. Thanks Bugz!
Traceback (most recent call last): File "/hb.py", line 199, in <module> File "/mount.py", line 842, in main OSError: [Errno 2] No such file or directory: '/a/b/c'
log: if the backup directory was specified in one of the "non -c" ways, logging wasn’t being automatically enabled. Internal testing.
if a backup normally has halted destinations, for example, keeping a USB drive offsite, retain could fail with an error "destinations halted" when downloading arc files to be packed. Thanks Bugz!
stats: database space used by blocks and block references are shown separately instead of being lumped into block space
a new command, tune, optimizes the HashBackup database. For the current release, tune is a separate command that can be run periodically, maybe once a year. An --inc option may be added to do incremental tunes over time, similar to selftest, or tune may be run automatically in the future. Since it is new, consider this a beta release and run tune on an isolated copy of a backup first rather than on a production backup:
NOTE: the --iso selftest option is only in the preview release.
Tune does not modify any remote data and does not upload its changes. The next backup will upload the modified database when the backup finishes. Tune checkpoints its progress, is safe to interrupt, and can be restarted though doesn’t have to be restarted.
In tests on the 12-year-old HashBackup dev server backup, tune reduced the database size by about 10%, reduced selftest RAM usage by 78%, and reduced selftest runtime by 15%.
Another function of tune is to compact database keys to increase backup capacity. HashBackup has capacity limits that most sites will never see, but over a few years, some large sites may bump into them. Running tune is like a reset, making more key space available for future backups.
previews now have their own web page so the official release notes look more, uh, official. There’s a lot of noise with the preview notes that most people, especially if not running previews, don’t care about.
log: logs were not being sorted for summaries. For summaries after every backup it doesn’t matter, but for weekly or monthly summaries, the error summaries could be out of order.
log: hb log log tried to log itself while summarizing the log files, causing this traceback. Thanks Bugz!
Traceback (most recent call last): File "/hb.py", line 186, in <module> File "/log.py", line 330, in main OSError: [Errno 2] No such file or directory
rm: Bugz forwarded a traceback that couldn’t be duplicated, but was easily fixed in case it happens again. Thanks Bugz!
Traceback (most recent call last): File "/hb.py", line 203, in <module> File "/retain.py", line 517, in main File "/rm.py", line 693, in finish File "/rm.py", line 357, in packcombine KeyError: 0
log: create the logs directory at init time. If the logs directory is later removed or a user doesn’t have permission to write to it, logging is disabled and a warning is shown. Bugz sent a traceback for this, where retain -n (dry run) was being used with read-only permission to the backup directory. Now it works. Thanks Bugz!
/usr/local/bin/hb log retain -c /backup -s30d12m -x3m -n Traceback (most recent call last): File "/hb.py", line 186, in <module> File "/log.py", line 234, in main OSError: [Errno 13] Permission denied: '/backup/logs'
audit: with the recent changes relating to logging, commands were getting added to the audit log twice: once with the log keyword, once without.
rm: the feature to remove an entire version has been removed, along with the --force option. This was likely seldom used and had some bad side effects, eg, deleting a version deleted related statistics counters, causing the stats command to display misleading results. The feature was confusing because it behaved differently depending on whether --force was used. Removing the root path "/" from any version is now an error, but other pathnames can still be removed. These changes also eliminate holes in the version history.
rm: if a path was removed with -r2 but the path only existed in version 1, rm gave a confusing error message to run selftest. Now it says "Path not found in version 2".
the recent changes to log all commands inadvertently disabled all percentage progress bars. They’re back now.
dir dest: symlinks can only be used for arc and hb.db.N files, but recover was using them for dest.db
INCOMPATIBILITY NOTICE: previously, the log command had double duty: to log HB command output and to summarize log files. In this release, all commands are automatically logged and prefixing them with log (hb log backup …) is not necessary, though it is still supported for compatibility. The incompatible change is that using the log prefix used to suppress output if not running on a tty - a Unix term that sort of means "having a keyboard and a screen".
This output suppression worked when the log prefix was only used in cron jobs: it’s not a tty, output was suppressed, and no one noticed. But now that every command is logged, output cannot be suppressed this way because, for example, an ssh session without -t is not considered a tty so output would be suppressed there too.
What happens now is that stderr is redirected to stdout for logging and stdout is both logged and displayed normally. The upshot is that for cron jobs, >/dev/null needs to be added or each program’s output will be emailed. Check the Automation web page to see the recommended setup for cron jobs.
The other effect is that stderr cannot be redirected. This is necessary (for now) to allow timestamped logging of both stdout and stderr intermixed, as it would be on a screen.
INCOMPATIBILITY NOTICE: previously, hb log xyz … always returned an exit code of 0. This was because of a mistaken belief that non-zero exit codes caused cron output to be emailed, when cron actually emails any output, regardless of exit code. Now that all commands are logged, the command’s exit code is returned instead of zero. This only matters if you were relying on exit codes being zero when hb log xyz … was used.
log: the -s option is obsolete though still accepted. All commands are automatically logged in this release without a log prefix. When the log command is used by itself, it summarizes log files and -s isn’t needed.
HashBackup’s compatibility statement is that databases less than a year old are automatically upgraded. In fact, the previous release, #2677, could upgrade databases from August 8, 2016. However, it causes code bloat within HashBackup to go back this far so periodically the older db support has to be removed. The current version can upgrade dbrev 32 from Feb 2019. Upgrading older databases requires running #2677 first to get to rev 34, then running the current release.
a hash mismatch error indicates arc file corruption has occurred. Previously HB advised to run selftest --fix with the name of the bad arc file. Now it automatically runs selftest to correct this error
it occurs in rm/retain pack (for now, expand later)
no destinations have failed
no destinations are turned off
As an example, this small backup with 1 destination has a local arc file intentionally corrupted, an rm command triggers the hash mismatch error, and selftest corrects the error using a remote copy of the arc file:
$ hb backup -c hb backup.py Backup directory: /Users/jim/hb Backup start: 2022-01-30 11:23:36 Using destinations in dest.conf This is backup version: 0 Dedup not enabled; use -Dmemsize to enable /Users /Users/jim /Users/jim /Users/jim/backup.py /Users/jim/hb /Users/jim/hb/inex.conf Copied arc.0.0 to d1 (42 KB 0s 14 MB/s) Writing hb.db.0 Copied hb.db.0 to d1 (4.9 KB 0s 1.1 MB/s) Copied dest.db to d1 (36 KB 0s 24 MB/s)
# corrupt the backup intentionally: write 3 zeros in the middle $ dd if=/dev/zero of=hb/arc.0.0 conv=notrunc oseek=1000 bs=1 count=3 3+0 records in 3+0 records out 3 bytes transferred in 0.000027 secs (110376 bytes/sec)
# remove a file and force a pack with --secure $ hb rm -c hb /Users/jim/backup.py --secure Backup directory: /Users/jim/hb Most recent backup version: 0 Using destinations in dest.conf Dedup loaded, 0% of current size Removing all versions of requested files Removing path /Users/jim/backup.py Packing archives Packing arc.0.0 into arc.0.1 Traceback (most recent call last): File "/hb.py", line 211, in <module> File "/rm.py", line 1195, in main File "/rm.py", line 693, in finish File "/rm.py", line 602, in packcombine File "/rm.py", line 253, in combine File "/arc.py", line 431, in compress File "/blocks.py", line 528, in getblock badblock: hash mismatch blockid 1 in file arc.0.0
Running command: hb selftest -c '/Users/jim/hb' --fix 1 Backup directory: /Users/jim/hb Most recent backup version: 0 Using destinations in dest.conf Removed arc.0.1 from d1 Dedup loaded, 0% of current size Block 1 is in arc.0.0 Check arc: arc.0.0 Level -v2 check; higher -v levels check more backup data Partial selftest because specific paths, arcs, or blocks requested Checking all versions Checking database readable Checked database readable Checking keys Checked keys Checking arcs I Checked arcs I Checking blocks I Getting arc.0.0 from d1 Checking arc.0.0 Error: unable to get block 1 of arc.0.0 from (local): hash mismatch Corrected block Checked arc.0.0 from d1 Checked arc.0.0 from (local) Checked 2 blocks I Checking arcs II Checked arcs II Writing hb.db.1 Copied arc.0.1 to d1 (42 KB 0s 27 MB/s) Copied hb.db.1 to d1 (4.1 KB 0s 892 KB/s) Copied dest.db to d1 (36 KB 0s 31 MB/s) Removed arc.0.0 from d1 1 errors Run hb selftest -v2 --fix without arc filenames if corrections were made Selftest exit code 1
Running command: hb selftest -c '/Users/jim/hb' --fix Backup directory: /Users/jim/hb Most recent backup version: 0 Using destinations in dest.conf Dedup loaded, 0% of current size Level -v2 check; higher -v levels check more backup data Checking all versions Checking database readable Checked database readable Checking database integrity Checked database integrity Checking dedup table Dedup loaded, 0% of current size Checked dedup table Checking paths I Checked paths I Checking keys Checked keys Checking arcs I Checked arcs I Checking blocks I Checked 2 blocks I Checking refs I Checked 1 refs I Checking arcs II Checked arcs II Checking files Checked 6 files Checking paths II Checked paths II Checking blocks II Checked blocks II Writing hb.db.2 Copied hb.db.2 to d1 (1.6 KB 0s 368 KB/s) Copied dest.db to d1 (36 KB 0s 20 MB/s) No errors Selftest exit code 0 Returning error exit code because of previous block error
log: the -x option controls how many lines of context to include around backup error messages, with a message of how many lines were skipped. If only 1 line is skipped, log just includes it. So this:
2022-01-19 Wed 21:44:00| / ... (1 lines) 2022-01-19 Wed 21:44:04| /Users/jim/.sh_history
is now summarized as:
2022-01-19 Wed 21:44:00| / 2022-01-19 Wed 21:44:00| /Users 2022-01-19 Wed 21:44:04| /Users/jim/.sh_history
This also fixes a bug: if -x0 was used, meaning no context, summaries did not show how many lines were skipped. With -x0, this:
2022-01-19 Wed 21:44:04| Unable to list directory: Operation not permitted: /Users/jim/Library/Application Support/CallHistoryTransactions 2022-01-19 Wed 21:44:05| Unable to list directory: Operation not permitted: /Users/jim/Library/Application Support/com.apple.TCC
is now summarized as:
2022-01-19 Wed 21:44:04| Unable to list directory: Operation not permitted: /Users/jim/Library/Application Support/CallHistoryTransactions ... (37 lines) 2022-01-19 Wed 21:44:05| Unable to list directory: Operation not permitted: /Users/jim/Library/Application Support/com.apple.TCC
get/mount: if the -c backup directory is on an NFS mount (not recommended!), then after get finishes and says "No errors", it could generate the traceback below. HashBackup left files open in the spans.tmp directory, tried to delete it, and while this works with a local filesystem, it fails on NFS mounts because instead of deleting open files, NFS renames them. When HB tries to delete the spans.tmp directory, there are hidden file there, so the directory delete fails. After HB exits, the files are closed and NFS deletes the hidden files, so the directory is actually empty after the failure. Confusing! Now HB closes the files before the delete to prevent the traceback. Thanks Ware and Paul!
Traceback (most recent call last): File "/opt/lib/python2.7/atexit.py", line 24, in _run_exitfuncs File "/misc.py", line 1054, in rmx File "/misc.py", line 1043, in _rmtree OSError: [Errno 39] Directory not empty: '.../spans.tmp'
then sync correctly said "Arc file is missing" when trying to sync the 1st destination, but entered an infinite wait loop when trying to sync the 2nd destination, with the message "Wait for arc.v.n". Now it gives an error for each destination and continues syncing other files. This was a confusing one! From internal testing.
stat: a build mistake caused a traceback on non-OSX systems in the stats command, related to the new statistics added in #2677. Thanks Evert!
Database error: no such table: dbstat File "/hb.py", line 215, in <module> File "/stats.py", line 281, in main
selftest: when a certain kind of block was deleted with --fix, HB’s internal accounting was not adjusted correctly, leading to 2 additional errors:
Error: d > s in (0,0) Error: arc.0.0 accounting is wrong: 85181 > 42672
ls: it was an error to use ls -1 (one) without including -r to specify a version. Now if -r is omitted, the most recent backup is shown. Enabling this is one less line of code!
log: a few updates for the log command:
read output from the command being logged in blocks rather than one character at a time for better efficiency
the command being logged is recorded on the first line of the log
the clear command is logged
log: there have not been many problems with HashBackup, but every now and then a challenging one comes up. Log files can make troubleshooting much easier, so now they are enabled on every command; using hb log … still works, but is no longer necessary. If you have disk space concerns, add this to your crontab entry to automatically zip logs files:
$ hb log -e -x1 >/dev/null
selftest: a preview release about a month ago caused selftest --inc to test all arc files instead of just a few. This was fixed, but an aftereffect is that a counter underflowed (went negative) and selftest stopped testing arc files for a long time until the counter went positive again. Now, a warning is shown if this counter underflows and it is zeroed, preventing this odd behavior.
count: if a directory is the first entry in its parent directory and a permission error occurs, it caused this traceback:
File "/count.py", line 121, in dosubdirs File "/count.py", line 160, in walktree UnboundLocalError: local variable 'path' referenced before assignment
Related to this, the pathname printed with the error message "error stating directory" was incorrect. Thanks Arthur!
dest edit: an empty file was causing "AttributeError: 'NoneType' object has no attribute 'strip'", and a non-empty file was not getting saved. Thanks Arthur!
backup: extended attributes, also called resource forks on OSX, are usually small bits of data attached to files. They are saved in the HB database, not in arc files, because they are usually small. However, they can be very large and this will cause the database to be unreasonably large. A notice is now displayed during backup for extended attributes or ACLs larger than 100KB. Thanks Arthur!
export: after exporting a backup, export could raise an error if the original backup had a passphrase protecting the key. The export worked fine; this error was related to updating the audit log after export finished, to set the status code.
Traceback (most recent call last): File "/hb.py", line 284, in <module> File "/db.py", line 383, in opendb Unable to access HashBackup database: file is not a database
The rclone.py script was checking for the existence of hb.db in the backup directory, but it won’t exist yet for recover. The purpose of this check is to ensure that --backupdir really is an HB backup directory. Now instead of checking for hb.db, rclone.py checks for key.conf since that must be present in all situations. Since rlone.py is not automatically updated, users will need to download the new version from http://www.hashbackup.com/shells/rclone.py or edit their local copy. Thanks Ben!
selftest: when cache-size-limit is -1, all arc files should be kept locally. If cache-size-limit was previously >= 0 (arcs not all local) but is now -1 (arcs should be local), some arc files may not have local copies. Since selftest -v4 already downloads remote arc files to verify them, it will now save a copy if the local copy was missing. This allows sites to migrate arc files back locally with -v4 and optional --inc incremental testing.
stats: statistics have been added to show why the hb.db database can get "too big". This is almost always caused by large extended attributes (OSX) or backing up browser caches and other content-defined / hash pathnames. The new stats are:
5,303 extended metadata entries 278 MB database size 88 MB space used by files (31%) 47 MB space used by paths (17%) 133 MB space used by blocks (47%) 5.9 MB space used by extended metadata (2%) 495 KB largest extended metadata entry 226 MB database space utilization (81%)
stats: a new option --xmeta <size> shows files with extended metadata larger than <size> (usually extended attributes) to allow excluding or removing them from the backup if they are unimportant.
recover: adds more guidance about archive downloading before the recover starts based on -a and -n options. After recover, resets cache-size-limit if it contradicts with -a or -n.
compare: the -r option can be used to compare an older backup to the filesystem. This can be useful to see why an incremental was unexpectedly large. For example, if backup 939 was normal but backup 940 was very big, then hb compare -r939 can often be used to find out what changed in the live filesystem. For now this only works if the large files are still present in the filesystem. It would be nice to compare two backup versions without looking at the live filesystem. Thanks Krisztian!
B2 dest: by deleting and re-creating B2 buckets with the same name, renaming destinations in dest.conf, and using lots of setid commands, it’s possible to get a backup in a very weird state that causes tracebacks when files are deleted. After 10 retries, the destination shuts down:
dest b2: error #1 of 9 in rm arc.1.0: b2(b2): http status 400 (Bucket 589c09e5d773b10967850313 does not exist) deleting fileid 4_z589c09e5d773b10967850313_f11867765b241ab9b_d20220110_m173350_c001_v0001097_t0047: test/arc.1.0 Traceback (most recent call last): File "/basedest.py", line 90, in retry File "/b2dest.py", line 633, in rm File "/b2dest.py", line 311, in b2deletefile Exception: b2(b2): http status 400 (Bucket 589c09e5d773b10967850313 does not exist) deleting fileid 4_z589c09e5d773b10967850313_f11867765b241ab9b_d20220110_m173350_c001_v0001097_t0047: test/arc.1.0
The problem is that files were stored in a B2 bucket, that bucket was deleted and re-created on the B2 web site (which assigns a new B2 bucket ID), and when HB tries to remove files it thinks are still there, it is sending delete requests for the orignal bucket ID that no longer exists. Now HB ignores this error as if the delete worked, allowing it to escape from this catch-22. Thanks Alexander!
dest clear: previously a destination had to be active before it could be cleared or a traceback would occur: "No active destination xyz in dest.conf". This can be a problem if the account is no longer accessible. Now, dest clear asks if you want to delete files anyway (unless --force is used) and removes them without any network I/O. Thanks Alexander!
S3: multipart get was added to #2641 using the multiprocess library. Some systems (FreeBSD 12, maybe others) generate a traceback during initialization and the destination is disabled:
OSError: [Errno 78] Function not implemented
Now HB tests for this during initialization and disables multipart operations with a message "multipart does not work on this system". Thanks Henrik!
the installer on the Download page checks the environment variable HB_UPGRADE_URL when fetching the real HashBackup binary. This enables private upgrade servers to also use the installer. The installer hashes shown on the Download page have been updated. Thanks Michael!
upgrade: previously, upgrade would ask before installing a new version. This made it necessary to have a --force option to avoid asking when running from a cron job. Since there is already a -n option to avoid installing the upgrade, the --force option is redundant and is now obsolete. Using --force will cause an extra message, so it should be removed from crontabs after upgrading to this release to avoid that.
upgrade: cron sends all job output, either stdout or stderr, to the job owner or MAILTO email address. Previously, hb upgrade wrote the version / copyright message to stdout but wrote the "You already have the latest version" to stderr. This is a problem because a daily upgrade cron job will send email every day.
Now upgrade writes errors, release notes, and upgrade messages to stderr and writes "You already have the latest …" to stdout. This allows redirecting stdout to /dev/null in crontab so only important mail is received.
recover: the -n (no arc download) and -o (overwrite arc files) options raise an error if used together.
INCOMPATIBILITY NOTICE: previously recover -a meant to overwrite existing files. This option has been renamed to -o and a new -a option has been added.
recover: -o means overwrite existing files (was -a), and -a means download all arc files, ignoring cache-size-limit
SECURITY NOTICE: HashBackup allows specifying destination credentials in either the dest.conf text file or by loading this text file into the encrypted hb.db database. Because of ransomware, it is highly recommended that the dest.conf text file not be used to for remote credentials and that the dest load command is used to store these in hb.db instead. In a future release, the dest.conf text file will be forced into hb.db.
a new dest subcommand, edit, will edit the dest.conf file stored in the encrypted hb.db database without using the unload and load commands. It uses the EDITOR environment variable to choose an editor, or uses vi if EDITOR is not set. If there is no dest.conf loaded into the database, a new one can be created with the editor. Changes must be saved before quitting the editor.
S3: some S3 services (Storj) may have multiple hosts configured at the same DNS name. Connecting to these with SSL requires SNI (Server Name Indication), which is now supported.
upgrade: the retry loop is now 60 seconds vs 120 seconds. On a failure to download, a traceback is dumped to help diagnose problems.
S3: a new keyword "partsize" can be used to specify a fixed partsize for multipart S3 uploads and downloads. The default is 0, meaning that HB chooses a reasonable part size from 5MB (the smallest allowed) to 5GB (the largest allowed), based on the file size. When the new partsize keyword is used, HB uses this part size to determine the number of parts needed, then "levels" the part size across all parts. For example, when uploading a 990MB file with a partsize of 100M, HB will use 10 parts of 99M each. This option was added for the Storj network because it prefers part sizes that are a multiple of 64M (the Storj default segment size). The size can be specified as an integer number of bytes or with a suffix, like 100M or 100MB, but is always interpreted as MiB, ie, 100 * 1024 * 1024.
backup: if a file disappears after scanning a directory, backup suppressed this error if there was a previous backup and printed "pathname (deleted)", but for files not previously saved, it generated an error. Now it displays "pathname (deleted)" in both cases, suppressing the error. Thanks Michael!
log: a new -X option allows excluding lines when summarizing log files. All lines not starting with a slash (pathname) are normally included in log summaries. But some sites may want to exclude lines like "Copied arc.v.n to blah". Adding -X "Copied arc." would do that. The exclude list items are simple strings - no regular expressions. Thanks Michael!
upgrade: HashBackup uses RSA4096 signatures to verify that an upgraded version is authentic, so using SSL (secure http) to fetch upgrades is not necessary. However, SSL may be a policy requirement at some sites. To use SSL (https) with a private server, use:
# SSL_CERT_FILE=/etc/ssl/cert.pem hb upgrade https://my.server/path
If you aren’t sure where your root certificates are stored, use the pathname of the cacerts.crt file in a HashBackup backup directory. Thanks Michael!
config: previously the environment variable HB_ADMIN_PASSPHRASE could be set to avoid having to enter the admin passphrase. Now the admin passphrase can be set or changed with the environment variable HB_NEW_ADMIN_PASSPHRASE. To enter the new passphrase with the keyboard, use:
$ HB_ADMIN_PASSPHRASE=oldp hb config -c backupdir admin-passphrase
To enter the new passphrase with an environment variable, use:
$ HB_ADMIN_PASSPHRASE=oldp HB_NEW_ADMIN_PASSPHRASE=newp hb config -c backupdir admin-passphrase
WARNING: environment variables are useful for automation but may have security risks that could expose sensitive data to other locally running processes.
dest: a new "test" subcommand tests a single destination or all currently configured destinations. It performs 3 rounds of upload, download, and delete tests for many file sizes, displaying the performance of each and an average performance for each file size.
Test all destinations: $ hb dest -c backupdir test Test specific destinations: $ hb dest -c backupdir test <dest1> <dest2> ... Run specific tests: $ hb dest -c backupdir test -t up del Test sizes 1K and 128M: $ hb dest -c backupdir test -s 1k 128m Run 10 rounds instead of 3: $ hb dest -c backupdir test -r 10 Delay 5 mins and repeat: $ hb dest -c backupdir test -d 5m (or 5s or 5h) Repeat test 12 times: $ hb dest -c backupdir test -n 12
S3: multipart get was added to S3 destinations. This scales very well with the workers and partsize keywords in dest.conf. In tests with the Storj S3 Gateway, 1GB download performance increased from 20 MB/s to over 200 MB/s using multipart gets, and Amazon S3 scaled up to over 300 MB/s with 16 threads. Multipart uploads and downloads are enabled by default unless multipart false is used in dest.conf. Thanks Dominick!
NOTE: some S3-compatibles such as the Google Cloud Storage proxy do not support S3 multipart operations.
S3: if a file is copied to the MinIO object store with filesystem commands, everything works fine except that MinIO serves the file with an etag of 00000000000000000000000000000000-1. Instead of complaining and aborting the download, HB now ignores these etags. If there was a download error, it will be caught later during the restore with a block hash mismatch. Thanks Ian!
backup: in #1946, a SIGTERM handler was added to the backup command so that if the backup program was terminated, it finished the current file and then stopped cleanly. However, this does not play well with the new S3 multipart download feature, so the SIGTERM handler has been removed.
destinations: after testing with several object store services, the default number of workers has been changed from 2 to 4. This seems to be a sweet spot of increasing performance without adding too much overhead for managing threads.
shards: copy the HB program to the backup directory instead of each shard subdirectory
S3: check when uploading files larger than 5 GiB that multipart is enabled, and raise an error if not
rm: if the local backup directory disk is full, it’s important to be able to remove backup data to correct the disk full. Previously rm would fail trying to update the dedup table. Now it avoids this and can remove arc files to free up disk space. Thanks restic!
dir destination/backup: if all directories are full for a dir
destination, it raised the error:
TypeError: %d format: a number is required, not NoneType
but now raises the correct error:
d1(dir): file not copied: /Users/jim/hb/arc.3.1
destinations: previously, a traceback was only printed after a destination exceeded all retries. Now a traceback is printed on every error. Instead of using "debug 99" to stop retries, use "retry 0" (debug 99 is now obsolete). Thanks Grant!
selftest: if a -v4 selftest used --inc, had a download limit specified with ",xxxMB", and would have exceeded the limit, selftest would say "Checking 0% of backup" instead of the actual percentage. Also, if only one version was being incrementally checked (-r used with --inc), the percentage checked is not of the whole backup but just of the version requested, so the message was changed to "Checking xx% of version r".
dest clear: when a destination has the only copy of a file, HB refuses to clear it. But it’s possible for an arc file to be unused, for example, it was removed while a destination was down. Now dest clear does not refuse on files that aren’t needed.
selftest: with -v4, if an arc file isn’t local and cache-size-limit is -1 (all arc files should be local), selftest would download the file twice. This situation can happen for example when switching from remote arc files (cache-size-limit >= 0) to local (the limit is -1) and some arc files are not yet migrated locally.
ls: use bracket wildcard patterns to match specific characters. [atJ] matches either a or t or J, [0-9] matches a single digit. This example now works whereas before it said "Path not in backup: /.fil[ef]":
$ hb ls -c /hbbackup '/.fil[ef]' Backup directory: /hbbackup Most recent backup version: 2952 Showing most recent version, use -ad for all /.file
backup: when path /a/b was backed up and /a did not exist, backup would halt with a traceback before it got started:
Traceback (most recent call last): File "/hb.py", line 129, in <module> File "/backup.py", line 3033, in main File "/misc.py", line 545, in fixpath OSError: [Errno 2] No such file or directory: '/a'
Now it will do the backup and show an error during the backup:
/a (deleted) Unable to backup: No such file or directory: /a/b
a new variation of the dir destination with type "dirs" allows specifying multiple destination directories. This is useful with multi-drive external disk enclosures configured as JBOD (Just a Bunch Of Disks). It is possible to configure these units as one large RAID0 drive and use a "dir" destination, but that reduces reliability: if any disk fails, all data is typically lost. A RAID > 0 configuration is most reliable, but also reduces capacity. A JBOD configuration using the dirs destination with multiple directories is a middle ground: if a disk is lost, only the backup data on that drive is lost. This is an especially reasonable configuration if you have another copy of the backup data at another destination.
The dirs destination requires a dirs keyword specifying multiple target directory paths, separated by colons. When reading from the destination, HB will check all directories until it finds the file it needs. When writing arc files, HB will check available space in each directory and copy to a directory with enough space to hold the file. When copying non-arc files, HB makes copies in every directory that has room, to increase the chances of a recover in case the local backup directory is lost and one or more JBOD disks are also lost.
A new "copies" option specifies how many copies of arc file to create, to increase redundancy. The default is 1.
A new "spread" option causes arc files to be distributed over all disks rather than filling up disk 1 before using disk 2.
config: it is possible to set enable/disable-commands without setting an admin-passphrase; only a warning is printed. But if enable-commands was set to 'backup' for example, then it was not possible to set an admin-passphrase because the config command was disabled. This created a backup where no config options could ever be changed again, which is a little strange and unintended.
Now, the config command with the admin-passphrase subcommand is always allowed. If you really want a backup that is unchangeable like before, you can set an admin-passphrase to some random string that you then forget.
config: the admin passphrase was sometimes requested twice, depending on whether the config command was enabled or disabled.
retain/rm: during packing (removing empty space from an arc file), a block sometimes needs to be decrypted & decompressed, and this also verifies the block’s hash. If the block is bad, a hash mismatch error occurs and retain/rm abort. Sometimes these steps are not necessary and are skipped as a performance optimization. However, if a block is bad, skipping the hash verification allows the bad block to propagate without error during packing. The first time the bad block would be noticed would be either a -v4 selftest of the arc file or a restore needing the bad block. With this release, all block hashes are verified while packing an arc file and retain/rm will abort if any blocks are bad. This may slow down packing somewhat; if it becomes a problem, please send an email.
retain/rm/selftest: improve advice given when bad blocks are detected. For example, instead of: getblock: hash mismatch blockid 3 in arc.0.0 the message is now: getblock: hash mismatch blockid 3; run selftest --fix arc.0.0
this release does an automatic database upgrade to remove unused columns from the database
backup: if a destination failed a certain way, it could cause one or more remote archives for that version to appear corrupted after subsequent backups, and also generate selftest errors:
Error: arc.V.N size mismatch on XXX destination: db says YYY, is ZZZ
A selftest -v4 would give hash mismatch errors on the interrupted arc files. If the arc files were stored correctly either locally or on another destination, selftest -v4 --fix could fix the errors. But sometimes the only correct version was local. If cache-size-limit was also set, the correct arc files could be deleted when the cache filled up, leading to permanent corruption.
backup: a bug in exclude processing caused some files to be backed up that should have been excluded
get: restores could fail with a "list index out of range" error in certain circumstances. Thanks Ian!
get: could fail with a traceback: NameError: global name 'errmsg' is not defined when scanning local files to find a match for a file to be restored. From internal testing.
get: fix "Unable to set extended attributes" error when restoring symlinks on Linux, related to the order things are restored, if the symlink target had extended attributes. Thanks Ian!
dir dest: fix hang if dir destination file is shorter than expected when using selective download. From internal testing.
dir dest: retry selective download on I/O error. From internal testing.
selftest: the message "Note: arc.V.N is correct size on …" was being displayed if an arc file had not yet been transferred to one or more destinations. It should only be displayed if an arc file’s size is incorrect somewhere.
selftest: with -v3, -v4, and/or an arc file on the command line, selftest verifies block hashes. If it sees a block with a bad hash, it tries to get the same block from another destination. If all destinations have the same bad block, selftest deletes the block. However, this delete was not happening, even with --fix, so a subsequent selftest would give the same errors. Now the delete is working properly. As usual, it may take 3-4 more selftests without -v3, -v4 and/or an arc filename to get rid of follow-on errors. Thanks Vincent!
selftest: if -v4 --sample is used on a backup where no remote destinations support sampling or destinations are not setup, add a hint to use -v3 --sample to sample just the local arc files.
selftest: explain that database integrity errors are usually hardware problems and cannot be bugs in HashBackup. These can sometimes be fixed with selftest --fix, but it’s important to find out what caused them (bad RAM, USB drives, cables, etc)
versions: prevent traceback error: TypeError: unsupported operand type(s) for -: 'NoneType' and 'int' if a backup was interrupted.
log: if hb log is used without a command, it caused an IndexError. Thanks Arthur!
this release does an automatic database upgrade when any HB command is used, fixing datesaved timestamps on hard-linked files that could cause retain to fail with an AssertionError (rare)
versions: two new columns, backup time and backup space, have been added. Some versions may have a blank backup space if all blocks have been consolidated into other versions because of packing.
backup: when saving large files (>500MB), show a percentage progress line during the save if output is to a screen
mount: reading large files saved with the new "auto" block size (#2490 on Mar 16, 2020) caused mount performance problems because of the larger block size.
dest: destination names are always lowercase inside HB even if they are mixed case in dest.conf. If uppercase letters were used with the dest setid or clear commands, they gave incorrect errors that a destination did not exist. Thanks Francisco!
backup: if backup tries to save a file but it’s been removed, it could cause a traceback. Thanks Bruce!
Traceback (most recent call last): File "backup.py", line 3474, in <module> File "backup.py", line 3317, in main File "backup.py", line 1392, in backupobj File "backup.py", line 339, in markgonelog IndexError: No item with that key
dest: with a sharded backup, the dest clear and setid commands failed unless --force was used. Now they ask if it’s okay to proceed just once, then execute the command. Thanks Josh!
retain: if -r (rev) is out of range, a traceback occurred instead of an error message about -r. From internal testing.
backup: sometimes could hang when restarting a backup with --maxtime. Thanks Maciej and Dave!
backup: a rare race condition with hard-linked files changing during a backup could cause retain to fail with an AssertionError. This occurred with 2 files in a backup of 16M files. Thanks Daniele!
backup: progress stats were showing 0 bytes saved during backup
backup: saving a fifo could cause a broken pipe error and the fifo was only partially saved
backup: if a file was saved with an odd block size, like -B70000, mount could have trouble reading the file
though I wasn’t afraid of getting hit by a bus, having an expiration date on HashBackup during a pandemic seems like a really bad idea.
config: previously the default block size (config option block-size) was 32K variable. This will not change on existing backups. For newly-created backups, the default is now "auto". This will choose an efficient block size, either variable or fixed, for each file saved. This may reduce dedup somewhat (5% in tests), but greatly reduces block overhead in the main database (from 2.2M blocks to 780K in the same test), for faster selftests and restores. For more information see the block-size config option on the HashBackup web site. To get the old behavior on new backups, use hb config to set block-size to 32K or use -V32K on the backup command line. Use hb config to set block-size to "auto" on existing backups to get the new behavior. This will cause more data to be saved for modified files because of the block size change.
backup: if output is being sent to a screen, show files checked and saved every 10K files. This is useful on huge backups with mostly unmodified files, so users know files are being checked.
backup: using -V0 or -B0 (zero block sizes) should have failed, now they do
backup: excluding many files with inex.conf had some scalability problems. For testing, 200K pathnames were added to inex.conf, half with wildcards, half without. Then a directory of 10K small files was backed up on a Core 2 Duo system:
Excludes Backup init Backup time Backup rate (files/sec) ---------------------------------------------------------------------- OLD: default 0 sec 3 sec >3300 OLD: 200K 37 sec 2060 sec 5 NEW: 200K 24 sec 3 sec >3300 (660x faster)
get: by default, get will copy local files if the size and time match the file to be restored. But it also needs to ensure the file type matches: if a file was saved, but at restore time the file is a symlink to a file with matching size and type, the symlink must be deleted and replaced by an actual file. Now hb get checks the file type before using a local file. From internal review.
get: a couple of changes this year made restores of large directories containing small files run 5x slower on OSX HFS+ filesystems. This was noticed during internal benchmarking and has been corrected.
export: now works if dest.db doesn’t exist (destinations never used)
export: fixed an error saying -c was required even when HASHBACKUP_DIR was set correctly. Thanks Maciej!
export: if a backup was created or rekeyed with -k to use a specific key, export worked but would give an error "Unable to access HashBackup database: file is not a database" when trying to update export’s exit status in the audit log. Thanks Ralph!
selftest: if no destinations were configured, --sample gave an error "No destinations support sampling" even if -v3 was used. Thanks Gabriel!
dest verify: there are 2 types of confusing remote file size errors detected with dest verify, and the final status message has been changed to clarify this:
HB uploaded a file, noted the uploaded size, but now dest verify (a remote "list") shows the file is a different size. The dest ls command shows the size HB uploaded, but the remote storage service says it is a different size. This can happen if a remote file is overwritten after the HB upload for example, and is a verify error.
HB knows an arc file should be a certain size, but a different size file was uploaded to the remote service. This is called a db size mismatch and is very unusual. It can happen if a local arc file is overwritten while HB is transmitting it for example.
config: as mentioned July 10 2019, #2377, the audit-commands config keyword is now obsolete. Previously this was used to select which commands were audited. All commands are audited as of #2377.
retain: as mentioned June 6, 2019, #2347, the --force option is now obsolete and the --dryrun option has been replaced by -n.
selftest: if an arc file is missing and deleted with --fix, selftest could fail later with a traceback:
TypeError: 'NoneType' object is not iterable
A 2nd selftest showed errors about extra blocks in the dedup table, a 3rd selftest ran clean. This traceback is now fixed and the 2nd selftest runs clean. Thanks Alex!
selftest: if cache-size-limit is set, the only copy of an arc file is missing, and hb verify is run, a subsequent hb selftest -v5 without --fix would fail with a traceback error in the prefetcher. This was happening because -v5 with a limited cache starts fetching arc files very early, before it knows that some arc files are missing. To fix this, the prefetch plan is computed just before it is needed rather than during initialization.
Related to this, if a missing arc file is not deleted with selftest, the check level is reduced from -v5 to -v2 to avoid the prefetcher traceback error and a warning is issued. Thanks again Alex!
when HB created new files, it sometimes left the execute (x) bit set in the file permissions because this is the default for Python programs that use os.open. An example of this is the log files. In other cases where HB did specify the permission, it used 0644, the permissions for a typical Unix setup with a default umask of 022. An example of this is arc files. However, if a site sets a umask of 02, the group id should also have write permission.
All uses of os.open were reviewed and when files are created, a default permission of 0666 is now used. The bits set in umask (usually 022) are then turned off, making the new file’s permissions 0644. With a umask of 2, new files will have permission 0664, and group users will have write access. This was not possible before this change, which affected 12 os.open calls. Thanks Francisco!
selftest: --inc (incremental testing) works for -v3 to -v5, but the optional download limit currently only works for -v4. A download limit doesn’t make sense for -v3 since it doesn’t download files. For -v5, a download limit is more complex to implement, because a single user file might require downloading many arc files. A note has been added to the selftest doc page and selftest will halt if a download limit is used with a check level other than -v4. Thanks Alex!
export: fixed a traceback that prevented an export from being generated. Thanks Frank!
selftest: remove arc .tmp files that may be left after a -v4 check. Thanks Kuryan!
dest verify: would not remove a file from HB’s database if it was the only copy, even if the file didn’t actually exist on the remote or was the wrong size. Now with --force, it will remove the file from HB’s database, allowing selftest --fix to make corrections.
dest ls: a new -v option shows associated data HB tracks for uploaded files. For example on B2, this shows the SHA1 hash and B2 fileid.
backup: halt on partial writes when creating arc files. Partial writes are very unlikely, but possible. Thanks Frank!
ssh: the ssh destination uses remote "dd" commands to download pieces of arc files (selective download). This is faster than downloading whole arc files, especially when restoring small files. Some 3rd-party providers of ssh services use jails or chroots to restrict the commands that can be used and dd is sometimes not available. Now HB gives advice to either use sftp instead of ssh or enable dd.
selftest: --fix will now truncate files if a hash mismatch is detected with -v5. This is needed when bad hardware caused errors in the backup, the hardware has been fixed, and the user wants a clean selftest.
IMPORTANT NOTE: if HB is used with defective hardware, especially non-ECC RAM that is defective, a clean selftest doesn’t mean the backup is fine. Some errors may not be detectable, for example, a bit flip in a database page in RAM, before being written to disk, could cause a file to be restored later with the wrong permissions. The best thing to do after a hardware problem is fixed is to start a new backup and keep the previous backup for historical reference if needed.
dir destination: remove "debug: fast copy" debug output when copying files to a dir destination with large iosize for NFS4.
dest.conf: the maxsize keyword is the maximum size of files uploaded to a destination. Any files over maxsize are split into parts before uploading, and reconstructed from parts when downloading. maxsize is used for small backups to email or WebDAV where there are often low limits on file sizes, like 25MB.
Previously the maxsize default was 5GB, because B2 has a hard limit of 5GB for uploads unless special procedures are used, and HB doesn’t use them. However, some sites increase arc-size-limit to create large arc sizes, like 10GB. If maxsize is not also changed, it causes inefficient splitting of arc files. This is often not recognized until the backup is well underway or finished, and it’s hard to correct split arc files once created.
The new default for maxsize is unlimited, ie, files are never split. The B2 driver now raises an error when trying to upload files > 5GB.
mount: fix "error getting block xxx logid xxx ix 0: -xxxxxxxxxx" Thanks Warz!
recover: a couple of minor changes to hb.db.N files prevent (harmless) HMAC warnings during recover --check and speed it up.
if hb log selftest was interrupted with Ctrl-C during "Checking database integrity" with more than 5 seconds to go, it would cause a "Database is locked" error. A key aspect of this is that Ctrl-C is deferred during the database check, causing a long delay before the Ctrl-C is acknowleged by selftest. Thanks Chris!
audit: with "hb log backup …", two audit entries were created instead of one.
a bug generating hb.db.N files caused these files to flip between small and large sizes, when they should usually be on the small side. It also could cause recover to fail, though recover --check worked. This bug has been around for about 2 years and was found via internal testing.
recover: if --check was used and there were 2 destinations in dest.conf, hb.db.N files were downloaded and applied twice by mistake. This could also cause download problems if workers was > 1 in dest.conf because the same hb.db.N file could be downloaded concurrently by multiple workers.
config: a new option, db-history-days, keeps incremental database backup files (hb.db.N files) for the specified number of days before the latest backup. Combined with recover --check, this allows recovering earlier versions of the main backup database. The default is 3. HashBackup previously operated with 0. The new minimum value is 1 so there is always at least one older version of the database stored on remotes.
If the backup database is stored on networked or removable storage (not recommended!), or if using non-ECC RAM, lower values are not recommended because database damage is more of a possibility. Also, db-check-integrity should be set to "upload" in these configurations.
For remote storage such as Wasabi with delete penalties (90 days), setting db-history-days to 90 may lower total storage costs. Byte storage costs will actually increase, but delete penalty costs should decrease more. This is not a problem on Amazon S3 because HashBackup uses the regular storage class with no delete penalties for storing database increments, but Wasabi does not have storage without delete penalties. Also set pack-age-days to 90 with Wasabi.
Setting db-history-days higher may also have the effect of making each hb.db.N file smaller, though total storage for all hb.db.N files will be higher because more increments are kept. Higher values might be useful on slow upload links to decrease the size of hb.db.N files uploaded after each backup.
rekey: if rekey aborted, rekey had to be run again to do a rollback. Now the rollback happens automagically, either after the attempted rekey or on the next HB command.
selftest: for technical reasons, the dedup table was checked if --fix was not used but was not checked or corrected if --fix was used. Issues with the dedup table are not critical and will not harm the backup because of other checks before deduping a block. If selftest did notice problems, the dedup table had to be manually deleted. Now selftest also checks the dedup table with --fix and if any problems occur the dedup table is deleted and will be rebuilt on the next backup.
selftest: when sparse files were backed up with dedup enabled and they contained sequential identical blocks with a hole between them, selftest -v5 would say there was a file hash mismatch. The file restored correctly. This was actually a selftest bug and the backup was fine. Thanks Thomas!
recover: previously, if an error occurred while applying an incremental database backup file (hb.db.N) to re-create the main database (hb.db), recover aborted. With the new db-history-days option, HashBackup is retaining older increments longer. If a rekey occurs during the db-history-days time window, older increments could be using an obsolete key. A normal recover works fine in this situation, but a recover with the new --check option causes HMAC errors when the older increments are applied because the older increments were created with the previous key.conf file, before the rekey. Now the error is shown, the older increment is ignored, and HB continues restoring the database.
recover/rekey: following a rekey, new database increments (hb.db.N) and dest.db are uploaded to remotes, keyed with the new key. If the upload fails for some reason, the local backup uses the new key.conf but the remote backup files still use the old key, saved in key.conf.orig. This is fine and the upload will be retried on the next backup or sync. But if recover is attempted before the upload completes, there may be confusion about which key file to use. Recover now tries to be helpful in this odd situation by suggesting to use key.conf.orig if key.conf doesn’t work.
rekey: previously, rekey did not commit (completely finish) until after all files were uploaded to remotes. This can be a problem if some remotes finish and others don’t, because it’s harder to rollback a remote. Now rekey commits locally first, then uploads files to remotes. The next backup or sync will finish any incomplete uploads.
ls: a sha1 hash is shown for every file with -lv. For sparse files, this is not a true sha1 and is unique to HashBackup. A prefix like 7: or 11: is now displayed as a hint that these sha1’s, aren’t.
as mentioned a month ago, the Rackspace Cloud Files native driver has been throwing errors lately, no one has complained, and Cloud Files object storage service is 4x more expensive than S3, so support for it has been dropped. It should still be accessible using an rclone destination if necessary.
the ssh / sftp destination checked error message text to decide whether a file was missing on a remote. This is not reliable on non-English systems so a better method is now used.
dest verify: previously, a lot of success / fail messages were displayed for dest verify, the messages weren’t summarized, and the exit code was always zero. This made it hard to use dest verify with very large backups and in scripts and cron jobs. Now only error messages are displayed, with a summary of successful and failed file counts at the end. The exit code is non-zero on any verify failures, even if the errors are corrected.
when syncing, HB would halt with an error if an arc file was not found. Now it displays the error and continues syncing.
selftest: instead of showing the first database integrity error and stopping, selftest now shows all errors to allow better assessment of the type and extent of damage to a database. A customer had a damaged database that was easily fixed, but it was initially "scary" without knowing all of the specific integrity errors. Thanks Chris!
selftest: --fix now rebuilds damaged database indexes. This fixed the previous customer’s "malformed" database. Thanks Chris!
recover: if recover tried to download an arc file but it was missing on the destination, it halted with an error message. Now it continues to download the remaining arc files.
recover: a new --check option will recover the database in a slower way that does an integrity check after each hb.db.N is applied and saves the latest hb.db possible. This is useful if the local database has become damaged, to recover an older version of hb.db that is not damaged. Thanks Chris!
config: a new option db-check-integrity controls when database integrity is checked. The default is "selftest". This is how HashBackup has always operated and is fine when backups are run and stored on enterprise-class hardware: non-removable drives, ECC RAM, and hardwired networks.
Setting db-check-integrity to "upload" will do an integrity check before new hb.db.N files are generated and uploaded to remotes. This is recommended for consumer-level hardware using wireless networks, removable USB drives, or non-ECC memory, to prevent a possibly damaged local database from being transferred to remote destinations. Thanks Chris!
rekey: in the previous release, auditing was enabled for all commands. If hb.db is set to read-only, auditing cannot happen and an error about a read-only database occurs. If the command was rekey and ctrl-c is used to abort it, the next command will say "Incomplete rekey, run rekey". This usually does a rollback, putting things back in the state before rekey, but instead, HB said "run rekey" again. Now it does the rollback as expected. No data was lost with this, it’s just very confusing. Thanks Chris!
rekey: if a key had a passphrase and rekey -p was used to change it, it caused a "file is not a database" error and the rekey didn’t occur. The backup was fine. Thanks Chris!
rekey: previously if rekey -p aborted, say because the two new passwords were different, a backup & rollback was needed even though nothing had changed. Now a backup + rollback is only needed if rekey actually starts and doesn’t finish.
rekey: if rekey doesn’t complete for whatever reason, the next rekey restores the backup files (rollback). If the key had a passphrase, rekey would ask for the passphrase before doing the rollback. That’s confusing and the passphrase isn’t necessary to move original files back into place, so now it doesn’t ask for a passphrase.
rekey: previously when a blank passphrase was entered with -p ask in either init or rekey, an empty passphrase was placed on the key and HB would still prompt for a passphrase. Now if a blank passphrase is entered with init or rekey, the passphrase is removed completely, as if -p ask was not used. Existing keys with a blank passphrase will still prompt for it as before. Thanks Chris!
export: with shards enabled, a traceback could occur during export. The tar output filename was displayed incorrectly. Export is only applying the passphrase to shard #1 and is temporarily disabled for sharded backups.
config: all commands are now audited. The audit-commands config option will be retired in 2020 so scripts that set this should be adjusted before the end of 2019.
config: the hfs-compress option (OSX) is being retired. It was a clumsy implemention using the "ditto" command, and everything runs fine if previously-compressed files are restored uncompressed. Scripts that set this should be adjusted before the end of 2019.
ssh: if the rate keyword was used with an ssh destination, the -l option had a floating point value. Some versions of scp don’t like that, so now it’s an integer. Thanks Warz!
ssh: selective download always used a dd block size of 1K, which is inefficient for large transfers. Now it uses a range of block sizes up to 64K to optimize large downloads.
backup: 32-bit systems with cache-size-limit -1 (all arc files are local - the default) stalled when the backup size was near 2GB.
there have been no downloads of the 32-bit FreeBSD build of HashBackup in the last 12 months so it has been removed.
mount: fixed intermittant error with cache-size-limit set: mount: error getting block 39608 logid 949 ix 2003: free variable 'arcsize' referenced before assignment in enclosing scope This could occur when reading a user file through the mount that crossed an arc file boundary in the backup, so it was more likely to occur on large files. Thanks Krisztian!
HashBackup has supported Rackspace Cloud Files since 2011. Recently it started throwing SSL Certificate Verification errors. No one has complained about this, so apparently there are very few if any HB customers still using Cloud Files (it’s 4x more expensive than S3). This driver also supports OpenSwift in theory. If anyone is using Cloud Files or OpenSwift with HashBackup, please send in an email. Otherwise, support for this driver will be removed in the next release. It should be possible to access these using rclone if necessary.
selftest detected a block hash mismatch in a customer’s dedup table, likely caused by a memory parity error. The dedup table is easily corrected by deleting it (hash.db), and backup will re-create it on the next backup. As a double check, blockids (integers) can now be listed on the selftest command line. HB will find the corresponding arc file and test all blocks in that arc file with a -v4 test.
retain: shows the overall retention period, ie, the sum of all the -s intervals after adding 1 if retain-extra-versions is set. Since retain intervals follow one after another, the overall retention time can be longer than expected, especially considering retain-extra-versions.
retain: if a path was listed on the command line but not found in the backup, the final statistics line caused a divide by zero when figuring out percentages deleted and kept (all zeroes)
retain: a new -r option performs retention only on files at or below a specified backup version. This is for testing retain, but might be useful to thin out older files using a different retention policy
retain/rm: packing runs more often (pack-age-days / 4) but still respects the pack-xxx config option limits.
cacerts.crt, the SSL CA bundle used by HashBackup, was updated. HB will update the cacerts.crt file in every backup directory on the next backup if it is writable.
upgrade: a 2-minute retry loop was added to handle upgrade server reboots
retain: if a file’s backup was 5 years old, -s3y was used, and a new copy was saved, retain would delete the 5-year-old copy. But it should have been kept so that restores as of 2 years ago would still have the file. From internal testing.
upgrade: accepts an optional http URL to a private HB upgrade server. Sites with more than 10 copies of HashBackup are encouraged to set up their own upgrade server.
IMPORTANT: only hb binaries at this release or later can upgrade from a private upgrade server. So first upgrade to this release, then create and start upgrading from your private server.
To create an HB upgrade server, run a cron job on your private server - no more than once a day please - to mirror the HashBackup release with rsync. Adjust the target for your http server. The source is an rsync module, so two colons are used:
rsync -a --delete-after --port 8010 upgrade.hashbackup.com::release /var/www/hbrelease
To mirror the preview release, use ::preview instead of ::release.
To upgrade using your local server:
hb upgrade [--force] http[s]://myserver.mydomain.com/hbrelease
This functions just like the official HB upgrade server, including using RSA4096 signature verification to ensure the binary is authentic. To archive past releases omit --delete-after.
upgrade: the new -p option upgrades to the preview release instead of the official release. The preview release is to try out the upcoming release and provide early feedback with non-production backups. -p cannot be used with an upgrade http address, but you can create both preview and release private server directories and upgrade from either using the http address.
dir destination: the new iosize dest.conf keyword specifies the size of reads and writes when copying files to the destination. The default iosize was 64K bytes, which can be slow if the target is an NFSv4 server where each write waits for physical disk I/O to complete on the server. The new default size is 512K bytes and this can be increased using iosize. For example, iosize 16M would use 16 MB transfers. Thanks Song for your extensive NFS testing!
retain: -s has been rewritten in this release to fix several bugs found with internal testing. Performance and memory usage are the same as before. Customers have not reported these long-standing problems, but since files are modified and backed up at random times, retention bugs may be hard to notice.
sometimes multiple versions of files outside the entire retention period but still "live" in the filesystem were being retained, when only the latest version should have been retained. For older backups, the first run of retain with this release may delete a lot of old files.
retain didn’t do time rounding: if a file was saved one day at 2:10 AM and then was saved the next day at 2:05 AM (not a full day apart), retain with -s5d (keep last 5 days) might incorrectly delete a backup.
simulating 1200 daily backups from 2016-05-25 to 2019-09-06, retain with a schedule of -s 7d4w3m4q2y worked okay (but not great) when it was run once after all 1200 backups.
(Columns are: backup number, date/time saved, and a note to show what each backup represents, even though the dates are not always as expected):
1200 2019-09-06 07:35:06 1d 1199 2019-09-05 07:51:03 2d 1198 2019-09-04 07:35:12 3d 1197 2019-09-03 07:28:11 4d 1196 2019-09-02 07:33:29 5d 1195 2019-09-01 07:33:54 6d 1194 2019-08-31 07:29:08 7d
On a live server backed up daily with -s30d12m it did better, but the older backup history is mostly missing and there are obvious gaps in the most recent 30 days (only 16 versions):
2821 2019-05-10 02:02:59 bsd732-s001.vmdk 1d 2820 2019-05-09 02:02:18 bsd732-s001.vmdk 2d 2815 2019-05-04 02:03:01 bsd732-s001.vmdk 7d 2813 2019-05-02 02:03:02 bsd732-s001.vmdk 9d 2812 2019-05-01 02:02:21 bsd732-s001.vmdk 10d 2810 2019-04-29 02:02:57 bsd732-s001.vmdk 12d 2808 2019-04-27 02:02:58 bsd732-s001.vmdk 14d 2807 2019-04-26 02:02:43 bsd732-s001.vmdk 15d 2803 2019-04-22 02:03:00 bsd732-s001.vmdk 19d 2802 2019-04-21 02:02:23 bsd732-s001.vmdk 20d 2800 2019-04-19 02:02:25 bsd732-s001.vmdk 22d 2798 2019-04-17 02:02:38 bsd732-s001.vmdk 24d 2794 2019-04-15 02:02:16 bsd732-s001.vmdk 26d 2792 2019-04-14 02:02:03 bsd732-s001.vmdk 27d 2788 2019-04-12 02:03:06 bsd732-s001.vmdk 29d 2787 2019-04-10 02:05:36 bsd732-s001.vmdk 31d 2780 2019-04-02 02:02:08 bsd732-s001.vmdk 2m 2479 2018-04-14 02:02:37 bsd732-s001.vmdk 13m
Now retain does much better at keeping the requested history. Here is the earlier simulation with the new retain:
1200 2019-09-06 07:35:06 1d 1199 2019-09-05 07:51:03 2d 1198 2019-09-04 07:35:12 3d 1197 2019-09-03 07:28:11 4d 1196 2019-09-02 07:33:29 5d 1195 2019-09-01 07:33:54 6d 1194 2019-08-31 07:29:08 7d
Backup file retension is a thorny problem and reasonable people will disagree on exactly how it should work. Retain -s uses time intervals without regard to midnight, time zones, or calendars, so the files it retains may appear arbitrary and confusing. It retains a reasonable backup history over a long period of time (or it should now) but retention is not based on days of the week, month, or year.
retain: display progress (total, kept, and deleted files) every 100K kept files or 10K deleted files if sending output to a screen
retain: previously the -v option would show a brief reason why files were deleted. This is still supported, but is replaced by -v1 to show deleted files and -v2 to show both deleted and kept files, with more details about why files are deleted or kept. Directories are now supressed since they are lengthy and mostly uninteresting.
config: a new config option, retain-extra-versions, controls whether retain adds an extra period to each -s interval. The default is True to add a safety margin for backup file retention.
There are 2 mental models for file retention. Using -s1m can mean:
The difference is subtle, but to accomplish #2, there must be a backup older than 30 days. The retain-extra-versions config option changes -s7d3m to -s8d4m to satisfy this. -s1m is changed to -s2m so there are 30-60 days of backup history, or "at LEAST 30 days".
For model #1, set retain-extra-versions to False, but keep in mind that with -s1m for example, you could have a current backup from today and a "monthly" backup from yesterday, with no earlier backup history. The monthly backup will age every day, from 1 to 30 days, depending on the day of the month of the latest backup, then will jump ahead 30 days and be next to the latest backup. With retain-extra-versions set to False, -s1m would keep 1 to 30 days of backup history, or "at MOST 30 days".
files deleted more recently than -x are treated like a live file: the most recent copy is always saved and older copies are retained according to -s, -t, and -m.
files deleted longer ago than -x are removed entirely
if -x is not used, it is determined by adding all of the -s and -t intervals together. If neither -s nor -t are used (only -m), then deleted files are kept in the backup forever.
Another difference is that -x can now be larger than -s and -t. For example, -s30d without -x retains deleted files for 30 days. Adding -x1y keeps the latest backup of deleted files for a year.
retain: the # items kept/deleted was wrong when --dryrun was used, and it was especially noticeable if there were a lot of directories that a normal retain would have deleted. Also added -n as a synonym for --dryrun.
retain: the --dryrun option is being renamed to -n for consistency. This is unlikely to be in scripts, but should be changed soon if it
retain: the --force option is being retired in 2020 and should be removed from scripts & cron jobs. If the previous backup aborts, retain will operate correctly without --force.
dest: previously if a destination was unable to start, advice was given to use debug 99 to show a traceback. Now a traceback is always shown on a start failure so debug 99 isn’t needed.
selftest: if --sample was used with a list of arc filenames it would incorrectly stop with an error message "Use --sample with -v3, -v4, or a list of arc filenames to sample".
selftest: if --sample is used with an older format arc file listed on the command line (pre-2013), it can’t sample the arc file because older arc formats don’t support sampling. Now this causes an error to make it clear why sampling isn’t happening.
selftest: if an arc or pathname is on the selftest command line it performs an extensive test of the items listed. Previously this still performed a full -v2 selftest. Now when an arc or pathname is listed, some parts of selftest -v2 are skipped to speed up testing the items requested, and a partial test note is displayed.
selftest: a customer forcing a VM crash during a CIFS backup with the backup database on mergerfs (FUSE) had incorrect reference counts in the database. Selftest is designed to fix incorrect reference counts, but there were a couple of cases that it didn’t handle right. Now it fixes bad reference counts without removing files. Thanks Ben!
selftest: if an arc file is missing, selftest could say: Error: block 1 requires arc.0.0 Deleted block 1 But it didn’t actually delete the block, so the error persisted. Now it deletes the block.
dir + ssh destination: if a backup aborts, it leaves .tmp files on the remotes for dir and ssh destinations. These are supposed to get removed on the next backup, but that wasn’t happening. It’s fixed now, but you may want to manually remove stray .tmp files from these destinations (ensure hb is not running).
rclone: with rclone 1.42 and higher, "deletefile" avoids directory listings and is much faster than "delete". With lower versions of rclone, "delete" doesn’t do directory listings. If you run rclone 1.42 or higher, change "delete" to "deletefile" or get a new copy of rclone.py from the HashBackup web site (bottom of Destinations). Thanks Ben!
upgrade: if an error occurs downloading any file, display more detailed error infomation. Also, upgrade now opens a temp file next to the current HB binary before doing any downloads to avoid the situation where the download works but the HB binary can’t be upgraded. If the upgrade is in a cron job and fails, maybe because of a permission problem, it can cause a download loop.
backup: the dedup table grows dynamically up to a limit set with the dedup-mem config option or the -D backup command line option. When dedup is enabled, backup displays a percentage of how full the dedup table is at both its current and max size. But other commands using the dedup table may not know its maximum size and displayed a confusing percentage. Now only backup displays both percentages.
get: exception handling was mistakenly left disabled for development and is now enabled again. When an exception occurs during restores, get is supposed to show an error message for the file affected and continue the restore, but instead it was halting. Thanks Ian!
get: when restoring files over 500M, get tries to report progress every 1%. For block sizes > 4M (a recent feature), this could cause a divide by zero exception. Thanks Ian!
upgrade: the RSA key used to verify new versions of HashBackup has been upgraded to RSA 4096. For the rest of 2019, #2295 and below can still use the upgrade command and will use the old RSA key to verify the download, while #2298 and up will use the new RSA key.
previously, new releases of HB were posted to the hashbackup.com Download page and also to the upgrade server. To make automated deployments easier, an installer or "boot binary" that does not change from release to release is now posted on the Download page. The boot binary is run just like the regular HB command, but the initial run will do an hb upgrade, replacing the boot binary, then execute the original command. The boot binary has the public key built in and does RSA 4096 verification of the latest version downloaded from the upgrade server, just like a regular hb upgrade.
To verify the sha1 of the boot binary from a secure server, use: https://sites.google.com/site/hashbackup/download
mount: the inode cache was increased from 10K to 100K entries to handle large directories better. When this cache is full, mount uses around 275M of RAM. The size of this cache should probably be configurable with a mount command line option.
count: with -n, count doesn’t read file sizes to make the scan faster. But if --shards is used, count should still show how files are divided between shards - just not the sizes.
export: fix and speed up log file pathname mapping and try to map deleted pathnames to ??. Some deleted pathnames might stay visible in exported log files because they are hard to distinguish from regular text. Thanks Ben!
mount: in #2062 (released as #2077), a bug was fixed where deleted files could be accessed even though they did not appear in a directory listing. This fix had problems when -f was used: an ls inside a mount was sometimes missing files or displaying files deleted in an earlier version. This is fixed, and mount -f can now read large directories 9x faster (100K entry directory with 30 full versions). Thanks Dan!
s3: if more than one s3-like destination was configured and they were using multipart uploads, uploads in one destination would block uploads in another destination because a lock inside HB was configured incorrectly. Now multiple S3 destinations upload in parallel, as expected.
s3: if multipart uploads are enabled, HB tries to abort previous MP uploads during initialization. It was not using prefixes to list previous MP uploads, so if restricted S3 permissions were used, access had to be given to the entire bucket for the "list multipart uploads" S3 request. Now this permission can be limited to the prefix (dir) being used. Thanks Alexander!
backup: a large directory of tiny files was taking 3m30s to backup and using 950MB of memory because of a bad interaction with the memory manager. Using a very large block size made it worse. Now the same backup takes 34 seconds and uses 82MB of memory.
export: changed to improve privacy of exported databases:
if an admin passphrase is set, it is verified before export
all pathnames are mapped to numbers. An ls of an exported database looks like this:
Backup directory: /hb/export Passphrase? Loading /hb/export/hb.db.0 Verified signature Verified hb.db signature Most recent backup version: 0 Showing most recent version, use -ad for all / (parent, partial) /1 (parent, partial) /1/2 (parent, partial) /1/2/3 /1/2/3/2580 /1/2/3/2581 /1/2/3/2582 /1/2/3/2583free and partial pages are removed from the exported database since they may contain remnants of removed or changed pathnames
all log data is included with the export, with pathnames changed to numbers
the admin passphrase is removed on the exported database. It is stored as a salted hash, and on the customer support side, always had to be removed to view the backup config.
sharding: the partitioning method changed yet again, to divide files more evenly into shards. The goal is have close to the same number of files in each shard; files can’t be spread evenly by size because file sizes constantly change. Partitioning by size would cause files to move between shards, making incremental backups impossible.
count: for statistic purposes, directories are now counted as a zero-size object. Several other minor changes.
rm: if a packing operation was interrupted at exactly the right time with either cache-size-limit set to -1 or an arc file that wasn’t on any destination yet, the backup data in an arc file could be lost. It’s not clear this ever actually occurred with a customer backup, but it was verified with a forced test case. Thanks Israel!
the "dir" destination type did not handle missing arc files well when selective download was used. It went through the retry cycle then stopped, which caused selftest to think that all subsequent arc files in the backup were bad, which caused it to delete them all if --fix was used. Now when an arc file is missing on a dir destination, --fix only deletes the blocks in that arc file. Thanks Israel!
ls: when looking for a file, the default is to check only the current version. For deleted files, nothing is displayed and it looks like the file is not present. The file is present, because otherwise ls says "Not found in backup", but that’s a subtle difference. Now a note is added to use -ad to check all versions; then deleted files are displayed.
backup: when a trailing slash was used on a command line pathname, backup did not always descend into the directory when -X was used. This is a recent bug caused by the changes to add -F, read pathnames from a file. Thanks Dan!
selftest: a recent change to make selftest work on read-only databases had the unintended side-effect of causing an UnboundVariable traceback if any error occurred in selftest. Thanks Ben!
The count command may be useful for setting backup options and estimating the minimum time an incremental backup will require. There are other Unix tools that do similar things but count’s real purpose is to serve as a proving ground for parallel directory scanning and other future optimizations to improve backup speed.
rm: in some situations, rm would pack arc files but instead of removing the original unpacked arc files from remote storage, it only flagged them for removal and they would be deleted in the next backup. Now the remote arc files are removed at the end of rm.
retain: previously either -t or -s could be used, but not both. Now they can be used together. -t retention occurs first, then files are retained according to -s. So for example:
-t30d -s7d4w12m
means to keep all files backed up in the last 30 days, then keep 7 daily backups after that, 4 weekly, and 12 monthly. -t has the effect of shifting or delaying the schedule, so in this case without -x, the deleted file time would be 13 months: 12 months from the -s schedule then delayed another 30 days by -t. Thanks Leo!
retain: previously -x was required if neither -t nor -s were used. But this prevented using -m by itself to limit the number of copies. Now -x is not required, and the default is to keep all deleted files. Thanks Alexander!
retain: with -v, retain was not showing the correct backup date when deleting empty directories.
retain: -x10000y halted with "-x must be ⇐ -t" because of an overflow problem. Thanks Alexander!
with a Dir destination type, the dest.conf rate limit was not honored for files smaller than the rate. For example, if the rate was 10M, for 10MB/s, files less than 10MB were copied at full speed because of a rounding error.
this version of HB will upgrade databases dbrev 23 or later database to the current version, dbrev 32. Dbrev 23 is circa Nov 2016. If the database is older than that, it has to be upgraded with an older version of HB first, then again with the current release. This is now detected and explained when a very old database is accessed instead of giving an import error.
B2, WebDAV: a library used by HB was updated in #2247 and displayed a long warning on some system about SNIMissing. This was an HB build problem and has been fixed.
Mac: on old versions of OSX like 10.6.8, Snow Leopard, HB has high CPU usage during file uploads. This is apparently an OS issue (a "busy wait") that doesn’t occur on newer versions of OSX like 10.9, Mavericks. To prevent this, add timeout None to dest.conf. However, doing this can cause HB to hang if there is a problem communicating with an unreliable remote, so it isn’t a perfect solution.
rm: when files are deleted from the backup, the "formula" for reconstructing the files is immediately deleted from the database, making it impossible to restore the files using HashBackup. However, the files' encrypted data blocks, stored in arc files, are not deleted until the next pack operation. This may make it difficult to say "yes, your confidential data has been removed". A new --secure option repacks all affected arc files to remove deleted blocks, ignoring the pack-xxx config settings and limits.
backup: a new -F option specifies sources of pathnames for files to backup. This can be used to avoid scanning through an entire filesystem to find new, modified, or deleted files. There can be more than 1 source after -F. Each source is:
pathname of a file containing pathnames to backup, 1 per line. Blank lines and lines beginning with # (comments) are ignored. Pathnames in the file can either be individual files or whole directories to save. If a pathname doesn’t exist in the filesystem, it is marked deleted in the backup.
pathname of a directory, where each file in the directory contains a list of pathnames to backup as above
an error message was displayed that the file couldn’t be found but the file stayed in the backup. Now the file is marked deleted in the backup. This allows sites to add deleted pathnames to a -F file list to get them removed from the backup without scanning the filesystem.
backup: previously, command line pathnames that were files rather than directories were saved in every shard. For example, backing up *.c would save all of the C files in every shard. Now these are divided up among the shards so each file is only saved in 1 shard. Pathnames listed in -F files are also divided among shards.
Command line and -F pathnames are also sampled, so backing up *.c --sample 10 will only save 10% of the C files.
selftest: if a backup is on a read-only filesystem or the database is set to read-only, selftest no longer fails at the end with a traceback when the incremental selftest markers are updated.
recover: now works with sharded backups. The recovery procedure is similar: copy key.conf and dest.conf to an empty directory, run recover. This recovers the main backup directory. Then run recover again (recover tells you this) to recover all shards in parallel.
compare: backup’s --sample option was added to compare so that compare works with sampled backups
backup: shards were mistakenly backing up each others' backup directories. This caused major performance problems if cache-size-limit was not set and all backup files were local.
dest: all of the dest subcommands (clear, load, setid, unload, etc) work with sharded backups.
get: restoring a Mac/OSX or BSD backup to a Linux system could cause a traceback when ACLs are used:
Traceback (most recent call last): File "/hb.py", line 149, in <module> File "/get.py", line 2080, in main File "/get.py", line 846, in restoreobj File "/get.py", line 616, in restoredir File "/get.py", line 932, in restoreobj NameError: global name 'acl' is not defined
Now this fails in a better way: since ACLs are not portable across platforms, HashBackup issues this error on every file or directory with an ACL and then continues the restore:
Unable to set ACL: this system does not support ACLs: (pathname)
File data is still restored even if the ACL cannot be set.
when an older backup is accessed with a newer version of Hashbackup an automatic database upgrade might be necessary. If the key has a passphrase, it is only queried once now instead of once for every version upgrade.
Amazon S3: the class keyword previously required a value of either standard or ia. It still accepts those, but also any other value used here is passed directly to AWS, so you can use onezone_ia for example. The value is uppercased as required by AWS. Thanks Alex!
init: a new --shards N option will create backup subdirectories s1 to sN for each shard beneath the main backup directory. Sharding allows multiple backups to run in parallel, automatically dividing files between shards.
Shards are setup so that they share the same key.conf, inex.conf, dest.conf, and cacerts.crt files, located in the main backup directory. Modifying any copy of these files takes effect in all shards.
dest.conf is somewhat special because each shard needs a separate directory to store its backup, so whenever {shard} is seen in dest.conf, it is replaced by the shard number. See further down for more details.
A new directory "sout" is created for sharded backups. It saves the output from each sharded command and is displayed when each shard finishes.
Thanks Ian for inspiring sharded backups.
config: a new config option, shard-output-days, is the number of days to keep shard output in the sout directory. The default is 30 days. Set to 0 to keep all shard output.
shards: in this release, most commands on a sharded backup directory will automatically start a process for each shard. Output from each process is buffered in a file and displayed in order after each process finishes. The log prefix works (hb log backup …). There are still rough edges:
output should not be buffered in memory so that files in sout can be checked for job progress or problems
commands that need input may halt or ask for each shard
the prompts for input are not always displayed
mount & recover do not work, recover being a necessity
HB will start a process for each shard if the backup directory was created with the --shards option to init.
To manually start shard backups, eg, with GNU parallel, the shard is specified with -c backupdir/sN. For example, if the init command was -c backupdir --shards 5, then the backup command for shard #1 would use -c backupdir/s1, etc.
backup: when displaying the percentage of files being saved, backup would display 50% for the 1st shard then 100% for the 2nd shard in a 2-shard setup. It should have displayed 50% for each shard.
config: previously, new admin passphrases were entered on the command line. This is not secure since command lines can be observed. Now the new admin passphrase is asked for with getpass(), echo is turned off, and it is entered twice for verification.
HB will look in the environment variable HB_ADMIN_PASSPHRASE when an admin passphrase is needed. This should NOT be used like this:
$ HB_ADMIN_PASSPHRASE=jim hb rm ...
because that command is visible to other users. Instead, do this:
$ HB_ADMIN_PASSPHRASE=`cat pwfile` hb rm ...
where pwfile is secured and contains the password, or export the environment variable in .profile and make sure it is protected. Storing passwords is usually a bad idea, but if it’s necessary for automation, this is one of the less bad ways to handle it. On modern Unix systems, environment variables are not visible to other non-root processes.
config: config changes made to the main backup directory are propagated to all shard subdirectories. Trying to change the config settings on an individual shard will throw an error.
dest.conf: for sharded backups, each shard must have its own private storage area similar to the way each host must have its own private storage area. This is usually done by using the Dir keyword with a destination. For sharded backups, the special string {shard} in dest.conf is replaced by the shard number. For example:
Dir hostname/s{shard}
will store backups in hostname/s1, hostname/s2, etc. {shard} is replaced anywhere it occurs, not just on Dir keywords, so it can be used with rsync, rclone, and other destinations that do not use the Dir keyword.
compare: fixed a traceback (since #2197)
get: fixed a traceback when --delete was used (since #2197)
HashBackup is written in Python. This release updates the Python environment used to build HB. No performance changes have been noticed. The main reason for the Python update is to be able to use newer versions of OpenSSL.
HashBackup uses OpenSSL for secure network connections. Previously OpenSSL was dynamically loaded at runtime on Mac OSX and statically loaded on Linux and FreeBSD. Now it is statically loaded on all platforms and can be updated independently of OS updates, allowing HB to maintain security when running on older systems.
backup: if --maxtime is used and one ever-changing file took longer than maxtime to backup, HB would get stuck backing up this file instead of proceeding to the next, ie, it started on the checkpoint rather than after the checkpoint. From a thought experiment and verified with a test case.
backup: the inex.conf file is normally saved in every backup, but sometimes if --maxtime was exceeded, inex.conf was not saved.
backup: the --sample and --shard options select some files for backup and reject others. Rejecting files is now more efficient: for a test directory on SSD of 10K files, --sample 10 is twice as fast. The selection method changed again so this release will not sample or shard the same as the previous release. This should be the last time it changes.
Amazon S3: in #2139 the S3 endpoints were changed from IPv4 to the "dual stack" endpoints to enable both IPv4 and IPv6. For some regions this worked fine, but for others it caused a 400 Bad Request. There doesn’t seem to be a pattern of where it worked and where it failed. For now this change has been reverted to IPv4 endpoints. If you understand why some regions worked and some didn’t, please send an email. Thanks Scott!
backup: as an experiment, the --sample option had 2 forms: either a single percentage, meaning the percentage of files to sample, or a percentage range used to partition the backup. This option now only takes a single percentage and the range option has been removed.
There was also a bug in the sampling feature, where the sampling could sometimes be very uneven. For example, in a 100-file directory, backing up with --sample 1,33 should have saved around 33 files, but instead only saved 2. --sample 67,100 should also have saved around 33 files, but instead saved 66. In this release the sampling is more uniform.
backup: EXPERIMENTAL: a new --shard option replaces the sample range option. It takes 2 integers, "my shard id" and the total number of shards, separated by a slash. For example, --shard 1/3, --shard 2/3, or --shard 3/3. This allows partitioning a huge backup into independent sections and running the backups in parallel using separate backup directories. This option is only for experimentation and evaluation and is not likely to stay in this form as it becomes more integrated into HashBackup. Feedback is welcome from any experiments with sharding a large backup.
rm/retain: a couple of weeks ago, a fix was made to the packing code to avoid repacking the same set of small files over and over. This had an unintended side effect that caused a hang when packing large files. This new fix will hopefully accomodate both situations.
backup was incorrectly displaying a message "Sampling 1% of backup" when no --sample option was used. It was doing a regular backup so no backups were missed because of this bogus message.
backup: --sample 10,5 was displaying a negative sample percentage instead of displaying an error message.
get: in the previous release, get used whole local files to assist restores. This works very well for restoring a user directory with lots of files but didn’t help much with restoring a large file that had changed, such as a VM image.
With the new --splice option, get can combine data from parts of local files with remote backup data to restore files, sometimes called incremental restore. This can be done even if the local files are changing, for example, a running VM image. For very large restores, splicing can use temp space in the backup directory equal to the size of the restore, and it requires reading the local files. Splicing can reduce the amount of data downloaded significantly and is often faster than other non-spliced restores. Thanks Jacob and
inex.conf: /dev/fd are open file descriptors and should be excluded from the backup as they generate errors if used with -X (cross filesystems). Thanks Harry!
selftest: a new warning was recently added about sparse files with variable-sized blocks. However, OSX and ZFS sometimes compress files and it causes these warnings, even though the backup is fine, so the warnings have been removed for now. Thanks Alex & Robert!
backup: a new --sample option can be used to sample huge backups. This is especially useful with the simulated-backup config option to determine the best settings for a huge backup, though it also works with non-simulated backups. Directory records are never skipped, which may skew the results slightly.
There are two forms for the --sample option:
The first form, --sample P, is a percentage between 1 and 100 of how many files to backup. If 5% of files are saved first, then 10% are saved 2nd, the same set of files are saved as if 10% were saved the first time. In other words, each run is not a true random sample. This allows testing of incremental backups.
The second form, --sample L,H gives a range of files to sample, where L and H must be 1-100. For example, --sample 1,10 samples the first 10% of files, --sample 11,20 samples the next 10%, and so on, with --sample 91,100 sampling the last 10%. This allows very large backups to be partitioned and done in parallel with multiple copies of HB, each using its own backup directory. In the future, this concept may be expanded to appear to the user as a unified backup. Thanks Ian!
In this initial release, local data is matched for entire files. This may be improved in future releases, for example, to allow restoring a large VM image to an earlier state using both local files and downloaded data. To compare restore plans without doing a restore, use get --plan with and without the --no-local options. Thanks Ben!
-v3 displays messages when local files are not used for restore. This can happen because the file is missing or the timestamp, size, or hash does not match the file being restored
get: before using a local file for restore, HB checks that the local file’s size and mtime match the backed-up file’s size and mtime. This is a not a 100% guarantee (but maybe 99%) that the local file data is identical to the backed-up file data because mtime can be set to arbitrary values with the utime/utimes system calls. The new --no-mtime option causes HB to compute a strong file hash for local files and compare that to the backup file’s saved hash before believing mtime and using a local file.
get: if a restore is restarted, get will restart much faster and does not download data it has already restored if files are restored to the same location. This also uses the mtime + size check for identical files. The --no-mtime option will verify the file hash if there is a possibility mtime has been altered. This is unlikely, but may be necessary for extremely high security restores.
get: a new --no-local option disables the use of local files. This can be used to compare restores with/without using local files and also can be useful if local files might change during a restore. Changing local files could cause a restore using local files to fail with a hash error. If that happens, the restore can be repeated.
get: download information is now displayed with the --plan option. Previously the -i option (confirm restore) had to be used.
get: if get detects file errors during a restore, it adds a .hberror suffix to the restored filename to make it clear there is a problem. This is better than deleting the restored file because partial data is often better than none.
get: a poor database query could cause directory restores to be slow with --orig, especially on large backups.
rm: a bug in #2139, released a few weeks ago, could sometimes cause the same small files to get packed repeatedly
this release does an automatic database upgrade when any HB command is used. The upgrade prevents access by earlier HB releases since they would not recognize new variable block sizes.
backup: #2139, released about a week ago, fixed a traceback with too-small dedup table size but also enabled dedup even with -D0. If dedup was previously disabled, backup might show this warning:
Backup block size change, full backup: <pathname>
This occurred with versions #2139-#2146 because dedup was mistakenly enabled. It may occur again with this version because the bug is fixed and dedup is again disabled.
backup: HB splits files into either fixed or variable-sized blocks during backup. Smaller block sizes dedup better but generate more blocks so the block accounting overhead in hb.db is higher.
The -B option forces HB backup to split files into fixed-size blocks. Previously only a handful of block sizes were supported, doubling from 4K to 16M. Now any block size >= 128 bytes can be used for fixed-block backup. This is useful for proprietary file formats that may have an unusual fixed block size (a Prime minicomputer disk image uses 2080-byte blocks for example), and to backup huge files with large block sizes, reducing the amount of block-tracking overhead in hb.db and making it faster to remove files and plan restores. Large block sizes > 4MB are usually a bit slower for backup because there is less concurrency.
NOTE 1: for now there is a limit of 2GB on the block size. Ensure there is enough RAM for huge block sizes. A reasonable estimate is (n+2)*blocksize, where n is the number of CPUs (or -p option). Using -p0 to disable multi-threading will use less memory and less CPU but is slower.
NOTE 2: blocks are never split across arc files. If the block size is greater than config option arc-size-limit, arc files may be larger than expected.
config: if HB decides to backup a file with variable-sized blocks, it previously used a variable block size of 32K, with an average size of 48K. Now this is configurable with the config option block-size. The possible values are: 16K 32K 64K 128K 256K 512K 1M.
When the block size is increased, dedup becomes less effective. The advantage of larger blocks is that there is less block tracking metadata: hb.db and the dedup table are both smaller. Backup may run slightly faster with larger variable block sizes, though not always and not by a lot - maybe 8%.
Larger blocks are good for backing up lots of huge files, or if there is a concern of exceeding 4 billion 48K blocks (211TB). The dedup table currently is limited to 4B blocks for efficiency (though that limit is easy to change). Using -V64K would double the backup size limit to 422TB, -V128K doubles it again to 844TB, and the max of -V1M would have a limit of 6.7PB.
Different block sizes can be used with the same backup directory by using several backup commands with different block sizes.
NOTE 1: if the default block size is changed, the next backup will save changed files without dedup if the new block size doesn’t match the file’s previous block size. A warning is displayed.
NOTE 2: If you are already using -B, changing this config option will have no effect.
backup: a new -V option specifies the variable block size. This overrides the block-size config option for a single backup.
config: the block-size-ext config option sets the backup block size for files based on the file extension. This now accepts both -B and -V. For example, the value:
-B4m .mov .avi -V64K .sql -B23K .xyz
sets a large fixed block size for large video files that will not dedup well (except if the file is duplicated), a 64K variable block size for .sql files, and a 23K fixed block size for .xyz files.
get: when cache-size-limit is set, some archive files must be downloaded for a restore. Creating a plan for the optimum download of remote data is rather complex and can be on the slow side, especially for a very large restore with a lot of small blocks, for example, a large VM image saved with 4K blocks. In this release, HB saves restore plans so that if there is a problem in the restore and it has to be tried again, the restore plan can be read in a few seconds rather than computed from scratch. Keep in mind that a restore plan is dependent on the files being restored, so a plan can only be reused if the exact same files are being restored.
get: a new --plan option creates a restore plan for a list of files but does not actually restore anything. The restore plan is saved in the backup directory. Using get with --plan can be done after the daily backup for example, so that if a restore is needed, the plan is already available. --plan is especially useful for very large restores as in a disaster recovery situation or complete restore of a very large VM image saved with a small block size. Another advantage is that reading a saved plan uses around 40% less RAM than creating the plan from scratch.
get: a new option --no-sd disables selective downloads during a restore, ie, entire arc files are downloaded. Selective download (SD) allows HashBackup to read parts of arc files and is very useful for small restores. SD also minimizes the amount of data downloaded and the local cache required for a restore. However, HB needs more time and RAM to compute a restore plan with SD. If you have fast and cheap access to remote archives and plenty of local disk space for the cache, it may be faster to use --no-sd to disable the more complex selective download plan.
rm/retain: the Downloaded: figure, showing how many bytes were downloaded for packing, was wrong if all archives were local. It was sort of related to the number of new archives created and usually very small like 2, when it should have been 0; then the Downloaded: message isn’t displayed.
rm/retain: added a message when retain is scanning files and when packing is started.
config: when an error occurred, such as trying to assign a value 'abc' to cache-size-limit, the error message was sometimes vague: int value is required Now it is much more detailed: Error converting config keyword cache-size-limit: could not convert string to float: abc
rm: recent changes to the packing algorithm require knowing how much will have to be downloaded to estimate the cost of packing. This estimate was wrong for destinations that don’t support selective download (rclone and rsync).
retain: fix traceback that could occur when not packing: TypeError: 'NoneType' object is not iterable Thanks to the automated email bug reporter.
previously, when HB displayed a number with a suffix like KB or MB, it meant *1000 or *1000000. If a file was 123456 bytes, HB would display the size as 123K. But when a suffix was typed in, either as a config option, command line argument, or dest.conf keyword, HB used the older, computer geeky multiplier of 1024, so 123K meant 123*1024*1024 or 125952 bytes (also called 123KiB). If displayed, that same number would be shown as 125KB, not 123KB.
Obviously that was confusing, so now KB, MB, etc all mean *1000 for both input and output, with these exceptions that remain *1024:
the -B backup command line option; this is a disk block size so -B4K still means 4096 bytes (usually the size of a disk block)
the block-size-ext config option, for the same reason
the -D backup command line option; this is the size of the dedup table. If the units were changed, it would cause large dedup tables to rebuild. And since it is specifying an amount of RAM to use, *1024 makes more sense.
the dedup-mem config option, for the same reason
Be aware that if you previously had an arc-size-limit of 100MB (the default), that meant 100*1024*1024 or 104857600 bytes. That’s why HB often displayed arc file sizes as 104MB. Now the limit will change to 100000000 bytes, so new arc files might display as 99MB.
If you set a pack-download-limit of 100MB, you may get warnings on old arc files because they are larger than the download limit. To avoid this, set the download limit to 105-110MB. Or, the error might go away on its own as more space is freed up in the arc file.
if the B2 application key is restricted to a specific bucket but that bucket no longer exists, an informative error is displayed rather than a 401 error
the bucket keyword in dest.conf is optional if the B2 application key is restricted to a bucket
if the B2 application key is restricted to a specific bucket and that bucket isn’t the one listed in dest.conf, an informative error is displayed instead of a 401
if a B2 application key is restricted to a specific prefix and the dir keyword is not present in dest.conf, the B2 prefix is used as the dir keyword
if a B2 application key is restricted to a specific prefix, it must be an initial substring of the "dir" keyword in dest.conf. For example, if the B2 prefix is foo (this will give a warning about a missing slash), the dir keyword must start with foo, or no accesses will work. Previously this would cause a 401 error, but now an informative error is displayed
if a bucket doesn’t have lifecycle rules, HashBackup tries to set a rule "keep only the latest version". Previously it was a fatal error if this failed, but now HB first verifies that it has permission to set rules before even trying
if a B2 application key doesn’t have deleteFile permission, HB will hide files instead of deleting them. Note: un-hiding files in the future requires deleteFile permission.
B2: there is always concern about a backup being deleted, encrypted, or corrupted by an attacker, ex-employee, or by accident. Several new features address this concern with the B2 storage service:
a B2 application key should be used that does not have the deleteFiles permission. Then even if the B2 credentials are compromised, an attacker cannot delete backup files.
the lifecycle rules on the B2 bucket should be set to preserve a number of days of history - enough to make sure that a backup problem is noticed before the history is deleted by the B2 service.
to revert to a previous version of the backup:
use the B2 web site to delete the history of the DESTID and dest.db files back to the desired recovery point
create a new local backup directory, with key.conf and dest.conf copied from a safe copy
run hb recover -c newbackupdir to recover the local backup directory
you can now continue regular backups and restores
While your backup has been recovered, it is in an unusual state because the current version of HB files are still hidden on the B2 web site. This needs to be addressed in a future update by rolling back the entire B2 history to the recovery point.
config: a new config option pack-download-limit limits the amount of data downloaded for packing in a single run of rm or retain. The default is 950MB. Several users asked for this option, especially B2 users since Backblaze gives a 1GB free daily download allowance. Thanks Vincent!
NOTE: in HB config, MB means MiB or 1024*1024, so 950MB is really 950 * 1024 * 1024 = 996147200 bytes. The limit can be set to 1000000000 for exactly 1 billion bytes, but there should be a little slop in the download limit to account for downloading DESTID (32 bytes), etc. (Update: since #2143, config inputs are multiplied by 1000, so this is no longer true.)
The pack-remote-archives True/False config option has been removed. To prevent all downloading of remote arc files for packing, set pack-download-limit to 0. For "unlimited" downloading for packing, set pack-download-limit to a high limit like 1TB.
Use hb config -c backupdir pack-download-limit 5GB to set the limit to 5GB. Packing of remote arc files stops when the download limit is reached and will continue on the next run of rm or retain.
It is recommended that pack-download-limit is not set to zero since over time, this can cause slower restore times as "holes" are created in older remote arc files by rm and retain. Instead, raise pack-percent-free to something like 95, meaning an arc file must be 95% free before it is packed. This will prevent nearly all downloading except very small arc files (less than pack-combine-min) and very inefficient arc files.
rm & retain: the packing algorithm has been improved and now packs the "best" arc files first, taking into account both the amount of data deleted in the arc file and the download bandwidth required to retrieve the file. It also combines arc files more often while packing, reducing the total number of arc files and improving restore times by avoiding tiny retrieval requests to remote backup storage.
rm & retain: during packing, rm & retain combine very small arc files (less than pack-combine-min, default 1MB) into larger arc files. When this was first implemented, HashBackup didn’t support selective download and had to download entire arc files to retrieve small blocks of data. The config setting pack-combine-max was used to limit how large the new combined arc files could be.
Now that HashBackup supports selective download, it is not as important to limit the size of a combined arc file and the pack-combine-max config option has been retired. Instead, the arc-size-limit config option controls the maximum size of a combined arc file as it does with backups.
previously the minimum local arc cache (cache-size-limit) was:
2 * (arc-size-limit + 10MB)
or roughly, 2 arc files. But if arc-size-limit was set higher in the past, say 1GB, arc files were created, and then arc-size-limit is set lower to 100MB, this could cause problems during remote packing if cache-size-limit was set very low because the larger arc files would not fit into the cache. Now instead, the minimum cache size is:
2 * max(arc-size-limit + 10MB, largest existing arc)
This might require a bit more local cache space if the backup contains arc files larger than arc-size-limit.
retain: --dryrun could sometimes modify the database slightly, causing a database upload. It also could pack arc files, which should not happen with --dryrun.
rclone.py: if -v or --verbose was used with --args, rclone failed because the rclone.py script also adds -q and the two can’t be used together. Now rclone.py doesn’t add -q if either -v or --verbose is used. Thanks Chris!
NOTE: get rclone.py from http://www.hashbackup.com/destinations/rclone
could cause rm/retain to hang while trying to download the arc files to pack. Running another backup (or dest sync) would usually correct the problem, but it’s now fixed in rm/retain. Thanks Frank!
stats: the "file bytes currently stored" statistic could become negative if hard-linked or sparse files are removed from the backup. This statistic is no longer displayed until it can be corrected. Thanks Frank!
mount: when reading from the backup mount via a destination that doesn’t support selective download (like rclone or rsync), entire arc files have to be downloaded. There was a race condition causing the same arc file to be downloaded more than once. If the destination had multiple workers, they could "step on each other" while trying to download the same file, causing various errors. Thanks Ben!
S3: both IP4 and IP6 S3 endpoints are supported as well as some new S3 regions: ap-northeast-3, cn-northwest-1, and eu-north-1.
dest verify: on systems with small RAM and swap (the test system had 512MB of RAM and no swap), a dest verify command with an rclone destination could display an endless loop of errors like this, where "drop" is the destination name:
drop[26482]: error reading ls output from <backupdir>/drop.lsout.tmp: drop[26489]: error reading ls output from <backupdir>/drop.lsout.tmp: drop[26482]: error reading ls output from <backupdir>/drop.lsout.tmp: drop[26489]: error reading ls output from <backupdir>/drop.lsout.tmp: drop[26482]: error reading ls output from <backupdir>/drop.lsout.tmp:
The problem was an inability to allocate 1GB of RAM for a read buffer that didn’t need to be nearly that large.
Box does support FTP access for enterprise accounts, but they say in bold type that they do not recommend using this on a regular basis.
If you have been using HashBackup with Box, you may want to migrate your backup to another storage service or reconfigure dest.conf to use the rclone destination. HB does not support the native Box API, so after the deadline, the only way to access Box storage would be to use HB’s rclone destination.
See https://rclone.org/box to configure rclone for Box.
See http://www.hashbackup.com/destinations/rclone to configure HashBackup with rclone.
dest.conf: if the destname keyword had no value, an error: AttributeError: 'NoneType' object has no attribute 'lower' was displayed. A better error message is now displayed Destination name required at line x in dest.conf Thanks to the automated email bug reporter.
backup: when doing a multi-threaded backup, specific I/O error codes were handled correctly, but unexpected I/O errors caused the backup to abort. NTFS-3g on Linux sometimes raises error 75 "Value too large for defined data type" on compressed NTFS files that have garbage at the end, causing HB to abort. A Linux "cp" command gets the same error, and it happened with multiple files, so this is likely a bug in NTFS-3g.
Now when a read error occurs, an error message is displayed for that file, the file is flagged as partially backed up, and the backup continues. The next backup will again try to save the file, probably get the same error, etc. To permanently correct the NTFS file you can make a copy and rename it to the original though it’s not clear whether this results in data loss.
Backblaze B2 recently launched application keys, a feature that allows one B2 account to have multiple keys, each with its own permissions. Previously a B2 account had only an account id and master key, which HB supported with the accountid and appkey keywords in dest.conf.
Unfortunately, calling these new keys "application keys" is a bit confusing since that is also used earlier to describe the master application key.
In dest.conf, B2 storage accounts can now be accessed in two ways:
Use the accountid keyword with the B2 account id, and the master application key with the appkey keyword. The account id is a 12-digit hex string.
Use the B2 website or B2 Python utility to create an application key and set the permissions (capabilities) of the key. You will get back a key id and application key string. Use the keyid and appkey keywords in dest.conf with these values. The keyid is a 25-digit hex string. The appkey is 31 symbols that (for now) seems to start with K.
IMPORTANT: when restricting a B2 application key to a certain prefix for use with HashBackup, make sure that the prefix you use when creating the application key has a trailing slash. If a trailing slash is not used, then for example, a prefix of "a" would allow access to any files or pseudo-directories in the bucket that begin with the letter a, but you may have intended to restrict to a single pseudo-directory named "a". In this case HashBackup will display a warning:
b2(b2): warning: B2 restricted key prefix matches dir keyword but without trailing slash: (prefix)
B2: when the debug keyword is used, HashBackup now dumps the JSON response on every request for easier troubleshooting. Previously it only dumped the JSON response in some situations.
the ssh destination does one connect per worker instead of one connect per file transfer. With a local ssh server (3ms ping time), this doubles the speed of remove and short file transfers since connection overhead is eliminated. With remote ssh servers, the speed improvement is even more visible because remote ssh connections can sometimes take up to a few seconds to finish. Thanks Tadas!
rm: in yesterday’s version, rm wasn’t using the correct cache size if cache-size-limit was less than 2 * arc-size-limit, eg, zero. This limited combining arc files more than necessary.
ssh/sftp: previously, the ssh destination could have a type keyword (in dest.conf) of either ssh or sftp, and they were exactly the same: both used only the sftp command to talk to the remote storage server.
In this release, a type of sftp is the same as before, using only sftp. This is sometimes desirable for security purposes, if the ssh server is configured to only allow certain commands like sftp.
A type of ssh now has 2 new features not supported with sftp:
files are sent using the local scp command instead of sftp, so upload bandwidth can be limited with the rate keyword in dest.conf. The sftp command does not support upload rate limiting. scp may also be slightly faster because it doesn’t do buffer acking.
selective download is supported, allowing HB to download only the data it needs from remote arc files instead of downloading entire arc files. This is a much more efficient use of network bandwidth and local cache space, especially when large arc files are used and a restore is comparatively small.
backup: a new config option, block-size-ext, sets the backup block size for specific filename extensions (suffixes). For example:
hb config -c backupdir block-size-ext '-B4M mov,avi -B1M mp3'
Commas and dots are optional. Thanks Jacob!
the disk full check that was recently removed (Feb) is back, but with a slight twist so that it doesn’t interfere with draining the cache like it formerly did. HashBackup will halt when a new arc file is created if there is less than 2 x arc-size-limit bytes available in the local backup directory. This allows enough room for 2 arc files: one transmitting while another is created.
get: if cache-size-limit is set, an arc file needed for a restore is not local, the arc file is >= 4GB, and the portion of the arc file needed for the restore is after 4GB, get would fail like this:
error: 'I' format requires 0 <= number <= 4294967295
The backup is fine; this was a bug in the planner. Thanks Jacob!
get: when cache-size-limit is set, get could sometimes leave a few arc files in the cache that should have been removed
dest.conf: if the workers keyword was used with no value it caused an uninformative traceback error:
TypeError: int() argument must be a string or a number, not 'NoneType'
Now it gives an informative error message:
int() argument must be a string or a number, not 'NoneType'; Expected integer for dest.conf keyword: workers
This bug was reported by the automated email system.
config: backup sometimes will create arc files slightly over arc-size-limit. If a user sets arc-size-limit to 4GB, it’s very likely that at least one arc file will be slightly over 4GB. If any arc file is over 4GB and cache-size-limit is set, the restore planner needs twice as much memory to create a plan. To prevent this RAM doubling, HB will adjust arc-size-limit down slightly when arc-size-limit is 4GB so that arc files are never bigger than 4GB.
rm/retain: when arc files are packed to remove empty space, rm may combine multiple archives into one larger archive. Downloading multiple arc files may exceed cache-size-limit, causing rm to hang. Now rm will combine a smaller group of arc files that does fit into the cache rather than locking up. Thanks Ben!
config: pack-combine-min can be set to zero to disable combining small arc files into larger arc files.
if cache-size-limit was set to 10GB, the local backup directory contained 10GB of cached arc files, and the cache-size-limit was then lowered to 1GB, the next backup would leave the cache at 10GB for the duration of the backup and only trim it down to 1GB when the backup finished. This was an intentional design decision: if the cache is already at 10GB, why not use it all and trim later?
But this was confusing for users: if the local backup directory disk was nearly full (hence cache-size-limit was lowered), it’s reasonable to expect HB to lower the cache size as soon as possible rather than wait until a completed backup. The cache also was not trimmed down to the lower size if the backup was interrupted.
Now, HB will lower the cache size when the next backup starts rather than when it finishes. Of course, if there are arc files in the local backup directory that need to be sent to remotes, they will delay the cache trim.
if cache-size-limit was set, a new destination was added, and the new destination’s transfer rate was slower than the old destination’s tranfer rate, backup (and sync) were not respecting the cache size limit. Arc files were downloaded as fast as possible from the source destination, possibly filling the backup directory’s disk. Now cache-size-limit is respected during a remote-to-remote copy. This bug was recently introduced when the cache handling was rewritten to allow cacheing arc files for mount.
backup: if cache-size-limit was set to 5GB then reduced to 2GB, backup failed to reduce the cache size to 2GB. Thanks Tadas!
selftest: if the local copy of an arc file is the wrong length, selftest is supposed to print some diagnostics. But a recent change added a length test at a lower level and caused this traceback, preventing selftest from executing:
Traceback (most recent call last): File "/hb.py", line 191, in <module> File "/selftest.py", line 469, in main File "/rm.py", line 134, in proginit File "/arcs.py", line 314, in init Exception: /media/backup/arc.253.0 should be 103378368 bytes, is 101257216 bytes
This check has been removed so that selftest can display:
Checking arcs I Error: arc.253.0 size mismatch on local file: db says 103378368, is 101257216 NOTE: arc.253.0 is correct size on <destination> 1 errors Checked arcs I
backup: previously HB checked to see if there was room for at least 2 arc files before opening a new arc file and raised an error if the disk was nearly full:
Backup directory is nearly full, disk has only 177 MB available
This causes problems when the backup disk is nearly full and the cache has been resized down, so the test was removed. Now HB fails with a write error when the disk is full.
selftest: if an arc file is missing, display its size to aid troubleshooting, for example:
Error: arc.4.22 1121232 bytes is not local nor on any active destination
S3: HB uses the boto library to access S3. GCE (Google Compute Engine) VM environments have a special /etc/boto.cfg that loads Google-customized Python code into boto (and hence HB). This fails in various ways, often causing a seg fault or hang. Now, HB only loads boto.cfg from the user’s home directory. Normally it will not exist, but if you want to make changes to boto config settings, you can create a boto.cfg file in your home directory. Thanks Jose!
B2: when a file isn’t found, display the B2 fileid, expected SHA1, and expected size to aid troubleshooting. Thanks William!
backup: if a path listed on the command line changed from a directory to a non-directory, it could cause this traceback:
Traceback (most recent call last): File "/hb.py", line 109, in <module> File "/backup.py", line 3077, in main File "/backup.py", line 773, in addallpaths File "/backup.py", line 2152, in backupobj File "/backup.py", line 362, in addlog IndexError: No item with that key
The backup is fine, just re-run it. Thanks Jacob!
S3: if a bucket contains "subdirectory" backups that use the Dir keyword and also has a backup at the bucket’s root that does not use the Dir keyword, a dest verify on the root backup listed the entire bucket contents instead of just the root contents. It still worked, just not as efficiently.
DAV: on WebDAV destinations with copy-executable set to True (the default is False), the hb#xxxx executable was being copied as just "hb" because the pathname was not url quoted. Ie, only the latest verson was stored on the remote. Related bug: when verify checked the hb#xxxx files, it would say they were all verified when actually there was only one "hb" file on the remote. Thanks Mehmet!
dest verify: a bug was introduced in #2082: if more than one destination was configured only the last was being verified.
backup: when --maxwait or --maxtime were used and the wait time limit was exceeded, this message was displayed:
Warning: maxwait exceeded before all archives were copied
but then a traceback would occur: AttributeError: 'module' object has no attribute 'close'. The backup is fine, though one file is not sent to the remote and will be sent on the next backup or sync.
with newer versions of OSX, a traceback could occur when upgrading the database of an older backup:
Traceback (most recent call last): File "/hb.py", line 109, in <module> File "/backup.py", line 2595, in main File "/db.py", line 286, in opendb File "/db.py", line 556, in upgradedb File "/up29/dbup.py", line 61, in upgrade File "/up29/db.py", line 149, in opendb File "/up29/misc.py", line 409, in maxmem KeyError: 'pages_free'
S3: HashBackup uses multipart upload to allow multiple workers threads to upload a single large file for better performance. This creates temporary files on S3 during the upload and S3 deletes them when the upload finishes. However, if the upload does not complete for some reason, these temporary files stay in your S3 account, you get charged for them, they don’t ever go away, and they don’t show up anywhere - not even in a listing of your S3 bucket.
To keep your costs down, HashBackup cleans up these temporary files the next time backup runs. But when multiple backups are stored in one S3 bucket using the Dir keyword and are run concurrently, a backup going to dir 1 could mistakenly abort an active multi-part upload to dir 2. Now HB is more careful about cleaning up only multipart uploads belonging to the current backup. Thanks Scott!
S3: if Dir / was used in dest.conf, meaning to put files at the bucket’s top-level as if there were no Dir keyword, it worked for backup but dest verify would fail: it said files didn’t exist and re-uploaded all of the backup files.
ssh, ftp, rsync dest verify: previously files were verified one-by-one with these destinations, but are now verified as a group. With 8 workers and 5500 arc files, ssh verify was taking about 3 minutes over a local LAN connection, maxing out an 8-core box and causing a heavy load on the ssh server. Now the same verify takes 3 seconds. Rsync and ftp destinations have similar performance improvements. Backups on servers over "real" Internet connections will also take just a few seconds to verify now, whereas before, this same test with 5500 arc files could have easily taken 30 minutes.
ssh: removing files from ssh destinations is 2x faster
ssh, ftp: previously the ssh and ftp destinations would only create a single subdirectory from the Dir keyword because of limitations in those tools' built-in mkdir command. Now HB issues extra commands if necessary so that a full directory path can be created (a/b/c).
dest verify: when a backup is interrupted, some arc files may have been sent to destinations but are not "committed", ie, they’re not really part of the backup yet - this is normal. The dest verify command was verifying these uncommitted arc files, then removing them all. Nothing wrong with that, but it’s very confusing. Now the verify command doesn’t verify uncommitted arc files.
clear: if any destinations fail to start, the clear command warns that halted destinations won’t be cleared.
Happy New Year and good luck to everyone in 2018! Send an email if you have a great HashBackup success story to post on the Customer web page.
mount: prefetching is enabled again. This only has an effect when cache-size-limit is set, ie, backup data has to be downloaded.
On a small 512M single-CPU test VPS, S3 performance reading a 200MB file with dd using an HB mount (8 workers in dest.conf) increased from 2.1 MB/s to 35 MB/s - about 17x faster. Google Storage is about 11x faster (3.8 MB/s to 41 MB/s), Backblaze B2 is 2-3x faster (4 MB/s to 10 MB/s).
The mount prefetcher stays inactive on random reads to keep download costs low.
The mount -B option can be used to increase the block size when downloading. This usually increases performance but can have the potential negative effect of increasing download traffic and costs for random reads. Be sure to test whether -B increases performance and/or costs for your usage.
For B2 in particular, using -B4M increased mount sequential download performance from 10 MB/s (8 workers) to 22 MB/s (4 workers). Using -B32M increased performance even further, to 40 MB/s (4 workers). B2 has higher latency (requests take longer to process), so making each request larger is more beneficial for B2 than the other storage services. The downside of a large -B is it’s likely wasteful, slow and expensive to download 40 MB on every random read. Keep in mind that sequentially accessing a file deduped across many versions via an HB mount might actually cause a lot of random reads behind the scenes, so you don’t want -B too high.
S3: added Amazon’s new Paris region, eu-west-3, to HashBackup
mount: as retain removes old files from the backup, it’s possible that a version becomes completely empty. These empty versions should probably be deleted by automatically rm/retain. In the meantime mount will ignore them instead of having a lot of empty top-level directories in the mount.
mount: if a file existed in r2 but was deleted in r3, it was correctly missing from a r3 directory list. But a cat command from r3 would display the r2 file instead of giving a "No such file" error.
get: if a file was saved in r2 and deleted in r3, get -r2 gave an incorrect error that the path wasn’t in r2, then showed that it was in r2, like this:
Path is not in version 2: (pathname) Versions of this file: 2017-12-18 19:16:12 in version 1 2017-12-18 19:16:43 in version 2-2 2017-12-18 19:17:30 in version 4
The backup is fine, this was a get bug. The error is correct for get -r3, except version 2-2 should just say 2, and now it does.
rekey: -k ask and -k env now cause errors, because it’s very likely that -p ask or -p env is what you really want
a recent change could cause a "Backup directory is locked" error during db upgrades
mount: prefetching was added in the previous release but could hang in certain configuations so has been disabled for now. Use the mount -B option to improve mount performance by reading larger blocks.
selftest/rm/retain: for these commands selective download is optimized to lower cost rather than maximize performance
get/mount: the --cache option can be used when cache-size-limit is >= 0 (backup data is not local) to specify a different directory to be used for holding downloaded backup data instead of the backup directory. The --cache option cannot be used if cache-size-limit is -1 (the default) because no data is downloaded: it is referenced directly from the backup directory.
Use --cache when:
a) there is not enough room in the backup directory to hold a large cache. For example, the backup directory might be on a small SSD but a large cache is needed for a large restore, or you want to maintain a very large mount cache to improve performance.
b) you can’t or don’t want to lock the backup directory for a restore, either because a backup is running or you want concurrent restores
The option is --cache cachedir[,lock] where cachedir is the directory for the cache and ,lock is optional. If ,lock is omitted, the main backup directory is not locked. If ,lock is added, the backup directory is locked.
Adding ,lock allows arc files already in the backup directory to be used in the restore or mount without downloading again. If ,lock is not used, existing arc files can be used only if the cache directory is on the same disk as the backup directory. When the backup directory is not locked, arc files can disappear at any time and get & mount can’t handle that.
The cachedir directory is always locked. For concurrent restores or mounts, specify a unique cache directory for each, maybe using in the cache directory name ( is replaced by the process id).
If cachedir already exists, it will remain after running HB. If it doesn’t exist, HB will create it then delete it when finished.
mount: a new --cachesize option will expand the mount cache beyond the normal cache-size-limit. Example: --cachesize 1.5G The default cache size is the larger of cache-size-limit or 2 x (arc-size-limit + 10MB).
the span.<pid> temporary directory created for get & mount when cache-size-limit is set is now span.tmp and is deleted whenever the backup directory is locked if it was not previously deleted.
mount: when reading files from an HB mount, backup data is downloaded in the background before it is needed (readahead) to improve performance 3-4x in some cases. Keep in mind that restoring a large file with the get command is still 25% faster than copying it from an HB mount with dd, cp, or rsync, and restoring a directory of small files is 3x faster with get. Mount cannot predict which files will be accessed in the future so readahead does not help much with many small files.
backup: two larger block sizes -B8M and -B16M are available for special backup situations, like huge backups of huge files
get: a race condition could randomly cause this error message just before exiting:
Unhandled exception in thread started by sys.excepthook is missing lost sys.stderr
mount: selective download is now used for destinations that support it. For more details about selective download, see #2035’s changelog entry. Readahead is not implemented yet so performance will be slower for large sequential access. The mount command is easy and convenient, but for large restores when not all arc files are local (cache-size-limit is set), the get command will be faster.
mount: a new option -B specifies the minimum download block size. Higher values will increase throughput but may also increase latency and download fees because extra data may be downloaded that is not needed. The default is 64K, which tends to minimize download fees.
mount: previously mount downloaded whole arc files without respecting the limit on the size of the local arc cache, cache-size-limit. Now it will try to respect the limit though it will temporarily increase the limit if any single download is bigger than cache-size-limit. When the cache is full, the oldest items are discarded. If they are referenced again, they will be downloaded again, so it’s important that cache-size-limit is set to a reasonably large value if you use mount a lot.
get: selective download is now used during restores for destinations that support it: S3 & compatibles, B2, Rackspace Cloud Files / Open Stack, WebDAV, Dir, and FTP. Previously when HB needed data from a remote arc file for a restore, it downloaded the entire arc file then picked out the data it needed. With selective download HB only downloads the data it needs. This saves local restore cache space, decreases restore time, and decreases download data and fees.
Selective download is most noticeable on: - backups with a lot of versions - backups using large arc files - files / directories with a high change rate - files / directories with a lot of dedup between versions - restoring just a handful of files / directories
IMPORTANT: expect an increase in the number of storage requests / operations during a restore. This extra cost is more than offset by the decrease in cost because less data is downloaded. The new restore will never cost more than the old restore.
EXAMPLE 1: an active HB dev directory that is 2 years old, with 4600 files (200MB), has daily changes saved in 315 versions using 1GB arc files on a USB2 drive on a 2010 Core 2 Duo system. The backup is packed regularly, so many of the arc files are smaller than 1GB. "cache" is the local disk space needed to perform the restore.
Restore with old version (#1983): download: 7.4 GB cache: 4.4 GB time: 3m 19s
These examples are "downloading" from a USB2 drive at 31MB/s, similar to a 250MBit/s Internet connection. Internet download rates are usually a lot slower, so selective download will make an even bigger difference with restores from cloud storage services.
During restores, a temporary cache directory spans.<pid> is created in the backup directory. HB deletes this after the restore.
This release is compatible with #1977 or later so you can run your own restore comparisons between versions of HB.
stress tests with 100K arc files revealed slow O(n^2) performance in several places:
the "Checking arcs I" phase of selftest. It was taking 4 hours and now takes 20 seconds - 720x faster.
creating a restore plan when cache-size-limit is >= 0: similar performance improvement
backup, rm, and retain when cache-size-limit is set and there are many arc files in the local cache. In a test backup with 20K local arc files and 100K remote arc files, a 2nd backup of one small file took 55 minutes and now takes 9 seconds.
creating 100K small arc files by saving a 2GB file with a 4k block size and cache-size-limit set to 0 took over 2 1/2 hours with #1983 before being killed. It took 11 minutes to finish with this release.
O(n^2) means n items need n times n (n squared) operations. With 1000 arc files, 1M operations are needed. It isn’t horrible for small n, but for larger n it gets ridiculous. 100K arc files is 100x more files, but required 10B operations - 10,000x slower.
get: with -i, a Continue? question is asked after the restore plan is displayed but before any downloading starts. This gives a chance to review cache sizes to ensure the restore will succeed.
S3: always use the cacerts.crt file in the backup directory to validate SSL connections
get: when cache-size-limit is set, a restore error on file A could cause a restore error on file B if they shared a block and file A was the block’s first reference. Bug found in internal testing with forced errors.
selftest: if cache-size-limit is set, selftest -v5 <pathname> sometimes caused an error "list index out of range". It continued after that, but arc files were not prefetched. Also, instead of testing just the path specified, it tested the entire backup.
backup: btrfs, a newish Linux filesystem, can create snapshots. But all snapshots have a common inode number, so HB only backed up the main snapshot directory, not the directories containing data.
log: if an error occurs while trying to start the hb command being logged, this error was displayed instead of the actual error: LookupError: unknown encoding: string-escape From built-in traceback exception reporter.
OpenStack destinations require an authurl. If the authurl included a port number, HashBackup was not handling it properly and generated an error:
dest xxx: unable to start: DNS error getting IP address for api.nz-por-1.catalystcloud.io:8443: [gaierror] [Errno -2] Name or service not known
the dedup hash function was improved for dedup tables 10GB and up. There was a bias in the function causing lower positions of the hash table to be used more often than upper positions. Everything still worked correctly but it could affect dedup performance.
backup has always expanded the dedup table as it fills. Now it resizes the dedup table both up and down within the limit set by -D or dedup-mem.
backup: if the dedup table was missing, backup rebuilt it using a size just large enough for existing entries. This could cause another resize in a short time.
in #1846, it became a fatal error instead of a warning if a command could not be audited, but it was often hard to determine the reason the command couldn’t be audited. Now a traceback is printed.
The auditing error is back to a warning now because if rekey is interrupted, it says "re-run rekey" but then won’t let rekey run because of the auditing error. Impossible loop to escape. Thanks
when auditing was enabled and an error occurred, it could cause a 2nd traceback:
Traceback (most recent call last): File "hb.py", line 255, in <module> cPickle.PicklingError: Can't pickle <type 'traceback'>: attribute lookup __builtin__.traceback failed
this release does an automatic database upgrade when any HB command is used. The upgrade prevents access by earlier HB releases since they do not recognize the encrypted backup keys created by the new readkey command (below). Most commands failed harmlessly and had the nice benefit of showing that backup keys actually were encrypted. But running an older version of selftest -v4 --fix on a readkey backup would have deleted every file because no data blocks would be readable.
readkey: this new command enables RSA public key encryption, also called asymmetric encryption, to provide write-only backups. When readkey is enabled, a new file readkey.conf is created in the backup directory. When readkey.conf is present, HashBackup acts as before, with no extra security, but does display this warning:
WARNING: readkey is enabled and readkey.conf is presentWhen readkey.conf is NOT present:
Readkey can be turned on or off at any time. A typical setup would be to enable readkey, copy key.conf, readkey.conf, and dest.conf to secure locations, then delete readkey.conf. Automated backups will still work but no data can be restored until readkey.conf is copied to the backup directory.
The default RSA key length, 2048 bits, is recommended by the NIST through the year 2030. The -b option can set a key length from 1024 to 4096 bits, but keep in mind longer keys make restores slower. For example, it takes .05 seconds to decrypt a key with RSA-2048 and .3 seconds (6x longer) with RSA-4096. This happens once per version during restores. On backups with many versions such as a 1-year old hourly backup (8700 versions), restores could take 45 minutes longer with 4096-bit keys but only 7 minutes longer with 2048-bit keys.
The -p ask/env option adds a passphrase to the readkey. For -p env, the readkey passphrase environment variable is HBREADPASS.
The readkey command with no on/off command displays the current readkey status and if readkey.conf is present, verifies the key. See the web site for more details. Thanks Tobias!
IMPORTANT:
a new readconf.key is generated every time readkey is enabled so it’s important to copy the new readkey.conf to secure locations
readkey.conf is not deleted when readkey is disabled in case there is a copy of the backup somewhere with readkey enabled
readkey does not prevent deleting backups since this requires cooperation from remote storage servers. Use hb config options admin-password, enable-commands and disable-commands, and hb dest load to mitigate this risk.
backups created with HB#655 or lower (before Sep 10, 2012) do not gain extra protection because these early versions used convergent encryption keys rather than random session keys to encrypt data.
get: when cache-size-limit is set, the get command has to download arc files to do a restore. A large restore might require more disk space than is available in the backup directory. For example, the backup directory might be on a small SSD with cache-size-limit set to 10G, but the backup itself could be many terabytes, and restoring it could require downloading hundreds of GB of arc files that won’t fit in the backup directory.
A new get option --cache specifies another directory to use as temp space for doing the restore, and can be on a different disk with more space than the backup directory. If the directory doesn’t exist, it is created and will be deleted after the restore. If it already exists and contains files from the same backup, they will be used again without needing another download, so create the directory beforehand if you want to keep the cache. Thanks Jacob!
init, export: -p ask only asked for the passphrase once, which could lead to problems if there was a typo. Now it asks twice and verifies they match. Thanks Tobias!
clear: if dest.conf is loaded into the database it survives clear
backup: after the last file has been saved (inex.conf), an error could occur when trying to cd back to the starting directory on Ubuntu 16.04:
OSError: [Errno 2] No such file or directory: '(unreachable)/
The cd was removed since it isn’t necessary. Thanks Auke!
backup: previously when a filesystem was copied, eg to a larger disk, SSD, or new system, HashBackup did a good job of not creating new backup data if dedup was enabled, but the database size could increase quite a bit. Now this is handled more efficiently if pathnames stay the same and file attributes are copied, for example:
filesystem is copied with cp -rp
filesystem is copied with rsync -a
filesystem is copied with a cloning utility
filesystem or directory is restored with HashBackup Thanks Daniele!
S3: HB’s S3 interface has not used SSL because the S3 protocol is resistant to attacks and the data is encrypted. But some customers want to use SSL anyway, so the "secure" keyword has been added as an option to S3 destinations. It takes a true/false value or if used without a value it enables SSL. Thanks Alex!
ls: handles leading . in the pathname to list:
hb ls . list the current directory, like ls
pwd
hb ls ./ same as ls .
hb ls './*' list the current directory recursively (note quotes!)
hb ls ./xyz list
pwd
/xyz
hb ls xyz list any pathname with xyz as component
Thanks Kriston!
get: using ./filename or a simple filename adds the current
directory to construct a full pathname, ie,
pwd
/filename
rm: using ./filename means
pwd
/filename, like get. A simple
filename is not supported for rm because removing files from the
backup is not reversible so there is more danger of a mistake.
log: a traceback occurred when used with clear because clear deletes the logs. Now it’s an error to use hb log clear. From email traceback.
log: the log command is designed for use with cron jobs but can be used when typing HB commands on the keyboard. With a keyboard, if a question was asked, eg when clear asks "Are you sure?", the question was never displayed and HB seemed hung. This works now, though in the log file, the question is combined with the next line of output and the response isn’t logged. Not perfect, but better.
this release does an automatic database upgrade when any HB command is used, to correct statistics for incomplete backups. The upgrade may take a few minutes, depending on the number of incomplete backups and the number of files they contain.
backup, stats: if a backup was interrupted, it was not included in some statistics, but was included in others. This could lead to confusing and misleading numbers from hb stats. Thanks Jacob!
dest clear is supposed to refuse to clear a destination if it contains the only copy of a file. But it was clearing it anyway because of a subtle bug that’s now fixed. Thanks Jacob!
if a destination is supposed to have a file but it "goes missing", ie, is deleted outside HB, then when HB tries to fetch the file it creates a zero-length file in the backup directory. But a subsequent run of HB got confused by this zero-length file. Now it is deleted before it causes confusion. Thanks Jacob!
the retry keyword in dest.conf controls how often HB retries errors. It is up to 3 integers: # of retries, initial delay, delay factor. This is called "exponential backoff", where each retry waits longer than the previous attempt. The defaults were 2,5,2 so:
try the first time
wait 5 seconds
first retry
wait 10 seconds
2nd and final retry (15 seconds total)
Most customers want their backup to "just work", and since backup is usually run at night, would prefer it ran a little longer when a storage service is having problems rather than bomb out after only 15 seconds of downtime / retries.
The second problem is that exponential backoff works fine with small numbers, but causes ridiculous delays fairly soon. For example, after 14 retries, the retry delay would be 1 day, then 2 days, etc.
retry never delays more than 20 minutes, so if you use retry 10 in dest.conf, the first 8 retries will take 20 minutes, then the next 2 will each delay 20 minutes. The effect is that after the 8th retry, HB will retry every 20 minutes.
These changes make it easier to figure out the total retry delay. If you want to keep retrying for 5 hours, it would be:
upgrade: before downloading and installing a new version, upgrade asks for confirmation. When run from a script or cron job where there is no keyboard, upgrade cannot get confirmation and halted with an error and traceback. Now it requires --force in this situation to avoid the traceback.
S3: the port keyword on an S3 destination was not being used, so attempts to use an S3-compatible service like Minio running on other than port 80 would result in an error:
unable to start: [gaierror] [Errno 8] nodename nor servname provided, or not known
S3: Amazon S3 uses bucket names as part of the host name, but S3-compatibles often don’t support that since it requires DNS support. A new true/false keyword "subdomain" can be added to control whether bucket names are added to the host name (subdomain true, the default) or the bucket name is added to the URL (subdomain false). Thanks Alex!
the log command has been updated and documented. This command is used as a prefix to other commands in cron jobs to timestamp and log HB output. For example: hb log backup -c backupdir … It also will summarize and archive log files. See the Log web page under Commands for details.
backup: if kill -TERM (or kill -15) is used to kill a backup, it will finish the current file and then end normally. It makes a checkpoint where it stopped and if --maxtime is used on the next backup with the same pathnames, it will restart where it left off.
small arc files are combined into larger arc files by the retain and rm commands. The settings were fixed at 1MB and 5MB, meaning arc files under 1MB were combined until they were at least 5MB. These two settings are now config options, pack-combine-min (default 1MB) and pack-combine-max (default 5MB). For the more daring, these can be used to repack an entire backup into larger arc files.
destination info is on www.hashbackup.com under Destinations and has been removed from doc/dest.conf.examples
if a destination is turned off or fails, the next backup would sometimes send a complete set of hb.db.N files to all destinations. Now this update is much smaller. This is useful for some users who regularly turn off or have missing destinations, for example, rotating 2 USB backup drives, or backing up to USB during the day and enabling offsite backup only at night. Thanks Jacob!
Unix allows // in pathnames, treating it as /. Now all HB commands do this too. Thanks Lukasz!
On a backup with multiple destinations, one of which was not available, a get command would fail with this error even though other destinations contained the file(s) needed:
Unable to download archive arc.0.0: Exception('destinations halted',)
Now when onfail ignore is used on destinations that may not be available, like external USB drives, the restore will complete using the other destinations. Thanks Jacob!
If you had a remote destination, added a new destination after April 15th, and have a limited local cache (cache-size-limit >= 0), please read the following. Otherwise, it does not apply to you.
Selective download is a feature of B2, S3 & compatibles, FTP, WebDAV, Cloudfile, and Dir destinations since April 15th. Internal testing has found an arc file corruption problem that can occur during destination to destination copies. This is when you have have destinations, don't have a local copy of the backup, add a new destination, and HB must copy data from an old destination to the new destination to populate it. Here are the necessary conditions for this problem to occur:
the source destination (copying from this) supports selective download (one of the destination types listed above). The target destination (copying to this) can be any type.
the sync occurred after April 15th
the source destination had more than 1 worker (default is 2)
during the sync, the source destination failed. An interruption like Ctrl-C, killing HB, or a system crash is okay - no problem. It has to be a failure where the destination stops HB displays an error message saying the destination stopped after several error retries.
If all these conditions are true, it's possible for a corrupt arc file to be created in the local backup directory when the source destination fails. A serious side-effect is that the corrupt local arc file could be copied to the new target destination during destination to destination synchronization. This means the new destination has a bad copy of the arc file. The dest verify command cannot detect this because the file is there, it's the right size, and the checksum matches because the arc file was actually corrupted when downloaded from the source destination.
If you have done a destination to destination sync with HB after April 15th and it satisfies all of the conditions above, you should take one of these actions:
This will analyze your backup and try to determine if any arc files may have been affected by this sync bug. This is the fastest option. Checksync can be interrupted without harming the backup. Checksync may recommend that you run selftest if it finds any bad arc files. If you run checksync more than once, it will always report that the files need checking, even if you run selftest and everything is fine; all it is doing is selecting files based on timestamps. The checksync command will be removed in 3 months since it is designed for this particular problem.
Option 2: if you still have both the source and target destinations configured, you can redo the copy operation by clearing the target destination with:
hb dest -c backupdir clear <target dest name>
The next backup will do another destination to destination sync. This is cheaper than running a full selftest -v4 because the data only has to be downloaded from the source destination, whereas with selftest -v4, data is downloaded from both the source and target destinations (and any other active destinations).
Option 3: the slowest but most thorough option is to run selftest -v4 --fix to verify your destination data is correct. This will download all data from all destinations and verify every block in the backup. For huge backups, you can use --sample (without --fix) to do a rough check of all arc files, but since not all data is checked there is some risk of undetected corruption on the new (target) sync destination.
Examples sync scenarios:
Had a complete local backup (cache-size-limit is -1, the default), added a new destination, HB copied all the backup files to the new destination. This is fine because it is not a destination to destination sync (cond #1).
Had a cache-size-limit set (not all backup files are in the backup directory), had an SSH destination, and added an S3 destination. Files were synced from the SSH to S3 destination. This is fine because the source destination (SSH) does not support selective download (cond #2)
Had cache-size-limit set, had an S3 destination configured, added a B2 destination and did the sync in March. This is fine because the sync occurred before Apr 15th & selective download was not available before then. (cond #3)
Same as previous, but the inital sync to the new destination occurred after Apr 15th. This could potentially have caused a corrupted arc file on the target destination, but only if:
during the sync, the source destination failed and stopped; this is not as likely (cond #5)
If you have questions or need advice about whether your backup might be affected, please send an email.
backup: if an error occurred either reading or opening a directory listed on the command line, backup would stop abruptly with an unhelpful traceback that didn’t include the directory name. Now it displays an error message with the directory name, skips backing up that directory, and when finished, exits with an error code.
get: when restoring a deleted file without -r, ie, restore the latest version, get would say:
Most recent backup version: 2104 Restoring from current version Path deleted in version 2097, last saved in version 2079: pathname
Most recent backup version: 2104 Restoring from current version Path is not in current version; last saved in version 2079, deleted in version 2097: pathname Restore pathname from version 2079?
When restoring a deleted file with -r but the wrong version is used, get now lists all versions of the file. This is too confusing for a yes/no answer, so another get with -r is needed to restore the file:Path is not in version 1809: pathname Versions of this file: 2016-07-04 09:02:36 in version 1781-1808 2016-08-09 18:10:57 in version 1810 2016-10-05 12:51:06 in version 1867-1904 2017-04-11 15:06:36 in version 2054 2017-05-07 16:42:51 in version 2079-2096 2017-05-07 16:42:51 in version 2105 Path not restored: pathname
B2: HashBackup does not need B2’s automatic file versioning and has always used B2 API calls to delete previous versions immediately. Now when HB creates a new bucket, it sets the lifecycle to "keep 1 version", ie, disable versioning. It will also set this lifecycle if an existing bucket has the default lifecycle of "keep all versions". HashBackup still deletes previous versions immediately because it’s usually cheaper than paying storage costs until B2’s lifecycle rules delete old versions.
B2: when maxsize is used in dest.conf, files > maxsize are split by HashBackup before uploading. Since the default for maxsize is 5GB, splitting doesn’t happen often, but when it does, these split files caused one extra B2 API call for every upload and two extra calls for every download or remove.
selftest -v4 on a destination that supports selective download (B2, S3, WebDAV, FTP, Dir, CF) would fail with an error "file not on destination: arc.v.n" if the file was split on upload because it exceeded the destination’s maxsize. The file is fine: this was a selftest download bug.
get: this set of conditions:
backup with a cache size limit >= 0
not all arc files are present locally
change cache-size-limit to -1
try to restore a file with the get command
the arc file needed is not present locally
the arc file was split because of maxsize
the arc file has deleted data ("holes")
destination supports selective download
more than 1 destination worker could cause this error: getblock: hash mismatch blockid N in arc.v.n
The backup is fine, but selective download doesn't work with split files and caused this misleading error message. It seems unlikely to occur, yet an email traceback did come in with this error and it was reproduced. It's fixed now.
rm: in unusual cases, rm could leave an unused path in the backup. It didn’t hurt anything other than causing a selftest warning.
rm: when -r is used to remove an entire version, it has two behaviors depending on --force:
with --force, rm deletes all files in the version
without --force, rm deletes files in the version if there is a newer copy in the next backup version. If there isn’t, files are moved into the next version.
However, if the most recent version was removed with -r, rm would delete all files as if --force was used. Now it gives an error and requires --force.
rm: when removing a version without --force, rm has to check every user file to see if it has been superceeded. This check is now 5x faster
rm: removing a path is 20% faster
rm: when removing a path, the user running rm must own all of the backups containing the path. Previously rm would start deleting paths, detect an ownership error, then revert the deletes and fail. Now rm verifies ownership first so it can fail quickly.
dest unload was writing dest.conf as it should but was not removing dest.conf from the database. This bug was introduced Apr 29.
selftest: the database upgrade on May 5th had a small bug related to file versioning that selftest didn’t notice. It didn’t affect the backup or cause data loss, but now selftest will catch this problem.
selftest: the error "logid X removed in version V but saved in that version as logid Y" can now be corrected with --fix. No backup data is lost.
clear: ctrl-c at the right time could reset the config, even without the --reset option
clear: ctrl-c at the right time could cause "destination mismatch" errors on the next command
dest sync: a recent change could cause remote-to-remote synchronization to be slower when the source is a B2, S3 or compatible, WebDAV, FTP, Rackspace Cloud Files, or Dir destination that supports selective downloads.
this release does an automatic database upgrade when any HB command is used, to correct "multiple C records" problem introduced in #1890
backup: when a hard-linked file was previously backed up and then changed, it could cause a selftest error "multiple C records". This bug was introduced in #1886 in the May 3rd release. The database upgrade in this release corrects the problem. From internal testing.
selftest: in the "Checking files" section, the progress percentages were off, sometimes by a lot, especially on older backups where a lot of files had been deleted over time. From internal testing.
selftest: if all files were removed from a backup with hb rm /, selftest would fail with a traceback: UnboundLocalError: local variable 'lastblockid' referenced before assignment From internal testing.
selftest: a bug introduced a few days ago (the 30% speedup) could cause sequence errors on sparse files:
Checking refs I Error: logid 49 seq 1025 blockid 30 expected seq -1023 1 files have errors
this release does an automatic database upgrade when any HB command is used, to correct "orphaned files". (See next bug)
in an unusual situation, files could become "orphaned" and never removed by retain. There were around 1800 files like this in the HB build server backup containing 2.2M files, so less than 0.1%. This bug didn’t affect HB operations, other than keeping some very old files in the backup that retain should have removed long ago. The bug causing this has been fixed and retain will now be able to remove them, so you might see more deleted files than usual in the first retain with this release. From internal testing.
selftest is over 30% faster with backups that have a lot of block references, for example, VM image backups with a 4K block size or any backup of large files. The HashBackup build server backup has 137M block references and selftest now runs in 7.5 minutes vs 11 minutes previously.
get: if cache-size-limit is set and a backup was created with --no-ino, restoring could cause a traceback. Thanks Jacob!
Traceback (most recent call last): File "/hb.py", line 142, in <module> File "/get.py", line 1172, in main File "/get.py", line 888, in plan File "/get.py", line 948, in prefetch UnboundLocalError: local variable 'hlkey' referenced before assignment
config: the admin-passphrase is a hash code and so not normally displayed with the config command. But if it was listed on the command line, the binary characters displayed could mess up the terminal settings. Now it displays (hidden).
selftest: a new test was giving an incorrect error "logid x has data blocks but null file hash" on partially backed up files
this release does an automatic database upgrade when any HB command is used: directories use a few bytes less storage. The update makes three passes over the database, so the update could take a while for very large backups.
HashBackup has had four arc file formats since its initial release in 2009 and is still compatible with all of these. But the new selective download feature (downloading parts of an arc file) only works with the latest arc file format, so a test was added to make sure HB doesn’t try to do selective downloads on old-style arc files created before 2013; it downloads the whole file instead.
the admin-passphrase config option can be used to restrict access to certain commands, including the config command, from users that have access to the backup directory. To increase local security, it is now stretched with pbkdf2 and a random salt.
the dest verify command quickly checks remote destinations to ensure all files HashBackup has sent are actually there, are the right size, and have the correct file hash - without downloading any file data. Now, for B2, S3, and S3 compatibles, dest verify is 3-20x faster and up to 100x cheaper for large backups. It verifies 500-1000 arc / hb.db.N files per second at half a cent per 1M files.
selftest has a new option, --sample N, indicating N blocks should be tested in arc files instead of testing every block. --sample is used with -v3 or -v4 arc file testing and gives a higher degree of confidence than the "dest verify" command that arc files are correct, without having to download entire arc files. --sample can only be used with destinations that support selective downloading since it needs to download the individual blocks being tested.
--sample is very useful with large backups to cut testing time and download fees. It can be used with or without --inc. When used without --inc, --sample tests N samples from all arc files. With --inc, --sample does sampled testing over a period of time and respects the --inc download limit. Since it is only sampling, more arc files can be tested than before with the same download limit.
Any combination of -v3/4, --sample, -r, and --inc options can be used and they are tracked separately. So for example, you can use --inc 1d/30d --sample 3 -v4 after every backup to do a random sample of 3 blocks from every arc file every 30 days, and on the weekend use --inc 1w/3m -v4 to do a full verify of every arc file every 3 months. Sampling is not implemented for -v5.
selftest: specific arc files can be tested by adding arc filenames, arc.5.35 for example, to the command line. But if -v4 was used with a list of arc filenames, selftest tested everything; you had to leave off -v4 to test specific arc files. Now, only the arc files listed are tested. This is useful if there is one bad arc file that needs to be repaired using copies on other destinations, but the backup is too large to download everything with a full -v4 selftest.
backup: at the end of the backup, a Mem: line shows the maximum memory (RAM) used during the backup. On OSX it was correct, but on Linux and BSD it was 1024x too small, ie, showing 135KB meant 135MB.
mount: fixed a traceback found in testing: Getting arc.0.0 TypeError('splitspans() takes exactly 2 arguments (1 given)
backup: some filesystems like CIFS (SMB, Samba), FUSE (sshfs, UnRAID) do not have stable inode numbers like a normal Unix filesystem. Inode numbers are used by HashBackup to detect hard links. HB previously issued an unstable inode warning on incremental backups. This is now a fatal error, advising to clear the backup and use --no-ino. This prevents accumulating a lot of backup history with random inode numbers. If inode numbers are unstable and --no-ino is not used, it can cause incorrect hard linking at restore time. Thanks Steve!
IMPORTANT: if you have a backup containing unstable inode numbers, --no-ino must be used with "get" to prevent incorrect hard linking.
backup: when --no-ino is used, backup saves data so that restores are correct even if --no-ino is not used with get.
IMPORTANT: this only works for future backup data. If you already have a backup of unstable inodes created without --no-ino, you can either clear it and start over, or use the --no-ino option on every future backup and restore.
backup: no-backup-tag is a list of filenames that, if present in a directory, indicate the directory should not be backed up. If directory p1 was tagged as "don’t backup" and path p1/p2 was given on the command line, path p1/p2 was saved on every backup whether it was changed or not and caused selftest errors:
Error: logid 78670297 removed in version 1008 but saved in that version as logid 78686742: pathname [r1007]
If you have this problem in your backup, use hb rm -rN pathname to remove the pathnames in error. Whew - this backup has saved over 78M files. Thanks Daniele!
selftest: if backup notices that a file changes while it is being saved, it sets a "partial" flag on the file to indicate that the file wasn’t completely saved. Depending on the backup timing, this race condition could cause selftest to show a bogus error "hash should be null". Thanks Daniele!
the HashBackup security document was reviewed and a few changes were made for clarity - nothing major.
the selftest -v4 and mount commands may need to download arc files. To improve download speeds, these commands can now utilize multiple workers when downloading one arc file from a destination. For example, a 100MB arc file may use 5 workers to simultaneously download 5 x 20MB sections. This is only available on destinations supporting selective downloading: S3 and compatibles (Google Storage, DreamObjects, SoftLayer, etc), Backblaze B2, Rackspace Cloud Files, WebDAV, and FTP. The get, recover, and selftest -v5 commands might download files, but they already download multiple files in parallel. Now all HashBackup downloads are parallelized.
dir destination store backup data on mounted storage, which might be a local directory or a remote directory. Dir destinations now support selective download. This is mainly for testing since Dir destinations can also use the dest.conf symlink option to avoid copying remote files.
get: the new --no-ino option is used when restoring files that were backed up with --no-ino. This is a temporary workaround to correct a few restore problems with unstable inodes, eg, when backing up CIFS filesystems. When this option is used, no hard links are created.
the default network communication timeout is now 5 minutes instead of 30 minutes. This timeout is for making connections and doing small data transfers, not for file transfers. The timeout period can be changed if necessary with the timeout keyword in dest.conf
the network timeout for S3 destinations was previously 30 seconds, but that’s a little short and it is now 5 minutes.
S3: when an S3 bucket is configured to automatically transition files from S3 to Glacier, HashBackup previously issued "restore" commands to S3, causing the Glacier files to be temporarily stored on S3 where HB can get to them. This was leftover from when HB supported Glacier. But there is no restore pacing, so these restores from Glacier to S3 could be quite expensive. To prevent a surprise AWS bill, HB no longer restores files from Glacier to S3.
S3: the euca destination type (Eucalyptus Systems "Walrus" S3 clone) has been removed. S3-compatible destinations use type S3 with host and port keywords.
download fees for storage companies are typically 10-15x higher than storage fees, so it’s important to minimize downloaded data.
HashBackup downloads files for: - selftest -v4 or -v5 - recover - get (restore) - rm/retain to "pack" arc files (remove deleted data) - mount
HB previously downloaded entire files. But arc files often contain deleted data ("holes") created by rm and retain, especially if remote packing is disabled. Now HB skips deleted data during downloads, saving data transfer fees. The destination types that support selective download are: Amazon S3 and compatibles (Google Storage, DreamHost, Softlayer, etc), Backblaze B2, Rackspace Cloud Files, WebDAV, and FTP.
Selective downloading is not yet fully optimized in this release, so it might be faster or slower than whole-file downloads depending on how much deleted data is in an arc file and the speed of your network connection.
Selective downloading will usually be cheaper than whole-file downloads and is never more expensive. You will see higher request fees, but even lower data transfer fees and lower overall costs.
previously, HB did a preliminary access check on the backup directory, to avoid having obscure error messages for insufficient permissions. But commands that are usually read-only, sometimes are not read-only in certain situations, making it difficult to predict what kind of access to the backup directory will be needed. Now HB doesn’t do an access check before starting, so a command might fail if write access is needed but only read access is available. Thanks
the hb log command creates log file names with a timestamp. If the same command is run twice in the same second, there was a filename collision and traceback. Now if a collision occurs, HB delays and retries until a unique log file can be created. Reported by traceback email.
B2: the default for keepalive is true rather than false. If dest.conf has keepalive false, that should be removed. If removing it causes problems, please send an email.
if audit logging is enabled but fails for some reason, HB displayed a warning that the audit log failed but executed the command anyway. Now if audit logging fails, it’s a fatal error and the command doesn’t execute.
some commands are mainly read-only: audit, get, ls, mount, versions; but HB was always checking for rwx access to the backup directory. Now it only checks for rx access for these commands. However, they can still fail with a permission error if:
the previous command aborted unexpectedly and the database has to be rolled back
cache-size-limit is >= 0 and the command (get or mount) requires downloading arc files from a remote destination
IMPORTANT WARNING: HashBackup does not do individual file permission checking, so anyone with read-only access to the backup directory has access to anything that was backed up, even if they don't have access to the same data in the live filesystem.
B2: log files said the download rate was X MB/s, when actually it was showing KB/s, ie, the number was 1000x too big
B2: during downloads, files were loaded into RAM. This was not intended and could lead to significant RAM usage, especially with large arc files and/or many worker threads.
IMPORTANT NOTICE: Eucalyptus Systems had one of the first S3-compatible object stores, "Walrus", and HashBackup supported an S3 destination type of "euca". This was necessary because when initially launched, Walrus used a weird pathname to access buckets and HB didn’t support host and port keywords for s3 destinations.
Now, a host and port keyword can be added to the s3 destination type, making the euca destination type obsolete. The euca destination type will be removed very soon. If that's a problem, please send an email.
this release does an automatic database upgrade when any HB command is used. Some procedures have changed and the upgrade prevents access by earlier HB releases.
every HashBackup command that modifies the database has to create an hb.db.N file to send changes to the remotes. This is up to 3x faster in this version.
hb.db.N files are no longer kept in the local backup directory, saving local disk space. Running this release will remove local hb.db.N files.
hb.db.N files use less remote storage space and HB tries to avoid the "delete penalty" for these files on Amazon Infrequent Access. This will shorten recover time and lower storage and download costs.
HB only manages the storage class when the class keyword is used in dest.conf. If the class keyword is not used, all objects are stored in the bucket's default storage class. This will not be optimal on Google Nearline / Coldline because of the delete penalty.
The first backup with this release will send one or more hb.db.N files and then will delete all of the old ones; don't panic - this is expected. If you notice any changes that are not beneficial, please send an email with details (ls -l backupdir/hb.db* and hb dest ls).
Unfortunately, HashBackup isn't able to dynamically set the storage class with Google Storage like it does with Amazon S3, maybe because HB is using the S3 interface. Instead, the storage class is determined by the default storage class on the bucket. The downside is that Nearline and Coldline have 30/90 day "delete penalties" that HashBackup can't avoid like it often can for Amazon S3 by managing the storage class on individual files.
The delete penalty can be fairly severe with Google Coldline storage. With Coldline, you get a 65% discount from regional prices, or 30% discount from Nearline, but all files are charged for at least 90 days of storage. If a file is added one day and deleted the next, you're charged 90x more than the one day of storage. Therefore Coldline is not recommended for use with HashBackup unless you retain all files for a minimum of 90 days. It can be very difficult to compare pricing with delete penalties, and the only way to know for sure is to run parallel backups to separate buckets for a month or so to see which is more cost effective for your backup data's access patterns.
rekey aborts if any destinations halt, to prevent a failed destination having files associated with the old key.
rekey was not restoring the old key file on errors.
recover: following a recover, the next backup would send a large db update that was unnecessary.
recover: reconstructing the database is faster: around 35% faster for the HashBackup build server backup.
recover: the Glacier download pacing code - half of the code in recover - was removed since Glacier is no longer supported.
recover: add retry on dest.db download errors
recover: the -c option is now required. If the backup directory doesn’t exist, recover offers to create it.
an empty key file caused a traceback but now displays an error message.
recover warned about overwriting an existing database only when the -a option was used. But recover always overwrites the database, so -this warning should always be issued.
recover: display a progress message while downloading arc files
backup is 5-6% faster when the backup is primarily small files
backup: a new Mem: statistic shows peak memory used. This includes the two largest RAM uses: the dedup table and database cache.
backup: if a sparse file hole was greater than 2GB, HB could fail with a traceback: Exception in shaq_loop. Thanks Mark!
backup: if a sparse file was only a hole with no data, it was backed up normally and could take a long time instead of just a few seconds. Thanks David!
the config option cache-size-limit controls the amount of arc data kept in the local backup directory. Setting it to zero means you don’t want any arc data stored locally, but that can cause backup performance problems. The cache size is now raise to at least 2 * arc-size-limit (2 arc files) while HashBackup is running, then trimmed lower if necessary when it’s finished. This makes a cache limit of zero work much better.
Amazon S3: the region list was updated, adding us-east-2, eu-west-2, us-south-1, and us-northeast-2
S3: the boto library used by HashBackup to access S3 was updated to the latest version
S3: for large files, HB uses multipart uploads: instead of uploading a 100MB file with 1 worker, it might be uploaded by 4 workers in 25MB sections. But if a multipart upload is interrupted, the partial uploads hang around indefinitely on S3 unless you have a "lifecycle policy" for them, and you are billed every month for storage costs. These files don’t show up in the online S3 Management Console. Now HB deletes any incomplete uploads every time it starts. There may be quite a few of these the first time.
S3: the IBM / SoftLayer Standard Cross Region storage service has an S3-compatible API that works with HashBackup. This is just a documentation change, not a code change. Thanks Jonathan for testing and sharing your dest.conf info!
get: if a file is only partially restored (disk full for example), the .hberror extension is added to the filename to make it obvious that the file is incomplete
dest verify/sync: if dest.db was missing on a remote, it was not transferred. Now it is always transmitted on a dest sync or verify.
dest.conf: if a required keyword did not have a value, a traceback would occur (TypeError: not all arguments converted during string formatting) rather than the correct error message, "keyword requires a value: (keyword)". Reported via email by HB traceback reporter.
rclone has a few issues that require workarounds in HB and have been filed on GitHub.
hb dest verify does one ls command now instead of one per remote file and is 10-15x faster. The output for rclone is more detailed now, with separate errors for "file not found" and "file size is wrong". Checksums are not supported on shell destinations.
the updated rclone.py script must be installed manually by copying it from doc/dest.conf.examples to where the run command in dest.conf needs it to be.
the hb "dest verify" command verifies the presence and size of files on a destination without downloading the file. This is now supported by the rclone shell destination. For this to work, the HB program has to be upgraded with hb upgrade and the rclone.py script has to be updated manually. The rclone.py script is in doc/dest.conf.examples/rclone.py in the tar file on the HB website.
get: reset symlink modification time to the value stored in the backup rather than the time it was restored. Some older OSs cannot set symlink modification times; in that case, symlinks will have the time they were restored. Thanks Roy!
get: displays progress while creating a restore plan when cache-size-limit is set. Thanks to Roy for the suggestion.
get: when cache-size-limit is set and the restore size is greater than cache-size-limit, a race condition could cause the restore to hang if multiple download workers were active. Thanks to Roy exporting his 3TB backup for testing!
get: if errors occurred while restoring files, it could case a hang. Now it works, even with 50% injected random errors on a 3TB restore.
the rclone.py script wasn’t working with the mount command. It is located in doc/dest.conf.examples of the tar file on www.hashbackup.com and has to be updated manually.
get: previously, concurrent downloads were disabled to avoid splitting bandwidth resources across multiple files. This works well for low-latency storage services, but is not so great for high-latency services where it takes a while to get a download started. For now, concurrent downloads are re-enabled. To do this right, HB needs to adjust downloads dynamically.
backup: with some ssh servers, ssh destinations had trouble initializing, displaying 3 errors about DESTID and then disabling the destination.
rclone: HashBackup can use rclone to communicate with storage services not directly supported by HB. The rclone.py script to enable this was updated to use the rclone copyto command. This makes downloads more efficient because copyto doesn’t do remote directory listings. The script was also changed to do unconditional transfers because rclone sometimes thinks a remote file doesn’t need updating when it actually does (same size file on a remote that doesn’t support checksums, like Dropbox). Because of this change, the --verify option had to be eliminated to avoid transfer loops related to "eventual consistency" on many remote storage services (files are not necessarily immediately available after an upload).
The new rclone.py script is in doc/dest.conf.examples and must be installed manually. Rclone will be built-in to HashBackup in the next release to avoid having to do a manual script update: it will get updated with hb upgrade just like the rest of HB.
backup: HashBackup will run in an LX zone on Illumos, a descendant of Solaris. LX zones emulate Linux under Illumos. When the dedup table is resized during a backup, getcwd() is called. There is a bug in LX zones causing getcwd() to fail with "No such file or directory" and backup can’t finish. As a workaround, the getcwd() call was removed.
backup: in very rare circumstances that happened to be triggered by the previous Illumos bug, a race condition combined with a nested exception could cause a traceback:
Exception in thread shaq_loop: unsupported operand type(s) for +: 'int' and 'NoneType' Exception in thread shaq_loop: an integer is required
backup: multi-threaded backup of a series of medium-sized files, especially if already compressed, was sometimes ignoring the signal to start a new arc file. This could create arc files much larger than arc-size-limit, like 200MB-8GB instead of 100MB. This bug started in August. Now arc sizes should be much more controlled.
backup: multi-threaded performance has improved 10-15% for some workloads
b2: the Backblaze B2 driver has a small internal retry loop in addition to the outer retry controlled by the retry dest.conf keyword. The internal retry loop now tries 7 times instead of 3.
selftest: in very rare cases, selftest could display an error: Error: block xxx arcdel yyy blen zzz This was a bug in selftest — the backup is fine. Thanks Evan!
B2: fix "connection reset by peer" errors, for real this time. The file size limit on B2 is 5,000,000,000 bytes, not 5GiB. This error only occurs on large initial backups, like 6TB.
B2 the Content-Length header is documented as required, so now HashBackup sends it (even though it seems to work fine without it).
b2: beginning with #1715 around Nov 7th, HashBackup could not connect to B2 from BSD systems because of this error:
[SSLError] [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)
In the backup directory, cacerts.crt is the root certificate file HashBackup uses to verify SSL connections. On Nov 7th it was updated from mkcert.org and this update broke SSL connections to B2, but only on BSD; Linux and MacOS worked fine. Apparently a limit on the number of certificates was exceeded. Thanks Mark for reporting the problem.
backup: when the new compression code was launched in August, all data was verified to decompress correctly before being written to the backup. It has been nearly 4 months without an error traceback, so verification has been turned off for a 10-15% performance gain.
backup: if all backup data was removed with rm, a traceback could occur on the next backup (from internal testing): TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'
rm: when a lot of data is removed from a backup, rm compresses the database. This was rewritten a couple of months ago and as a precaution, a backup of the original database was saved as hb.db.orig. This file can be fairly large and no problems have been reported, so in this release, the backup file is kept during compression and then deleted.
backup: --maxtime sets a time limit for the backup. It’s not very accurate for technical reasons and tends to run over the limit. In this release the overrun is less than in previous releases.
backup: when inex.conf was added to every backup a few months ago, it broke the --maxtime restart feature in some cases. One backup would work, the next would be empty, the next would work (but wasn’t a true restart), etc.
backup: with -p0, backup would sometimes create huge arc files. One arc file in a customer’s backup was 21GB, even though arc-size-limit was 100MB.
The backup is fine, but if you have this situation it is recommended that you either remove backups in reverse order starting with the last (most recent) backup and going back to the first backup where these huge arc files occur; or start a new backup and keep this backup until you no longer need it for retention purposes.
If necessary, you can continue to use an existing backup with huge arc files, but it will be inefficient because HashBackup will not want to pack the huge arc files until 50% of the data is removed by retain. Apologies for this one, and thanks Lourens for finding it!
ls: with the -a option, a symlink or LV snapshot backup with multiple versions displayed an additional → symbol on each version. Thanks Emanuele!
dest verify: some rsync servers caused an error "unexpected rsync output" during a dest verify command. Thanks Soren!
backup: if --maxtime was specified and the time ran out, the backup stopped (correct) but lots of pending uploads continued to be transmitted (incorrect). Now backup completes uploads that are in progress and stops much quicker. Thanks Robert!
backup: if the backup block size changes from the previous backup, a full backup warning message is displayed for affected files.
if dest verify encountered any kind of error while trying to verify a file, it marked the file as "not transmitted", forcing it to be sent again. This isn’t correct behavior if a destination was only temporarily unavailable but did have the files. Now verify will only mark files for retransmission when it gets a response from the remote indicating files are not there or are the wrong size.
WebDAV: some WebDAV servers (4shared) return a status of 204 when checking if a file exists. From internal testing.
WebDAV: when the secure keyword is omitted, try digest authentication before regular authentication.
selftest: if an arc file goes missing, for example, it is deleted by accident, selftest printed an error that the file did not exist locally nor on any destination. Now, if --fix is used and selftest is running interactively, it will ask whether to remove the missing arc file from the backup. This will remove all blocks in the arc file, then all files referencing those blocks.
dest verify can now verify the file size on rsync destinations even when cache-size-limit is set and an arc file isn’t present locally. Previously it could only verify that the remote file existed.
this release will do an automatic database upgrade to dbrev 23 when any HB command is used. Once the database has been upgraded, previous versions of HB cannot be used with it. This database upgrade has two purposes:
1) delete zero-length arc files created by a bug in dest verify for rsync destinations with cache-size-limit set (see below)
after every command that modifies the backup, HB has to send a database update to all remotes. When there are lots of changes, creating this database update is up to 45% faster.
b2: if a file could not be found when downloading, the filename wasn’t included in the error message.
b2: if the Dir keyword is changed on a destination, the next access causes a warning about ID mismatches. Running dest setid fixes the error. But if there were already backups to the old Dir, then backups are made to the new Dir, the backup files are split across 2 directories on B2. This is very confusing. Now, if you try to download files in this state, HB will display a warning that the file is stored in the wrong directory. It will still retrieve the file in case it’s the only copy. Thanks Robert!
Related to this, dest verify would verify all files, even if stored in an unexpected directory. Now it will complain about files that are in the wrong directory and upload them to the right directory if there is another copy available. Thanks Robert!
ssh: the DESTID file uniquely identifies remote storage areas. If an ssh protocol error occurred while fetcing the DESTID file, HB would complain that the destination ID did not match - a misleading error message. Now the ssh protocol error message is displayed instead. Thanks Robert!
ssh: temporary files ending in .stdin and .stdout are created by the ssh destination. Usually they are deleted, but if something goes wrong, they hang around. Now they have a .tmp suffix so HB will delete them on the next command rather than letting them accumulate.
dest verify was trying to verify files stored on inactive destinations, causing a traceback. Internal testing.
dest verify on rsync destinations with cache-size-limit set >= 0 would create zero-length arc files in the main backup directory for any arc files that were only remote, then give a "no such file or directory" error. It didn’t hurt anything, but also didn’t verify the remote file and left empty arc files in the backup directory. Thanks Marcin!
if HB had a problem cleaning up temporary files, it could abort with a traceback: OSError: [Errno 1] Operation not permitted: 'backupdir/hb-xxx.tmp' Now it displays an error message and continues.
HashBackup assumes that once it stores a file on a remote destination, the file stays there until HashBackup deletes it. This is usually true, but it can happen that files are manually deleted on a destination, and HashBackup gets no notice of this.
Selftest -v4 verifies backup data by downloading it from all destinations, decrypting, decompressing, and verifying the hash of every block, and has an incremental option for huge backups that cannot be verified in a single run. It's a very thorough check, but also very time-consuming.
Some remote storage like S3, Google Storage, and B2 are able to do all of these checks, while other destination types may only perform one or two checks. All destinations support the new verify command except the "shell" destination (for user scripting).
If a file fails to verify on a destination and there are copies elsewhere, it is marked missing on that destination and will be re-uploaded during the verify command.
the dest.conf keyword "maxsize" is used to limit the size of files uploaded to remotes. If a file to upload is larger than maxsize, HB splits it into parts before uploading them. This is different than multipart or "large file" uploads - features supported by some remotes.
Running recover to retrieve files that had been split would sometimes cause bizarre errors like checksum errors (S3/Google Storage), hash mismatches (B2), file size mismatches, arc version errors, and probably others. The remote backup data was fine.
This recover bug has been fixed. The only time it would be seen in practice is for a large backup that created an hb.db.0 file larger than 5GB, meaning hb.db is probably bigger than 10GB. This would be a backup of 30M+ files. The error would only occur in the recover command. From internal testing.
b2: when the Backblaze B2 website displays bucket names it displays funky "dash-like" characters. Visually they look fine, but if cut and pasted into a dest.conf file, the dash is not a real dash and HashBackup complained that the bucket name is invalid. Now HB changes the funky dash into a real dash. A bug was filed with Backblaze. From email traceback.
IMPORTANT SECURITY CHANGE: ls tried to emulate Unix permission checking before listing files in the backup. There were edge cases where it was either too permissive or too restrictive, it didn’t handle ACLs at all, and it’s possible that permission checking varies slight from one platform to another. Rather than having a false sense of security, ALL of the permission checking in ls has been removed. Now, anyone with read access to your HashBackup backup directory can list the metadata (permissions, filenames, file sizes, …) for all files in the backup. To secure your backup, secure your HashBackup directory with Unix permissions or ACLs.
IMPORTANT SECURITY CHANGE: get tried to emulate Unix permissions, like ls, and that has also been removed in this version (see above). Now, anyone with read access to your HashBackup backup directory can restore any file in the backup, even if they do not have read permission to the file in the live filesystem. To secure your backup, secure your HashBackup directory with Unix permissions or ACLs.
ls: the -d and -1 (one) options are replacing the --alldirs and --onerev options. This could break scripts, but it’s unlikely the hb ls command is used with a script and seems a low risk.
ls: ls accepted more than 1 pathname on the command line. ls always requires that wildcards be quoted, to prevent Unix from expanding them before ls executes, and it’s extremely confusing if an unquoted wildcard is used with ls. Now, ls only accepts 1 pathname on the command line - probably the most common usage - and if an unquoted wildcard is used and the Unix shell expands it, ls can display an error message about unrecognized arguments.
ls: ls added a * wildcard to the beginning and end of the match pattern on the command line. This caused ls to match any path in the backup containing the match pattern. For example, the pattern def would match /def, /abcdef, /defghi, and /abcdefghi. This made it impossible to match a specific filename. Now, ls does not add wildcards to the pattern, so pattern def matches /def, /abc/def, or /abc/def/ghi, but not /abc/defxxx. If you want the old behavior, use the pattern ' def ' instead. Remember that all wildcard patterns must be quoted with ls to prevent the Unix shell from expanding them; this hasn’t changed.
ls: ls is up to 27% faster if the pathname pattern matches a lot of files in the backup.
ls: with a very long pathname (more than 8 pathname components), ls would sometimes do a sequential search. Now if the pathname begins with /, ls is much less likely to do a sequential search.
ls: -d (was --alldirs) means to show all versions of directory entries. This requires -a. An error was displayed if -a wasn’t used, but now -a is implied.
ls: ls behaved more like the Unix find command, in that it listed all files underneath directories. This is useful when you’re looking for a file but don’t know where it is in the filesystem, but it was impossible to list only the first-level files of a directory. Now ls behaves more like the Unix ls command and only lists 1 level of a directory by default. If you want to see all files under a directory, add /* to the pathname pattern and be sure to quote it!
selftest: 10% faster and peak memory use reduced by half
selftest: with -v4, selftest could sometimes pack an arc file and send it to destinations on every run.
selftest: --inc could sometimes check all archives. From internal testing.
For the past 10 months there have been notices about withdrawing Glacier support, with a transition plan to other storage services such as Backblaze B2, Google Nearline, and Amazon S3 Infrequent Access, all priced similarly to Glacier but with simple retrieval policies and costs. Glacier is removed in this release.
as mentioned since January 2016, Amazon Glacier is no longer supported in HashBackup. The code is still there but will be removed soon. If you still have HashBackup data stored in Glacier, save this release so you can access your data in the future.
For details see: http://www.hashbackup.com/technical/glacier-eol
one of HashBackup’s caches was a fixed size that was reasonable for backups of several million files. For larger backups, HB relied on OS filesystem buffering. Usually that’s fine, but sometimes it’s not if the system is under memory pressure. Now this cache scales with the size of the backup: smaller backups use less memory than before and very large backups use more.
Example: a customer with a backup of 12M files totalling 7TB had never run retain in 6 years of daily backups. The hb.db file was around 9GB. On a 16GB system, the previous version of HB retain took 131 minutes to remove 7M files from the backup. The new version takes 94 minutes - a 28% improvement - and uses about 1GB of memory. 30 minutes of the time in both cases is to compress the database because so many files were removed (this is now reduced to 20 minutes thanks to the next change). The next retain removed no files, took 5 minutes, and used 287MB of memory because the database had shrunk from 9GB to 4GB.
rm: if enough data is removed from a backup, HashBackup will compress the hb.db file. This is now 33% faster.
get: if a write error occurred while restoring files, the error was: "Tried to write X bytes, could only write Y bytes". Now it also displays how much free space the filesystem has since a full disk is usually the real problem.
get: after displaying a disk write error message, the get command would hang. Now it aborts the restore and sets the exit code.
get: on OSX and BSD, symbolic links can have flags. When restored, flags were being set on the target rather than the symlink. Related to this, if the target did not exist (a "dangling" symlink), a restore error occurred: NameError: global name 'FS_SECRM_FL' is not defined. The restore continued but flags were not set for the symlink. From internal testing.
B2: changed the domain name for B2 access to avoid redirects
selftest: if a backup had no data stored, for example, only /dev/null or empty files were backed up or rm / was executed, selftest -v4 --inc failed with a traceback. Reported by email traceback.
S3: when the host and/or port keywords are used but the host is not running an S3 service on that port, HB would fail with a traceback:
Traceback (most recent call last): File "/hb.py", line 100, in <module> File "/backup.py", line 2651, in main File "/dest.py", line 331, in init File "/dest.py", line 253, in initdest File "/dest.py", line 186, in startdest File "/s3dest.py", line 251, in init1 AttributeError: 'gaierror' object has no attribute 'status' AttributeError: 'error' object has no attribute 'status'
Now it fails with the correct error: [gaierror] [Errno 8] nodename nor servname provided, or not known Reported by email traceback.
to support new features, this release will do an automatic database upgrade to dbrev 22 when any HB command is used. Once the database has been upgraded, previous versions of HB cannot be used. Since this release has quite a few changes, you may want to try the new release with a small backup first before using it with large production backups.
backup: the HashBackup executable is copied to the backup directory as before, but now only the latest 3 versions are kept. Older versions are automatically removed during the next backup.
over time, a "forever incremental" backup strategy can lead to small arc (backup) files as older data blocks are replaced by data from newer backups and old arc files are packed to remove empty space.
In this release, rm & retain combine small arc files into larger arc files. This can drastically reduce the number of arc files in the backup directory, especially for backups with many versions. For example, HashBackup's 7-year-old build server backup had 1841 versions and around 1890 arc files. Combining eliminated 2/3rds of the arc files, leaving 690, and also gave a 4-5% restore performance increase when restoring a very large directory.
For now the parameters on this feature are fixed. To combine arcs: - arc files must be local, OR - pack-remote-archives must be true for remote arc files - arc files must be at least pack-age-days old - some small arc files cannot be combined for technical reasons - the rm and retain commands both do arc combining
Deduplication is a great tool for reducing backup sizes, especially for certain kinds of files like VM images, log and other "append-mostly" files, SQL database dumps, general "office" docs, and files moving around in the filesystem. But compression plays a big part to save backup space.
The goal of this release is to get better compression when possible, without losing performance. In most cases, this release achieves better compression AND better performance.
backup: -Z0 to -Z9 are hints to HashBackup of how much compression to apply. For this release:
-Z0 = no compression (usually slower) -Z1, -Z2 = fastest compression -Z2 to -Z9 = better compression
In this release, -Z6 is the default and -Z7-Z9 are identical to -Z6, so the only use for -Z is to use lower levels to get slightly faster but less compression. Of course the new version is compatible with and can restore all existing backups. Most backups will get a speed boost and/or compress better without any command line changes.
one core on a single-core system (just the main thread) two cores on a dual-core system two cores for -Z7 or less (required -p to use more) all cores for -Z8 or -Z9
It probably does not make sense to use -p4 or higher with the new compression technologies because they are much faster and can usually only keep 3 cores busy. You can use the new %CPU statistic to see if more cores will increase performance. For example, if %CPU is over 210% with -p2, try -p3. If CPU is 220% with -p3 (less than 300%), it likely means that more cores (-p4) is not going to increase performance because HB is not keeping 3 cores busy.
some of the compression technologies in HashBackup are rather new. To ensure that all data can be restored, compression verification is enabled. During backups, HB will expand compressed data and verify it against the original before saving it to your backup. This is only for the newer compression technologies (lz4, zstd, brotli, lzma). It’s a temporary safety measure that will slow down backups a bit for now, but will be removed in the future.
to help evaluate performance, backup displays a new statistic: Efficiency. It is "MB reduced per CPU second", so higher numbers mean better efficiency. This makes it very easy to compare backup options by showing how much extra CPU time is being spent to reduce backup data size.
The efficiency rating measures overall HashBackup efficiency, so it can also be used to compare compression, block sizes and dedup options. It's best to change one option at a time when comparing different efficiencies, to understand how that option affects things.
no discussion of compression would be complete without benchmarks. Here are a few, comparing the old and new version of HashBackup with different kinds of backups. All of these tests were run on a 2010 Macbook Pro (Intel Core2Duo, 2.5GHz) with a solid state drive.
Test #1: backup /Applications (212K files) with dedup, default -Z option
Time CPU Size old HB, -Z1 73s 126s 955 MB old HB, default -Z 78s 135s 943 MB old HB, -Z9 339s 665s 976 MB
new HB, -Z1 45s (38%) 70s (44%) 1.0 GB (-5%) new HB, default -Z 47s (40%) 76s (44%) 858 MB (9%) new HB vs old -Z9 47s (86%) 76s (88%) 858 MB (12%)
Time CPU Size old HB, -B1M -Z1 49s 88s 935 MB new HB, -B1M -Z1 23s 36s 1.1 GB new HB, -B1M no -Z 26s 45s 903 MB
selftest: the basic selftest check is 15% faster on a backup with 13M blocks and 110M block references - lots of blocks and lots of dedup. Making selftest more efficient is not about speed; it’s an ongoing process to scale HashBackup to handle ever-larger backups.
ls: the basic ls command to list all pathnames in a backup is 9x faster.
backup: to avoid filling the backup directory disk, backup will abort if there is not enough space for 2 arc files of size arc-size-limit (hb config option).
B2 has a 5GB upload limit and connection resets occur when exceeded. Large backups can generate an hb.db.0 file that is bigger than 5GB. To prevent connection reset errors, the default for maxsize (a dest.conf keyword) is now 5GB. This forces HashBackup to segment very large files.
selftest: selftest displays a progress indicator when it is run interactively. In certain circumstances with -v3 and -v4, the percentage would go over 100%. From internal testing.
stats: because of a previous bug in rm, the stats command could display a negative number of files. This is fixed. Thanks Soren!
rm: if a problem occurred while removing a data block, rm noted the error and continued but it could cause a selftest error later. From internal testing.
backup: the inex.conf file is now included (encrypted) in every backup. Previously, after a recover command, the inex.conf file was missing. The next backup would create a new default inex, but any site changes would be lost. Recover does not automatically restore inex.conf, but displays a get command to do it if it’s appropriate.
get: if cache-size-limit was set >= 0, one CPU could get stuck in a loop. It didn’t affect the restore, but it did run slower. This would have been especially noticeable on 1 or 2 core systems. From internal testing.
get: another case of the race condition (multi-thread bug) that was fixed in #1481, was fixed in this release. If an error occurred during the restore, the get command might do any of these:
give a Bad file descriptor error
give a size mismatch error if restoring multiple files
This rare bug was reported by HB’s email traceback on July 18th, though the bug has been present since February. Thanks HB!
selftest: after a full -v3/4 selftest, incremental selftest (--inc) always did a full selftest, downloading every arc file. From internal testing.
get: if cache-size-limit is set, the restore planner could fail with an error when a sparse file was being restored: TypeError: unsupported operand type(s) for >>: 'NoneType' and 'int'
rm/stats: when rm removed a file, it was not adjusting a counter correctly, causing the stats command to sometimes display a negative number of files in the backup. This fixes the cause, but the stats display won’t be fixed until the database upgrade of the compression release.
rm: if -v 2 was used with retain, it caused a traceback. Retain -v doesn’t accept an integer after it, so the 2 was interpreted as a pathname, and pathnames for retain have to start with slash. So the error should have been "All pathnames must start with /: 2" But because of a bug in the error handler, it caused a traceback.
the compression enhancements are going well, but since it is near the end of a quarter, they’re being released after July 15th instead of before. To try them out, upgrade again after July 15th.
rm/retain could sometimes cause an unused pathname selftest warning. If you are getting unused pathname warnings from selftest, run selftest --fix to correct them.
backup: some filesystems don’t have stable inode numbers: FUSE (sshfs, UnRaid), Samba/CIFS, and probably others. HashBackup checks for inode number changes during backup, but because these filesystems have random inode numbers, it can cause unpredictable full backups.
The new option --no-ino can be used with these filesystems to bypass the inode number check. A new warning is displayed periodically when this option might be necessary:
A negative side-effect of using --no-ino is that hard links cannot be detected because the definition of a hard link is two files with the same inode number.
a new command, "rename", can be used to change the pathnames of files and directories in a backup. This can be useful when filesystem paths change for some reason, and you want pathnames in an existing backup to match so that the renamed files aren’t all backed up again. If dedup is activated, this is not a big concern. But sometimes a renamed path contains so much data that you want to avoid reading it all again just to find out it hasn’t changed. See hb help rename or the rename web page for more details.
backup: wav files are no longer in the list of uncompressible file extensions since they can sometimes be compressed 10-20%, though other times only a few percent, depending on the file content.
S3: when multipart upload is enabled (it is by default), HashBackup will do retries on each part, then if that fails, will do retries on the whole file. This makes multipart error recovery more efficient and fixes a problem where an S3 timeout or broken pipe (S3 closed the connection) would cause backup to abort. If configured for the default 3 attempts, HB will now try 9 times before giving up.
this release will do an automatic database upgrade to dbrev 20 on any HB command, to support saving and restoring creation date.
on BSD and OSX, files and directories have a date created timestamp that HB didn’t save or restore. Now it does. On Linux, some filesystems have a creation timestamp but there is no standard OS interface to get or set it, so it is still not supported by HB. Note: "ctime" is not date created, but is the last time the inode was changed (changed time, not creation time).
Mac OSX uses the HFS+ filesystem. HFS has the ability to store files compressed. For example, /Applications/Address Book contains many compressed files. HashBackup has always restored compressed files as uncompressed. It works fine, but previously-compressed files would use more disk space. Now, a new True/False config option "hfs-compress" can be set True to re-compress these files on restore. hfs-compress defaults to False because unfortunately, restoring compressed files takes 10x longer. The only standard software supplied by Apple to compress a file is the external ditto program, and running that for each compressed file is slow. If you don’t care about the slowdown and want compressed files to stay compressed on restore, set hfs-compress to True with hb config.
get: on some filesystems, files and directories can be marked invisible. HB was saving this but not restoring it. Now it does. Found by Backup Bouncer. Thanks Roy!
Backup Bouncer is a backup test program that checks a lot of things on a Mac HFS+ filesystem. With the two changes above, HashBackup passes the Backup Bouncer test. The one caveat is that BB sets the nodump bit on files, so HB doesn’t save them (that’s what the nodump bit means), but BB expects the backup program to save them anyway. A full Backup Bouncer report is posted on the website.
a new dest.conf keyword, onfail, adds flexible handling of destination failures, requested by several customers.
If the onfail keyword is not used on a destination and it fails, HB will continue the backup as it does today. What is different is that the destination failure will count as a backup error and the exit code will be non-zero. HB should have done this all along.
With "onfail ignore", HB will behave exactly as it does today: the backup will continue, no backup error is reported, and the exit status is not affected. This will be necessary for example if you backup to 2 USB drives in rotation. If you have cache-size-limit set >= 0 and the cache fills up because a destination has failed, backup will have to stop, just like today, even with onfail ignore. To be exactly compatible with previous releases, add onfail ignore.
Backblaze B2: using Dir / in dest.conf causes an error: http status 400 (File names must not start with '/') listing file: /DESTID Thanks Niels!
Backblaze B2: add Backblaze-recommended error handling for status 403 (account limit), 429 (too many requests), Retry-After, broken pipe, and timeouts, so HashBackup can be self-certified.
Backblaze B2: when the debug keyword is used, a B2 traffic log is written to the backup directory. Previously the log contained B2 credentials used to sign-in to B2. Now credentials are replaced by 'xxx' so that the log can be shared for troubleshooting purposes. Only Authorization and authorizationToken are replaced; accountId, fileName, fileId, bucketId, and uploadUrl are not replaced since they are not usable without authorization credentials and would make debugging more difficult.
Backblaze B2: log lines begin with a time and thread number to help trace multiple workers' activity. It’s still recommended to set workers to 1 for debugging.
init: hb init -c /xyz/backupdir would cause a traceback if the directory /xyz did not exist. Now it says parent directory doesn’t exist and exits cleanly. From traceback email.
backup: backing up large sparse files did not always work correctly. It worked with -p0 (single-threaded) but not multi-threaded. Sometimes a warning message about a size mismatch would occur during backup. A selftest -v5 displayed a hash mismatch error.
This release will do a database upgrade to mark all sparse files as partial backups. The next backup will re-save the sparse files. After a good copy is saved, the next retain will remove all of the partial files.
in #1498, an ls command was added after every rclone transfer to verify the send worked. However, some storage systems like Amazon Cloud Drive may have a slight delay after an upload before the file is available. When this happens, ls returns an error, HB retries the send, the file is usually there and is not sent again. But a bogus error was displayed.
Now, the ls after send is only done if --verify is used on the run command line in dest.conf. This is highly recommended if you have cache-size-limit set: if something goes wrong, perhaps because of a bug in HB, rclone, or the storage system, and HB believes the file was transferred when it was not, HB will delete the local copy. Since there is no remote copy either, it breaks the backup. A selftest -v4 -fix will correct this, but it will also remove files from the backup.
apologies to the rclone developer: the 256 exit code was confusion about how Python worked, not an rclone problem. Sorry!
clear: if the clear command was interrupted, files on remotes could sometimes not be deleted by running clear again, because HB thought the files were already deleted.
rclone sometimes returns exit codes 256 for error conditions, but exit codes have to be 0-255. This version of rclone.py checks and adjusts the exit code, and also does an ls command after every send to verify that the files are actually on the remote.
HB has native support for many of these services, and that should be used when possible. With this update, Rclone can be used to access services that HB doesn't support natively, such as Amazon Cloud Drive. See doc/dest.conf.examples/dest.conf.rclone and rclone.py for more details. Thanks Ziv for the suggestion of using Rclone!
IMPORTANT: Only Hubic and Backblaze were tested to develop the rclone.py script. ACD has been tested by a user, but you are "on your own" when using HB with Rclone. Questions are fine but support is limited for now, so this method is not recommended for production or critical backups.
developing HashBackup on 5 different Unix platforms has become too much of a challenge, so going forward, HB will be released on these 12 Windows platforms instead: Windows XP, Windows XP Pro, Windows Vista, Windows 7, Windows 8, Windows 8.1, Windows 10, Windows Server 2008, Windows Server 2008 R2, Windows Server 2012, Windows Server 2012 R2, and Windows Server 2016. This will greatly simplify HB development. The change takes effect today, April Fool’s Day.
selftest wasn’t following cache protocols correctly, so if cache-size-limit was set and arc files needed to be downloaded, selftest could hang because it thought the cache was full
backup: read errors are unusual these days, but they do happen. Backup didn’t handle them well: it printed an I/O exception traceback and hung. Now it shows an I/O error message with the pathname, stops backing up the file, marks it as a partial backup, and continues the backup. From emailed exception report.
backup: if a file changes size during backup, backup printed a warning message. Now it re-checks the file to get the current size and if it matches what was saved, no warning is printed. If it doesn’t match what was saved, the file will be saved again on the next backup. From internal testing.
init: display a warning if -k ask or -k env are used, since these are for the -p option, not -k. From internal testing.
backup: when --maxtime is used, backup creates a restart checkpoint if the time limit is exceeded. If the next backup did not use --maxtime, it could still use the checkpoint data (a bug), and would set the checkpoint. Now, only backups with --maxtime look at or set the checkpoint. This allows running a nightly cron backup with --maxtime, and running other ad-hoc backups without --maxtime will not reset the checkpoint. From internal testing.
backup: related to --maxtime, if a backup aborted, even if --maxtime was never used, the next backup might not save anything. It was incorrectly restarting a checkpoint that wasn’t setup properly. Thanks to Daniele at EURAC.
selftest: with the --inc option (incremental selftest), depending on the options used and the size of arc files, selftest might not always cycle through all of the archives before cycling back to the first. This bug was introduced in #1454. From internal testing.
selftest: because of round-off error, selftest --inc would sometimes require an extra run to test all archives. From internal testing.
backup: when a small fifo or raw device was backed up, the Saved: number displayed at the end could be negative
get: fixed a bug in the new restore code that caused this: File size mismatch, should be 288, is 0: <pathname> Exception in thread writeq_loop: [Errno 9] Bad file descriptor File "/shared.py", line 85, in start_thread File "/get.py", line 168, in writeq_loop Thanks to Juha Pyy for reporting this.
get: in #1473, the get command for all restores was about 2x slower because of the new sparse file handling. Now it’s back to its previous performance levels for most situations, and restoring files with a small block size (VM images) is about 30% faster than ever. Restores using bzip2 compression (-Z8 or -Z9) are still slower than before #1473. Compression improvements are coming…
the recrypt and sha256 commands have been removed. HB no longer uses the SHA256 hash. The recrypt command was seldom used and while running, left the backup in a precarious "half old key, half new key" state. The rekey command, used to change the backup key, was not removed.
if the previous HB command aborted and left an open transaction, upgrade would fail with a traceback: Exception: Can’t upgrade database with a transaction active Thanks to Frank Riley.
backup: if an OS doesn’t support hole-skipping, error messages are no longer displayed. The file is still backed up normally, just without skipping the holes. On a restore, the holes of a sparse file are always created. The Sparse: line at the end of the backup tells how many hole bytes were skipped (none if no Sparse: line).
backup: some Linux filesystems return a partial sparse map under some circumstances, causing an bad sparse file backup. There are now checks to detect this, but if you made critical sparse file backups with #1473, it’s probably best to redo them. Seems to happen mostly with very large (>4GB) files. From internal testing.
this rev will do an automatic database upgrade to dbrev 18 when any HB command is used. The database isn’t actually modified, other than being stamped dbrev 18 to prevent older versions of HB from accessing the database. There are new data structures created in this release for sparse files that older versions wouldn’t understand.
backup: backup can skip "holes" (unallocated disk space) in sparse files rather than backing them up. This is OS version dependent, as well as filesystem dependent, so it will work when it can and fallback to a regular backup when it can’t. Sparse files are mostly used for "thin provisioned" VM disk image files.
Restoring large sparse files can be a bit slow and tedious because it may require many lseek() calls to re-create the original holes. Sparse files saved with this release will restore faster than with older versions of HB if the holes are large.
$ hb backup -c hb -D1g -B4k sparsefile Backup directory: /home/jim/hb Copied HB program to /home/jim/hb/hb#1463 This is backup version: 0 Dedup enabled, 0% of current, 0% of max /home/jim/sparsefile
$ time hb get -c hb `pwd`/sparsefile Backup directory: /home/jim/hb Most recent backup version: 0 Restoring most recent version
Restoring sparsefile to /home/jim /home/jim/sparsefile Restored /home/jim/sparsefile to /home/jim/sparsefile No errors
get/selftest: since #1454, if cache-size-limit was >= 0, get and selftest could stall in some situations. Thanks to Max Norton at Aria Networks for reporting and helping with this.
export: sets all block and file hashes in the exported database to spaces. The purpose of export is to debug hard-to-reproduce HB problems. These hashes aren’t useful for debugging and the goal of export is to remove as much sensitive information as possible from the exported data while still being able to use it for bug hunting.
backup: on Linux, VMWare vmfs filesystems could give an invalid argument error when trying to open files. Thanks to Ron Joffe.
init: display an error message if unable to set owner-only permissions for the backup directory rather than halting with a traceback
B2: if the copy-executable config option was set to True, HB would try to copy the program to the B2 storage service. But because of the # in the filename, it would cause a broken pipe error.
compare: for hard links, compare -f (verify file hashes) might show a file’s data had changed when it really had not. From internal testing.
compare: always ignore link and size changes for directories since that doesn’t mean much. On Linux, if you create 10K files in a directory then delete them all, the directory will have a large size even though it is still empty. From internal testing.
backup: if linux-backup-attrs is set to True with hb config (the default is False), an error could occur: NameError: global name 'pathname' is not defined
An email has been sent to any HashBackup customers that have written in explaining why Glacier support is ending and how to do the migration from Glacier to another service. This information is also available in the doc/dest.conf.examples/dest.conf.glac file from the Download section of the HashBackup site (expand the tar file).
NOTE: this rev will do an automatic database upgrade to dbrev 17 when any HB command is used. This upgrade modifies a database index and doesn’t take too long.
Because there have been several recent DB changes, HB may do consecutive database upgrades if your version is before #1405. Don't worry, that was designed in from the beginning and isn't anything new. It looks like this (versions command):
Current database rev: 14 Upgrading database to rev: 16 Copying /testhb/hb.db to /testhb/hb.db.orig before upgrade Copying /testhb/dest.db to /testhb/dest.db.orig before upgrade Upgrade to rev 15... Alter database Rename hb.NNNN programs to hb#NNNN in backup directory Remove dedup table; backup will rebuild it Upgrade to rev 16... Alter database Database upgraded to rev 16 Showing recent versions 0 501(jim) 2016-01-01 13:49:58 - 2016-01-01 13:49:58 #1335Restored dest.db See traceback below or in stderr redirected file Traceback (most recent call last): File "versions.py", line 208, in <module> File "versions.py", line 144, in main File "db.py", line 173, in opendb File "db.py", line 434, in upgradedb Exception: some error message
backup: incremental backups spend most of the time scanning the filesystem for modified files. This scan is now 5-10% more efficient. Because the scan is mostly IO-bound and very "seeky", this could show up as either lower CPU usage or faster backups, especially on large backups with a lot of history and directories.
backup: real and simulated raw device and VM image (.vmdk) backups are about 10% faster
a new S3 dest.conf keyword, "class", can be set to either standard or ia. When set to ia, backup files will be stored in Amazon S3’s Infrequent Access storage class IF it will be cheaper than the standard S3 storage class. For small files and files that are expected to be stored less than 13 days, standard storage turns out to be cheaper than IA because:
IA charges for 128K if files are smaller than 128K
IA charges for 30 days if files are deleted before 30 days
Amazon also charges 1 cent/GB extra to download from IA.
Amazon S3 Infrequent Access (1.2 cents/GB/mo), Google Storage Nearline (1 cent/GB/mo), and Backblaze B2 (.5 cents/GB/mo) are all good options for migrating off Amazon Glacier (.7 cents/GB/mo), since HashBackup support for Glacier will be ending in mid-2016.
The class keyword only works with type s3 destinations. Other S3-compatible destinations like gs (Google Storage) set the storage class on the bucket rather than on individual files, so use their website to set the bucket storage class.
S3: Ben Emmons reported a sporadic problem with broken pipe errors on S3 for buckets in any region other than us-east-1 (aka US). Ben did some research to explain the cause: HB was sending all requests through the US standard region, which is not recommended. Now the location keyword is used to communicate directly with the correct region. If location is anything other than US or us-east-1, it must match the region the bucket was created in. Thanks Ben!
selftest: with --inc (incremental selftest), small backups were tested too often. Selftest always wanted to test at least 1 arc file every run, so if the backup had only 1 arc file, that file would be tested every time, even if the goal was 1d/30d ("selftest runs every day, verify files every 30 days") Now selftest will test the one file (or small backup) once every 30 days.
backup: during a backup, HB splits a file into blocks, hashes each block with the SHA1 cryptographic hash, and uses these hashes to find duplicate data. The block hash is verified during restores to ensure that each block’s data has remained the same through all of its travels through HB itself and to and from remote destinations.
As an extra safeguard, HB also stores one hash for each file backed up. This ensures that after a restore, the correct blocks were restored, in the correct order, and provides extra reassurance that no hash collisions occurred during dedup. (Hash collisions are nearly impossible with SHA1, but they still get a lot of attention.)
Very early on, the SHA256 hash was chosen for the whole file hash. In hindsight, this was a poor decision, because it is the slowest of all the SHA family of hashes - even slower than SHA512 - and maxes out at around 100MB/s on common computers.
Going forward, the SHA1 hash will be used for the whole file hash. SHA1 is about 3x faster than SHA256 and still provides the extra layer of error checking to ensure that restored files are identical to when they were saved. All HB commands have been modified to handle both the old and new file hashes.
get: if cache-size-limit is set >= 0, it means some backup data is not stored locally. So get (and selftest -v5) create restore plans to figure out the best way to retrieve arc files to use the least amount of disk space and not download any arc file more than once. But if a raw device was restored, the plan would say "1 item, 0 bytes", and then during the restore, would fetch the arc files one by one as needed, and not delete any until after the restore finished. In other words, there was no plan. This has been fixed. A typical plan will look like this:
Planning cache...done Archives: 4 Blocks: 46 Download size: 44 MB Peak cache size: 11 MB Disk free space: 75 GB, 30% Items: 1 Data bytes: 52 MB
get: when cache-size-limit is >= 0, get was not respecting the cache size limit. A restore of a 500GB VM image said it would download 230GB and need 14GB of space in the backup directory. But if the data could be loaded from the destination faster than the restore, the backup directory would go over the 14GB limit and cause a disk full error.
backup: in certain circumstances, a file could be skipped instead of backed up, but the next backup would catch it
get: a null pathname on the command line ('') caused a traceback
mount: if the FUSE library is not installed, mount raised an EnvironmentError, which is not so user friendly. Now, mount displays 5-10 lines of information about what FUSE is, where it’s located, and tips on how it can be installed for the mount command
selftest: added a new test for -v2 and above
NOTE: this rev will do an automatic database upgrade to dbrev 15 when any HB command is used. This upgrade:
modifies a database index
renames backup directory programs hb.NNNN to hb#NNNN
removes "hb" (very old program version) from backup directory
deletes the dedup table; next backup will re-create it
backup: uses half as much RAM to dedup the same number of blocks. The first backup with this version will rebuild the dedup table, so may take longer.
backup: single-core variable block dedup is 5% faster (-D -p0 no B). Initial multi-core disk image backups (.vmdk, raw, etc) are 12% faster.
backup: --maxtime and --maxwait were added in #862 to control the backup time for huge backups. See the #862 changelog for details. The initial backup for a multi-TB filesystem with millions of files could require several days. Using --maxtime 6h lets HB backup for 6 hours every night until all data is saved. Then incrementals tend to be much faster and have no trouble finishing within the backup timeframe.
The enhancement in this version is that restarts after the time limit are much faster, allowing HB to completely skip huge portions of the filesystem already backed up. This only occurs when --maxtime is used, although if you want this shortcut restart behavior all the time, use --maxtime 1y.
backup: simulated backups are up to 25% faster: 155 seconds now vs 210 seconds with #1371 on a 4.5GB VM image (.vmdk files)
rm: two new config options control archive packing:
pack-age-days: specifies the minimum archive age in days.
Many storage services are adding delete penalties for files removed before N days. For Google Nearline and Amazon Infrequent Access storage, it is 30 days. HB needs to be aware of this because otherwise, it could pack an archive several times in a month if enough data was removed, which would cost more than leaving the data alone. The new HB default for this is 30 days. To disable this option and preserve existing behavior, set it to 0.
This option prevents repeated packing of small files. For example, if an arc file is 50K and 25K is deleted, it would be packed if pack-percent-free is 50 (the default). But, it's not worth the trouble for such a small savings. The default for this option is 1MB. To preserve existing behavior, set it to 4K, the minimum.
IMPORTANT NOTE for the pack-percent-free config option: most storage services are charging high download rates compared to storage rates. For example, it costs 8x more to download a file from S3 Infrequent Access (10 cents/GB) than to store the file for a month (1.25 cents/GB). Stated another way, it costs the same to store a file for 8 months or download it just once. If pack-remote-archives is set to True (the default is False), and cache-size-limit is >= 0 (not all archives are stored locally), consider bumping pack-percent-free much higher than 50 to limit packing downloads.
If cache-size-limit is -1, meaning a local copy is kept of all archives files, packing does not require a download so this config option is not as important for cost control.
— backup: the default arc-size-limit for new backups is now 100MB instead of 1GB. If you want to change the arc size for existing backups, use: hb config -c backupdir arc-size-limit 100mb. This will increase the number of arc files used, but there are very large sites running in this configuration with 40-50K arc files, without problems. Smaller archives make it more likely that HB can manage remote storage with delete commands instead of downloading, packing, and uploading.
selftest: the --inc freq/goal option is used to do incremental selftests, where a portion of the backup is checked every day. The freq and goal specify the percentage of the backup space to be checked (freq/goal).
Many storage services have a free allowance for downloads, for example, Backblaze B2 allows 1GB/day, and charges after that. To ensure incremental selftest doesn't go over the free allowance, a new limit option can be added. For example, --inc 1d/30d,500m means:
Selftest may still go over the limit if a single arc file is bigger than the limit. When the limit is triggered, selftest will display a message. In this case, your goal cannot be met, so a selftest of the complete backup will take longer than your specified goal.
the compare command compares a backup with a live filesystem and indicates new, changed, and deleted files. For changed files, the compare command shows which attributes changed. The new -v2 option shows the backup and filesystem values for each changed attribute.
HB reports all unhandled exceptions (tracebacks) via email. This email includes the HB version number, command line, traceback, and a short system description (Linux, Mac OSX, or BSD)
get: raw device restore fixed
get: show progress for large files only if displaying output
selftest: before, would go one arc file over the limit with --inc instead of staying under the limit. For GB-sized arc files, it makes a difference.
backup: a simulated backup could end with a traceback: AttributeError: Arc instance has no attribute 'iobuf'
selftest: an incorrect error was displayed: Error: for logid 1786855, hlogid 1786854 is invalid: /bin/ln [r836] This was a bug in selftest, not a problem with the database.
backup: OSX system files were sometimes incorrectly tagged sparse
ls: add a note for sparse and raw (device) files with the -l option
backup: gave a warning about slow backups & restores for -Z5 and higher, but should have only been for -Z8 and higher.
the rate keyword in dest.conf is now supported for Backblaze B2 and WebDAV destinations. See doc/dest.conf.examples/README for more details about the rate keyword. Basically it is an upload rate limit in bytes per second and allows suffixes like 512k to mean 512K (512 x 1024 = 524288) bytes per second.
"subdir" is a new keyword added to WebDAV destinations. This allows storing multiple backups in the same WebDAV area. It was possible to do this before if the subdirectories were created before using HB. Using the subdir keyword, HB will create these directories.
the example WebDAV file, doc/dest.conf.examples/dest.conf.dav, has more explanations about how to use HB with WebDAV. WebDAV servers are often configured differently and can be picky about their setup.
backup: a bug was introduced in the Nov 22 release, #1363, that caused backup to hang after creating only a few arc files. This bug was not related to a particular destination type.
Traceback (most recent call last): File "/hb.py", line 104, in <module> File "/backup.py", line 2047, in main File "/dest.py", line 302, in init File "/dest.py", line 248, in initdest File "/dest.py", line 177, in startdest File "/glacdest.py", line 229, in __init__ File "/s3dest.py", line 123, in __init__ File "/basedest.py", line 90, in baseinit
Backblaze B2 is supported. B2 is still in the invite-only beta stage, so please observe Backblaze beta guidelines. See doc/dest.conf.examples/dest.conf.b2 for B2 dest.conf keywords.
if the "secure" keyword is used with a WebDAV destination like box.net, the SSL certificate received from the server is verified
the HB ls and get commands will not check file permissions stored in the backup if the user running HB is the owner of key.conf.
For example, if root does nightly backups and a user is given sufficient OS permissions to access the backup files, hb get checks permissions saved with backed-up files to see if the user has read access to the files being restored. If not, hb get issues a "No read permission" error and will not do the restore.
But in a disaster recover situation with shared hosting, userids may change, the userid used for backup and/or owning the HB backup files may not be the same as the userid doing the restores, and the person running HB may not have any control over userids in a shared hosting situation. Now, if the userid running hb is the owner of key.conf, get and ls will proceed.
dest clear: if there were files flagged for removal, dest clear could complain about them being the only copy and refuse to delete them.
backup: backup now skips empty directories listed on the command line. This is useful when backing up a mounted file system, eg NFS, that isn’t currently mounted. Previously backup would mark all files as deleted if the file system wasn’t mounted.
retain: -m used by itself caused a traceback. Now, a message is displayed that the -x option is required if -s and -t are omitted.
rm: if the highest backup version is removed with -rN, it could create an confusing situation for files that were previously backed up, deleted in rev N, then backed up again, for example:
$ hb ls -c hb -a Backup directory: xxx Most recent backup version: 1 Showing all versions 0 / (parent, partial) 0 /Users (parent, partial) 0 /Users/jon 0 /Users/jon/x (deleted in version 1) 1 /Users/jon/x
Now when the highest version is removed, any files marked deleted in that version will be undeleted, as if the deleted backup never occurred.
to reflect its more stable status and to avoid a release update during December holidays, HashBackup’s release schedule is changing. The new expiration schedule for the backup feature is:
January 15th -- April 15th -- July 15th -- October 15th
a user reported this traceback. The cause is now fixed. If you have this problem with a backup, delete the hb.sig file.
Traceback (most recent call last): File "/hb.py", line 104, in <module> File "/backup.py", line 2191, in main File "/dest.py", line 371, in put File "/db.py", line 833, in genincdb ZeroDivisionError: integer division or modulo by zero
in unusual circumstances, HB may create a partial backup of a file, for example, when there is an I/O error backing it up, or when selftest has to truncate a file because it detects an error. These partial files were not handled correctly by retain, sometimes causing good files to be removed while the partial file was retained. Now, retain will delete partially backed up files if there is a later, complete backup of the file.
The next release of HashBackup will have many changes. Rather than releasing these changes now, forcing everyone to accept all of them before #1330 expires, the next major release is being held back until after Sep 15th.
If you prefer to have a very stable version, update to #1332 before Sep 15th. The only change is a bump in the expiration date. Update after Sep 15th to get the latest features of the new release.
if an EOF occurs when hb asks a yes/no question, hb aborts instead of asking the question again. This was a problem if a keyboard is unavailable, eg, hb running in the background.
cache-size-limit (config keyword) is honored when syncing files to a new destination, to avoid filling the local backup directory.
fixed unhelpful error message if an integer is expected for a keyword in dest.conf and something else is used
imap destinations previously did not support the timeout keyword, and could hang for long periods of time. Now, the default timeout is 5 minutes and can be changed with the timeout keyword in dest.conf. If a timeout occurs, hb will use its normal retry loop to recover.
selftest: a recent change verifies that hashes have the correct datatype in the hb database, but also could cause this error:
Traceback (most recent call last): File "/hb.py", line 178, in <module> File "/selftest.py", line 1166, in main UnboundLocalError: local variable 'logid' referenced before assignment
backup: in #1316, variable-block dedup backups (-D without -B) created the wrong datatype for hashes, causing the previous error. This is now corrected, but you should run selftest -v2 --fix to check your backup. It may display errors like this:
Checking blocks I Error: block 256, hash is type str Corrected hash type Checked 256 blocks I
selftest: a recent change enabled selftest to pack arc files, but a partial selftest could cause this traceback:
Traceback (most recent call last): File "/hb.py", line 178, in <module> File "/selftest.py", line 1198, in main ValueError: need more than 2 values to unpack
for all of April, the focus will be on testing, test scripts, and bug fixes. After April, two days per week will be devoted to testing to ensure HB’s quality remains high.
selftest: -v4 --fix could cause arc size errors if it was interrupted. This is fixed and -v4 --fix is okay to use now. If anyone had this problem, selftest --fix will now correct it.
selftest: with -v4, if archives have to be downloaded to be checked, they will also be packed and uploaded if their free space is greater than pack-free-percent. This happens even if pack-remote-archives is False, because during selftest, the arc files are already local. To completely eliminate packing (probably not a good idea) set pack-free-percent to 100. Then empty arc files are deleted but none are packed.
selftest: if destinations were not in sync, selftest might display this message that isn’t actually an error:
Error: arc.637.0 wrong size on atmos destination: should be 134223056, is 0
selftest: if -v4 --fix encountered a bad block, deleted it, and the file containing it was the last file in an archive, selftest -v2 would need to be run to correct this residual error:
Error: arc.0.0 is not referenced Marked for deletion
backup: with a simulated backup, if no backup data was generated because a zero-length file was saved or no files were modified, an error could occur after the backup when the empty arc file was deleted.
get: sometimes failed with these errors, often with .dmg or other disk image / large files with repeated blocks. It worked with -p0.
Block hash mismatch, blockid 1727226: pathname File size mismatch, should be 17896386, is 14750658: pathname
the database lock problem seems to be solved by increasing the database timeout. The problem was random, but occurred more often on systems with:
a large buffer cache (lots of RAM)
disk I/O restrictions (VMs and shared hosts)
systems with lots of write activity
HB running under nice and/or ionice
XFS: it flushes every 30 seconds while ext4 flushes every 5, so more "dirty" buffers accumulate in the buffer cache with XFS
These all contribute to long sync times. HB databases use synchronous I/O for fault tolerance, and database commits require flushing the system buffer cache. On some systems, this flush could take longer than 15 seconds - the previous database timeout.
get: sometimes get would end with an error message even though the file restore was successful:
Unhandled exception in thread started by <bound method dir.loop of <dirdest.dir instance at 0x100f26fc8>> Traceback (most recent call last): File "/basedest.py", line 331, in loop
backup: if the dedup table is >= 2GB, the next resize to 4+GB could fail. On Linux, the error was: Error writing dedup table: wrote 2147479552 of 4147483640 bytes On OSX, the error was: OSError: [Errno 22] Invalid argument
$ hb backup -c hb mnt/dir-2/db mnt/dir/backups Backup directory: /Users/jim/hb This is backup version: 0 Dedup is not enabled /Users/jim/mnt/dir-2/db /Users/jim/mnt/dir-2/db/abc Unable to stat file: No such file or directory: /Users/jim/mnt/dir/ Unable to backup: No such file or directory: /Users/jim/mnt/dir/backupsselftest: arc file healing is where an arc file has a bad block and is reconstructed from good blocks found in another copy of the arc file. This release corrects a bug found in internal testing.
in #1256, a default timeout of 30 seconds was added for Amazon S3, because of long hangs at a (Linux) customer site. But this change caused failures on Mac OSX:
dest s3: error #1 of 3 in send arc.1.0: [error] [Errno 35] Resource temporarily unavailabledest glacier: error #1 of 3 in send arc.1001.5: [UnicodeDecodeError] 'ascii' codec can't decode byte 0xd0 in position 21: ordinal not in range(128)retain previously operated on the entire backup. Now, pathnames can be used with retain so that it operates on just these files or directories. This is useful when you want to retain fewer copies of certain parts of the backup.
For example, if you backup /home with several users, each user could have a different retention policy by running retain several times with /home/user1, then /home/user2, etc.As another example, you may have a log directory where you only want to keep the last 30 days, even though for the rest of the backup, you want to keep 1 year of backups.In addition to the specific retains, it's a good idea to continue running retain on the entire backup, without a pathname, unless you are sure that the specific retains will cover all parts of the backup.rm/retain: a simulated backup (config keyword simulated-backup = True) is used to model a backup with live data, to learn by experiment the best backup method and HB options to use, without requiring a lot of disk space. One key option is packing archive files. Packing always happens with local archive files when the free space exceeds the pack-percent-free config keyword. It happens with remote archives only if pack-remote-archives is True.
If pack-remote-archives was set on a simulated backup, it caused: Error opening archive: Archive does not exist: /hbdev/hb/arc.0.0Since it is important to be able to model remote archive packing, to see how it affects backup space utilization, this is now supported with simulated backups. When the config keyword pack-remote-archives is True (default is False), rm and retain will pack the simulated arc files and hb stats will show this.selftest: -v2 can now be used with simulated backups. Any higher level will cause an error and stop.
rm/retain: when hard links were removed, it sometimes caused a selftest error "hash should not be null" on other files that were hard-linked to the removed file. This release fixes that bug. To correct existing "hash should not be null" errors in your backup, run selftest -v2 --fix until there are no errors.
selftest: when a hard-linked file is truncated, mark all related hard links as partial so they are saved on the next backup
Glacier: a recent upgrade to a new version of the Amazon library supporting Glacier could cause a traceback: IOError: [Errno 2] No such file or directory: 'endpoints.json'
selftest: there is a bug in HB’s hard link handling that can cause "hash should not be null" errors. The cause of the problem has not yet been fixed, but now selftest -v2 --fix will truncate the file to eliminate the errors. Then backup will save the file again. Still working on finding & fixing the hard link bug.
log errors: logs ending in _RUN are now included in the error summary. Sometimes if HB can’t get started properly, empty _RUN logs are created. Lots of these is a sign that "something bad" is happening.
selftest: in a recent change, selftest prints progress percentages in some long sections. Don’t print these if output is being sent to a file.
backup: if a named pipe, aka fifo, is listed on the command line, the fifo is opened and all data is saved. This can be used for example with a database dumper, to back up a database dump without having to create a huge dump file first. Here is a simple example:
mkfifo fi cat somefile > fi & hb backup -c backupdir fiInstead of using cat, any program can be used, and its output will be backed up. It is not possible to put hb in a simple pipeline without a fifo, because it would not have a filename to associate with the data saved, so this does not work:IMPORTANT: it is easy to have multiple processes writing to a fifo at the same time by mistake (I did it during testing.) When that happens, the fifo is getting data from two places and the backup is a mixture of the two. Or, said another way, it is trash.Glacier: when downloading a file, HB tried destinations in the order they are listed in dest.conf. Now, Glacier is always a last resort even if it is listed first in dest.conf, to avoid 4-hour restore delays and potentially high retrieval charges.
Traceback (most recent call last): File "/hb.py", line 178, in <module> File "/selftest.py", line 1612, in main UnboundLocalError: local variable 'maxvernum' referenced before assignments3 destination was causing hb to hang at the end because of a close added to help with the database lock problem
yes, the dreaded (and random-ish) "Database is locked" error is still lurking. Since it is not easily reproduced except at customer sites, here is another change that may help. Or not.
selftest: a new option, --inc, can be used to do more thorough testing of a backup over a period of time. For example, there are hb backups with over 30K arc files, none stored locally, so it’s not reasonable to do a full selftest all at once. Even with smaller backups, there may not be enough time in the backup window to do a full selftest.
The --inc option is: --inc f/g, where f is the frequency selftest is typically run, and g is the goal for completing the selftest. For example, --inc 1d/30d means selftest is run every day and a complete selftest should occur every 30 days. Or, --inc 1h/1q means a selftest runs every hour, and a complete selftest should occur every quarter (90 days).--inc can be used with -v3, -v4, or -v5, and there is a separate checkpoint for each level. This allows incremental local arc file tests (-v3), local + remote arc tests (-v4), or file hash verification (-v5) to have different schedules.selftest: previously, the default selftest level could be almost anything, depending on whether arc files are stored locally and other factors. Now, the default level is -v2, or -v1 for simulated backups. A note is displayed that higher levels will do more thorough checking. Higher levels of checking may involve downloading lots of arc files, so it seems reasonable to request that rather than having it be a default action.
For thorough backup checking, the recommended level for selftest is -v4, to check all arc file data. -v5 can be useful, to verify user file sha256 hashes, but it does not test every arc file like -v4.Traceback (most recent call last): File "/hb.py", line 178, in <module> File "/selftest.py", line 897, in main ValueError: need more than 0 values to unpackselftest: if an expected file was missing at a destination, selftest would fail or hang indefinitely waiting for it to be downloaded. Now, the missing file will generate errors for each block, but if there are other copies of the arc file at other destinations, selftest will upload a copy to the destination where it was missing.
selftest: related to above, if the local copy of an arc file is missing, and it should be present because cache-size-limit is -1, selftest will leave a copy in the backup directory if there are destinations that have the file. If an arc file is truly missing and not present locally or at any destination, all of its blocks are deleted and the files using those blocks are truncated when --fix is used.
selftest: with the recent selftest changes, a race condition could cause this error:
Checking blocks I Downloading arc.0.0 Checking arc.0.0 Exception in thread _writethread: [Errno 9] Bad file descriptor File "/shared.py", line 68, in start_thread File "/arc.py", line 65, in _writethreadselftest: don’t print 'Downloading arc.v.n' when there are no remote copies, and show the list of destinations when remote copies are downloading.
selftest: add a note to run selftest -v2 --fix if there were corrections
a new command, log, runs another HB command with logging This command is experimental; feedback is welcome.
selftest: levels -v3 & -v4 check remote arc files
selftest: new -r (version) option
selftest: progress display
selftest: option to test specific arc files or pathnames
selftest: healing arc files using multiple destinations
debug keyword for Amazon S3, default S3 timeout
bug fixes
a new command, log, makes it easier to run HB commands with a log file. Any HB command can be run with logging. The usage is:
# hb log backup -c backupdir -D1g -B1m /Users/jim /etcThe hb log errors command displays log files from all commands that failed, zips all files to a monthly .zip file, and prints a summary of successful and failed commands. This info is written to stderr and the exit code is always 1, so putting hb log errors in a cron file will cause the output to be emailed.When run from a regular terminal, command output is sent to the terminal and to the log file. When run from a cron job, or anytime stdout is redirected, output is only sent to the log file. Regular output (stdout) and error output (stderr) are interleaved, and every line is timestamped in the log file:2015-02-17 Tue 21:48:16| Backup directory: /Users/jim/hb 2015-02-17 Tue 21:48:16| Copied HB program to /Users/jim/hb/hb.1236 2015-02-17 Tue 21:48:16| This is backup version: 0 2015-02-17 Tue 21:48:16| Dedup is not enabled 2015-02-17 Tue 21:48:16| /Users/jim/bcopy 2015-02-17 Tue 21:48:16| Writing archive 0.0 2015-02-17 Tue 21:48:16| 2015-02-17 Tue 21:48:16| Time: 0.5s 2015-02-17 Tue 21:48:16| Checked: 4 paths, 5542 bytes, 5.5 KB 2015-02-17 Tue 21:48:16| Saved: 4 paths, 5542 bytes, 5.5 KB 2015-02-17 Tue 21:48:16| Excluded: 0 2015-02-17 Tue 21:48:16| Dupbytes: 0 2015-02-17 Tue 21:48:16| Compression: 91%, 11.2:1 2015-02-17 Tue 21:48:16| Space: 496 B, 139 KB total 2015-02-17 Tue 21:48:16| No errors 2015-02-17 Tue 21:48:16| Exit 0: Success
Logs are written to the logs directory under the backup directory. The log filename is a timestamp and the command name. While a command is running, the log filename will have RUN at the end. If a command is suddenly aborted (kill -9), the RUN suffix will stay there. If the command fails, RUN is changed to FAIL. If the command succeeds, there is no status at the end.
selftest has these -v options that have not changed: -v0: check database is readable -v1: check database integrity -v2: check database high-level structure -v5: check user file hashes (sha256) (No -v option means hb selects level based on several factors)
The -v3 and -v4 options have changed: -v3: check all local archives -v4: check all local and all remote archives
With -v4, arc files are downloaded from ALL destinations now, whereas previously, HB downloaded each arc file from only 1 destination - usually the first destination listed - and then, only if necessary. If you ran with cache-size-limit set to -1, selftest did not verify the remote arc files.
Another improvement is that -v4 uses local disk space equal to the number of destinations times one arc file size, so it can be run on huge backups that previously would have required a lot of local disk space. A cache plan is no longer needed for -v4, but it is still used with -v5.
IMPORTANT NOTE: selftest can't download and check arc files stored at Amazon Glacier. The 4-hour retrieval delay is unmanageable for selftest.
selftest has a new option, -r, to indicate the version you want to test. Selftest always runs all of the usual database checks. With -r, the -v3 and -v4 options only test the arc files in that version. For -v5, the -r option restricts full file verification to the user files in that version.
The default is to test all versions, as before. To test the most recent version, use -r-1.
selftest has a progress display in the block & ref tests, since these can be quite long and may appear "stuck"
selftest can test specific arc files or pathnames by listing them on the command line. If arc files are listed, all copies are tested as if -v4 was used. If pathnames are listed, their sha256 file hash is verified as if -v5 was used. If any pathname is a directory, all files underneath the directory are tested also checked. For listed pathnames, all versions are tested. A warning is printed if specific arc names are combined with -v4, or pathnames are combined with -v5; neither of these make sense: -v4 tests all arc files and -v5 tests all pathnames.
selftest: with -v4 and multiple destinations, arc files can be corrected. All good blocks are merged into a new arc file that replaces any arc files with problems. selftest does not yet handle completely missing remote files and will usually halt or hang.
s3: if the debug keyword is added to an S3 destination, a log file destname.log is created in the backup directory. Add debug 99 to dest.conf. This will also usually cause HB to fail when exceptions occur rather than catching them and doing retries.
s3: a 30-second timeout has been added to help prevent hangs on downloads
recover: failed with a traceback related to adding simulated backups:
Traceback (most recent call last): File "/hb.py", line 156, in <module> File "/recover.py", line 191, in main File "/dest.py", line 143, in init AttributeError: 'NoneType' object has no attribute 'get'
debug 99 keyword on a destination will display a detailed traceback if an error occurs during startup
Glacier: fix typo in yesterday’s release for this error: dest glac: error #1 of 3 in send arc.0.0: [NameError] global name 'location' is not defined
retain: add a default -x if none is specified NOTE: this may remove lots of deleted files from your backup that should have been removed earlier, but weren’t
export: include dest.db in export.tar
pre-2014 database upgrade code removed
bug fixes
config: a new keyword, simulated-backup, can be set to True before the initial backup. When set True, no arc files are created by the backup command. This allows modeling backup options such as -B (blocks size) and -D (dedup table size), even for very large backups, without using a lot of disk space. Simulated backups also run faster because there is less I/O. Incremental backups work correctly, and the stats command can be used to view statistics, space used, etc.
Summary of differences for simulated-backup: - must be set before the initial backup - cannot be changed after the initial backup - no arc files are created (not a real backup) - no files are sent to remote destinations - selftest is limited to -v1 - get & mount will fail with "No archive" errors
backup: if interrupted with a ctrl-c, backup will display the file it was saving. This can be useful with -v0/-v1 since file names are not printed
backup: if a file gets bigger during the backup, save whatever data is there. Previously, backup saved the file up to its size at the start of the backup, plus sometimes 1 extra block.
selftest: --fix will truncate files that contain a bad block
backup: (Mac OSX) HB is a case-sensitive program, so /users is not the same as /Users. Most Unix filesystems are also case-sensitive. HFS, the Mac OSX filesystem, is usually case-insensitive, so /users is the same as /Users. If HB did nothing, then backing up /users and /Users would result in saving the same directory under 2 different names. It’s not a problem, but it’s confusing.
To solve this, HB tries to figure out the correct "case" of all pathnames on the command line; then everything works correctly. To do this, HB needs read access to the parent directory of all pathnames on the command line, so to map /users to /Users, HB needs read access to / (root). Usually this is fine, but on some systems, HB may not have read access to a parent directory, and backup would fail with a traceback. Now it will display a new error message "Cannot verify filename case", and use the filename as-is.
backup: a new True/False config keyword, backup-linux-attrs, controls whether HB backs up Linux file attributes. Linux file attributes are set with the chattr command and displayed with lsattr. They are little used and poorly implemented on Linux, requiring an open file descriptor and an ioctl call. This can cause permission problems, especially in shared hosting environments. Now HB will only read and store file attributes on Linux if backup-linux-attrs is set True with the hb config command. The default is False.
NOTE: File attributes are not the same as extended attributes, also called xattrs. Extended attributes are always backed up if present. Xattrs are handled by the attr, setfattr, and setfattr commands on Linux, as well as the Linux ACL commands.
rm/retain: previously, HB would not pack version 0 or 1 arc files, created before Sep 10, 2012. Now it will. To see if your arc files need packing, check these lines in hb stats: 48 GB archive space 36 GB active archive bytes - 75.08% After packing old archives on this backup, hb stats says: 39 GB archive space 36 GB active archive bytes - 93.06% If cache-size-limit is -1 (the default), copies of arc files are kept locally and packing is enabled. If cache-size-limit is set, not all arc files are kept locally; they are only downloaded, packed, and uploaded if the config keyword pack-remote-archives is True.
rm/retain: when arc files are deleted or compressed, rm & retain display the amount of backup space saved.
rm/retain: when cache-size-limit is set and pack-remote-archives is True, rm and retain now try to respect the cache size limit while compressing archives. This keeps HB from using too much local disk space. Packing might run slower if it has to wait for packed archives to be transmitted, to respect cache-size-limit.
retain: without the -x option, files that had been deleted from the filesystem were staying in the backup forever. This was fixed for -t (see below) but is still a problem for -s. Now, a default -x is always used. The default -x is the same as -t or the last time period of -s. So for -s30d12m, the default would be -x12m, ie, keep history of deleted files for 12 months after they are deleted.
export: the dest.db file and a stub dest.conf file with just destination names are now included in export.tar to help debug problems that involve remote destinations. dest.db contains HB filenames (arc files, etc) and the names of destinations containing them. It does not contain user data, keys, or passwords.
retain: with -t retention, HB would not remove a deleted file if it was the only copy, no matter how old it was. Now it is removed when it has been deleted longer than the -t retention time.
Example: - a file is saved January 1st - the retention period is -t30: keep 30 days of file history - the file is deleted January 31st - for 30 days from Jan 31st, the deleted file must be restorable - 31 days after it was deleted from the filesystem, it can be removed from the backup
retain: the -x option allows removing history for deleted files sooner than it would normally be removed. For example, with -t30d, the history of a deleted file is kept for 30 days after it is deleted. With -t30d -x15d, history of active files is kept for 30 days, but files that have been deleted (from the filesystem) have their history removed 15 days earlier. That was the design intent.
What actually happened (ie, the bug) is that retain with -x15d was removing deleted files *saved* more than 15 days ago, so it deleted files too soon. It should have been checking the date the file was deleted, not the date it was saved.
on Linux, when HB asked a yes/no question it sometimes would not display the question, yet waited for a response. Related to a recent buffering change for stdout.
on Glacier, HB was not ignoring "not found" errors when trying to remove files, causing these kinds of error messages:
glac(glac): unable to remove arc.0.0 with archive id (long id) Expected 204, got (404, code=ResourceNotFoundException, message=The archive ID was not found: (long id)
This can happen if part of a backup is saved in one region, the Location keyword (region) is changed in dest.conf without changing destname, then you backup in a different region.
backup: when a backup has old, inactive destinations that were used in the past, HB would sometimes try to remove files from them. This could cause errors like below with active Glacier destinations. Now HB ignores inactive destinations.
glac(glac): no archive id available to remove this file: arc.0.13
#1200 is identical to #1199 except that HB’s database software has been upgraded to a new release. Everything should function the same, though performance may be slightly different. In testing, a long selftest, which has lots of database operations, took about 7% less time than with the previous version. If performance is much worse for any operation, please send an email.
rm: when a hard-linked file is removed, sometimes its data blocks were not being deleted. This could cause a selftest error:
Error: unknown logid referenced: 5 [r0] IndexError: list index out of range
when remote destinations are setup in dest.conf, HB creates incremental versions of hb.db (the main HB database) in the local backup directory. These are sent to remote destinations and are used by hb recover to re-create hb.db if the local backup directory is lost. In some cases, you may be working directly with a remote backup directory. For example, the remote backup directory might be on an external USB drive and you bring it back to the office to do a complete restore when a disk dies.
The "normal" way to do this would be to create a local recovery directory, add key.conf and dest.conf, and run recover. This would download (or copy from the backup drive) all of the files from the remote backup directory and re-create hb.db.
Now, you can put a key.conf file in the backup directory itself and the next HB command will re-create hb.db from the hb.db.N files. It saves a copy step and allows you to copy the HB program and key.conf to a remote destination and run selftest, for example. But be aware that you are operating directly on the backup and any modifications will likely corrupt it.
stats: the display precision on a few statistics was increased for multi-TB backups. Some new statistics were added to hb stats, for example, an estimation of the backup space saved by dedup.
backup: if a directory /abc/def exists and the path /abc/def-ghi (either a file or directory) is excluded in inex.conf, backup would usually save /abc/def-ghi anyway. This bug could occur with many punctuation characters other than dash.
audit: if a program was still running, audit would show Finished: as the current time. Now it displays a blank space.
stats: if a backup was in progress, stats would sometimes display an error
selftest: it’s possible for a file’s size to change while it is being backed up. Usually the file is growing, but it can also be truncated. If a file is truncated to zero bytes between the time that HB reads the file size and begins the file backup, selftest was incorrectly reporting this error:
Error: logid 39528488 does not reference blocks: (pathname)
every version of the HB program used in a backup is now copied to the local backup directory, named hb.N, where N is the build number (not to be confused with hb.db.N files, which are database-related). This local copy is always created now, whereas before it was only created if remote destinations were configured. If the copy-executable config keyword is True, each version is also copied to remote destinations. You can delete old copies of hb.N manually, and they will also be deleted from remotes.
IMPORTANT: do not delete hb.db.N files by mistake!
backup: if file system flags cannot be read for a file or directory, backup displays an error message as before, but continues as if the flags were zero rather than giving up on the file or directory. File system flags are set with the chattr (Linux) and chflags (BSD/OSX) commands and are not widely used. Related to this change, backup unnecessarily opened directories for reading on BSD/OSX systems, which could cause errors if there was only x access to a parent directory.
selftest: more new tests - yay!
selftest: if cache-size-limit is set, selftest -v3 or higher is used, and you answer No to the Continue? question, the arc cache was not cleaned up (lots of extra arc files left there).
stats: bombed with a traceback if the initial backup was still running, but had done at least 1 commit. Now it doesn’t bomb, but the statistics are still not quite accurate because stats uses completed backups for some of its numbers, but all backups for other numbers.
the first backup with this new version may upload many archives. The sync code was not always uploading archives that were already present on the destination if the size changed because of packing following rm/retain in older versions of HB. This mainly occurred if packing was interrupted or a destination was down during packing. One of the new tests in selftest revealed this.
audit: if auditing is enabled and an HB command is running, audit would display the current date & time for the running HB process under "Finished". Now it displays a blank.
get/selftest: creating a plan to manage the local arc cache (when cache-size-limit is set) is 3-5x faster
prior to #1090 (June 2014), an unusual race condition during backup could cause this selftest error:
Error: for logid 3241850, pathid 2923104 is invalid.
This could also cause selftest to abort, with an error message to run selftest. Doh! Now, selftest --fix corrects these errors by manufacturing a pathname based on the prior pathname backed up. For example, if /data/file1 was the file backed up before the missing path, the missing path would be called /data/file1#path-N#hberror, where N is a unique number.
get/selftest: if cache-size-limit is set, archives are being packed, and rm/retain are used, get and selftest (with -v3 or higher) could fail with an error like this:
Planning cache... Traceback (most recent call last): File "/hb.py", line 172, in <module> File "/selftest.py", line 857, in main File "/selftest.py", line 298, in plan File "/blocks.py", line 169, in iterprefetch File "/blocks.py", line 50, in getvernum Exception: Blockid not found: 919621
a new command, export, creates a tar file of the HB database for customer support. This contains file metadata such as filenames, info listed by ls -l, and HB metadata, but no user file contents.
The advantage of export is:
a passphrase can be set, so if your export is intercepted, even with the key, the passphrase is still required
it clears backup keys stored in the database
it clears destination info if stored in the database
export’s tar file is much smaller than tar alone can create
selftest: more tests added
hb init sets the key file, key.conf, to be read-only to prevent accidental deletion. If the backup directory is actually on a Synology NAS mounted over AFP (AppleTalk), an error occurs because a read-only file can’t be renamed in this setup. HB was changed to reset the permissions before the rename.
NOTE: rekey still does not work in this configuration. A better way to setup a NAS is to use a local backup directory with -c, setup a Dir destination to the NAS in dest.conf, and set cache-size-limit with hb config to avoid having a local copy of the backup.
backup: apparently the backup expiration date did not get bumped back in October, though it says it was in the change log. Apologies to everyone for such a silly mistake.
selftest code has been reorganized and streamlined in this version, to better accomodate all the new tests
when HB output is redirected to files, for example:
hb backup -c xxx / 2>&1 >hb.log
normal output was buffered differently than error output, so the log file didn't look right: error output and regular output weren't interleaved correctly. This also caused weirdness if output was sent to syslog, because output was dumped all at once at the end of the program. Now log files should look like terminal sessions. NOTE: this may be a bit slower if a lot of output is sent to a file on a remote file system.
This caused the error count to always be at least 2, which made backup monitoring more difficult. Now, backup will display only one warning:
retain with -x and -v would sometimes stop with an error: No pathname for pathid N; run selftest Selftest would complete without errors. The bug was that retain had deleted a pathname because of -x, then wanted to display the pathname because of -v. The simple fix was to display the pathname first, then delete it.
backup: dedup tables >= 2GB could cause problems when written to disk, with a message (Linux) like:
Error writing dedup table: wrote 2147479552 of 2147483640 bytes
backup: on July 29th, 2009, the -n option was added to HB backup so that if retain was running next, the entire HB database was not sent twice: once from backup, then again from retain. For a long time now, HB has been sending much smaller incremental database updates instead of a complete dump, so the justification for -n is gone. If -n is used with backup, and retain decides not to delete any files, the database does not get sent at all - clearly not the intent. In this rev, -n issues a warning and is ignored. Early in 2015, the -n option will be removed altogether and backups will fail if it is still used, because it can be a dangerous option.
selftest: a few more tests have been added for -v2, for very large backups. Selftest also had a bug where it would not check files that had extended attributes + ACL in common with more than 64K other files.
IMPORTANT: all customers should run selftest -v2 to check their backup.Getting arc.34.0 dest rsync: stopping because of errors dest rsync: Traceback (most recent call last): File "/basedest.py", line 310, in loop File "/basedest.py", line 469, in getcmd File "/basedest.py", line 443, in getparts File "/basedest.py", line 67, in retry File "/rsyncdest.py", line 116, in getfile nosuchfile
Traceback (most recent call last): File "/hb.py", line 146, in <module> File "/retain.py", line 322, in main File "/rm.py", line 181, in finish File "/arc.py", line 92, in compress File "/arc.py", line 156, in __init__ File "/arc.py", line 189, in _open File "/arcs.py", line 713, in fetch NoArchive: Archive does not exist:
rm/retain: with small VMs, rm/retain might have "out of memory" errors during startup with rsync destinations, because HB is unable to fork rsync while the dedup table is loaded. It is recommended to use workers 1 in dest.conf in low memory environments with rsync destinations.
when packing arc files because of a rm or retain, HB stores new, packed archives on remote destinations, then after all new archives are stored, removes the obsolete archives. This order is necessary to preserve the integrity of the remote backup, but it also uses more remote disk space. For users who are tight on remote disk space or have filled a remote, it may be necessary to remove old data first, then add new data. The new config keyword "remote-update" can be set to either "normal" (the default), or "unsafe" to control the order of operations.
IMPORTANT: if you set remote-update to unsafe and an operation is interrupted, the remote backup area may be in a temporary "bad" state, and doing a recover will fail. The local backup directory is fine, and the next complete HB run should correct the remote backup area.
if HB has been only creating a local backup, then a dest.conf is created, then hb dest -c xxx sync is done, hb.db.0 was not sent to the remote because it didn’t exist. The backup command normally creates these, but did not since there was no dest.conf. The remote backup is not usable without hb.db.N files. An interrupted backup would cause a somewhat similar situation. Now, hb dest sync will create a new hb.db.N and send it to all remotes.
the hb dest -c xxx ls command sorted files by destination and filename. But filenames like arc.10.0 would sort before arc.2.0, which gets very confusing. Now filenames are sorted as expected: all arc.0.N files, then arc.1.N, arc.2.N, …, arc.10.N, etc.
rm/retain: if a disk full occurred on a remote while packing archives, the next HB command could get stuck on "Getting arc.X.Y" (the old archive that was packed) with an error that it didn’t exist. This bug was introduced in the Sep 10th release. Running selftest -v2 --fix will correct this.
rm/retain: previously rm & retain did not do a local→remote sync like backup does when it starts. Now they do a sync when finished.
selftest: a few new tests were added for -v2 and above. -v2 is an important selftest level because it can be used on very large backups without having to download archives if they are not local. One new test makes sure that all arc files are either local or on at least 1 remote destination. Note that HB does not know whether files are really on remotes, but only that it successfully sent them in the past and they should still be there.
backup: previously, the backup command would remove partial or uncommitted archive files when it was interrupted, for example, with a ctrl-c. This can cause race conditions and destination errors, because the files being removed may already be queued or open for transmit. Now, backup will just exit, and extra archive files will be removed at the end of the next backup, rm, or retain.
display a message when creating hb.db.N files. It can take some time to create these files if there is a lot of backup history, even if only a single file is backed up, and it’s not obvious what is happening during the delay.
when HB syncs destinations, it does a better job of removing .tmp files from remotes (affects dir, ftp, and ssh destinations)
when a specific destination is cleared with hb dest -c xxx clear ddd, HB checked to make sure it was not deleting the only copy of a file, but it would delete files if they were on other destinations. Now, HB will not delete any files from a destination if it contains the only copy of any file.
security doc updated: dest.db sent directly to remotes
webdav: added digest authorization. Currently HB uses Basic authentication first, then Digest authentication if Basic fails. This is not ideal, because it defeats the "security" enhancements of Digest authentication. If security is a concern, use the secure keyword to enable ssl.
webdav: some dav servers only allow use of creditials for a certain amount of time before throwing an error. HB will now recover from these credential timeout errors, though it will still show up as a retry. Ideally, these should be handle more gracefully by HB so they don’t look like errors.
rm/retain: release dedup table early to help prevent running out of virtual memory in small VM environments with multiple rsync workers
rm/retain: "packing" is the recovery of deleted space within archive files and is performed automatically based on config settings during rm and retain. Previously, HB would overwrite arc files during packing and then send them to remote destinations. But between sending a packed archive and sending hb.db.n, the remote backup was in a transient, broken state. If rm or retain was interrupted during this time, the remote backup would be broken (ie, a recover would fail) until the next backup, rm, or retain automatically corrected it.
Now, packed archive files are created with new names and the old, unpacked archives are removed afterwards. This prevents the bad transient state if HB is interrupted. The downsides of this new method are that more space will be temporarily required on the remote side, equal to the size of all packed archives, and rsync will no longer be able to use its fast "delta transmission" to upload a packed archive.
There were a couple of other cases of potentially bad transient states on remote destinations that have also been corrected. An interrupted recrypt still leaves the remote backup in a bad state until the recrypt is finished, but this is documented in recrypt.
previously, dest.db was compressed & signed before sending to remotes. Since dest.db is always encrypted, there was little security benefit to this and it was confusing that a remote dest.db file was very different from a local dest.db file with the same name. Now, dest.db is copied without modification and is directly usable as a dest.db file if you have the correct key.
NOTE: the local and remote dest.db may not match if HB removes files after dest.db is copied. This is normal and expected.
WebDAV remote destinations are now supported. The destination type is either dav or webdav. There is a new doc file in doc/dest.conf.examples to explain the options for DAV destinations.
recover: a new option, -n, makes restoring a few files faster in a disaster recovery situation. If the local backup directory is lost, recover is used to download files from a remote destination. But recover downloaded all archive files, unless arc-cache-limit was set. For large backups, this could take a very long time. Now, the -n option can be used, and no archives will be downloaded, the get command can be used to restore the required files, and only the archives needed will be downloaded. Later, recover can be used again without -n to recover all archive files.
Dir destination: previously, HB would try to do a symlink to "fetch" files from a remote Dir destination. This is useful when cache-size-limit is set and Dir destinations are actually remote drives, like Google Drive, Dropbox, etc., because instead of downloading the whole arc file from the remote service, HB can fetch only the blocks it needs for a get command. But it can also be confusing if you don’t realize what is happening. So a new symlink keyword has been added. If symlink is not present or is False, no symlinks will be used. If symlink is present with no value or is True, symlink will be attempted. If symlink fails, HB falls back to a regular copy.
recover: dest.db is always downloaded, even if it already exists in the recovery directory. A customer reported problems with recover, but the real problem was that recover had been run earlier, a backup was run (deleting files from the remote directory), and recover was run a 2nd time in the recovery directory using a stale dest.db. This caused a hang problem when non-existant hb.db.* files were retrieved.
recover: hb.db is always rebuilt, even if it already exists in the recover directory.
recover: hb.db is rebuilt from hb.db.N files. If recover is interrupted and restarted, it now will not download hb.db.N files that already exist and are the correct size.
recover: previously, recover would rename existing files with a .old suffix. Now, existing files that are going to be overwritten are deleted instead.
recover: customer reported hash mismatches after a recover. When recovering into a directory already containing arc files, verify that any existing arc files are the correct size, and if not, download the file again.
customer reported selftest error (-v2 or higher):
Error: for logid 3241850, pathid 2923104 is invalid
backup directory d again: 5a. hb reads the directory to get a list of files 5b. file d/x is deleted by another process 5c. hb tries to backup file d/x and gets a stat error
Afterwards, selftest will throw the error about an invalid path. This backup bug is now fixed.
related to above error, if the same thing happens with a nested directory x instead of a file, it caused this selftest error:
Traceback (most recent call last): File "/hb.py", line 151, in <module> File "/selftest.py", line 619, in main File "/selftest.py", line 251, in showpath File "/paths.py", line 112, in getpathname Exception: No pathname for pathid 5; run selftest
a note was added to the S3 example dest.conf for Google Storage, explaining that Developer Keys have to be generated to use the S3-compatible API with Google Storage.
the security document was updated to mention loading dest.conf into hb.db to avoid having plaintext passwords in dest.conf, and describe modifications to the key generation procedure when the entropy pool is exhausted (Linux only).
customer reported error: — rebooted during a backup — this leaves hb.db-journal file (transaction log) — run hb upgrade to get latest version of hb — this version required a database upgrade — database upgrade would not work because journal existed
$ hb versions -c hbCurrent database rev: 13 Upgrading database to rev: 14 Warning: unable to audit command: Can't upgrade database with a transaction active Backup directory: /Users/jim/hbdev-1035/hbTraceback (most recent call last): File "/hb.py", line 171, in <module> File "/versions.py", line 139, in main File "/db.py", line 138, in opendb File "/db.py", line 379, in upgradedb Exception: Can't upgrade database with a transaction activerecover: with Glacier destinations, if a recover is in progress, the recover is aborted, then it is restarted, HB tries to display the archives already in progress and when they started retrieval. This print message caused an error on Linux, either an import error for _strptime, or a seg fault.
mount displays a better error message when a mountpoint is already in use, a better message when the backup has been mounted, and explains how to abort the mount using Ctrl-\
stats command was printing: 4:1 reduction ratio of backed up files for last %d backups instead of: 4:1 reduction ratio of backed up files for last 5 backups
beginning with #1032, a fatal exception was raised when a destination had trouble starting, for example, a Dir destination was unavailable because a removable USB disk wasn’t inserted. This fatal error was not intended, and the effect is that you couldn’t have destinations that were temporarily missing.
in #1035, a feature was added to detect overwriting a remote backup area by accident. But when doing an initial backup, this caused error messages to be displayed, usually 3, because HB kept trying to download the DESTID file when it did not (and should not) exist. The feature still works, but now the confusing error messages are not displayed.
backup: in #1059, saving / and /mnt did not save /mnt if it was a separate filesystem and -X wasn’t used. /mnt should have been saved because it was specifically mentioned on the command line. A similar thing could happen with excluded files that were on the command line.
NOTE: this rev will do an automatic database upgrade to dbrev 14 when any HB command is used. Extra statistics are maintained to speed up the stats command so it scales better for huge backups with millions of files. The existing database does have to be scanned during the upgrade to initialize these new statistics, so the upgrade could take some time to complete - about the same time as the old stats command took to run once
hb stats runs faster for huge backups and the "industry dedup ratio" will be more accurate for new backups
backup: prints the number of files and bytes checked in addition to the number actually saved (because they were modified). These numbers now include saved directories. When /abc/def is the only file backed up, what actually gets saved is /, /abc, and /abc/def. Backup will say 3 paths were checked and saved whereas before it said 1 file was saved.
mount: if an empty backup directory is mounted, it caused a traceback. Now, an error message 'No backups yet!' is displayed.
hb: if an invalid command is used, like hb xyz, hb could complain that the backup directory doesn’t exist, when the real error is that the command is not recognized
the error message displayed when the backup directory doesn’t exist is more specific, advising to use the -c option if it wasn’t used, or to use a different directory with the -c option. It was confusing when the -c option was omitted by accident.
when the backup database is newer than the hb program can handle, hb no longer recommends using the clear command. Auditing and command restrictions (disable-commands config option) prevent using clear in this situation.
backup: if a symlink to a mounted block device was used on the command line, backup would not check to see if the block device was mounted and display the appropriate warning.
help command could cause a traceback:
Traceback (most recent call last): File "/hb.py", line 51, in <module> File "/misc.py", line 556, in confdir misc.err: Backup directory doesn't exist, use hb init command: /root/hashbackup
if a database upgrade fails with an error, the original database is re-installed. But if the upgrade was aborted with Ctrl-c, the original database was not re-installed.
in #1035, a new feature was added to prevent sending two backups to the same destination. But if a destination is flaky (imap in this case), hb could report that the destination ID’s did not match and you may be overwriting another backup, when the real problem is that the remote service had an issue and did not return the DESTID file. Error retries have been added to fix this.
when there is a destination ID mismatch, the local and remote IDs are displayed to help determine the problem
compare: could cause a traceback on ZFS, because ZFS ACls are not yet supported. Now it displays a warning message like backup does.
with Amazon Glacier, HB creates an associated bucket in S3. But the location names for S3 and Glacier are sometimes not identical, and a traceback could occur:
Traceback (most recent call last): File "/hb.py", line 76, in <module> File "/backup.py", line 1973, in main File "/dest.py", line 180, in init File "/dest.py", line 126, in initdest File "/dest.py", line 62, in startdest File "/glacdest.py", line 172, in init1 File "/s3dest.py", line 185, in init1 File "/boto/s3/connection.py", line 500, in create_bucket S3ResponseError: S3ResponseError: 400 Bad Request <?xml version="1.0" encoding="UTF-8"?> <Error> <Code>InvalidLocationConstraint</Code> <Message>The specified location-constraint is not valid</Message> <LocationConstraint>us-east-1</LocationConstraint>
# hb upgrade Traceback (most recent call last): File "/hb.py", line 51, in <module> File "/misc.py", line 556, in confdir misc.err: Backup directory doesn't exist, use hb init command: /root/hashbackup
The upgrade command cannot be audited since it is not associated with a backup directory. To work around this problem, create a ~/hashbackup backup directory if you don't already have one, as in this example, first showing a failed upgrade, then success:
[jim@mb ~]$ hb upgrade Traceback (most recent call last): File "/hb.py", line 51, in <module> File "/misc.py", line 556, in confdir misc.err: Backup directory doesn't exist, use hb init command: /Users/jim/hashbackup
[jim@mb ~]$ hb init -c ~/hashbackup Backup directory: /Users/jim/hashbackup Permissions set for owner access only Created key file /Users/jim/hashbackup/key.conf Key file set to read-only Setting include/exclude defaults: /Users/jim/hashbackup/inex.conf
VERY IMPORTANT: your backup is encrypted and can only be accessed with the encryption key, stored in the file: /Users/jim/hashbackup/key.conf You MUST make copies of this file and store them in a secure location, separate from your computer and backup data. If your hard drive fails, you will need this key to restore your files. If you setup any remote destinations in dest.conf, that file should be copied too.
a new file, DESTID, is stored on each destination to prevent accidentally overwriting a remote backup area. If this file does not match the backup, an error message is displayed and the destination is disabled (for this run of HB):
dest s3: destination ID mismatch - you may be overwriting another backup! Verify destination is correct; use hb dest setid s3 to disable this warning.
This error will not occur during normal operation, so if you see it, pay close attention to make sure you aren't overwriting a backup. The error will occur for example if:
HB may be slightly slower to start because the remote DESTID file has to be checked for every destination. An error message might be displayed on your first backup since DESTID will not exist yet.
a new dest subcommand, setid, sets DESTID on a remote destination(s) to match the current backup. Before doing this, make sure you are not overwriting an active backup.
audit was not working when -c wasn’t used, for example, when the HASHBACKUP_DIR environment (shell) variable was set or one of the default backup directories /var/hashbackup or ~/hashbackup was used.
dest: display unrecognized subcommand in error message
get: a new option, --delete, deletes existing files in restored directories that are not in the backup. This is similar to rsync’s --delete option, allowing HB to "sync" a directory like rsync rather than just add to it. With -v2 or higher (the default is -v2), the names of the deleted files and directories are printed.
get: previously, get restored into a temporary file or directory, deleted the original file if it existed, and renamed the temp file. When restoring directories, often it makes more sense to merge the restored directory with an existing directory (that’s what tar does). Especially with ZFS, when restoring BSD jails, a user’s home directory might contain several different filesystems.
Now, get restores directly to the target file or directory. If the target already exists and is a directory, get will merge the restored contents with the existing contents, overwriting any existing files.
Traceback (most recent call last): File "/hb.py", line 99, in <module> File "/destcmd.py", line 162, in main TypeError: cannot concatenate 'str' and 'int' objects
the dest subcommand erase has been eliminated. To move dest.conf from the database to a text file, use the new unload subcommand (see next change)
a new dest subcommand, unload, writes the dest.conf stored in the database to a text file and removes it from the database. If no pathname is given, the file is written to dest.conf in the backup directory. If the output file exists, HB prompts whether it’s okay to remove it. dest load and dest unload are now opposites.
a new dest subcommand, clear, removes all files from destinations specified, usually because you are no longer using that destination and are going to remove it from dest.conf after deleting files stored there. This command will not remove a file if it is the only copy available, ie, there is no local copy and no other remote copy.
a new dest subcommand, sync, ensures that all remote destinations are in sync with the local backup directory. This always happens at the start of each backup, but this dest command can force it at other times. Previously, backing up a small or dummy file like /dev/null would force a sync.
a new dest subcommand, ls, displays a listing of all files stored at each destination
a change in #1015 was for destination threads to shutdown before the main program. The purpose of this was to avoid a race condition where a thread wokeup while the main program was dying. This is a hard condition to duplicate. The symptom is that when the main program is exiting, something like this is displayed:
No errors Unhandled exception in thread started by Error in sys.excepthook:
Since this just happened to me with #1018, the fix in #1015 didn't work. And, the #1015 fix also made recover not fetch arc files, which obviously isn't a good thing. This release backs out #1015 and fixes the recover bug.
get: on Linux, disk partitions are often symbolic links to the actual block device. When the symbolic link name is used for backup, HB also saves the actual block device contents as a separate path. With get’s --todev option, the symbolic link could not be used since it is not a block device. Now, get will accept a symbolic link with --todev if the symlink is pointing to a block device. A warning message is displayed showing the symlink target.
get: related to the above, if a symlink name is used on the get command line, and at backup time, this symlink pointed to a raw device, HB will restore the raw device. If the symlink does not already exist or has a different value than it did at the time of the backup, HB will only restore the symlink. If the get is repeated, HB will restore the block device. A warning message is displayed about restoring a block device instead of the symlink.
temporarily disabled ftp idle connection handling, because it sometimes causes spurious Python errors when the backup program stops. Backups are fine; the errors are caused by ftp idle timers running while the main program is dying.
ftp destinations have a new True/False keyword, "restart". If not present, the default is True and ftp will try to restart failed uploads and downloads. If the restart has trouble, HB will print an error message like:
size mismatch after restart: file is 4976624 bytes, restarted at 1048576 bytes, uploaded file is 3928048 bytes; disabling restart
If you see this message, restarts should probably be disabled by adding "restart False" to your dest.conf. Use debug 1 to see the ftp conversation to help troubleshoot restart problems.
ftp: if the Dir keyword is present, a cd occurs to this directory. Otherwise, backups are sent to the initial login directory.
ftp keepalive: routers, firewalls, VMs, and other intermediate devices sometimes drop the control (command) connection while a long file transfer is in progress. Then after the file transfer completes, a timeout occurs and HB thinks the file was not sent. Then a restart (usually) occurs to complete the file transfer. Setting keepalive may help prevent this, though whether it works depends on your operating system’s keepalive settings (how often keepalive packets are sent) and how long it takes the intermediate device to timeout the connection.
ftp destinations try to keep connections to a server open for a while after each operation to save making another connection. The idle keyword specifies how long in seconds the connection can be idle before HB closes it. The default is 15 seconds.
ftp: restart didn’t work with the bsd ftp server
if multiple destinations were setup and an old file needed to be sent to n destinations, it was sent n times to each destination
on Linux, block devices (logical volumes) are often symbolic links that are user-defined names for a "real" block device. When these symlinks are used on the backup command line, for example, /dev/mylv, the symlink is saved and also the actual block device, eg, /dev/dm-1. When backups occur for several logical volumes, an ls listing becomes confusing because it’s hard to tell which symlink goes with which actual device. Using ls -l shows the symlink target. In this release, the symlink target is always displayed if the pathname starts with /dev/ (and the user has cd access to the directory; this is a permission requirement to display any -l info).
ftp destinations now restart uploads if the ftp server software implements the REST and SIZE commands; most ftp servers implement these. Before restarting an upload, HB verifies that the first 1K of the local and remote files are equal, the file size must be 100K or greater, and the partial upload must be 100K or greater.
a new destination keyword, "off", means to disable the destination. It can be re-enabled by deleting the "off" line, or changing it to "#off", which is a comment. This is useful for testing and travelling, when you may want to temporarily disable a destination. It’s easier than commenting out all the lines describing a destination. A disabled destination prints a warning message.
INCOMPATIBILITY NOTE: in previous releases, it was okay to repeat a keyword in a destination; the last keyword was used. Now, it is a fatal error to use the same keyword more than once in a destination. Repeating a keyword can cause subtle errors, ie, you think you are sending files one place, but actually, they are going somewhere else because of a repeated Host keyword. You can leave repeated keywords in the dest.conf file, with all but one commented out with a # mark.
a new destination type, ftps, supports FTP-TLS (Transport Layer Security), also called FTP over SSL. This is not FTP over an ssh tunnel, also called Secure FTP. And it’s not sftp, which is a separate file transfer protocol built in to ssh. The names are a bit confusing…
FTP uses a control connection to send authentication and commands to the remote FTP server. A data connection is opened during file transfers. HB's ftps destination uses SSL for the command connection so your userid, password, and commands are encrypted while talking with the FTP server. HB does not use SSL for the data connection since the backup files being transferred are already encrypted.
The ftp destination type (without the s) is still available for internal servers, or servers that do not support TLS. If the ftps type is used with an FTP server that doesn't support TLS, this message is displayed and the destination halts:
ftp: the debug keyword with a positive number displays the FTP conversation. Higher numbers display more output, but usually debug 1 shows enough.
ftp: rather than making a new connection for each file, the ftp connection is left open and reused, reconnecting only when necessary.
when maxsize is used on a destination to limit file sizes, the permissions on the files created on the remote were rwx-r-xr-x, but should have been rw-r—r--.
S3 destinations have a new 'multipart' keyword, True or False; no value means True. Multipart uploads are enabled by default for Amazon S3 and Dreamhost Dreamobjects. With multipart, instead of 3 workers sending 3 different files to S3, they all work on the same file in parallel. This helps minimize cache stalls during backup when cache-size-limit is used, and tests show more consistent and predictable upload rates to S3.
added fsync calls in a few places to prevent XFS creating zero-length files if the system crashes
imap: reduced memory usage 50%, though imap destinations still require twice the file size during send or receive to encode the file. Use a low number of workers for imap as they will use a lot of memory. In tests with gmx.com, 2 imap workers upload as fast as 6 workers, and 2.5x faster than just 1 worker can upload.
if an archive was packed but dest.db could not be written because of an unusual permission problem, rm / retain would stop with an error (correct). But when the permission problem was corrected, the packed archive was not sent to destinations during the next backup (incorrect). This is fixed.
backup: sync could fail with an error "No destinations in dest.conf contain arc.x.x", even though a destination did have the file. To trigger this, multiple destinations are setup, backup to them, delete one of the destinations, add a new destination, then do a backup, causing a sync, and triggering the error.
selftest: with small backups that are in memory, ie, don’t have to read from disk, a race condition could cause this traceback: File "/hb.py", line 144, in <module> File "/selftest.py", line 682, in main File "/selftest.py", line 201, in checkallblocks NameError: global name 'shaq_seq' is not defined
destination initialization has changed. Some initialization, like checking a hostname with DNS, was being done in every worker thread instead of just once, and a fatal error would occur in every worker instead of just one
clear: could cause a traceback when destinations are configured because of a race condition while deleting dest.db:
File "/basedest.py", line 231, in loop File "/basedest.py", line 408, in rmcmd File "/basedest.py", line 161, in getinfo File "/dest.py", line 214, in __init__ File "/dest.py", line 230, in opendb OSError: [Errno 2] No such file or directory: 'dest.db'
recover: recent improvements in the Cloud Files driver caused recover to fail with this traceback:
Traceback (most recent call last): File "/recover.py", line 626, in <module> File "/recover.py", line 279, in main File "/cfdest.py", line 107, in getfile AttributeError: cf instance has no attribute 'container'
mount: because of ongoing issues with mount putting itself in the background, mount now runs in the foreground by default and the --debug option has been removed. To run mount in the background, use this to suppress all output:
$ hb mount -c backupdir mnt >/dev/null 2>&1 &
display file size, transfer time, and transfer rate after files are copied to a destination
Rackspace Cloud Files: new destination keyword 'servicenet', True or False, accesses Cloud Files over the local Rackspace network. (faster, no download charges)
Rackspace Cloud Files: random BadStatusLine errors and/or Broken pipe errors are less likely
Cloud Files, OpenStack: HB was requesting a 30 second timeout, but the Python Cloud Files library was not passing this correctly and the authentication timeout was actually 5 seconds. If the Rackspace authentication servers got busy or your connection was busy doing other things, this 5 second timeout could easily be exceeded and authentication would fail, leading to unnecessary retries.
Cloud Files, OpenStack: when certain errors occurs, such as a timeout, hb was reusing the socket in the retry loop instead of opening a new one. It could be argued that the Cloud Files library should clean up when a socket error occurs after an HTTP request. Since it doesn’t, the hb retry loop was not working for these types of errors. The traceback would show CannotSendRequest instead of the real error, which was a timeout.
Rackspace Cloud Files (destination type 'cf'): add 'location' keyword, with values of either us or uk. If not specified, the default is us. This is Rackspace specific, only applies when the destination type is 'cf', and does not apply to other OpenStack services.
OpenStack (destination type 'os'): add REQUIRED 'authurl' keyword to specify the authentication endpoint to authenticate with non-Rackspace OpensStack object stores. The version 1.0 API is used, so the url looks like (these are RackSpace’s endpoints):
https://auth.api.rackspacecloud.com/v1.0' https://lon.auth.api.rackspacecloud.com/v1.0'
retain has a new option, -v, to display the files being deleted. It shows the file backup time, filename, version, and retain option that caused the file to be deleted. For example:
recover: if dest.db couldn’t be fetched, recover was giving this traceback instead of the reason dest.db couldn’t be fetched:
Traceback (most recent call last): File "/hb.py", line 124, in <module> File "/recover.py", line 282, in main NameError: global name 'destdb' is not defined
Dir destinations: when recovering files, dir destinations try to use symbolic links because they are much faster than copying files. But some filesystems don’t support symlinks and a traceback occurred:
Traceback (most recent call last): File "/hb.py", line 124, in <module> File "/recover.py", line 279, in main File "/dirdest.py", line 36, in getfile OSError: [Errno 95] Operation not supported
Amazon Glacier uploads failed with the error: dest glac: error #1 of 3 in send arc.18.1: 'Layer2' object has no attribute 'close'
backup: if cache-size-limit was set, this traceback could occur:
Traceback (most recent call last): File "/hb.py", line 75, in <module> File "/backup.py", line 1973, in main File "/dest.py", line 684, in sync File "/arcs.py", line 376, in initcache UnboundLocalError: local variable 'arcbytes' referenced before assignment
get: if cache-size-limit was set, a directory was being restored, and -r was used to restore an older version, this traceback could occur:
Traceback (most recent call last): File "/hb.py", line 103, in <module> File "/get.py", line 1087, in main File "/get.py", line 868, in plan File "/get.py", line 922, in prefetch TypeError: not all arguments converted during string formatting
when an error occurred with a capitalized command line option, eg, -D with no size, the error message would be: Argument -d: expected one argument instead of: Argument -D: expected one argument
mount: when cache-size-limit was set, mount was run in the background (without --debug), backup file data was referenced through the mount, and a remote archive had to be downloaded, the background hb mount process could die and the file access would then fail with an input/output error
when cache-size-limit was set to a small number, like 3, it means 3x the arc-size-limit. But the limit was actually higher because a 4MB fudge factor was added. For large archives, this doesn’t matter, but for small caches and smallish archives, it is more noticeable so the fudge factor has been removed.
when VMWare shared folders are used as the -c backup directory, the timestamps (mtime) of files hb creates in the backup directory can change during a transfer to a remote destination. This isn’t supposed to happen - it’s some weirdness in VMWare’s hgfs - and hb complains about it:
dest rsync: stopping because of errors dest rsync: Traceback (most recent call last): File "/basedest.py", line 198, in loop File "/basedest.py", line 281, in sendcmd Exception: file changed during transfer: 1363391280.0 != 1363391281.23
Now this comparison isn't done at all. Instead, inode number and size are compared before and after the transfer. This is to catch the odd case of someone sending the backup on top of itself, which has actually happened by mistake.
clear: this command resets all config options to default values. It has always done this, but now a note is printed. The clear command should be replaced by init. To just remove all backup data, you can use rm /, though this is much slower than clear.
rm is 2-3x faster, especially with large files with a lot of dedup blocks. This also speeds up retain when it is removing old versions of files.
on 32-bit systems, stats could fail with a traceback:
Traceback (most recent call last): File "/hb.py", line 156, in <module> File "/stats.py", line 106, in main OverflowError: Python int too large to convert to C long
Traceback (most recent call last): File "/hb.py", line 156, in <module> File "/stats.py", line 187, in main TypeError: 'NoneType' object is not iterable
release #918 had a backup bug that could cause warnings about files changing size during the backup. This warning is misleading: what actually happened is that the file wasn’t backed up correctly because of the new feature in #918 where variable-block dedup was enabled for single-core backups. This warning only occurred on multi-core systems.
The backup for these files with warnings is no good, and will also cause selftest errors on these files. Because it is a thread timing / coordination problem, you may or may not have errors. One test system had the problem, another did not. It's recommended to completely remove any backups made with #918 with hb rm:
Your -r version number(s) will be different of course. Use hb versions to see which backups were created with #918 and remove all of them. Your next backup with #919 will re-save these files correctly.
backup with -p0 or on a single CPU system runs in a single process (thread). -p0 can be useful on multi-CPU systems to restrict hb’s system load. Previously, running in a single process also disabled variable-block dedup. Now backup supports VB dedup with single CPUs and multiple CPUs. If you had been running with -p0 or on a single CPU system, your next backup may be larger because files that had been saved with a fixed block size will be saved with a variable block size. Because of the block size change, files will not dedup well on this next backup, but will after that. To prevent this dedup gap, use -p0 -B1m, to disable variable-block dedup.
NOTE: variable-block dedup with -p0 (or on a single CPU) usually takes twice as long as fixed-block dedup. Add -B1m to disable variable-block dedup with a single CPU if you are more concerned about performance than VB dedup.
Amazon Glacier requires two fetches separated by 4 hours to retrieve files. HB’s recover command was doing this, but get and selftest were not: if your backup data was only located on Glacier, then restoring a few files required an hb get (which failed with an error), waiting 4 hours, then retrying the get (which would work this time).
In this release, if you are using Glacier as your first destination and running with cache-size-limit set so that your archive files are mostly remote, hb get will figure out which archives it needs to do your restore, request them all once, delay 4 hours, then start your restore, requesting archives again as they are needed.
IMPORTANT: the HB recover command has lots of options for doing paced retrievals, but get and selftest do not have those yet; everything is retrieved in 4 hours. This can be very expensive for very large restores, so if you need to restore everything, it may be cheaper to set cache-size-limit to -1, use recover to fetch all of your archives from Glacier to your backup directory with whatever pacing you need to meet your cost objectives, then do your get command to restore from the local backup directory.
cache-size-limit >= 0: some archives may be on remotes A special case of #1 is when an archive file is missing, ie, it was manually deleted. HB will download the missing archive when it is needed. For Amazon Glacier, this requires two retrievals, separated by 4 hours. That wasn’t happening, but now it is. Please note, this is a very inefficient way to run HB with Glacier, because every individual archive needed will require a 4 hour delay. It’s much better to set a cache-size-limit; then HB will request all archives needed for a restore, wait 4 hours, then request the archives again and do the restore.
the recover --dl option did accept KB/s and Kb/s for bytes per second and bits per second (upper or lower case B), but expected the K, B, or G prefix to be uppercase and the /s, /m, /h suffix to be lower case. This is too confusing, so now, the --dl option is changed to all lowercase, and it’s no longer possible to specify a rate in bits per second.
if --dl 1 was used (rate of 1 byte per second), recover would loop forever because there wasn’t enough bandwidth to retrieve the largest arc file. Now it will display a message and ignore very low rates since they are equivalent to Option 4, the "cheap" download option. A separate change triggers a fatal error if this looping situation occurs in the future.
a bug was introduced with the new maxsize destination keyword that would generate a traceback similar to this:
Getting arc.0.7 from drobo-vaio-rsync Unable to download archive arc.0.7: AttributeError("rsync instance has no attribute 'getparts'",) Traceback (most recent call last): File "/hb.py", line 75, in <module> File "/backup.py", line 1941, in main File "/dest.py", line 779, in sync File "/dest.py", line 408, in condput File "/arc.py", line 141, in __init__ File "/arc.py", line 174, in _open File "/arcs.py", line 676, in fetch NoArchive: Archive does not exist: /root/vaio-drobo/arc.0.7
backup: a new option -X means to cross mount points, also called descending into other filesystems. The default is not to cross mount points, as before. Be careful, because -X does not discriminate between local filesystems and remote filesystems, so you can end up saving an entire NFS server for example.
the stats command is much faster when there are millions of blocks. A backup with 11 million unique blocks was taking 5 minutes for stats, but now takes around 2.5 minutes.
the --no-compress backup option was made obsolete when -Z0 was introduced in April 2012. --no-compress has been removed in this update.
the mount command would often give bad address errors when accessing files. This bug was introduced a couple of weeks ago, in #871.
a new keyword for destinations, 'maxsize', can be used to limit the size of files uploaded to a destination. For example, many imap servers have small limits like 25MB per file. While the arc-size-limit config keyword can be used to limit the size of archive files, there is no way to limit the size of the hb.db.n files. Now, using maxsize, a destination can impose a hard size limit. If a file to be uploaded is larger than maxsize, hb will split the file into pieces, each maxsize in length (except the last piece), and upload the pieces separately. Pieces have a .pN suffix added. When the file is retrieved from the destination, each piece is retrieved separately and the original file is reassembled.
Maxsize should be at least 1MB larger than arc-size-limit. Arc files can exceed arc-size-limit by up to 1 block (backup's -B parameter). If you use -B4M for 4MB blocks, Maxsize should be 4MB larger than arc-size-limit. If maxsize is equal to arc-size-limit, things will still work, but an archive slightly over arc-size-limit will have to be split on upload. This causes more I/O during upload, creates more files on the remote, and the 2nd piece is less than a block size, which is inefficient.
Remote error recovery has been rewritten so that each piece will be retried according to the destination's retry settings. But if one piece exceeds the error retries, the entire file will have to be retried.
a new destination keyword, randfail, can be used to simulate remote failures. The value is an integer 0-100 representing the percentage of requests that should fail. So 25 means 1 out of 4 requests will fail, 50 means 1 of 2 will fail, 75 means 3 of 4 will fail, 100 means every request will fail. Simulated failures do not generate any remote traffic. Destination threads will stop when all requests fail for one file. Randfail is for testing hb’s error recover and of course should not be used in normal operation (DUH!)
some new statistics have been added to the stats command, specifically the "industry standard dedup ratio" used by many other backup programs. This ratio is computed as sum(bytesin) / sum(bytesout) and assumes that every backup was a full backup. I didn’t say it made sense… Right now, HB has to compute these figures and it takes a while, but soon they will be recorded by the backup program so the stats command will not have to do so much work. It takes around 5 minutes for 700K files, so could take quite a while if there are many millions of files in the backup.
a new option -v has been added to the stats command. After each line of statistics, a paragraph describing (to the best of my ability!) how the statistic is generated, what it means, and why it might be useful.
in testing with 10,000 archive files, the backup sync procedure was taking 50 seconds to figure out which archives needed to be copied to remotes, even if none did. Now it takes less than 1 second. This was a potential scalability issue for sites using smaller archive files (confir variable arc-size-limit).
archive files default to 1GB (config variable arc-size-limit), but some sites want to use smaller archives, either because their storage system requires it or because it allows HB to manage deleted files better. With smaller archives, HB can delete entire archives more often when rm or retain delete files from the backup, rather than going through a download, pack, upload cycle. But smaller archives also have more overhead and can slow the backup down. Changes in this release reduce this overhead. A test backup that creates 5000 small arc files is now 65% faster: 40 seconds vs 114 seconds.
the stats command is 2x faster if a backup has many versions (mine has 600)
with Amazon Glacier, the dest.conf has a location keyword to specify the region for Glacier storage. This was not working, and all Glacier transfers were going to us-east-1, the default Glacier region. Now, the location keyword is honored, and the region is also checked to make sure it’s a valid Glacier region; currently there are fewer Glacier regions than for other AWS services.
Related to this change, HB creates an associated S3 bucket to be used with Glacier. This bucket contains the database files with backup metadata. The actual backup data - the bulk of data - is stored in Glacier. This S3 bucket was named: <access-key>-hashbackup-glac-vaults But, if your Glacier data is getting stored in us-west-1, you probably want your associated S3 bucket stored there too. So now, the associated S3 bucket name for regions other than us-east-1 is: <access-key>-hashbackup-glac-vaults-<region>
if a backup contained only empty directories, the new stats command would fail with a traceback
if only dest.db or hb.db.n were being recovered from Glacier, recover had an unnecessary sleep after the message: Download size: 0 Files: 0
a new config option, no-backup-tag, can be set to a list of filenames. If a directory contains any of these files, the directory contents are not backed up. The directory itself and the tag file are the only items backed up. For example:
$ hb config -c backupdir -no-backup-tag .nobackup,CACHEDIR.TAG
a new command, stats, displays statistics about a backup. More stats will be added in the future, specifically about dedup ratios.
the backup command now displays statistics after the backup, even with -v0. Some sites with huge backups - millions of files - used -v0 to prevent displaying any pathnames, but this also suppressed statistics
a new destination type, shell, allows customized programs and scripts to transfer files to and from remote destinations on behalf of hb. exdirshell.py in the doc subdirectory is an example shell implementation of the built-in Dir destination. Excluding comments and blank lines, this script has only 26 Python statements: shell destinations can be quite easy to write. There is a limitation that no state information generated by the remote, such as an object id, can be stored.
on hardened Linux kernels (gsecurity), hb required paxctl -m or paxctl-ng -m to allow anonymous mmap with execute privilege. This was only needed for the mount command, to load ctypes and libfuse libraries. But all hb commands would fail with an error in the logs like: denied RWX mmap of <anonymous mapping>, and a traceback like:
$ hb init -c vinci Traceback (most recent call last): File "/hb.py", line 20, in <module> File "/arc.py", line 13, in <module> File "/arc550.py", line 19, in <module> File "/db.py", line 35, in <module> File "/conf.py", line 90, in <module> File "/help.py", line 19, in <module> File "/mount.py", line 75, in <module> File "/fuse.py", line 17, in <module> ValueError: bad marshal data (unknown type code)
Now, only the mount command will fail on hardened kernels; all other hb commands will work normally, even without marking the hb binary. To use the mount command on hardened kernels, use paxctl-ng -m hb to allow anonymous mmap w/execute.
after each backup, rm, or retain, hb.db.n is sent to remote destinations. There is a trade-off between how much data is sent in each hb.db.n vs the total size of all hb.db.n files stored on the remote. The smaller each hb.db.n file, the less data transmitted after each backup, but the larger the total size of all hb.db.n files on the remote. In this release, hb.db.n files are slightly bigger but there will be fewer stored on the remote. In tests, this saves significant space on remote destinations.
the new dest command was not working in the release build of hb
when backups are sent to remote servers, the credentials for these servers - userids, passwords, access keys, etc. - are stored in dest.conf, unencrypted. A new command, dest, can be used to load this file into the encrypted backup database; then it can be deleted and hb will read the creditials from the database. Examples:
hb dest load - load dest.conf into the database
hb dest show - show dest.conf from the database
hb dest erase - erase dest.conf from the database
The dest command requires that admin-passphrase be set. Otherwise, anyone could use dest show to display the stored credentials in dest.conf. It is possible but not recommended to remove the admin-passphrase after loading dest.conf into the database.
backup: new option --maxwait <time> specifies the maximum time to wait for archives to upload to all destinations after the backup completes. The time can be specified as a number (seconds), 6h, 1d, etc. This is useful in these situations, maybe others:
for initial backups, the backup runs faster than most destinations can accept data. A backup that takes only a few hours to create may take days or even weeks to upload over an Internet connection. But you may only want your connection used at night, and you still want daily incrementals even if the initial upload has not finished. Starting the backup at midnight with --maxwait 6h is a way to handle this.
a new destination is added to a large existing backup. It may take days to get all of the existing backup data transmitted to the new destination, which is okay, but you don’t want to lock the backup directory for the entire time as this prevents future backups until the copy to the new destination has finished
IMPORTANT: be careful with --maxwait: if it is too short, your backup may never get fully synched to remote destinations and the remote data would always be incomplete.
backup: new option --maxtime <time> specifies the maximum time to spend actually saving files. When this time is exceeded, the backup stops and waits for uploads to finish using --maxwait, which is adjusted based on how long the backup took. Examples:
--maxtime 1h means to backup for up to 1 hour then wait the remainder of the hour to upload the data, ie, total time is limited to 1 hour
--maxtime 1h --maxwait 1h means to backup for up to 1 hour, then wait 1 hour + the remaining backup time for uploads to finish, ie, the total time is limited to 2 hours
This is useful for the initial and full backups, which usually take much longer than incremental backups, and allows you to spread the full backups over many days. It also prevents incrementals from running into production time when a large amount of data changes for some reason.
IMPORTANT: be careful with --maxtime: if it is shorter than the time required for your average incremental backup and upload, your backup may never finish, some files may never get backed up, and/or your remotes may never be fully uploaded. Backup should start where it left off when maxtime is set, but it doesn't do that yet.
backup: instead of giving an error when mounted block devices are backed up, backup will only display a warning. If a partition is mounted readonly for example, it’s fine to back it up even if mounted. Read/write partitions should not be backed up as they will be inconsistent when restored. You may still get an error on some OS’s, for example, on OSX:
Warning: backing up mounted block device: /dev/disk4s1 This is backup version: 0 Unable to open file: Resource busy: /dev/disk4s1
many options that only accepted integers now accept decimal points, for example, backup option -D1.5g, config option arc-size-limit 1.5g
clear: when a Dir destination is setup in dest.conf, files are symlinked to the backup directory instead of copied. The clear command deletes the destination files first, then the backup directory files. But the symlinks were pointing to non-existing files, so clear would not remove them. Then, backup would try to sync these dangling files, causing errors like:
Exception: dest: can't access file for put: /Users/jim/hbdev/hb/hb.db.25
backup: hb has special handling for huge directories to reduce memory usage. (Use ls -ld to see the directory size.) But if a huge directory was actually empty, it would cause an error:
Traceback (most recent call last): File "backup.py", line 2092, in <module> File "backup.py", line 1953, in main File "backup.py", line 1569, in backupobj StopIteration
backup: in some environments, backup would halt with "database is locked" errors. This seemed more likely when multiple workers were used on destinations, and since the default is now 3 workers, this has become more of a problem. This error is sometimes hard to reproduce (that means I couldn’t reproduce it). Changes have been made in this release that should fix this.
backup: if the initial backup ran long enough to create 1 archive file then was interrupted, mounting the backup would not work. Another backup of even of a single file that ran to completion would fix the problem.
backup: backup files without dest.conf, create dest.conf, repeat the same backup immediately; no files are modified, but hb.db.0 still needs to be transmitted and wasn’t. This is an unlikely bug because any non-null backup would fix the problem, but it confused me so it’s likely to confuse others.
This option is --dl now. It's usually the most expensive option, though if there is only 1 archive to retrieve, this is the only way. Today hb will not segment one archive to retrieve it over a longer period with ranged retrievals. Because of Glacier's unusual pricing, it's best not to use large archives (over 1GB).
Option 2: retrieve over N hours or days --------------------------------------- This option is --dl 8h for hours, or --dl 8d for days. Using these allows pacing the retrieval over the time period specified, with files evenly divided into 4-hour download groups.
Option 3: retrieve using specified bandwidth -------------------------------------------- This option specifies the average bandwidth to use for the retrieval, for example, --dl 1MB would be 1 MByte/sec, --dl 1Mb would be 1 Mbit/sec. You can also use G and K suffixes. The actual downloads are not throttled to this download rate; the bandwidth number is only used to decide how much data to retrieve in each 4-hour download block to give this average rate.
Option 4: retrieve based on archive sizes ----------------------------------------- This option, --dl cheap, minimizes your peak retrieval rate and is usually the least expensive option. The only time it will not be least expensive is if the retrieval crosses from one month into the next. In that case, retrieval costs can double and it may be better to keep the retrieval all in the current month using option 2 or 3. This is the default option.
Using one of these options, recover will pace the retrieval of archives stored in Amazon Glacier in 4-hour download groups. Retrieval pricing in Glacier is complicated and can be expensive, so you should review it before doing a large Glacier retrieval.
Amazon Glacier: when archives are cached locally, get, selftest, and mount expect archives to be in the backup directory. If they aren’t there for some reason (manually deleted) they will be fetched from a remote destination as needed. None of this is new.
When Amazon Glacier is the remote in this situation, new retrieval jobs would be started and get/selftest/mount would fail with error messages, because it takes 4 hours to get files from Glacier. The bug here is that if you waited 4 hours and retried the same operation, it should have fetched the files from Glacier. Instead, it was starting new retrieval jobs. This has been fixed, though having to run programs twice is not ideal.
added an args keyword to the ssh destination. This allows specifying sftp options (use man ssh_config to see the options). For example:
args -oIdentityFile=~/.ssh/otherid -oLogLevel=debug
rsync: added Port example in dest.conf.rsync example. rsync Args aaaaaaaaaa keyword now parses quotes like a shell would, allowing quoted strings. For example, to use an alternate ssh port and userid: Args -e "ssh -p 8002 -l sshuser"
selftest: if a raw block device backup was interrupted after it had written at least 1 archive file, selftest would display an error: Error: for logid 3, hash should not be null: /dev/disk4 [r0]
selftest: if a raw block device was backed up and cache-size-limit was set (not all archives are local), selftest would display a warning message (the numbers will be different): Oops: planned read count 7 != actual read count 0; diff = 7
selftest: was not running its normal file data integrity checks on raw block device backups
backup: sometimes when doing a multi-thread backup of a very large file with a small block size (as with VM images), backup could use excessive memory. This was not related to dedup, but was more an issue with how the OS scheduled threads and is similar to the selftest problem fixed in #800. It seemed to occur more often on Linux, but could have happened anywhere. Backup’s memory usage is now much more controlled and stable.
Amazon Glacier support is available with a new destination type, glac. Glacier is an inexpensive (1 cent/GB/mo) archive system that works well for backups, providing you don’t need to restore very often. Since HashBackup supports multiple destinations, Glacier can be used as cheap offsite insurance for an onsite backup.
Example dest.conf entry for Glacier:
The accesskey and secretkey are the same as used by other Amazon Web Services such as S3. Unlike S3 buckets, the vault name does not have to be globally unique: it only has to be unique for your account. Using the Dir keyword, it is possible to store multiple backups in a single vault.
Glacier has some unusual trade-offs for cheap storage: * retrieval time is around 4 hours * there are retrieval costs in addition to bandwidth charges * you get a free allowance for small retrievals * you have to spread large retrievals over 600 days to be 100% free * fast retrievals can be very expensive * the option to ship data on a disk is available, but you still pay retrieval fees plus the fee for shipping the disk
Glacier retrievals occur in 2 stages: * first, request a file; then wait 4 hours * second, download the file from Glacier * retrieved files are available for about a day
To handle this situation, HashBackup stores its databases in a helper S3 bucket named <accesskey>-hashbackup-glac-vaults that is created the first time you use Glacier.
1a. first, databases are downloaded immediately from S3 1b. retrieval jobs are created for archive files, messages are: Started retrieval job <long job id> status InProgress for arc.0.0
After this, you must wait around 4 hours for Glacier to make your archive files available. Then run recover again, and assuming all files are available, your archives will be downloaded to the local backup directory.
If hb recover is run before all files are retrieved, there will be messages like: Queueing arc.0.0 from glac Retrieval job <long job id> status InProgress for arc.0.0 If this happens, you must wait longer and run recover again. Eventually, you will get all your archive files back.
Glacier works best as a "backup of last resort", where you have another copy of the backup locally. It can be used as the only copy of the backup (if cache-size-limit is set), but large retrievals are expensive and may be difficult to manage.
If you make a backup to Glacier then decide you don't want it, it's important to hb clear the backup before deleting the backup directory. The reason is that, unlike S3, Glacier will hang on to your files, even if you make a new backup with the same vault and dir keywords.
recover: handles S3 files that have migrated to Glacier by requesting a restore and holding restored files in S3 for 5 days. Because each Glacier restore takes 4 hours, recover has to be run several times:
first, dest.db is requested (4 hours)
then hb.db.n files are requested to recreate hb.db (4 hours)
then arc.v.n files are requested (4 hours) NOTE: this step is omitted when cache-size-limit is set
then the final recover can fetch archives from S3
Each recover run except the last will display messages for files that are still transitioning from Glacier back to S3, for example:
Loading /Users/jim/hbdev/hb/hb.db.0 Traceback (most recent call last): File "recover.py", line 269, in <module> File "recover.py", line 218, in main File "/Users/jim/hbdev/dest.py", line 486, in get File "/Users/jim/hbdev/db.py", line 895, in applyincdb IOError: [Errno 2] No such file or directory: '/Users/jim/hbdev/hb/hb.db.0'
recover: if configured in dest.conf (Workers keyword, default is 3), use multiple workers to download files simultaneously, reducing recovery time
validate s3-compatible bucket names before creating new buckets: must be 3-63 characters, start and end with a-z or 0-9, and may contain dashes. Existing buckets with names violating these more strict rules are still accessible.
with multiple workers there was a race condition in s3-compatible destinations when a bucket was created: several workers could try to create the bucket at once, causing errors. This is fixed.
S3-compatible destinations now store an MD5 checksum in the database when a file is transmitted, and compare this to the server-generated MD5 checksum when a file is retrieved.
S3-compatible destinations have a new Dir keyword, so that many backups can be stored in one bucket, and backup data can be segregated from other data in a bucket.
Rackspace Cloud Files destinations have a new Dir keyword, so that many backups can be stored in one container, and backup data can be segregated from other data in a container.
there was a bug in the previous db upgrade process with old archive files. The error during upgrade was: Unable to upgrade your database to rev 13: write() takes exactly 5 arguments (4 given)
for longtime beta testers using dest.conf, an old hb.db file could be stored on a remote. In the Dec 3rd release, the sync code was rewritten. This new code would see the old hb.db on the remote, delete it, which is correct, but also delete the local copy, which is not correct. This bug has been fixed. The workaround for the lost hb.db is to rename hb.db.orig to hb.db and run the upgrade again, or use recover to rebuild hb.db from the remote copy
when generating keys, get as many bytes as possible from /dev/random, without blocking; get the rest from /dev/urandom
OSX Snow Leopard and Lion have a bug where an unaligned disk write that would fill a disk does not return an error. Instead, it does a partial write and doesn’t throw an error. In testing, this OSX bug could corrupt the dedup table. Code was added to detect this condition.
if a disk full condition occurred at a particular point during backup, it could cause selftest errors later because of the partially saved file. Now if a critical error occurs during backup, backup will display a "Fatal error:" message and stop immediately.
related to the disk full problem, a new selftest option, --fix, has been added. With this option, selftest will make corrections for simple errors and will not stop at 100 errors as it usually does. Corrections occur with -v2 (or higher).
added support for DreamObjects (destination type is do), an S3-compatible service by Dreamhost with good prices: 7 cents per GB for both storage and outgoing bandwidth. For more information, see: http://dreamhost.com/cloud/dreamobjects
selftest -v4 could fail with this error:
Traceback (most recent call last): File "/hb.py", line 140, in <module> File "/selftest.py", line 674, in main File "/selftest.py", line 199, in checkallblocks UnboundLocalError: local variable 'row' referenced before assignment
[#797] selftest: added --debug option to print logid, pathname, and version for files as selftest runs
[#798] init was failing if backup directory didn’t already exist
[#799] error messages are clearer when recover is used with the wrong key; removed temporary files created so that if recover is used again, it displays the same error messages
[#800] on 32-bit Ubuntu Linux 12.04.1 LTS, with a 128GB file of random data, selftest would continually consume memory until it failed with an error like this: Error: Error reading archive: , logid 5, blockid 2050709: /mnt/snapshot/data/com bo.bin [r0] Exception in thread decrompq_loop: File "/shared.py", line 60, in start_thread File "/selftest.py", line 97, in decrompq_loop or it was killed by the OS OOM (Out Of Memory) handler.
[#801] mount: on Linux, du was always reporting zero for directory sizes. Also changed top-level directory names from just version number to YYYY-MM-DD-HHMM-rV, to be more accessible. The latest backup is now called 'latest' instead of 'c'. To cd using just version numbers, use cd *r5 for version 5.
[#802] mount: the Linux stat command would sometimes print different values for st_blocks on a real file and an hb mount containing the same file, with the difference being less than 8.
[#803] compare was displaying socket files as if they were new, because backup never saves socket files. Now compare ignores socket files, like backup.
[#808] after a database upgrade by the 7xx and 8xx series of releases, version 6xx of hb was still able to access the backup database. Now, after #808 or later is used, the correct error message is displayed when an old version of hb is used on an upgraded database: you need a newer version of HashBackup to access this backup.
[#809] /dev/urandom is now used instead of /dev/random for key generation. These are only different on Linux. Recent versions of Linux running on a single-user VM frequently do not have enough entropy in the random pool to generate even a 128-byte key, so hb init can block for a very long time. With this switch to /dev/urandom, hb init will not stall.
[#810] update dest.conf README file to explain why changing destname after files are backed up is a bad idea and causes breakage
[#811] backup: on Linux, filesystems that don’t support flags, eg, FUSE, would sometimes return garbage flags. If the nodump bit was set in the bogus flags, the file would not be backed up. This has been fixed.
NOTE: this rev will do an automatic database upgrade to dbrev 13 when any HB command is used, to support remote-only archives and a limited archive cache. All archive files must be read for this upgrade, so the upgrade could take some time to complete
IMPORTANT: do not enable any new features in this release until after your database has been upgraded. For example, don't add the new 'rate' keyword to any destinations until after your next backup using this new release.
HB versions all config changes so that config -rN displays the configuration settings of backup N. With the old config setup, if an option is changed, a backup is taken, and that backup is later removed with rm -rN, the config option change also disappeared. Sometimes this is okay / desired, but usually it is unexpected, especially with options like admin-passphrase: users expect that if they put an admin-passphrase on a backup, it stays there. A separate problem is that if backup N caused a database upgrade, then all backups N and later were removed with rm, the next operation would want to do a database upgrade; this failed because the upgrade had already occurred.
To fix these issues, HB now has one config that is "current". After each backup, the current config is saved. But if backup versions are removed, it doesn't affect the current config settings. Also, --revert was removed from config as it was seldom used in practice.
HB supports remote-only archives using the new cache-size-limit config option. This option defaults to -1, meaning there is no local cache limit and archive files will stay in the local backup directory as before.
Keeping a local copy of all archive files and leaving cache-size-limit set to -1 has several benefits:
restoring from a local backup directory is much faster than restoring from a remote, where archives have to be downloaded
disk space is cheap, and most systems have room for a local copy of the backup. Setting a cache size limit will save space during backups, but you may still need lots of local disk space during restore to download required archives
with local archives, backup never stalls waiting for remotes to accept archives
the backup directory does not have to be locked for read-only programs, such as mount, allowing backups to run while mount is running. When the cache is limited, only one program can access the backup directory because archives might be coming and going
errors on a remote, such as it being down or having a full disk, do not affect a backup with local archives: the archives will be sent the next time the remote is available. With a limited cache, the backup will halt if the cache becomes full and any remote stops accepting data
But, there are environments where keeping a complete local copy of archives may not make sense:
If cache-size-limit is set >= 0 with hb config, the backup program may remove local archives after they have been transmitted to all remotes, to stay under the cache size limit. Cache-size-limit zero means "no local archives". This causes backup to stall after each archive until it is transmitted to all remotes. A better option is to set cache-size-limit to 1-1000. These small numbers mean "multiply by the max archive size". So cache-size-limit 5 with the default arc-size-limit of 1gb means that 5GB of archive data can be kept locally after it is sent to remotes. A specific size can be set too, for example, cache-size-limit 10gb. There is no minimum cache size: HB will function correctly no matter what size is used, though it may need to temporarily go beyond the limit while executing a command.
If a new destination is added to dest.conf and some archives are only stored remotely, hb has to download these archives first and then upload them to the new destination(s). In this release, this remote to remote synchronizing blocks other operations and backup waits until all destinations are synchronized before starting the backup. With a local backup (no cache-size-limit), this doesn't happen because downloads (remote to local) aren't necessary.
Another effect of a limited local cache is that backups may be delayed if there are slow remotes. For example, if the cache-size-limit is 2GB and you have 16GB to backup, the backup program may have to delay while archives are uploaded. This doesn't happen with a local backup (no cache-size-limit).
One guideline for setting cache-size-limit is to use at least the average size of your typical backup. This will allow backup to finish without waiting for remotes to accept data.
a new dest.conf keyword, 'workers', can be added to any destination. This is the number of concurrent connections to a destination. The default is 3, allowing uploading or downloading 3 files at once. Set workers to 1 if you want to minimize the impact of hb on your network connection. (You can also use the new 'rate' keyword to limit the network impact.)
a new dest.conf keyword, 'retries', can be added to any destination. If omitted, destinations will retry 2 times on errors (3 times altogether), delay 5 seconds the first time, then multiply the delay by 2 for each retry. This is equivalent to retry 2, 5, 2. Up to 3 integers can be used with the retry command, with defaults used for missing values. So retry 1 means to do only 1 retry with a 5 second delay.
get: creates a cache plan when cache-size-limit is set, and downloads archives in the background in the order they are needed. While get is running, the archive cache might exceed its limit. This may be necessary to avoid downloading the same archive more than once. After get finishes, it will trim the cache back to cache-size-limit.
selftest: creates a cache plan when cache-size-limit is set, like get. Selftest’s -v option controls the level of testing. If cache-size-limit is set, selftest defaults to -v2 to prevent all archives from being downloaded. If -v3 or higher is requested with cache-size-limit set, selftest will show how many archives need to be downloaded, how much space will be needed, and then ask for confirmation. If cache-size-limit is not set (the default, meaning all archives are kept locally too), selftest uses -v9 as before.
when fetching files from a Dir destination, a symlink is created instead of copying the archive to the cache. This allows reading directly from the Dir destination file. This is useful with mounted remote storage such as Google Drive, WebDAV, etc., because they support reads without needing to download an entire archive. For example:
your backup directory is /hb, on the local filesystem
setup dest.conf with a Dir destination for remote mounted space
use hb config cache-size-limit 5 for a small local cache
your hb.db file is in fast, local storage in /hb
your archives are on slower remote mounted space
restoring a file will access remote storage directly
The database upgrade process was rewritten for this release. The old upgrade system worked well for a single-rev upgrade, but sometimes failed on older databases. This release can update backups created with #339 (Oct 2010) or later, and is more reliable for future upgrades.
backup: a new option, --full, forces a full backup. This adds redundancy to the backup and can make restores faster too by reducing fragmentation in the backup. When cache-size-limit is set, reducing fragmentation will usually reduce restore times by reducing the number of remote archives that need to be downloaded. -D can still be used with --full to enable dedup, but no data from previous backups is reused.
Another way to achieve backup redundancy vs -full is to simply start a new backup directory. The advantage here is that everything is redundant, including the backup database. The disadvantage is that it is harder to manage and configure, because each backup directory has to have unique remote destination directories, and retain cannot be used across multiple backups.
a new config option, audit-commands, enables audit logging for the commands listed, or if 'all' is used, auditing is enabled for all commands. To display the audit log, use the new hb audit command. Audit logs cannot be removed from the database. For more secure audit logging, admin-passphrase should be set and disable-commands config; this prevents someone from turning off audit logging without the admin passphrase. Example audit log:
[jim@mb hbdev]$ hb audit -c hb Backup directory: /Users/jim/hbdev/hb Showing recent history
rekey now asks for a new passphrase twice, and verifies that they are the same. Before, a typo in the passphrase could make the backup database inaccessible. To abort a rekey, enter mismatched passphrases.
Amazon S3: performance may be increased for large S3 transfers because of better use of Amazon’s load balancing, and S3 error recovery may be better when an Amazon S3 server is having issues.
a new destination keyword, 'rate', allows limiting upload bandwidth for for Amazon S3, ftp, imap, dir, Rackspace Cloud Files, and rsync destinations. The value is the outgoing transfer limit in bytes per second, for example, rate 100k would mean 102400 bytes/sec. This is the upload limit for each worker on a destination, and since the default is to have 3 workers, the aggregate limit would be 300k for this example if all the workers were busy. If rate limiting is used, you may want to add the worker keyword with a value of 1, to limit bandwidth further. A rate limit less than 1024 raises an error - it’s probably a typo.
after removing files from the backup, either with the rm or retain commands, some archives may have empty space. If there is enough empty space (25%), HB would compress archives to save local disk space. If any archives shrank 50%, hb would retransmit them to remote destinations to free up remote disk space. This works well when archives are local. But when archives are only stored remotely because cache-size-limit is set, this packing operation requires a download first.
So a new config option has been added, pack-percent-free. This takes a number, which is a percent. If an archive has this much free space or more, it will be packed to save disk space. The default is 50, so archives are packed when 50% or more space is free. Another useful value is 100: archives are never packed, but are deleted when they are completely empty. It may or may not be cost effective to set this, depending on whether your cache is limited and the rates you pay for outgoing bandwidth, incoming bandwidth, and storage costs.
a new yes/no config option, 'pack-remote-archives', specifies whether to pack archives that are not stored locally. The default is no. This option exists because many cloud storage vendors charge for download bandwidth, so it may cost more to download an archive for packing than it is worth: it might be cheaper to just pay for the storage. This option only makes sense when cache-size-limit is also set.
the CRC on archive blocks was removed. This CRC was intended to allow remotes to do some simple archive validation, but that’s not possible with the other changes in this release. Each block still has a full, encrypted SHA1 hash for data verification during restores, but remotes can’t access it: they don’t have the key. And each file still has a full SHA256 as a double check to verify restored files.
mount: because there is no way to predict which archives might be accessed, the mount command cannot prefetch remote archives. This may lead to slow file data access while remote archives are downloaded. The cache size limit is ignored while mount is running, to avoid downloading the same archive more than once. When the HB backup filesystem is unmounted, the cache is trimmed back to cache-size-limit.
in previous releases, read-only programs like mount did not lock the backup directory. But when the cache size is limited, all programs must lock the backup directory since archives may be moving in and out of the cache.
the config option arc-size-limit sets the maximum size of archive files created by backups. Previously this was limited to 8gb; now there is no upper limit. The default is still 1gb. To facilitate testing, the lower limit has been changed from 1mb to 10000 bytes, but small sizes like this should not be used for real backups.
config: when a config option is changed, both the new and old value are displayed
selftest -v2 -p0 (one core) at a customer site was 7x faster than -v2 -p4 (4 cores). This performance issue, which also affected -v3 and -v4, has been fixed.
multi-core selftest could report an error with a file, then "Verified x files with 0 errors", then at the end, "1 error"
the Userid keyword was added back to the ssh destination. This was accidentally removed when hb switched to using sftp. Also, this destination always tried to create the target directory on the ssh host. Now, it will only do this when files are sent to the host - not on a remove or fetch.
before, the inex.conf rule ex /tmp/ meant the same thing as ex /tmp/ . The problem is it also prevents requesting a backup of /tmp/jim; it saves the directory but not the contents, and it’s very confusing: "Why won’t it work!?" Hey, if it confuses me, which it did, it’s confusing! Now, ex /tmp/ works as before when backing up /, and will not save the contents of /tmp. But you can request a backup of /tmp/jim and it will work as expected. To get the old behavior, use a rule ex /tmp/ . Then when requesting a backup of /tmp/jim, the directory itself is saved since you explicitly requested it, but the contents are excluded by inex.conf. Also, a new warning is displayed when a requested directory’s contents are excluded by inex.conf.
imap destinations would sometimes reuse a connection when an error occurred. But sometimes these errors are fatal, for example, if the imap server resets a connection. Now when an error occurs, the old connection is closed and a new connection is created. hb retries 3 times when a destination encounters problems, and imap retries up to 20 minutes with an exponential backup. So altogether, imap will retry for around an hour before giving up.
imap performance is slightly improved by sending one less imap command per file sent, received, or removed
IMPORTANT IMAP BUG FIX: in all previous versions of hb, rm and retain could delete too many archives on the imap server, rendering the remote backup incomplete. The symptom of this problem is that when an archive arc.v.n is removed, all archives with arc.v.n as a prefix are removed from the imap server. For example, if retain removes arc.0.3, arc.0.30 and arc.0.31 are also removed incorrectly. A similar problem occurred with hb.db.n files. This only occurs on imap destinations.
To correct this, the database upgrade procedure will tag files that need to be resent to imap destinations. After the next backup completes, the remote imap destinations should have the missing files.
To verify your remote imap backup is correct, wait until your next backup completes with this new version. Then create a new temporary backup directory, copy your key.conf and dest.conf files there, and run hb recover -c tempdir. This will download all of your remote backup files from your imap server. Run hb selftest -c tempdir to verify there are no errors. Then tempdir can be deleted.
If you still have selftest errors because of missing archives, edit your real dest.conf file in your production -c backup directory and change the destname of your imap destination. For example, if it now says destname imapjim, change it to imapjim2. On your next backup, all backup files will be uploaded to the imap server. Then repeat the selftest in the previous paragraph to verify your remote imap backup has been fixed.
imap: would sometimes fail to parse server responses correctly if additional information was included, causing unnecessary retries.
imap: added a debug keyword. debug > 0 will cause imap FETCH responses to be dumped, which can help diagnose parse errors. debug >= 4 will dump the entire imap conversation.
Dedup statistics were inconsistent. For example, backup might say: Dedup enabled, 28% of current, 7% of max But then retain or rm would say: Dedup enabled, 28% of current, 0% of max Rm and retain never expand the dedup table, so they don’t have the -D option to specify a maximum dedup table size. Without knowing a maximum size, it was assumed to be huge; so the 2nd stat was always 0%. To avoid confusion, only backup prints dedup statistics now.
if hb.db was deleted, running backup or recrypt would create a new (empty) hb.db. When this empty hb.db was sent to remotes, it would cause all hb.db.n files to be deleted. Now, backup and recrypt will issue an error message that hb.db doesn’t exist. The usual remedy would be to run recover to regenerate hb.db.
recover checks that hb.db recovered from a remote is bit identical to the original hb.db. In a specific, unusual situation, recover would report that all signatures matched on each hb.db.n file, but then would report 'Database HMAC mismatch' on the final database and stop. The recovered database was equivalent to the original, but it was not identical. This bug has been fixed and the error message is now a warning; the database is available if you choose to use it.
get: setuid, setgid, and the sticky bit were being saved but not restored on regular files. This has been fixed. These bits are also now restored if the numeric userid running get is the same as the numeric userid of the restored file.
ls: setuid, setgid, and the sticky bit are displayed with ls -l
when the backup directory is locked and can’t be accessed, the error message displays the process id owning the lock
get: if a directory was restored with --orig and the parent directory didn’t exist, get would fail with an error message "ValueError: too many values to unpack"
get: if two hard links to the same file were listed on the command line, the first file existed before the restore, and the restore replaced it, the restore of the 2nd hard link would fail with an error like: Unable to hardlink: No such file or directory: followed by two pathnames …/hb-540858.tmp → …/hb-297485.tmp
clear: if hb.db didn’t exist, clear stopped with an error; it should have cleared the rest of the directory
clear: would sometimes print the error message "database schema has changed"
versions: could fail with the error message KeyError: 'getpwuid(): uid not found: nnn' if a backup from one system was transferred to another system with different userid → name mappings
add a test to all programs that the -c backup directory is actually a directory
when reading passphrases, a warning was displayed and characters still echoed on the screen
backup: when the dedup table was full, it was resized (to the same size) at the beginning of the backup
large dedup tables could cause MemoryExceptions and other problems when a program started. A 1GB dedup table was actually temporarily requiring 4GB of memory. As a side effect of fixing this, loading a large dedup table is now much faster.
mount, backup, selftest: if certain files are backed up with dedup enabled, fixed block sizes in one backup version, and variable block sizes in a later backup version, reading them via mount may cause a Bad address error. This is extremely data dependent: on my backup of 800K files, this occurred with 4 files. The cause of the problem is a bug in the backup program, and that has been fixed. selftest has been changed to detect this problem (requires -v9). The database upgrade will correct files with this problem.
bogus keywords in a destination (typos) were silently ignored. Unrecognized keywords now generate a fatal error.
the new rsync Args keyword was not getting inserted into the rsync command line.
previously, passphrases were read from stdin and displayed during keyboard entry. Now echoing is suppressed and passwords are read from /dev/tty.
rsync destinations have an Args keyword, and any options listed here will be inserted into the rsync command line. For example: Args --bwlimit=64
three new config keywords were added:
admin-passphrase: defaults to ''. If set to something else, this passphrase has to be entered to view or change the config.
disable-commands: (default ''). A comma-separated list of hb commands that require the admin passphrase. For example:
$ hb config -c hb disable-commands clear,recover,rekey,recrypt,retain,rm
enable-commands: (default ''). A comma-separated list of hb commands that do NOT require the admin passphrase to be executed. All other commands will require the admin passphrase. For example:
$ hb config -c hb enable-commands backup,help,ls
It is an error to set both enable-commands and disable-commands. It is more secure to set enable-commands because if new commands are added to hb they will automatically be disabled. It's not possible to disable upgrade because it is not associated with an hb database, and that's where the config information is stored.
backup: if an hb.db.n file from a previous backup needed to be sent to a destination, an error could occur:
Traceback (most recent call last): File "/hb.py", line 41, in <module> File "/backup.py", line 1835, in main File "/dest.py", line 607, in sync NameError: global name 'HBDB' is not defined
on Linux, if the random number entropy pool is low, reading from /dev/random is supposed to block. This may return a short read, causing problems such as:
Exception in thread mashq_loop: AES key must be either 16, 24, or 32 bytes long File "/shared.py", line 60, in start_thread File "/backup.py", line 742, in mashq_loop
NOTE: this rev will do an automatic database upgrade to dbrev 12 when any HB command is used, to support the new encryption system
security: HashBackup’s encryption system is enhanced in this release to prevent a certain kind of information leak that could allow someone to determine if your backup contained data in common with their backup. Existing backups remain accessible and future backups will use the new encryption system.
NOTE: to exploit this, someone must have copies of your encrypted backup files and *unencrypted* copies of files to test in common.
security: local and remote copies of dest.db are now encrypted. There is little value in hacking this file, which is why it wasn’t encrypted before, but putting any data on a remote server unencrypted is not a good practice.
security: remote copies of dest.db now have an HMAC (Hashed Message Authentication Code) signature. This signature is generated on upload and verified on recovery.
NOTE: an HMAC signature is similar to a regular hash, like SHA1, MD5, etc., but combined with a secret key. Without knowing the key, the HMAC cannot be regenerated by an outsider, whereas other hashes can. HMACs provide a stronger guarantee against tampering than regular hashes.
security: previously, database files were encrypted with AES-128 and arc files used AES-256. Now, AES-128 is used everywhere, because:
-- this was advised after a security review of HashBackup by a well-known, published security expert: AES-256 has key schedule weaknesses that might make it less secure than AES-128.
-- if you have data that is so valuable that AES-128 is not enough protection, it's more likely that the data will be obtained through other means such as stealing your computer and/or physical coercion rather than obtaining your backup and breaking the encryption
-- a quote from the 2nd reference link below: "Recovering a key is no five minute job ... the number of steps required to crack AES-128 is an 8 followed by 37 zeroes. 'To put this into perspective: on a trillion machines, that each could test a billion keys per second, it would take more than two billion years to recover an AES-128 key,' the Leuven University researcher added."
-- References: http://www.schneier.com/blog/archives/2009/07/another_new_aes.html http://www.theinquirer.net/inquirer/news/2102435/aes-encryption-cracked http://research.microsoft.com/en-us/projects/cryptanalysis/aesbc.pdf
init: normally init generates a random key automatically, and stores the key in the key.conf file. But in some situations, others may have access to the key.conf file. Examples are hosted virtual private servers, managed servers where someone has root access, and Google Drive and other "remote drive" services.
A new -p option has been added to init and rekey to protect the key with a passphrase. hb init -p ask will get the passphrase from the keyboard for every hb command. Even someone with access to key.conf cannot access your backup without also knowing this passphrase.
If you write your backup directly to a remote drive like Google Drive, the key will also be stored there. To protect your backup, you MUST use a passphrase with the -p option to init.
to increase security, pbkdf2 key stretching is used. This may introduce a short delay (1 or 2 seconds) on every hb command. Key stretching slows down an outsider’s attempts to guess your key or passphrase by running both through thousands of one-way hashes. Any attempt to guess a key / passphrase has to repeat all this hashing work for each guess.
All of HashBackup's security comes from your key. This is why hb init creates a random key by default: it is next to impossible for someone to guess a long random key. Here are some suggestions for creating a strong passphrase to further protect your key:
make up a sentence that you will remember and use this as your passphrase. A sentence is easier to type than a password like Wjd0$p2^! and is stronger because it is longer. Length wins over weird, hard to type, hard to remember passwords. Example: the fat green martian landed his shiny silver spacecraft
make up a sentence that you will remember and use the first letter of each word as your passphrase. For the first sentence in this paragraph, that would be: muastywrautfloewayp
adding special symbols (other than spaces) will increase the passphrase strength. One easy way to do this is to use special symbols before, after, and/or between words, for example: .,.this.,.is.,.a.,.decent.,.passphrase.,.! But make up your own special symbol rule. Even adding just one special symbol will increase your passphrase strength.
adding a number, especially in the middle, will increase your passphrase strength
use a password manager program. These store lists of passwords and passphrases in an encrypted file, protected by a master passphrase. They often have password generators built in and you can cut and paste a passphrase when needed.
To learn more about the importance and methods of choosing a good passphrase, do a search for:
init: similar to -p ask, -p env can be used. You first set the shell environment variable HBPASS to your passphrase. For example, using the bash shell, you would say: export HBPASS='secret phrase', then run hb init -p env. To make the environment variable permanent, export it in .profile in your home directory (for bash). Using -p env is less secure than -p ask, because every program you run has access to HBPASS. But -p env is more convenient since you only have to set the environment variable once in your login session. Setting HBPASS in .profile is probably less secure than typing the export HBPASS=mysecret command after you login. If you do store your password in .profile, restrict file permissions to read-only: $ chmod ~/.profile 400 The comments above about -p ask also apply to -p env.
init/rekey: it is possible to use -p ask/env with -k ''. This is less secure, because only the passphrase is used for encryption. But it is more convenient, because the key.conf file can be re-created more easily if it is lost.
rekey: supports -p ask and -p env. When using env variables for rekey, leave HBPASS set to the old passphrase before giving the rekey command; then update HBPASS with the new passphrase after rekey.
rekey -k will now create a key.conf file if it is missing, however, no rekey occurs in this unusual situation. This may be useful when recovering, to create a key file in an empty directory without using an editor program.
rekey now recovers from interruptions. If rekey is interrupted, no hb commands will work until the rekey is retried and completes.
security: add a time delay if the key is incorrect. User-generated keys and passphrases are not as strong as random keys and this delay may help slow down attempts to guess ("brute force") a key. To prevent using the delay as an signal that the key is wrong, HB will also randomly delay even when the key is correct.
backup: creates a new random backup key for every 64GB of backup data to avoid using the same backup key "too long". Backup keys are managed by HashBackup and are not in the key.conf file. In this release, a file bigger than 64GB will use only one key.
recrypt: this is a new command that will re-encrypt backup data with the new encryption system. Recrypt is different than rekey:
-- rekey creates a new key in key.conf and re-encrypts hb.db; no archive files (these contain your backup data) are modified. This is used if the key.conf file may have been compromised.
-- recrypt re-encrypts archive files containing backup data. Specifically, recrypt operates on archive files that were created before the last rekey command.
recover: a public file signature (hash) is added to hb.db.n files sent to remote destinations. This allows integrity checks on the remote side without the key.conf file, and it’s also checked during recovery.
recover: a private file signature (HMAC) is added to hb.db.n files. Recover verifies the HMAC to ensure that the file has not been changed during transmission or while it was on the remote. Even before this release, tampering with encrypted hb.db.n files would have very likely caused a program fault during or after recovery; HMAC provides a stronger guarantee against undetected tampering.
recover: a 2nd private file signature (HMAC) is added to hb.db.n files to ensure that the hb.db file created by recover is identical to the original. This can detect more errors, for example, if an hb.db.n file is not applied, not applied correctly because of a software bug, or a valid hb.db.n file is copied over a different hb.db.n file on the remote side.
get: multiple CPU cores may be utilized during a restore. This is mainly beneficial for bzip2 compression, though restores with normal gzip compression are somewhat faster too. A -p option is added to get, like backup; -p0 will use only 1 core. The default is to use all cores.
selftest: multiple CPU cores may be used, and the -p option was added. The default is to use all cores.
backup: when a fatal error occurred, backup would sometimes freeze after displaying the error message. This has been improved, though may not be completely fixed. It is a race condition while shutting down and hard to manage when an exception occurs.
backup: the HB build number used to create each backup is now stored and displayed by the versions command. It is not displayed for old backups.
selftest: renaming hard links a certain way could cause a selftest error, "for logid x, hlogid y is also hard-linked". The backup data is actually okay - this was an bug in selftest.
get: the previous hard link renaming scenario could cause restoring a renamed hard link to fail with a file size mismatch or a file hash mismatch.
mount: the previous hard link renaming scenario could cause access to a renamed hard link to fail with an error "Bad address".
backup: hb can’t yet handle ACLs on zfs, nfsv4, or Windows filesystems since their ACLs are not Posix compatible. FreeBSD returns an error "Invalid argument" when these filesystems are backed up with hb. Instead of printing this error on every file, hb will only print it once, and print a note that ACLs on this filesystem aren’t supported.
backup: on Linux, backing up sshfs filesystems caused a fatal error because sshfs returns the wrong error code when reading flags.
selftest: could fail with an error on line 351: TypeError: not enough arguments for format string
ls: display symlink targets with -l, like ls -l
ls: display hard links with -lv
get: if there was a problem reading a symlink target, get would abort; it should have printed an error and continued the restore
raw block device / partition backup is supported. Before, hb only saved the block device information you would see with ls -l. Now, a block device pathname on the command line causes the contents of the block device to be saved. For example:
hb backup -c backupdir /dev/sda1
would backup the first partition on disk /dev/sda. You could also backup /dev/sda, which would save all partitions on the physical disk sda.
Before doing a block device backup, make sure the device is not mounted. hb does a basic check for this and won't backup anything that is displayed by df -l. If you do backup a mounted block device, you'll likely have a corrupt device on restore. Logical volumes can also be backed up using this method. To get a clean backup without unmounting a LV, make a snapshot LV first and backup the snapshot rather than the actual LV. To create a snapshot LV, there must be free space available on the same physical volume containing the LV (Linux).
hb is not yet smart enough to backup only the used blocks in a partition. It is safer (and easier!) to backup all blocks rather than reading the filesystem structures on the device to find used blocks, because hb doesn't need to know about filesystem details. But it can be slower if there is a lot of free space in the filesystem. Image backups are very space-efficient if dedup is enabled.
Dedup with a block size of -B4K works well with most filesystems, but the flipside is that this small block size does not compress very well. Depending on the compessibility of your data and the amount of data changed, a larger block size like -B1M may work better. Experiment with your actual data to decide which options are best.
If you are backing up several block devices and want dedup to work across all of them, for example, each block device has a similar Linux VM, you will have to use -B4K, because even when different block devices contain the same data, the block placement is not identical.
hb get will also restore entire block devices. The block device path must be given on the get command line for a full image restore. If --orig is used, the data is restored to the same device backed up. To restore to a different device, use:
The target block device must not be mounted. If it is mounted, your restore will almost certainly trash any file system there, and the restore will also be bad because an active filesystem is writing to the same device.
If neither --orig nor --todev are used, a file image is created from the backup as if you had done a dd from the device to a file named sda1 in the current directory.
On Mac OSX, the raw read block size is apparently fixed at 4K, so reading from a raw block device is slower than reading from the normal filesystem.
backup: to avoid confusion, display a message if dedup is not enabled, and display dedup utilization statistics when it is enabled. The statistics show whether -D would benefit from an increase in the dedup table size.
get: when a file with special flags was restored, get would display an error: 'module' object has no attribute 'chflags' on BSD and OSX.
selftest: could display an error message: Error: blockid n has version v where v is a version that was deleted. The backup is fine and would restore correctly; this was a bug in selftest.
backup: performance improved ~8% for small block sizes (VM images).
ls: if -r was used to display a specific backup version and a directory was not backed up in that version but was backed up in an earlier version, ls would sometimes not display the directory at all. Also, if a file in one backup was replaced with a directory by the same name in a later backup, ls -a would display the directory, but not the earlier file backup. It should have displayed both.
backup: a new -Z option controls compression. The default is -Z3, which is equivalent to the old behavior. The possible values are:
-Z0 = disable compression (replaces --no-compress option) -Z1-7 = gzip level 2-8 -Z8 = bzip2 level 1 -Z9 = bzip2 level 3*
Dedup is not affected by the type or level of compression used, and different kinds/levels of compression can be itermixed in the same -c backup directory without problems.
-- on a multi-core system, HB will use all cores with bzip2. With gzip, HB will only use 2-4 cores by default. You can raise this with -p but should probably experiment with your system first.
-- disabling compression with -Z0 may be slower than -Z1 on files that are compressible, because more backup data is written; only use -Z0 if you are very sure that your files will not compress
-Zgz = use gzip at hb's recommended level -Zgz,n = use gzip at level n (1-9) -Zbz = use bzip2 at hb's recommended level -Zbz,n = use bzip2 at level n (1-9)
In tests, higher bzip2 levels use more memory and take more time, but don’t seem to improve compression ratios very much over level 1; gzip level 9 takes longer but is rarely better than level 8. So don’t use -Zbz,9 thinking you will get the best results; often it will just make your backup and restore take longer. Compression is always data dependent, so testing with your actual data is the only good way to select a compression type & level.
the --no-compress option is obsolete and should be changed to -Z0. It will still be honored for a few months to give everyone time to change cron jobs, scripts, etc. -Z has priority over --no-compress.
backup: compression was often disabled when backing up files on single CPU systems or when -p0 was used. This bug was just noticed but has been crawling around since #339.
backup: if dedup is not initially used, but is used in later backups, a scan of all blocks was occurring to update the dedup table even though nothing would change. This scan is now avoided.
get: permissions were set after a file was restored. For large files that might take a while to restore, the permissions were lax during the restore. Now the permissions bits are correct during the restore.
get: similar to above, directory permissions were set after a directory and all its contents were restored, and were too lax during the restore. Unlike a file, directory permissions cannot be set correctly while the directory is being restored. For example, if restoring a directory with r-x permission, it would not possible to restore the directory contents because w access is needed. So during the restore, a directory’s permissions will be set to rwx for the owner, none for others. Then after the restore is complete, the correct permissions are set.
during a sync operation, where archive files from a previous operation needed to be sent to remotes, files were not sent in order.
mount: reading files through an hb mount point was very slow for large files backed up with fixed blocks sizes. Here are performance comparisons for a 10GB VM image saved with 4K blocks (smaller block sizes benefit more than larger block sizes, but both are faster):
Read at offset 0: was: 258215424 bytes transferred in 30.039307 secs (8.595918 Mbytes/sec) new: 550768128 bytes transferred in 30.244884 secs (18.210291 Mbytes/sec)
Read at offset 512M: was: 22285824 bytes transferred in 32.114154 secs (693.956 Kbytes/sec) now: 693599744 bytes transferred in 30.717192 secs (22.580181 Mbytes/sec)
Read at offset 1G: was: 6557184 bytes transferred in 31.744395 secs (206.562 Kbytes/sec) now: 577826304 bytes transferred in 31.432984 secs (18.382801 Mbytes/sec)
mount: reading a file sequentially is twice as fast for files backed up with -D. This is on top of the improvements in #537, where an n^2 algorithm was replaced by a nlogn algorithm.
backup: the config keywords no-dedup-ext, no-compress-ext, and no-backup-ext specify file extension of files that you don’t want to dedup, compress, or backup. The expected way of specifying these was: hb config -c backupdir no-dedup-ext 'jpg,jpeg'. Matching of suffixes is case-independent. The change in this release is that extensions can also be specified with spaces and/or with leading periods, so this is valid: hb config -c backupdir no-dedup-ext 'jpg jpeg, deb .iso'
backup: if a backup is terminated with kill -9, it doesn’t get a chance to clean up archive files; the next backup could print a negative number for the space used if the successful backup was smaller than the terminated backup
hb would not upgrade rev 9 databases to rev 11; now it will
security: hb would display a specific error message if a padding error was detected during decryption. This can often be used in a "padding oracle" attack. It doesn’t exactly apply to hb because an attacker must be able to request decryption of chosen ciphertext, and hb will not perform decryption without a correct key file. But as a precaution: — hb will no longer distinguish between padding errors, decompression errors, and hash mismatches; all of these will cause a hash mismatch — padding now uses random bytes
if a directory destination (Type dir) in dest.conf didn’t exist, it caused a confusing error message like:
dest hb2: error in send arc.0.0: [Errno 2] No such file or directory: '/Users/jim/hbdir/arc.0.0.tmp'
dest hb2: Traceback (most recent call last): File "/basedest.py", line 75, in loop File "/dirdest.py", line 22, in loopinit err: dir(hb2): directory doesn't exist: /Users/jim/hbdir
if an error occurred while initializing a destination, the destination name was not always included in the error message
mount: when a large file was backed up with -D, reading the file via an hb mount point would take a long time
backup: on BSD and OSX, hb indirectly used sysctl (a system command) to determine the number of CPU cores. But sysctl may not be available to cron jobs with the default path, so PATH had to be changed in the crontab file. A different method is used now that doesn’t require setting PATH. NOTE: this change was supposed to go in release #512
backup: display an error message rather than a traceback when non-integer values are used for keywords in dest.conf that are supposed to have integer values, such as a port number.
get, ls: using -rX where X is a deleted version would cause a traceback. Now it display an error that the version doesn’t exist.
IMPORTANT NOTE: this release contains critical fixes to the recover feature. Recover is needed when some or all of the local backup directory is lost, to recover data from a remote backup. Everyone should apply this upgrade if using a remote backup (in dest.conf file)
recover: sometimes an error could occur during recovery: Exception: Incremental db missing? 787521536 > 785825792: /hbtest/hb.db.163 The backup data is all fine, but there was an incorrect test in the recovery code that is now fixed.
recover: in an unusual situation where backups are running but a destination is unavailable (so no files are being transferred), then the destination becomes available and you recover from the destination before backup is able to sync the destination, it could cause the error: OSError: [Errno 2] No such file or directory: '/hbbackup/hb.db.229' This has been corrected.
backup: socket pathnames caused a "Pathid xxx unused" error in selftest. This is harmless and selftest would remove the unused paths, but the cause is now fixed
selftest: in the previous release, selftest level 4 verified that every file block could be decrypted and uncompressed, and that the block and file checksums were correct. For backups with a lot of dedup, VM backups for example, the same data block might need to be decrypted, uncompressed, and hashed many times to compute the file checksum. Now, -v4 will only process each block once and will not verify whole file checksums. -v5 will verify file-level checksums and is equivalent to -v4 in previous releases. Running selftest without -v is the same as -v5, the highest level of checking.
backup: HB is more efficient about storing hb.db.nnn files. The first backup after installing this release may delete more hb.db.nnn files than usual. Keep in mind that the hb.db.nnn file sequence numbers do not necessarily correspond to the backup version, ie, backup #5 may create hb.db.7. This change will also make recovery times shorter.
NOTE: this rev will do an automatic database upgrade to dbrev 11. Any previously backed up socket files are deleted during the upgrade, since these cannot be restored by hb get and generated errors during a restore.
still working on the option of remote-only backups
backup: backups on BSD failed with an error message: object has no attribute 'setbackup'
get: in a large restore of thousands of files, an error could sometimes occur on a few files:
Warning: partially restored file: (filename shown here) Exception: free variable 'cur' referenced before assignment in enclosing scope Continuing restore
backup: on BSD and OSX, hb indirectly used sysctl (a system command) to determine the number of CPU cores. But sysctl may not be available to cron jobs with the default path, so PATH had to be changed in the crontab file. A different method is used now that doesn’t require setting PATH.
in #505, a 30-second timeout was added to rsync destinations. On a very slow target (an unslung NSLUG2 NAS), rsync might repeatedly timeout when sending an updated archive after a retain or rm operation. Now, the default timeout is 3600 seconds (1 hour), but it can be changed with the Timeout keyword in dest.conf for rsync destinations.
ssh destinations sometimes failed with authentication errors when OSX or BSD tried to connect to a remote ssh server running CentOS 5.5. To improve compatibility, hb now uses the system sftp program instead of connecting directly to the remote ssh server.
socket files were backed up in previous versions of hb, but these can never be restored and hb get would generate an error when it tried to restore sockets. Sockets are no longer backed up.
development of the "remote only" backup option is not quite ready, so this very minor update is being issued to extend the beta expiration date to September
a few doc files were updated
backup -c <dir> to an empty directory (no hb init) creates hb.lock, but then hb init refused to run because the directory was not empty. Now, hb.lock is ignored by init
NOTE: this rev will do an automatic database upgrade to dbrev 10. Some items were moved from archives to the main database, so many archives may have items removed and your hb.db file may grow slightly.
IMPORTANT USAGE NOTE: if a new destination is added to an existing backup, the new destination is not fully synchronized until after the next successful backup. This has not changed, just making everyone aware
COMPATIBILITY NOTE: in previous versions, the hb executable was copied to all remote destinations if it changed. Now, the hb executable is only copied if the config variable copy-executable is True. The default in this version is False, which is a change in behavior. To re-enable this, use: hb config copy-executable True
Google Storage for Developers, an Amazon S3-like service currently in beta and available by Google invitation only, is now supported. The destination type is gs, the dest.conf config variables are accesskey, secretkey, and bucket, as with S3 destinations, and the environment vars are GS_ACCESS_KEY_ID and GS_SECRET_ACCESS_KEY; environment vars are only used if accesskey and secretkey are not specified in dest.conf. For more information, see http://code.google.com/apis/storage/docs/overview.html
NOTE: Google Storage for Developers is not the same as Google Storage for Docs. HashBackup does not yet support using Google Storage for Docs as backup space.
S3-compatible services are supported with the new Host and Port config variables on an S3 destination. This can be used with Eucalyptus' Walrus S3-like service for example. Eucalyptus / Walrus is an open source S3 clone that provides an S3-like service. For more information see http://www.eucalyptus.com
NOTE: the Host and Port keywords are not necessary with destination types s3 or gs, and will default to the correct Amazon or Google values. For other S3-like services, Host and/or Port are required.
Rackspace Cloud Files storage is supported in this version. The destination type is cf, Userid and Accesskey keywords are required, and Container is where you want your backup stored. Unlike S3 and Google Storage, Cloud Files containers are per-userid, so there is no need to find a globally unique name. Rackspace charges for incoming bandwidth are about half of Amazon S3 and Google Storage. For more information see www.rackspace.com
rsync destinations will timeout after 30 seconds if the remote is unresponsive. Before, it took a very long time for an rsync destination to timeout
if a local archive file is missing during a get (restore), hb automatically downloads it as needed; this is not new. Destinations in dest.conf should be listed fastest download speed first, and the fastest destination with the most up-to-date version of a file is the one that will be selected for downloading. Before this release, the destination chosen was unpredictable. This does not apply to recover (get all files) since recover uses only the destination you specify.
if a destination had a failure, hb would display an error and process the next request for that destination. Now hb will try requests up to 3 times, and if they all fail, the destination is stopped and no more requests are sent to it. hb will "catch up" the destination on the next backup.
when multiple destinations were configured and an error occurred on one destination, the other destinations continued to work correctly; but once the working destinations were finished, a backup would sometimes "hang" waiting for the failed destination to finish (it never would finish since it failed)
selftest -v3 sometimes displayed the backup size larger than the actual size.
selftest: data structures created during selftest are more memory-efficient
when backup is waiting for destinations to finish, it prints a list of destinations that are still busy with transfers, and updates this list as destinations finish their work
display a message when the hb executable program is being copied to destinations; this can sometimes take a while, and it isn’t always obvious why there is a longish delay
the major new feature in this release is incremental transmission of hb.db, the main HashBackup™ database. The major new feature of the next release is making optional the local copy of the backup.
COMPATIBILITY NOTE (backup): the inex.conf file previously allowed excluding files from the backup and overriding those excludes with includes. There were several bugs and points of confusion, so this was simplified by eliminating the include keyword: now, files can only be excluded. To include a file that would be excluded by an inex.conf rule, list the pathname on the backup command line. Include processing may be added again later, depending on user feedback. As a side benefit, the incremental backup file scan to find modified files is 10% faster.
COMPATIBILITY NOTE (backup): in previous releases, data from a prior backup was often used for dedup even if the -D option wasn’t specified. Now, the -D option is required to enable dedup. Backup data created without -D is not used to dedup future backups, in most cases.
config: a new config parameter, no-backup-ext, has been added. This is a list of filename extensions that should not be backed up. For example, you might use: hb config -c /hb no-backup-ext avi,mov,o to skip backup of files ending with .o, .avi, and .mov. This is faster than using exclude patterns like ex *.o in inex.conf.
config: a new config parameter, dedup-mem, sets the default amount of memory to use for dedup operations. This can be overridden by backup’s -D option. The default is zero, for no dedup. In this release it’s also possible to have dedup-mem set to some value you usually use, like 1gb, and use the -D0 backup command line option to disable dedup just for that backup. See doc/dedup.info for more information about the amount of memory to use for dedup.
a new command, hb init, is required to initialize the -c backup directory before the first backup. This allows modifying exclude rules before the first backup, and allows setting the encryption key to a user-specified value using the -k option rather than hb choosing a random key.
SECURITY NOTE: for higher security, it is recommended that you let hb init choose a random key string, as with previous releases. Or, you can choose a long phrase that is easy to remember for your key, for example: my cat chases my dog. For less security, you can use -k '', which specifies encryption with a blank key. With a blank key, there's no need to store the key securely. Spaces are removed from the key, so key abc def is the same as abcdef.
a new command, hb rekey, can be used to change the database encryption key. Usage is similar to hb init. After the database is rekeyed, it is transmitted to any remote destinations you may have setup in dest.conf. If you want some security but don’t want to fiddle with storing encryption keys separately, you could rekey to an easily remembered phrase like 'my dogs name is spot'. For less security, you could rekey to the blank key ''.
backup: the Freq keyword is no longer supported. This was used to defer transmits offsite, for instance, once per week. The only time or bandwidth savings was for transferring the hb.db file, and this is now sped up in a different way.
sending incremental backups offsite is much faster, especially if not many files were modified. For rsync destinations, the first backup after the upgrade will be much slower than usual, while other methods (ftp, etc) will be about the same. After that, it will be faster than usual for all destination types.
clear: files stored on destinations are also removed; a new warning about this is displayed unless --force is used
mount: when accessing the mounted HB filesystem, data near the end of a file would sometimes not be returned correctly. The backup itself was fine - only accessing it via mount was affected.
backup: dir destinations were creating the target directory, even though dest.conf.example said the directory had to exist
backup: if a pathname on the command line is a symbolic link, backup saves both the symlink and its target. For example, on BSD systems, /home is a symlink to /usr/home, so a backup of /home saves both the /home symlink and the complete /usr/home tree. The new behavior in this release is that if a backup pathname contains a symlink, the pathname is resolved and the target pathname is used instead. So for example, if /home/jim is saved, a message is displayed that the pathname was changed to /usr/home/jim, and this tree is saved. No /home/jim pathnames will appear in the backup, since they don’t actually exist in the filesystem.
ls would not display the contents beneath a symlink to a directory. For example, on BSD systems, /home is a symlink to /usr/home. If /home/jim was given to the backup program, all of /home/jim would be saved, but ls would only print / and /home. This situation cannot occur going forward, because in this release, the backup program resolves the symlink in the pathname (see previous note).
ls displays "(parent, partial)" when a directory is backed up because it is the parent of a file that was requested. For example, if /Users/jim/x is backed up, /, /Users, and /Users/jim are also listed in the backup, all marked "(parent, partial)"
get: if a file or directory being restored already exists, get prints a warning. If the user interrupted the restore, deleted or renamed the existing file, and then continued the restore, get would fail when trying to delete the old object
backup: sometimes a hard-linked file would be saved on every backup
backup: create hash.db without execute permission bits
rm & retain: if an archive was removed while some destinations were offline or inaccessible, it was not removed later when the destinations were accessible
rm & retain: if one file such as /Users/jim/x is backed up and then removed, a small archive was left in the backup directory if there were extended attributes or ACLs on any of the parent directories
error messages were sometimes being sent to stdout instead of stderr, which can be an issue for scripting hb
ls: failed when a file or pathname was listed on the command line. This bug first occurred in #408.
versions: with no options, display the most recent 5 backups vs 1
in #416, the upgrade command was changed to set the owner of the hb executable as it was before the upgrade. But a typo caused the error: AttributeError: 'module' object has no attribute 'state' After this error, there will be an hb.tmp file in the same directory as the old hb executable. To finish the upgrade, do: mv hb.tmp hb
backup: if a single zero-length file was saved, the empty archive file created should have been deleted
delay transmitting archives after a retain or remove if they don’t shrink very much. This was unintentionally removed at #408
in release #408, the help command didn’t work if the fuse library was missing. Now, only the mount command will fail when fuse isn’t installed
COMPATIBILITY NOTE: the undocumented --no-dupcheck option to backup has been removed, as no dedup is now the default and has been for several releases. To enable dedup, use the -D option, for example, -D1g will dedup using 1GB of memory.
this version of HashBackup has a new archive format that uses up to 9% less space to backup the same amount of data from VM images, and is also faster to create and access. Old format archives are still supported and are converted to the new format when they are accessed. You can leave old format archives on remotes, and when recovered, hb will convert them. If you want to force all local archives to be converted immediately, use hb selftest -v3 after installing this new version. The converted archives will not be uploaded to destinations unless they shrink much smaller than the remote archive. If you want to force all converted archives to be uploaded, run selftest as described, then delete dest.db from the backup directory. All archives will be uploaded during your next backup.
NOTE: during conversion, new format archives are created in a temp file and then replace the original. Because it would double your backup space requirements, no copies of the original archives are kept. If you want a backup of your old archive files, copy them before running this version of hb.
a new command "config" will display HashBackup’s config settings. These can also be changed with the config command to modify the operation of hb. In this release there are several config settings:
arc-size-limit: this controls how large individual archive files can grow before a new archive file is started. The default is 1gb, as before. The minimum value is 1mb and the maximum value is 8gb. Be careful not to set the size larger than your remote destinations can handle, for example, Amazon S3 is limited to 5gb file sizes, so the archive size limit should probably be 4gb at most. GMail has a size limit of 25mb, so a limit of 20mb should be used.
no-dedup-ext: files with a suffix listed here will not be deduped. Any suffixes listed here will add to the built-in list that backup uses. Suffixes should be listed with commas but without periods, for example: hb config no-dedup-ext bz2,bz
Config settings are versioned and match the corresponding backup. This allows you to see the config settings used for any prior backup, and to revert to the config settigs of a prior backup.
a new command "compare" will compare a filesystem path to a backup and display any differences. Compare only supports comparison to the current backup version, though in the future it may be extended to compare to older backups ("What has changed since -r10?")
the built-in help has been cleaned up
if recover -a is used to fetch files from remote destinations, and .old files exist in the local backup directory, recover would warn that archive files exist and would be renamed, even if they didn’t really exist; the .old files were confusing recover. This could happen if recover was executed twice for example. Everything worked, just the warning was sometimes incorrect.
when using a GMail account to store backups, hb.dbz and dest.dbz would keep appending to their email conversation, instead of replacing the old files. HB was never designed to work this way, and didn’t work this way with other email providers. Now, there will only be 1 hb.dbz and 1 dest.dbz, to save space in the GMail account. Be sure to set arc-size-limit (see above) when using an email account to store backups.
for future upgrades, hb upgrade will display only the changes since the version of hb that is already installed. So for example, if you are at #406, miss the upgrade to #407, and do the upgrade to #408, it will display only the changes in #407 and #408 - not the whole change log.
with release #339, the upgrade command sometimes had trouble finding and replacing the hb executable. As a workaround, use: /path/to/hb upgrade
to upgrade the executable, ie, a full pathname. This was fixed in #346, and upgrade should work fine with that release.
the October build was created on newer versions of Linux and BSD 8. Unfortunately, on Linux it required glibc 2.7, but CentOS, RHEL, and other versions of Linux do not have this newer C library. This version of HashBackup was built on CentOS 5.5 and should run on both older and newer versions of Linux. The BSD version was built on BSD 7, and should run on both BSD 7 and 8.
new command "upgrade" upgrades your hb executable to the latest version. This should be run from a userid with sufficient permission to replace the hb command. For example, if hb is in /usr/local/bin, you may need to run hb upgrade as root. The old hb executable will be renamed to hb.bak after a successful upgrade.
backup: on Mac/OSX, files would sometimes be backed up that hadn’t changed. OSX changes ctime on files, and this is what hb used to detect changes. This is fixed with a more detailed test using both ctime and mtime.
backup: related to the previous issue, some sites may not want to trust mtime, because it can be set by user programs. For example, a file’s data can be changed and then mtime reset to its previous value; this makes it appear that the file data hasn’t been changed. It’s usually fine to trust mtime, but if your site does not, use the --no-mtime backup option and hb will compare the entire file to the previous backup when mtime is the same, to ensure the file data is also still the same. --no-mtime will make your backup run somewhat slower.
Variable block dedup is not used: - for files smaller than 128K - when -p0 is used to disable multicore backup - on single-core systems
backup: with rev #321, an error message: OperationalError: database arc is locked was sometimes displayed on incremental backups larger than 1GB.
add the Port option to ftp destinations, userid and password are optional for anonymous ftp access
a new hb command sha256 computes the sha256 hash for a file. This is handy if your OS doesn’t have a built-in sha256 command.
ls accepts a new -v option. When combined with -l (long display), -v also displays the inode, ctime, and sha256 hash
backup: on Mac/OSX, if backing up / and the current directory was not /, an incorrect error message was displayed, like: Pathname changed: / ⇒ /Users/jim/ This bug was introduced in r288
backup: if a single zero-length file was saved in version 0, or all backup data was removed with hb rm /, the next backup would immediately fail with an error
NOTE: this rev will do an automatic database upgrade to dbrev 7. If you use -D, your first backup after this upgrade may be slower to get started while it rebuilds the dedup database.
a 64-bit build of hb is now available for Linux and BSD; the Intel OSX build is already 64-bit. The advantage is that 64-bit builds allow dedup (-D) tables larger than 2GB, and 32-bit compatibile system libraries don’t have to be installed.
backups with dedup (-D) are a little faster because of less I/O
improved error handling for multi-core backups
a new destination type, sftp, is added. This is equivalent to the old ssh destination type, except now either will work with ssh servers that allow sftp access but may not allow terminal sessions
get: redundant pathnames are an error, not a warning. Better safe than sorry
if HB was waiting for a yes/no question to be answered, ^z was used to suspend the program, then fg was used to start it again, an error about an interrupted system call would be displayed. Now, the question will be repeated.
NOTE: this rev will do an automatic database upgrade to dbrev 6. The hb.db file may shrink up to 50% after this upgrade, because some data is moved into a separate database, hash.db. The advantage is that when hb.db is sent offsite, it may be much smaller; hash.db is not sent offsite and is generated from hb.db when necessary
backup: archive files were sometimes being transmitted twice
backup: a new option, -p, uses multiple CPU cores to backup large (>128K) files. This can speed up the backup of large files by 30% or more. If the -p option is not used, HB will automatically use multiple cores when run on a multi-core system. The -p option allows finer control of this feature:
-p0 = do not use multiple CPU cores for backing up -pn = use n extra tasks no -p = use 2 extra backup processes on a multi-core computer
You might be tempted to use -p8 on an 8-core system, but it could actually make your backup slower. If you have very fast disks, you may want to try increasing -p to 3 or 4.
backup: a new option, -D, controls data dedup. This version of HashBackup uses a new dedup method that is much more scalable. The value after -D is the amount of memory you want to use for dedup information. The -D option may be tweaked in the next few releases. For detailed information, see doc/dedup.info
NOTE: in previous releases, dedup was enabled by default; now, it is disabled by default, because it requires some thought about the amount of memory to use for dedup. HashBackup may still do some dedup on incremental backups, even if the -D option isn't used, especially with VM images, logs, mailboxes, and databases.
backup: a new option, -B, controls the block size. This can be 1K, 2K, 4K, 8K, 16K, 32K, 64K (default), 128K, 256K, 512K, 1M, 2M, or 4M. Tests show that a large block size doesn’t usually speed up I/O, but it allows HB to scale to larger backups when very large files are being saved. A trade-off is that the dedup mechanism may not find as much duplicate data with larger block sizes. With small block sizes, dedup is better, but overhead is higher and the backup may be slower. The default block size is either 64K or 4K bytes, depending on the file type.
backup: print backup directory space used just for this backup and in total
backup: print compression statistics for this backup as both a percentage and compression factor
backup: a new option, -m/--max-file-size n, will skip files larger than n bytes
backup: display an error message if a directory destination is the backup directory. This copied the backup onto itself and caused warning messages like: dir(destname): file changed during transfer: /hb/arc.0.77 The backup remained intact, but there would be a lot of unnecessary disk I/O.
backup: if -c backup target is a FAT filesystem, such as most flash drives, the initial backup would fail with an error message like: OSError: [Errno 1] Operation not permitted: '/mnt/hb/key.conf' HashBackup was trying to make the key file read-only, but this is not possible on a FAT filesystem. Trying the backup again would succeed.
backup: on Mac/OSX, strip trailing slashes on command line pathnames to prevent errors like: Pathname changed: /Users/jim/backup/rest ⇒ /Users/jim/backup/rest/ Unable to stat file: No such file or directory: /Users/jim/backup/rest/
get: if the same pathname was listed on the command line twice, or redundant pathnames such as /abc and /abc/def were used (because restoring /abc would also restore /abc/def), get would fail with an error on OSX like:
Unable to hardlink: Operation not permitted: /Users/jim/hb-551625.tmp -> /Users/jim/xxx Not restored: /Users/jim/xxx
On Linux, get would fail with an error about using bind --mount. The get command incorrectly believed the directories should have been hardlinked. Now, a warning is printed about skipping the redundant pathname.
get: when a directory specified on the command already exists, hb would correctly do the restore into a temp directory; but when it tried to remove the old existing directory, the remove could fail with a "Not a directory" error if the old directory contained a symbolic link to a directory. ("directory" 6x - Ugh!)
get: don’t complain about existing symbolic links if they point to the correct destination. This doesn’t change hb’s restore behavior, but avoids unnecessary error message displays.
get: setuid and setgid mode bits were not being restored (these are for privileged commands like "mount"; see chmod command)
get: as a security precaution, setuid, setgid, and the sticky bit are only restored when running as root. Otherwise, a warning is printed for files with these bits set; see chmod command.
get: if an error occurred during restore of one path, get would (incorrectly) say that there were errors in all subsequent paths, even though the paths were restored without errors
recover: non-rsync destinations could fail with the error: object has no attribute 'cursor'
recover: could display the warning message: Unable to set mtime on hb.db: Cannot operate on a closed database.
recover: now displays a summary warning if any files were not downloaded
get: refused to restore / to a non-empty directory. But this is sometimes necessary for a system rescue, so now this error is a warning. As always, be careful when restoring files from a backup!
get: if / was restored to some directory (not /), without using --orig, the pathname displayed for restored files was incorrect and symbolic and hard linking did not work correctly. This situation typically occurs when booting from a CD to restore a crashed root filesystem that is temporarily mounted under /mnt.
recover: there was a bug in the new feature to remove unreferenced blocks from downloaded archives. If you ever ran recover with version #249, run selftest -v3 to verify your backup’s integrity. IMPORTANT NOTE: this is a critical bug in #249 and everyone is urged to upgrade
when this build is installed, archives are scanned to remove any stale file data. This is necessary because of a bug in version #249 retain (see next item). This scan will take some time for large backups. If you have to interrupt it, it will restart the next time you use hb; ie, you won’t trash your backup if it’s interrupted. The hb.db file itself is not modified, other than to change the rev. The archive scan is removing orphaned data blocks and possibly compressing archives. If any archives are sufficiently compressed, they will be transmitted on the next backup to your remote destinations.
retain: sometimes the message: Unable to remove block xxxx: no such table: rmlist was displayed. Retain still removed older versions of files, but the file data itself wasn’t being removed from the archive files. The effect of this bug is that archives do not shrink when they should, but all backup data is intact.
HashBackup’s Amazon S3 destination now accepts any value for the Location keyword (S3 Region), without validating it. The values and their meanings as of May 5, 2010 are: — no Location, US, or blank location = US Standard. Data will be stored on the east coast or west coast, whichever is closest — us-west-1 = US west coast. It costs more to store data using this name vs using the more generic US region — EU = Ireland — ap-southeast-1 = Singapore
rm/retain: transmitting an older archive over rsync after a rm or retain uses less CPU time than with version #249
selftest: if an error occurred with -v0 (just read the database file), the displayed error count should have been 1 but was actually a library error code value, like 256.
selftest: added a new -v1 verify level that does not traverse each file’s block info since this takes time for backups with very large files such as VM images. -v2 is now like the old -v1: -v0: read each page of main database, like cat hb.db -v1: check database, don’t traverse file blocks, don’t read archives -v2: check database, traverse file blocks, don’t read archives -v3: v2 + read all archive blocks and verify crc -v4: v3 + decrypt and decompress all data, verify block hashes, verify file hashes. Like a restore, without writing to disk. -v9: v4 + low-level database integrity check As before, the default verify level is -v9.
selftest: verify levels -v1 and -v2 (the new one) are a bit faster if the database isn’t already cached in memory
selftest: -v3 (old -v2) would sometimes display X GB verified, where X was much bigger than the entire set of backup data. Related to this, -v3 (old -v2) may run faster, depending on your backup data
if arc files exist on the local system and a recover -f command is issued, the existing arc files are renamed with a .old suffix. These .old files are now ignored during archive synchronization.
backup: if a pathname ending in . was used on the backup command line, /Users/jim/backup/. for example, a message like: Pathname changed: /Users/jim/backup ⇒ /Users/jim/backup/. was displayed, and the stored pathnames also contained .
rm/retain: with hundreds of archive files, rm and retain could run out of file descriptors when a large number of files were being removed, especially on systems like OSX where the number of open files is limited to 256 by default. rm and retain now use just a few file descriptors.
backup: exclude /private/tmp/ on OSX (/tmp symlinks here). To update an existing inex.conf, add ex /private/tmp/
INCOMPATIBILITY NOTE: the -n option (dry run) has been removed from the retain command, but the longer forms --dryrun and --dry-run are still available. This is in preparation for -n to mean "don’t transmit the database", as with the backup command, to allow retain to run more than once before transmitting hb.db
NOTE: this rev will do an automatic database upgrade to dbrev 4
beginning with this release, HashBackup releases are identified by the build number rather than a version number
the Linux build of 0.9.10 failed: ImportError: No module named acl
in some cases, the database upgrade in 0.9.10 could fail with: TypeError: 'NoneType' object is unsubscriptable
if the backup command did the database upgrade (vs any another hb command), it would take 10x longer than it should have, because IO buffering was disabled. On a Fedora test machine, the upgrade took 20 minutes with the backup command, but only 2 minutes with this fix
if the backup command did the database upgrade, it would then fail with an error like "not a database or encrypted: arc.x.x-journal". The next backup command would work. This was a bug in the archive synchronization procedure
the change in 0.9.9 to use multiple CPUs to prepare the database wasn’t actually enabled in previous beta builds
VMWare memory images, *.vmem, are now excluded when a new inex.conf is created. For existing inex.conf files, add an ex *.vmem line
ls: was looping when a wildcard filename like '*.vmem' was used
selftest: added multiple levels of selftest, taking increasing time: -v0: read each page of main database, like cat hb.db -v1: database consistency; no archive files are read -v2: v1 + all archive blocks are read and crc verified -v3: v2 + all data decrypted and decompressed, block hash verified, file hash verified. Like a restore, without writing to disk. -v9: v3 + low-level database integrity check The default level is -v9. This checks everything possible, as in earlier versions of selftest. On my MacBook, with 45GB backed up, using 22GB of backup space, the verify times are: -v1: 4 minutes -v2: 24 minutes -v3: 71 minutes -v9: 75 minutes For comparison, it takes 10 minutes just to read the entire 22GB of backup data from disk at the maximum speed of 35 MB/sec with the command: time cat /hb/* >/dev/null
selftest: delete unused paths with -v1 or greater; this is normally not necessary, but unused paths may occur in some circumstances
rsync destination: the Dir keyword is checked to make sure it has
rsync destination: HB was always adding /filename to the end of the Dir keyword to form the target path, but if the Dir path ends in : then a slash should not be added
rsync destination: added debug keyword. If value is 1 or more, the rsync command line is printed and -v is added so that rsync is more verbose. If debug is 2, -vv is added (even more verbose), etc.
rsync destination: improved transfer efficiency, esp for rm and retain
S3 destination: added debug keyword. With a value 1, data being sent to and received from Amazon S3 is displayed. With a value 99, any exception during a file transfer will cause a traceback and HB will hang (use Ctrl C to terminate it).
S3 destination: added a DNS lookup during startup to display a better error message when a system’s DNS is not configured correctly
backup: version 0.9.10 would fail on very long (>1023 bytes) symlinks, ACLs, and extended attributes with the error message: TypeError: an integer is required
retain: added directory retention. Previously, retain only removed files, which could leave empty directories in the database
made "hb restore" an alias for "hb get"
rm and retain now overlap archive compression, archive transmission, and database compression
better cleanup of archive journal files
rsync destination: workaround rsync bug: rsync 3.0.4 client (PCBSD 7.1.1) with rsync 2.6.9 server (Mac OSX) gives "unknown option"
this version has a database format change and will automatically upgrade your backup database the first time hb is used. All backup data is maintained, except extended attributes (SELinux); they will be saved again on the next backup
ACLs are supported on Mac (OSX) and BSD systems (Linux ACLs were already supported) NOTE: OSX 10.5 (Leopard), FreeBSD 7.1, and PCBSD 7.1 have an operating system bug that causes a small memory leak for every ACL restored. A patch to fix this was committed in the FreeBSD tree
the mount command (FUSE) is available on FreeBSD/PCBSD
mount: reading files from a mounted backup (FUSE) was sometimes extremely slow and CPU intensive because of a bad database query
on OSX, filenames are case-insensitive; but if /users/jim is used on the backup command line, it must still be saved as /Users/jim, and HB will print a notice: Pathname changed: /users/jim/backup/x ⇒ /Users/jim/backup/x NOTE: HB exclude/include processing is always case sensitive
on BSD & OSX, file flags are saved/restored like Linux version (see man chflags)
on BSD & OSX, extended attributes on symbolic links are now saved and restored. (Linux symlinks cannot have extended attributes)
get: on BSD & OSX, symbolic links with a different mode than their link target now have the correct mode after a restore
queue database to transmit next when the backup is finished. If transmitting all archive files takes a long time (days or weeks for a huge backup), there will be a database saved on the remote side to restore the archives that did finish transmitting
if the backup database doesn’t exist but the compressed database does, HB will ask if you want to expand the compressed database. This is useful to run selftest directly on a destination directory, for example, an external USB hard drive. Or, if the disk area storing the database itself goes bad (very unlikely, but possible), the compressed DB file can be expanded and used instead
recover: fix 1% failure with index out of range error
recover: print numbers so it’s clear that recover isn’t stuck
a problem restoring a symbolic link or extended attributes could cause the get command to abort. Now it will print an error message and continue the restore
some HashBackup data files had x (execute) permissions
a directory could be saved without its extended attributes (SELinux)
destination handlers were sending error messages to stdout vs stderr
when key.conf is created, a second line is written with spaces every 4 hex digits to make the key easier to copy by hand
backup: if a pathname requested for backup is a symbolic link, for example, /home points to /usr/home on FreeBSD/PCBSD, the symbolic link’s target (/usr/home) is added to the backup with a notice: Adding symlink target to backup: /home → /usr/home This prevents the serious mistake of believing the files "in" /home are being backed up when in reality, only a symbolic link to /usr/home would be backed up. IMPORTANT: "symlink following" only occurs for command line paths!
the rsync destination now accepts a port keyword, to allow the rsync daemon to run on a port other than the standard port 873. This only works with rsync modules, ie, two colons used in the dir path.
cache sizes have been scaled back in this version; determining the optimum cache size needs further study
add Password keyword to rsync destinations, to set the rsync module password when using the two colon form of rsync (direct to rsyncd)
expanded documentation for rsync destination in dest.conf.example
preparing the database for transmission is 35% faster on multi-core systems
ls: added a note, noaccess, when directory contents can’t be shown because of insufficient permissions
ls: improved performance 10% when listing specific files
Amazon S3 uploads would sometimes fail with "bad marshal data" or "No parsers found", depending on the system’s configuration
Amazon S3 uploads would sometimes fail with an error like: s3(xxx): sending <pathname>: sent XXX of YYY bytes where XXX was much greater than YYY.
native FreeBSD/PCBSD build added to beta site
removed timeout code from all destinations; it caused some problems, especially with FTP, and wasn’t very useful since each destination runs in its own thread
the rsync timout was set too low, causing the next file transfer to start, concurrently, every 15 seconds
fixed a selftest bug: OperationalError: no such column: blockshas.sha The database was fine - this was a bug in the selftest code
the recover command didn’t work with an rsync destination
fixed KeyError problem on PCBSD when sizing memory
fixed "Unable to read flags" error on PCBSD, in Linux compatibility mode; file flags (man chflags) are not yet supported on BSD / Mac
ACL’s are not yet supported on BSD / Mac
if FUSE wasn’t installed, mount would throw an exception
added Intel binary for Mac; 0.9.6 was compiled only for PowerPC and ran in emulation mode on Intel
new destination type: rsync; see dest.conf.examples
Amazon S3 was stepped on in 0.9.6, but is fixed
database was being prepared for transmission even if it was deferred on all destinations; changed to avoid unnecessary work
the backup program is copied to the backup directory if necessary
imap (email) and S3 connection handling is improved
IMPORTANT NOTE: this version has a database change; use hb clear to remove beta test backups created with earlier versions, or create a new backup directory to use with this release. The database format will be forward compatible at release 1.0
improved scalability for backups >100GB
backup: saving VM images (.vmdk, .hdd, .qcow2, etc) will use more disk space for the initial backup, but incremental backups will be much smaller for typical work loads
space: document this 0.9.3 command on the beta site
space: performance improved 5x
get: regulate memory usage for large restores with millions of hard links (this change was for a 500GB restore with 31M files, more than half of which were hard links)
ftp: changed block write timeout from 30 seconds to 2 minutes
ftp: display a message when a timeout occurs
dir destination expands ~jim in directory name
a default inex.conf file (include/exclude) tailored for each computer system is created on the first backup. Create an empty inex.conf or edit the file if you don’t want the default exclusions
initial backups directly to NFS are ~15% faster, but still slower than backing up to a local drive. Incremental backups with few modified files are fast, comparable to backup on local drives
added check for unrecognized arguments to commands
fix typo in mount message: fusermount to unmount, not fuseumount
added destname to recover command’s help display
backup: if control-c was pressed at exactly the right time on the first backup, the database could be only partially initialized
selftest: error counter was not always incremented, so error messages could sometimes be displayed but not counted
selftest: detect missing root pathname record in hb.db
recover: if hb.db exists in the target directory but is not readable, for example, it’s empty, recover would say "run a backup first"
recover: a change in 0.9.2 caused the recover command to fail with a "transactions cannot be nested" message
directory destinations: when copying to a directory, .tmp files would be left if the target disk runs out of space
backup: incremental backup huge directory on 1GB test machine, 15M empty files with 32000 hard links in 45 mins vs 105 mins
new command "space" to show how backup space is being used
backup: reincarnated memory savings from version 0.3 for incremental backups to improve scalability on huge directories (>1M files)
backup: remove warning "Unable to stat file" on deleted files
backup: the built-in excluded path list (/proc, /tmp, etc) was removed because it caused confusion with /tmp backups, and /proc and /sys were excluded anyway as separate filesystems
backup: hitting control-C at just the right time after starting a backup could cause a database to be half-initialized
backup: hitting control-C during a backup could cause selftest to display a warning about high reference counts
retain: if backup is run with -n (no transmit), then retain must transmit the database even if retain didn’t remove any files
backup: if a file had extended attributes with names containing characters >= 0x80, no extended attributes were saved for that file
mount: fix Bad address error when trying to read ACL’s on the fuse root or next-level directories
mount: extended attributes fix
mount: an error message was incorrectly sent to stdout vs stderr
initial FreeBSD compatibility testing. This is not a FreeBSD native build yet and still relies on the Linux compatibility layer. There has only been very light testing, but backup, versions, ls, and get appear to work fine with only 1 minor change so far
don’t try to read file system flags on FreeBSD systems
S3 is not working yet on FreeBSD, but all other destinations have been tested and seem to work (FTP, ssh, IMAP, Gmail, directories)
clear: added -f option to force clear; used by test programs
the 0.9.x releases will be for important fixes, in preparation for the 1.0 release, and instead of expiring in 1 month, these beta releases will expire in 3 months (only the backup command expires; backup data is still accessible anytime)database is 7-10% smaller. Run hb clear first to remove existing beta test backups. The database format will be stable and forward compatible at version 1.0
mount: improved performance of file open and close when the backup is mounted as a filesystem
get: if file1 and file2 were hard linked, restoring file2 w/o file1 and not using --orig would cause a checksum mismatch and an empty file was restored. Now, file2 is restored correctly, but will not be hard-linked to the existing file1; to cause the restored files to be hard-linked either use --orig or restore both files together
get: in some circumstances, a file that was hard-linked would be restored without being hard-linked
get: restoring a symbolic link that was also hard-linked could raise an exception
get: restoring a hard-linked file with extended attributes could cause an exception
mount: supports extended attributes (SELinux, ACL’s)
backup: if HASHBACKUP_DIR environment variable was set but was blank or empty, backup files were written to the current directory. Set the environment variable to . to get this behavior
if /var/hashbackup exists but the current user doesn’t have write access, hb would stop with an insufficient access message. Now hb will ignore this directory if the current user doesn’t have access, and offer to create ~/hashbackup as usual
backup: display a message after waiting 5 seconds for destination copies to finish, add a stat line for wait time
add write lock to ensure single user write access to backup data for backup, clear, recover, retain, rm, and selftest commands
versions: if a backup was interrupted, the versions command would sometimes print the current time as the backup’s ending time
retain: now refuses to run if the previous backup didn’t finish
retain: add -f/--force option to override previous safety feature
retain: an error in the -x option, for example, -x30p, printed an error message (correct), but retain would run anyway (incorrect)
retain: time-based retention (-t option) was based on the current time, ie, -t7d meant "in the last 7 days"; now it is relative to the last backup’s finish time. This prevents the problem where backups haven’t been run for a while, then a retain is run and removes all but the most recent backup of current files because the backup is very old
removed internal nice -19, because a backup that took 10 secs on an idle machine took 790 seconds when one CPU-bound program was running; let users do nice -19 or ionice -c3 if needed
mount: reading large files was very slow in ver 0.7
backup: add -n option to defer copying the main database to destinations. If retain is going to be run immediately after the backup, retain will upload the database
add -v (verbose) option to backup; default level is -v2. -v3 will print paths either excluded or with the "no dump" attribute set, v1 prints no filenames, v0 prints no statistics
add -v (verbose) option to get; default level is -v2
get: ask whether to remove the partially restored file when a control-c is pressed
get: display a warning that the current file was only partially restored if an error occurs during restore
for a commandline like hb ugh /home, "unknown command: ugh" was displayed (correct) followed by "Unrecognized command: /home"
backup: if unable to stat a file, print full pathname
backup: better handling of hard-linked files that change during the backup. Because of this change, databases created before 0.8 may fail the selftest; use hb clear to remove beta test backups
backup: bypass exclusion checks when saving parent directories of a requested file
backup: exit code was the number of errors, but should have been either 0 for no errors or 1 if there were errors
versions: align columns for neater output when userid varies
backup: add number of files excluded to statistics
get: trap exceptions and if --orig wasn’t used (ie, we’re restoring to a temp file), keep going. If restoring with --orig, ask before continuing the restore
get: if there were any errors during the restore, ask before replacing the original file or directory
ls: display root path too
get: print full pathnames instead of filenames
recover: after recovering the database from a remote site, it was functional but was larger than the original
open: the command to set the archive size limit isn’t available yet (the limit is set to 1GB). Backups to IMAP servers may need to lower this limit, and huge backups may want a higher limit
ls: deleted files in earlier versions were not being displayed, even with -a
NOTE: database schema has changed - run hb clear if you have test backups from previous beta versions. The database schema will be forward compatible beginning with the 1.0 release
database uses up to 25% less space for very large files
database uses much less space for virtual machine disk images
get: verify file SHA hash matches after a restore
mount: fixed error message when mount directory doesn’t exist
mount: fixed Bad address error when accessing non-current backups
backup: after the first backup finishes, display a large notice about copying the key.conf and dest.conf files to safe locations
changed some common error messages to prevent traceback displays
backup: added the line number to exclude/include error messages
backup/ls: fifos and devices were listed as partially backed up
backup: a development assertion failed when hard-linked files had the "nodump" chattr attribute set or were not readable because of permission restrictions
get: sparse files ending with zeros were not restored correctly
get: verify with OS that a restored file is the correct size
get: verify sparse file hash with a separate read pass after restore
review and start to standardize error message displays
selftest: didn’t correctly handle a symbolic link that was also a hard link (yep, you can actually do this with Linux/ext3)
mount: generated an error when accessing a symbolic link that was also a hard link
mount: could return all zeroes when reading a hard linked file
mount: now returns EIO if there is a problem reading a file
new Freq keyword for destinations, like retain -t; defers copy until enough time has passed since the last copy to this destination
bug fix in ssh destination when target directory did not exist
ftp copy could leave a file open if an error occurred
dest.db was being sent even if a destination was deferred
dest.db is encrypted before sending offsite (other files already are)
rm displayed "Removing all files from version x" when -r was used, but should have displayed "Removing requested files…" if paths were also listed
rm and retain may defer archive uploads to save bandwidth
rm was sometimes leaving a few blocks that should have been removed
added block consistency tests to selftest
get would stop on files with attributes that require root privilege to restore, for example, journal mode (j). An error is now displayed
get from a specific version (-r) would fail on directories
get restores all directory attributes if parent directories have to be created with --orig (Ex: get --orig /a/b/c but only /a exists)
get --orig failed with "Not a directory" when restoring /a/b/c, /a/b already exists, but b is a file and not a directory. This still fails (it has to fail), but with a better explanation
get will download an archive file from a destination if it is missing
help command added
get: if a and b were hard linked, only one was restored with --orig, and the other already existed, they weren’t linked after the restore
backup: added /tmp to the platform excluded directories
get: clearer error message for pathnames ending with slash
get: fixed existing file mtime check when multiple files restored
get: instead of a warning, refuse to restore file over existing directory, or directory over existing file
get: instead of a warning, refuse to restore a partially backed up file/directory over an existing file/directory, unless the existing directory is empty
get: instead of a warning, refuse to restore / into a non-empty directory
get: for safety, removed -f (force) option
prevent tracebacks when expected error messages are displayed
renamed to HashBackup
added -a/-all option to mount, to allow all users access to the mounted backup filesystem. Standard Unix permission checks are still performed on all accesses within the backup filesystem NOTE: by default, -a is only allowed by root, but it can be enabled for others with a /etc/fuse.conf setting
backup: removed /mnt from internal exclusion list; /mnt is still skipped if a filesystem is mounted, unless it is listed on the backup command line
backup: the backup directory contents were automatically excluded, but the directory itself was not. This could cause the version to increment with one file changed, even if nothing else changed
recover: if only 1 destination is setup, use it if none is specified
mount: if the backup directory was mounted, the backup, rm, and retain commands would fail with "Database is locked".
mount: accessing a file that didn’t exist caused a "Bad address" error instead of the correct "No such file or directory"
rm/ls: wrong version displayed for the first file of a backup
rm: didn’t copy database to remotes if it didn’t need to be compressed
rm: don’t display "Remove logid …" - it’s slow on very large removes
selftest: display file counts instead of log id’s - confusing
all: prevent stack traceback when piping output into head
backup: revert backup/memory reduction in 0.3 because of a database limitation: backup would fail after 5 minutes
backup: remove empty archive if backup is interrupted
backup: backups larger than 1GB fixed - nextarc typo
recover: -f option fixed
recover: no longer prompts for confirmation if no action would be taken
backup: removed features to simplify code testing: removed -n option (dry run) from backup removed raw device backup removed --log option for separate log files (capture stderr instead)
backup now prints a message when it skips a directory because the "no dump" attribute is set
mount command is available to view backups as a filesystem (requires Linux fuse kernel module, fusermount, libfuse.so)
s3 dest.conf accepts a new Location EU keyword to create European buckets
selftest: verifying 11M files required 650MB of virtual memory, but now ~325M files can be verified in 650MB
-c option failed - code typo
using environment variable PALBACKUP_DIR failed - code typo
readme: permissions should be 0700 on backup directory, not 0600
readme: removed --force-full documentation; the option is still there, but it’s mostly just confusing to new users
readme: ls -a never required selection strings
readme: retain Nn time means minutes, not seconds
backup: decreased memory requirements of incremental backup for 1M file directory from 271MB to 110MB for better scalability
ls: -r option was not showing any results