Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've long been a huge fan up bup, and have even contributed some code. I might be by far their single biggest user, since I host 96748 bup repositories at https://cloud.sagemath.com, where the snapshots for all user projects are made using bup (and mounted using bup-fuse).

Elsewhere in this discussion people not some shortcomings of bup, namely not having its own encryption and not having the ability to delete old backups. For my applications, lack of encryption isn't an issue, since I make the backups locally on a full-disk encrypted device and transmit them for longterm storage (to another full disk encrypted device) only with ssh. The lack of being able to easily delete old backups is also not an issue since (1) I don't want to delete them (I want a complete history), and (2) the approach to deduplication and compression in bup makes it extremely efficient space wise, and it doesn't get (noticeably) slower as the number of commits gets large; this is in contrast to ZFS, where performance can degrade dramatically if you make a large number of snapshots, or other much less space efficient approaches where you have to regularly delete backups or you run out of space.

In this discussion people also discuss ZFS and deduplication. With SageMathCloud, the filesystem all user projects use is a de-duplicated ZFS-on-Linux filesystem (most on an SSD), with lz4 compression and rolling snapshots (using zfssnap). This configuration works well in practice, since projects have limited quota so there's only a few hundred gigabytes of data (so far less than even 1TB), but the machines have quite a lot of RAM (50+GB) since they are configured for lots of mathematics computation, running IPython notebooks, etc.



I've also been using bup to replace (in some areas) my use of rdiff-backup.

It's a great tool, but since you contributed already, I cannot understate the importance of pruning old archives. For online/disk-based backup solutions, space is always going to run out eventually.

I'm using bup where I already know the backup size will grow in a way that I can manage for the next 1-2 years.

For "classical" backup scenarios though, where binaries and many changes are involved and the backup grows by roughly 10-20% a week due to changes alone, I have to resort to tools where I can prune old archives because I would either have to reduce the number of increments (which I don't want to do) or increase the backup space by a factor of 50x (which I practically cannot do either).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: