Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

TL;DR: He wrote an OS X dedup app which finds files with the same contents and tells the filesystem that their contents are identical, so it can save space (using copy-on-write features).

He points out its dangerous but could be worth it cause space savings.

I wonder if the implementation is using a hash only or does an additional step to actually compare the contents to avoid hash collision issues.

It's not open source, so we'll never know. He chose a pay model instead.

Also, some files might not be identical but have identical blocks. Something that could be explored too. Other filesystems have that either in their tooling or do it online or both.



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: