This is the standard API for deduplication on Linux (used for btrfs and XFS); you ask the OS nicely to deduplicate a given set of ranges, and it responds by locking the ranges, verifying that they are indeed identical and only then deduplicates for you (you get a field back saying how many bytes were deduplicated from each range). So there's no way a userspace program can mess up your files.