Fwiw, the Internet Archive is very much trying to avoid the random S3 bucket deletion problem, and donations to them are tax deductible.
The issues of long-term digital storage are such that - use whatever you want for your own blog - but (imo) ASCII isn't going to save you any more than binary blobs are, 300 years into the future after we're all long gone and buried. We're already in a world where UTF-8 is taking over in many places. (Many places but not all. Fun fact, you can't send Zelle to someone with an emoji in their local contact name with some banks.)
If I (today) said I had a word document and needed "an old version of Microsoft Word", I'm sure most people would know what I mean, and that I'd find someone with a Windows XP machine and a copy of Office 97'. Meanwhile, there are tons of people who are just going to stare at you blankly if you tell them about EBCDIC, never mind help you find a decoder.
> If I (today) said I had a word document and needed "an old version of Microsoft Word", I'm sure most people would know what I mean, and that I'd find someone with a Windows XP machine and a copy of Office 97'. Meanwhile, there are tons of people who are just going to stare at you blankly if you tell them about EBCDIC, never mind help you find a decoder.
Funny, I suspect the precise reverse is true.
EBCDIC is a well-documented encoding. Worst case, find you a reference book and you can figure out how to deal with it, because that knowledge is open and available.
The same is true of ASCII. If you can understand binary encodings with 8-bit groupings--a fairly fundamental concept in digital computing--you can probably find your way to an ASCII table in a library somewhere.
But good luck finding a working Windows XP machine with Office '97 fifty or one hundred years from now, let alone a spec for the format.
And once the maintained version of Libre Office inevitably drops office97 support you are back at having to find old Libre Office versions and trying to get them to run or port the code.
And that's ignoring the fact that code is a terrible spec. Trying to reverse engineer a file format from a software implementation is a godawful nightmare, and I say that from personal experience.
Given the choice between that and having to figure out how 8-bit ASCII works, it's pretty clear which is the easier problem to solve.
The issues of long-term digital storage are such that - use whatever you want for your own blog - but (imo) ASCII isn't going to save you any more than binary blobs are, 300 years into the future after we're all long gone and buried. We're already in a world where UTF-8 is taking over in many places. (Many places but not all. Fun fact, you can't send Zelle to someone with an emoji in their local contact name with some banks.)
If I (today) said I had a word document and needed "an old version of Microsoft Word", I'm sure most people would know what I mean, and that I'd find someone with a Windows XP machine and a copy of Office 97'. Meanwhile, there are tons of people who are just going to stare at you blankly if you tell them about EBCDIC, never mind help you find a decoder.