Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Second sight is advisable in such cases. Fact is, archives are essential to WP integrity and there's no credible alternative to this one.

I see WP is not proposing to run its own.



Wouldn't it be precisely because archives are important that using something known to modify the contents would be avoided?


> something known to modify the contents would be avoided?

Like Wikipedia?


No, not like that. There's a difference between a site that:

1) provides a snapshot of another site for archival purposes. 2) provides original content.

You're arguing that since encyclopedias change their content, the Library of Congress should be allowed to change the content of the materials in its stacks.

By modifying its archives, archive.today just flushed its credibility as an archival site. So what is it now?


> You're arguing that since encyclopedias change their content, the Library of Congress should be allowed to change the content of the materials in its stacks.

As an end user of Wikipedia there are occasions where content has been scrubbed and/or edits hidden. Admins can see some of those, but end users cannot (with various justifications, some excellent/reasonable and some.. nebulous). That's all I'm saying, nothing about Congress or such other nonsense. It seems like an occasion of the pot calling the kettle names from this side of the fence.


But Wikipedia promises you that it will modify its content. They're transparent about that promise.

An archival site (by default definition) promises you that it will not modify its content. And when it does, it's no longer an archival site.

Wikipedia has never been an archival site and it never will be. archive.today was an archival site, but now it never will be again.


This is your imaginary archive from the world of pink ponies.

Meanwhile their IMA on Reddit: no promises, no commitment. Just like Microsoft EULA :)

https://old.reddit.com/r/DataHoarder/comments/1i277vt/psa_ar...


What I don't see on that page is where they explicitly don't promise to not modify anything in the archive.


> What I don't see on that page is where they explicitly don't promise to not modify anything in the archive.

I'm quoting all of that because is lacks an explicit promise of non-modification /i

Meanwhile seriously, if you were disappointed not to see e.g. "We explicitly don't promise not to modify", then perhaps you should consider why, regardless, this site was trusted enough to get a gazillion links in Wikipedia... and HN.


> I'm quoting all of that because is lacks an explicit promise of non-modification.

And I'm quoting all of that because it lacks an explicit (or implicit) promise of modification. :)

It was (emphasis on past-tense) so-trusted because it advertises itself as an archival site. (The linked disclaimer is all about it not being a "long-term" archival site. It says it archives pages for latecomers. There is an implication here that it archives them accurately. What use is a site for latecomers if they change the content to be something else?) If they'd said or indicated they would be changing the content to no longer reflect the original site, Wikipedia would not have linked to them because they wouldn't be a credible source.

In any case, now I can't use them to share or use links since we can no longer trust those archives to be untampered. When I share a link to nyt content on archive.today or copy and paste content into email, I'm putting my name on that declaring "nyt printed this". If that's not true, it's my reputation.

Just like it was archive.today's.


> When I share a link to nyt content on archive.today or copy and paste content into email, I'm putting my name on that declaring "nyt printed this". If that's not true, it's my reputation.

What if the nyt article itself is the problem? How does that square?


Obviously not, since archive.org is encouraged.


What exactly is credible about archive.today if they are willing to change the archive to meet some desire of the leadership? That's not credible in the least.


A lot more credible than archive.org that lets archives be changed and deleted by the archive targets.

What's your better idea?


Does archive.org really let its archives be changed? That's very different than letting them be deleted from a credibility perspective.


Yes.

Archive.org snapshots may load javascript from external sites, where the original page had loaded them. That script can change anything on the page. Most often, the domain is expired and hijacked by a parking company, so it just replaces the whole page with ads.

Example: https://web.archive.org/web/20140701040026/http://echo.msk.r...

----

And another example: https://web.archive.org/web/20260219005158/https://time.is/

The page "got changed" every second. It is easy to make an archived page which would show different content depending on current time or whether you have Mac or Windows, or your locale, or browser fingerpring, or been tailored for you personally


I don't think it's fair to equate running JS that can change the rendered output with the archive server actually changing the HTML it sends back.


I agree, JS is much worse. Because anyone could create an "untrustworthy" page on archive.org, no hack or admin assistance is required.


Much worse indeed. This's why one should be deeply sceptical of the handful of WP users seeking to replace archive.today by archive.org. AT allows tampering by the archive operator; IA allows tampering by half the planet... including WP editors who'd love that replacement.


> the archive targets

Isn't there a substantial overlap with the copyright holders?


Overlap?


The operators() of archive.today (and the other domains) are doing shadey things and the links are not working so why keep the site around as for example Internet archives waybackmachine works as alternative to it.


What archive.today links are not working?

> Internet archives wayback machine works as alternative to it.

It is appalling insecure. It lets archives be altered by page JS and deleted by the page domain owner.


Currently as far as I know at least both archive.today and archive.is have the same ddos code on the main page. For more details https://gyrovague.com/2026/02/01/archive-today-is-directing-...


Is that what you call "not working"?


No it doesn't. You can just request content be removed from Archive.org and they will honor this: https://help.archive.org/help/how-do-i-request-to-remove-som...

Nonstarter for anything that you actually want to be preserved, especially anything controversial.


No request is needed. Just robots.txt to deliver a bulk removal.


> Fact is, archives are essential to WP integrity and there's no credible alternative to this one.

Yes, they are essentional, and that was the main reason for not blacklisting Archive.today. But Archive.today has shown they do not actually provide such a service:

> “If this is true it essentially forces our hand, archive.today would have to go,” another editor replied. “The argument for allowing it has been verifiability, but that of course rests upon the fact the archives are accurate, and the counter to people saying the website cannot be trusted for that has been that there is no record of archived websites themselves being tampered with. If that is no longer the case then the stated reason for the website being reliable for accurate snapshots of sources would no longer be valid.”

How can you trust that the page that Archive.today serves you is an actual archive at this point?


> If ... If ...

Oh dear.

> How can you trust that the page that Archive.today serves you is an actual archive at this point?

Because no-one shown evidence that it isn't.


The quote uses ifs because it was written before this was verified, but the Wikipedia thread in question has links to evidence of tampering occurring.


Lets see them, then.



> They referring to https://en.wikipedia.org/wiki/Wikipedia:Requests_for_comment... ?

Wikipedia does not have a project page with this exact name.

I assume that is weasel words for 404 Not Found.


You seem to have truncated the link; it appears in full for me in kay_o's comment.


I did not. The link was susequently edited.

To https://en.wikipedia.org/wiki/Wikipedia:Requests_for_comment...

I read that up to the first "proof", https://web.archive.org/web/20260218135501/https://www.googl...

It lands "503 Service Unavailable No server is available to handle this request."


Apologies, then. The Wayback link works just fine for me, no errors.


> there's no credible alternative to this one.

But this one is not credible either so...


Did you not read the article? They not only directed a DDOS against a blogger who crossed them, but altered their own archived snapshots to amplify a smear against them. That completely destroys their trustworthiness and credibility as a source of truth.


Sure I read it. But I don't believe everything I read on the internet.


The proof is right there for you to see. Denying it is rather wacky.


Altered snapshots = hide Nora name?

ArsTechica just did the same - removed Nora from older articles. How can you trust ArsTechica after that?


They didn't just remove her name, but replaced it with the target's name.

I don't know what you're talking about re: Ars removing her name from old articles.


Follow-up: maybe you're confusing Ars Technica with Wikipedia, whose admins did redact Nora's last name from discussions? If so, that's a weird equivalence to draw, since the change was disclosed and done to protect personal information, not attack someone else in the process. (Also, "Nora [redacted]" itself seems to be a name lifted from an unrelated person who had merely contacted Archive.today with a takedown request.)


1. I can't post links (I've already tried), my comments with links are getting shadowbanned. Check out Jon Brodkin's article on Ars about AT, not today's, but the previous one, 6 days ago. Nora's name was there, but now it's silently gone.

2. We learned about Nora's involvement from Patokallio. We learned about Nora's non-involvement... also from Patokallio. They could have reached a settlement with AT that includes hiding Nora's name.

3. Regardless of who Nora is, it is interesting to see the extent of this censorship: so far only gyrovague.com and arstechnica.com, but not tomshardware.com and not tech.yahoo.com. This shows which sites are working closely with the AT defamation campaign, and which are simply copywriting the news feed.


Silently? It tells you right there in the article: "Nora [last name redacted]". Maybe they could add a more fulsome explanation in an editor's note but it seems pretty obvious in context.

If AT is appropriating some random person's name as an alias, it seems helpful to report on that publicly in order to expose the practice and help clear up the misinformation.


Silently. Last article. Not today's.

One with title 'Archive.today CAPTCHA page executes DDoS; Wikipedia considers banning site'

I'll try to add the link with comment edit:

This has Nora's name https://web.archive.org/web/20260210195502/https://arstechni...

The current version has not


Even if they did, so what? There's nothing wrong with a news article removing personal information as a precaution. It's light-years away from altering the content of an archival snapshot in order to target someone else.


Well, that's the only name they removed, even though it didn't stand out among the other names in the investigation. Secondly, it's ironic to do so in an article tagged "Streisand Effect" so perhaps we're witnessing part of the performance. And thirdly, it's strange to blame AT for removing... the same name, and not blame Ars. Immediately accusing... AT of double standards and hypocrisy.

I am lost here. It is definitively an organized defamation campaign.

“You are guilty simply because I am hungry”


Seems more like Ars trying to avoid piling more attention on the name of a person that isn't actually involved.

And again, the accusation against Archive.today isn't just that they removed their "Nora" alias from a snapshot, but that they replaced it with the name of the blogger they were quarreling with. There's no defensible reason to do that outside of petty revenge (which tracks with the emails and public statements from the Archive.today maintainer).


> Ars trying to avoid piling more attention on the name of a person that isn't actually involved.

Oh, yes, by removing the name in the context of "Streisand Effect".

> petty revenge

How does it "revenge"? Was it a porn page? Or something bad?

It is likely to be just a funny placeholder name of the same length to come in mind.

--

We could find good and bad motives for both AT and Ars.

The bias against AT was here apriori. Paywall-story for CondeNast, russophobia for the rest.


They apparently did a find + replace across their database to change the Nora alias to the blogger's name. So any archives of content referencing her would instead point to him, muddying the waters and blaming him for anything she was accused of. Like I said, petty.

The porn smear threats came later, via email.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: