Some time ago I noticed that in Chrome, every time you click "Never translate $language", $language quietly gets added to the Accept-Language header that Chrome sends to every website!
My header ended up looking like a permuted version of this:
en-US,en;q=0.9,zh-CN;q=0.8,de;q=0.7,ja;q=0.6
I never manually configured any of those extra languages in the browser settings. All I had done was tell Chrome not to translate a few pages on some foreign news sites. Chrome then turned those one-off choices into persistent signals attached to every request.
I'd be surprised if anyone in my vicinity share my exact combination of languages in that exact order, so this seems like a pretty strong fingerprinting vector.
There was even a proposal to reduce this surface area, but it wasn't adopted:
This is a problem, that the software will try to guess what you mean by such things like this (it is not specific to this feature, but other features of computer programs in general; this is one specific case of that). Just because you do not want it to translate such a language (or any other langage) automatically does not necessarily mean that you can read it or that you want to request documents written in that language. Fingerprinting is not the only issue with this.
Is Chrome trying to assume that, since you don’t want it to translate those pages/languages, that you can read them/want them in your header? Interesting
I'd read it more generously than that. I think Chrome trying to stop the server choosing the language for you. By sending an accepts-language header (which your browser does regardless of what you use; it's not a Chrome thing) the server should return the page in a language you've said you'll accept. By adding the language to what you've told Chrome not to translate, it's attempting to show you pages in languages you want.
I imagine Chrome is really adding the language to your browser preferences when you choose not to translate a page, and the HTTP client in the browser is generating request headers based on your preferred languages. A small (and largely unimportant) semantic point, but it's possible that the Google translate team weren't aware of how adding a preferred language might impact user privacy. That isn't to excuse the behaviour; they should have checked.
Translating pages is literally the only thing I use Chrome for. The built-in translation works way better than other browsers, even though they also use Google Translate.
Firefox does not use Google Translate and performs the translation locally, which works great for the most common languages out there. For the less common ones you still have to go to Google Translate, but IME it's definitely not worth changing the browser to Chrome over.
I don't really like firefox translate, despite having made the switch many years ago. For a long time it didnt have the (european) language of the country I live in. Now it does have it. Every time I want it to translate I have to manually find both languages in the insanely long dropdowns. It will not save it the way I want it, but impressively seems to manage to always save it in the other direction...
Ditching Chrome is something we need to teach everyone.
The DOJ is totally spineless and refuses to squash Google's absurd monopoly on the internet. We are literally the last line of defense, even though we really don't amount to much.
You don’t need a grassroots movement when other movements doing this exact thing already exist. In fact it is likely counterproductive. Mozilla Foundation is the organization you want to support, or EFF.
> Mozilla Foundation is the organization you want to support
Mozilla Foundation is rudderless. I'm convinced the leadership are all Google plants who are keeping the "antitrust litigation sponge" from doing anything damaging to Chrome.
Sorry but you're using a Google browser and Google translation service, when excellent alternatives to both exist. What did you expect regarding privacy?
A clueless person might not know any better, but you clearly do, and also you seemingly care. So why do you use Google all the same?
As uniform as possible is exactly the wrong way to go. It only takes one data point overlooked or newly discovered to make every person trying to look identical distinct. New fingerprinting techniques are being implemented all the time, so what's the point in taking chances when it's far easier to randomly change a browsers fingerprint for each site/connection making it much harder to track any one browser over time.
Except I don't want to be flagged as a bot when I'm just visiting some website in my browser. (I also don't want to be flagged as a bot when I'm scraping some website with a bot).
Every time I manually touched the "fingerprinting" about:config settings, my entropy went up. I used the EFF site to test: https://coveryourtracks.eff.org/
AFAIK some of these options are there to be used by the Tor browser, which comes with strict configuration assumptions, and it doesn't translate well to normal Firefox usage. Especially if you change the window size on a non-standardized device. Mind you, the goal is not to block fingerprinting, but to not stand out. Safari on a macbook is probably harder to fingerprint than Firefox on your soldering iron.
However, judging by the fact that every data hungry website seemingly has a huge problem with VPN usage, I'd presume they are pretty effective and fingerprinting is not.
I've had good success with tracking tool tests and resistFingerprinting. Granted, I usually use it with uMatrix/NoScript most of the time which cuts down on the available data a lot and maybe makes it an unfair test.
One issue, I expect, is simply not enough people using resist fingerprinting to add variation to the mix. Since it's off by default, and only a small % of users use Firefox and an even tinier percentage use resistFingerprinting, unlike your example of Tor where probably most people on the tor network use the tor browser, it's likely that simply blocking things is a fingerprint all on its own.
The solution there would be to get more people using it :)
I will say one downside to using it is far more bot detection websites freaking out over generic information being returned to them, causing some sites to break (some of their settings breaking webgl games too due to low values). Using a different profile avoids this, or explicitly whitelisting certain sites in privacy.resistFingerprinting.exemptedDomains - obviously if a site is using a generic tracking service for bot detection, that kills a fair amount of the benefit of the flag, so a separate profile might be best. I wish firefox had a container option for this.
... and. not too sure what you mean by changing window size on a non-standardised device. They do try to ensure the window sizes are at standard intervals, as if they were fullscreened at typical widths to reduce fingerprinting, but surely that applies to using Tor too? I mean, people don't use Tor on dedicated monitors at standard sizes.
Oh, and a bit of followup. I tried the EFF cover your tracks on a Firefox profile with resist fingerprinting, and almost all the bits of identifying information came from the window size (which EFF considers "brittle") and the UA (I was testing in Firefox Nightly).
Apparently you need to add the hidden pref:
firefox.resistFingerprinting.letterboxing
Enabling letterboxing knocked off 5 bits of identifying information. Apparently my 1800px wide letterbox was still pretty identifiable, but, an improvement.
Setting a chrome user agent string using a user agent string manager dropped that one from 12ish bits to <4 bits. 'course, that has disadvantage of reducing firefox visibility online further, and probably being more recognisable with the other values (like mozilla in the webgl info). Using firefox stable for windows was <5bits, so probably best to use that if on linux. Although, it might conflict with the font list unless a windows font list was pulled in.
privacy.resistFingerprinting has potentially-unwanted side-effects, like wiping out most of your browser history (instead of the more sensible approach of just disabling purple links). I also recall something about it getting removed or nerfed, though I'm not sure whether that was a mere proposal.
It does not wipe your browser history. I can definitely attest to that since my generic JS active + resistFingerprinting profile has a history going back years. It does set your timezone to UTC in JS on websites. I've mostly encountered that when playing Wordle ;)
The browser should reasonably know what time zone you're in and what time zone you're reporting to the website and translate between them automatically.
Yeah, "should". Too bad it's unfeasible. As soon as you e.g. print the current date as part of a paragraph somewhere, the browser loses track of it, and the website can just read the element's content and parse it back.
what about duck duck go? We need a simple chart:
1. What browsers are good at resisting finger printing
2. tell for each browser, does it work on android ad ios and apple and windows and linux
3. what setting are needed to achieve this
for bonus points, is there no way to strip all headers on chrome on control it better?
Modern Safari is pretty damned good at randomizing fingerprints with Intelligent Tracking Prevention. With IOS 26 and MacOS 26, it's enabled in both private and non private browser windows (used to be only in private mode).
All "fingerprint" tests I've run have returned good results.
Surely this is true, but if you’re a fingerprinting company aren’t you making so much money violating the privacy of the masses that it’s not worth your time going after the tiny set of Freedom Nerds trying to evade you?
They aren't specifically going after you... they just try to create a unique hash from everything they can and by doing weird things to your system you are making a truly unique hash easier
You can change the header, but browser developers are not that dumb and they added properties like "navigator.platform" which do not change and immediately give you away. Consider also writing a browser extension to patch these properties. Also, I think that DRM module (widewine), that is bundled with browsers, also can report the actual software version. Sadly it is undocumented so I don't know what information it can provide, but I notice warnings from Firefox about attempts to use DRM on various sites like Yandex Market.
The article also mentions this, and suggests the UA is not a silver bullet. That said, they didn’t go into specifics. I’m assuming there are other details that correlate to particular browsers that will betray a false UA. Plus, having a UA that says Chrome while including an extension that’s exclusive to Safari (tor example) will not only contradict the UA, but it will also be a highly distinctive datapoint for fingerprinting, in and of itself.
Using Chrome and caring about privacy? I thought, after Google killed uBlock Origin, it had become beyond clear these two things were incompatible, https://news.ycombinator.com/item?id=41905368
There's a way to enforce loading UBo in Chromium but you need to download the extension by hand (git clone it from GitHub) and load it in "developer mode" in the extension settings. Also, you need to enable some legacy options related to extensions in about:flags.
Clearly it thinks you prefer Chinese to German. Was that correlated with the frequency of your requests on Google Translate? With your browsing history? With your shopping history?
Hmmm...YouTube has been getting confused about the language and displaying random languages for the closed captions on videos. This was happening to me across smart TVs but I access YouTube randomly from various devices and browsers...but mostly Chrome when using a browser.
> There was even a proposal to reduce this surface area, but it wasn't adopted:
>> Instead of sending a full list of the users' preferred languages from browsers and letting sites figure out which language to use, we propose a language negotiation process in the browser, which means in addition to the Content-Language header, the site also needs to respond with a header indicating all languages it supports
Who thought that made sense? Show me the website that (1) is available in multiple languages, and also (2) can't display a list of languages to the user for manual selection.
What language do you put that list in? Would you still want to show it to every visitor when you know most of them speak a particular language?
I use to do some work in this area. The first question is difficult and the second is no. We had the best results when we used various methods to detect the preferred language and then put up a language selector with a welcome message in that language. After they made a selection, it would stick on return visits.
> What language do you put that list in? Would you still want to show it to every visitor when you know most of them speak a particular language?
Judging by... a large number of websites, you make the list available in a topbar, and each language is named in itself. You don't apply one language to the entire list.
Here's the first page that popped into my head as one that would probably offer multiple languages (and it does!):
They've got the list in a page footer instead of a header, but otherwise it's an absolutely standard language selector. It does technically identify countries rather than languages. The options range from Azərbaycan to Україна. They are -- of course -- displayed to every visitor.
Why would you want to force someone to consume your website in the wrong language?
And why would the list be in a single language, again?
You’re looking at it with the perspective of someone who understands the language the site defaults to. Most non-native speakers have a hard time finding the link and they leave.
No, I'm looking at it from the perspective of someone who has needed to use that language selector in the past. Understanding the language the site defaults to wouldn't help, because the selector doesn't use that language anyway.
> Most non-native speakers have a hard time finding the link
You might notice the colorful flag right next to it.
Flags are a terrible way to indicate language. At best, they are unclear. At worst, they can be offensive.
Assuming you are a US company catering to non-English speakers in the US, which flag would you use for Spanish? Which flags would you use to differentiate between Mandarin and Cantonese? What would you do in Canada where they speak English and French? Show a French flag?
Except they're recognizable across languages. Faced with a UI in a language I don't know, going to settings -> languages -> my preferred language is a total guessing game. Meanwhile, if I'm confronted by a UI that has a tiny flag icon in the top, I know I can click on that and get to something familiar. Yes, someone looking to get offended can nitpick your flag choice, but a Spanish flag vs a Mexican flag for Spanish will at least let the user get to something closer to what they know, even though there's quite a bit of difference on the ground between Spanish in Spain and Spanish in Mexico. If your internationalization team is well funded enough to offer both, then show both flags. Same for UK English and American English, Chinese Simplify, Traditional, and Cantonese. And yes, Quebecoise French and French in France. Offer as many flags as you actually have translations for. If you can have a Chinese flag and a Hong Kong flag, users will appreciate it. Having a two level menu is also an option. Click on the Canada flag, which then offers Francaise and English is also an option.
Well, one of us has done research and work in this area. I don’t know what you’ve been doing. All of your suggestions perform poorly in the real world.
You can determine user's language from IP address location. Of course, there are users with VPNs, but they probably are used to seeing foreign content. For example, Youtube shows me advertisement in a language I don't understand despite my language header saying I only understand "en-US" and "en" languages. So this header is unnecessary, even Youtube ignores it.
Also, when using VPN, Google typically uses a language based on IP address, not my language header. I assume the header is only useful for fingerprinting today.
> You can determine user's language from IP address location.
There are reasons why it might not work (VPN is only one of them; there are others such as places with multiple languages, people traveling to foreign countries, and others), although it is also a bad idea for other reasons as well.
If the user specifies the language then you should use that one. I think it would probably be better to use the following order of figuring out which language you should want:
1. If the URL specifies the language to use, then use the language specified by the URL.
2. If the language is not specified by the URL, use the language specified by any cookies that are set for the purpose of selecting the language.
3. If the language is not specified by URL or cookies, but the user is logged in and the user account has a language setting, use the language specified by the user account. (If TLS client authentication is being used, then you might consider adding an extension into the client's X.509 certificate to select the language.)
4. If the language is not specified by URL or cookies or the user's account, or the user is not logged in, use the Accept-Language header.
5. If the language is not specified by URL or cookies or the user's account, or the user is not logged in, or the Accept-Language header is not present or cannot be parsed or does not specify any language that the request file is available in, then use the default, such as the language that it was originally written in.
> You can determine user's language from IP address location.
I live in Hyderabad, Telangana, India. I do not yet speak enough Telugu or Hindi or Urdu to be useful, and cannot read Hindi or Urdu at all; but I’m a foreigner who grew up on English only, rather rare around here, so let’s consider native Indians instead. Many can speak these languages but not read them in their native scripts, only romanised (in which case they can probably speak English tolerably). And many (many) come from other parts of India (or even Nepal) and can’t speak Telugu. Or are Muslim and at least prefer to deal in Hindi, often not having very good Telugu. And so on. It’s messy.
Some IP geolocation doesn’t even get the city right—I’ve seen Noida suggested, which is up north in Hindi territory.
More and more international audiences websites literally do this themselves, putting a language (sometimes even currency) select box option on top when they detect your settings don’t match best at first the page you are on.
Why not have this negotiation implemented at the browser level?
Because that prevents all of your users from selecting the language they want. It's a terrible idea with no upside and not-high-but-still-not-no downside.
It has a remarkably inconspicuous language selector, also using the names of countries rather than languages, located in the page footer. Compared to Dyson, Apple's list of country names is much more willing to use English in preference to whatever someone from that country would call it. This isn't consistent; many countries are rendered in their own language (日本 / Ελλάδα) and many aren't (Georgia / Kazakhstan).
The page defaults to the locale that you request in the URL. https://www.apple.com/ shows up in English, regardless of your country;† https://www.apple.com/bg/ shows up in Bulgarian. Switching your preferred location simply takes you to the page for that location. (Dyson does the same thing.) Some locations support more than one language; there's https://www.apple.com/lae/ for Latin America (English) and https://www.apple.com/la/ for Latin America (Spanish). If you're on the page for a location like this, a language selector (with language names) displays next to the location selector. In the case of Latin America, only two languages are supported, and the language selector automatically displays "Español" if you're on the English site and "English" if you're on the Spanish site, which makes sense but won't generalize.
Apple's selector is inconspicuous because it refuses to display flags, which I would guess is due to much higher political exposure than Dyson. So it's lower-quality in two ways, but fundamentally the same approach. The user asks for a language, and the site honors that.
Given that I presented Dyson as an example of doing language selection correctly, I'm confused about what you wanted me to see on apple.com. They're trying to do the right thing, but less effectively.
† I tested this by accessing the site(s) from Mongolia, Vietnam, and Morocco using ExpressVPN.
That was my point. Not comparing Apple/Dyson/whatever, but showing that website do have this need.
If this was designed and implemented as a standard at the browser level, we would get something better in the end, rather than re-implementations on each and every website.
My header ended up looking like a permuted version of this:
I never manually configured any of those extra languages in the browser settings. All I had done was tell Chrome not to translate a few pages on some foreign news sites. Chrome then turned those one-off choices into persistent signals attached to every request.I'd be surprised if anyone in my vicinity share my exact combination of languages in that exact order, so this seems like a pretty strong fingerprinting vector.
There was even a proposal to reduce this surface area, but it wasn't adopted:
https://github.com/explainers-by-googlers/reduce-accept-lang...