55366 - Don't reveal UI language to site/page -- Change navigator.language to use Accept-Language instead of the UI language

Assignee

Description

•

24 years ago

Reproduce:
1. Visit <http://gemal.dk/browserspy/language.html>.

Actual result:
Your UI language and locale (e.g. en-US) is displayed.

Expected result:
Neither UI nor OS language or locale are revealed to page or site.

Additional Comments:
Compare HTTP 1.1. spec:
<quote src="http://www.ietf.org/rfc/rfc2616.txt">
15.1.4 Privacy Issues Connected to Accept Headers

   Accept request-headers can reveal information about the user to all
   servers which are accessed. The Accept-Language header in particular
   can reveal information the user would consider to be of a private
   nature, because the understanding of particular languages is often
   strongly correlated to the membership of a particular ethnic group.
   User agents which offer the option to configure the contents of an
   Accept-Language header to be sent in every request are strongly
   encouraged to let the configuration process include a message which
   makes the user aware of the loss of privacy involved.

   An approach that limits the loss of privacy would be for a user agent
   to omit the sending of Accept-Language headers by default, and to ask
   the user whether or not to start sending Accept-Language headers to a
   server if it detects, by looking for any Vary response-header fields
   generated by the server, that such sending could improve the quality
   of service.

   Elaborate user-customized accept header fields sent in every request,
   in particular if these include quality values, can be used by servers
   as relatively reliable and long-lived user identifiers. Such user
   identifiers would allow content providers to do click-trail tracking,
   and would allow collaborating content providers to match cross-server
   click-trails or form submissions of individual users. Note that for
   many users not behind a proxy, the network address of the host
   running the user agent will also serve as a long-lived user
   identifier. In environments where proxies are used to enhance
   privacy, user agents ought to be conservative in offering accept
   header configuration options to end users. As an extreme privacy
   measure, proxies could filter the accept headers in relayed requests.
   General purpose user agents which provide a high degree of header
   configurability SHOULD warn users about the loss of privacy which can
   be involved.
</quote>

While we don't send this info as HTTP header, we offer it as JS property. See
the source of the testcase mentioned above for details.

David Krause

Comment 1

•

24 years ago

Isn't the language info also in the useragent string?  According to the browser
sniffer at http://www.ufaq.org/ it is.

User Agent: Mozilla/5.0 (X11; U; Linux 2.2.16-3 i686; en-US; m18) Gecko/20001006
Application Name: Netscape
Application Version: 5.0 (X11; en-US)

While this could be considered a privacy thing, it could be really neat if sites
tailored what language their content was according to your locale and language.

David Krause

Comment 2

•

24 years ago

Hmm, on section thought maybe this isn't so good.  I recommend disabling it or
at least pref-disabling it.

Ben Bucksch (:BenB)

Assignee

Comment 3

•

24 years ago

David, you are right and everything is as you suggested it. See the pref UI pane
under Navigator - it configures the HTTP header to send.
This bug is about the JS property which is
- much less useful (do you want to send all available language versions and then
select onteh client side?)
- not opt-in
- corrently not (independantly) changeable by the user, but identical to the UI
language
.

timeless

Comment 4

•

24 years ago

The useragent should not include this information. But the point of accept is 
to say that the user WANTS the content in that language.  Some RFCs need 
comments like ~this is stupid and counterproductive~.

Verah I think you work on that privacy document, add a paragraph w/ link to 
that rfc:
According to <a>RFC</a> we do hereby warn you that asking for content in a 
language you prefer might divulge information about you (including <span 
style="-moz-type-timeless:shocking">the language you prefer to read</span>, 
which may imply <span style="-moz-type-timeless:shocking">your 
ethnicity</span>).

I presume mozilla does send prefered language headers, if not we need a bug for 
that (mozilla1.0)

Severity: normal → enhancement

Keywords: relnoteRTM

OS: Linux → All

Hardware: PC → All

Whiteboard: [defective-privacy]

timeless

Updated

•

24 years ago

Blocks: 50205

Ben Bucksch (:BenB)

Assignee

Comment 5

•

24 years ago

timeless, I don't understand your last comment. Also, it has nothing to do with
privacy *links* (maybe the *document*). Removing dependancy.

Please note the difference between the HTTP header "Accept-Language" and the JS
property. The implementation of the former is OK in Mozilla (I think). This bug
is about the latter.

David Krause, thanks for noting the UA string (anyhow, I missed your comment).
Will investigate.

No longer blocks: 50205

Ben Bucksch (:BenB)

Assignee

Updated

•

24 years ago

Severity: enhancement → normal

Mitchell Stoltz (not reading bugmail)

Comment 6

•

24 years ago

Future.

Status: NEW → ASSIGNED

Target Milestone: --- → Future

Gervase Markham [:gerv]

Comment 7

•

24 years ago

As there's nothing a user can do about this JS privacy leak, is it worth 
relnoting?

The blurb:
It seems unclear to me whether this bug requires either of a "developer" or 
"user" release note for Netscape 6 RTM. If anyone feels it does, can they please 
draft one and then nominate with the relnote-user or relnote-devel strings in 
the Status Whiteboard.

Thanks :-)

Gerv

Ben Bucksch (:BenB)

Assignee

Comment 8

•

24 years ago

> As there's nothing a user can do about this JS privacy leak, is it worth
> relnoting?

Sure, at least he has to know. He can do something: Use another UA or use
english chrome.

Mitchell Stoltz (not reading bugmail)

Comment 9

•

24 years ago

Hasn't Netscape always revealed the UI language somehow? If this behavior is
present in 4.x, then I don't see why we need a relnote.

Ben Bucksch (:BenB)

Assignee

Comment 10

•

24 years ago

> Hasn't Netscape always revealed the UI language somehow?

Yes, I think so. I am not sure we need a relnote.

Ben Bucksch (:BenB)

Assignee

Comment 11

•

24 years ago

We *already* have something better than a relnote: Tasks|Privacy|Understanding
<chrome://communicator/locale/wallet/privacy.html>. removing relnoteRTM based on
that.

Keywords: relnoteRTM

John Unruh

Updated

•

24 years ago

QA Contact: czhang → junruh

John Unruh

Comment 12

•

24 years ago

Mass changing QA to ckritzer.

QA Contact: junruh → ckritzer

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Updated

•

24 years ago

Blocks: 71569

danielmc

Comment 13

•

23 years ago

We are past the UI freeze for Commmercial Beta. We need a UI freeze to ba able 
to ship Localised products simultaneously with the US. Would it be possible to 
check any changes to chrome://communicator/locale/wallet/privacy.html after we 
have branched in the commercial tree on 6/29?

danielmc

Comment 14

•

23 years ago

By the way here is a handy JS for revealing UA info...

<PRE>
<SCRIPT>
with (document) {
    writeln("navigator.userAgent is ", navigator.userAgent);
    writeln("navigator.appCodeName is ", navigator.appCodeName);
    writeln("navigator.appVersion is ", navigator.appVersion);
    writeln("navigator.appName is ", navigator.appName);
}
</SCRIPT>
</PRE>

Mitchell Stoltz (not reading bugmail)

Comment 15

•

23 years ago

Target is now 0.9.5, Priority P1.

Priority: P3 → P1

Target Milestone: Future → mozilla0.9.5

Ben Bucksch (:BenB)

Assignee

Comment 16

•

23 years ago

Removing this info from the UA string is trivial. I did this for Beonex
Communicator, I can attach a patch.

Mitchell Stoltz (not reading bugmail)

Comment 17

•

23 years ago

Ben,
   By all means, attach your patch. I don't think we will make it the default,
but it would be nice to have as part of "high-privacy mode," which is something
I'm working on.

Ben Bucksch (:BenB)

Assignee

Comment 18

•

23 years ago

Mitch, why not make it the default? The HTTP spec explicitly recommends against
doing what we do atm. Sites won't break either, if we just always send "en-US".
(But possibly with an Accept-Language header, which is customized by the user.)

Ben Bucksch (:BenB)

Assignee

Comment 19

•

23 years ago

Attached patch Proposed Fix for UA-string. Also contains Win fix for bug 57555. (obsolete) — Details — Splinter Review

Ben Bucksch (:BenB)

Assignee

Comment 20

•

23 years ago

> (But possibly with an Accept-Language header, which is customized by the user.)

s/with/, we send/

Mitchell Stoltz (not reading bugmail)

Comment 21

•

23 years ago

It's not my decision to make. I will ask around here and find out if changing
the default UA is OK. There may be some resistance to changing it at all,
especially if we've always provided the language information in the UA.

timeless

Comment 22

•

23 years ago

this was discussed in a newsgruop that i read recently (well the discussion was 
1-4 years old but...) cc.

Jeremy M. Dolan

Comment 23

•

23 years ago

Ben, does your patch remove the region/language from navigator.appVersion as well?

Ben Bucksch (:BenB)

Assignee

Comment 24

•

23 years ago

Yes, it seems so. The test page now shows "en" (the hardcoded dummy value)
instead of "en-US". Looks like the Javascript function pulls its value out of
the UA-string, which is nice. So, looks like I fixed this bug. I'll install a
German langpack when I get a chance to be sure.

mstoltz, would you mind, if I took the bug? Who has to be asked about checking
this in, apart from dbaron? (There's no pref to turn this on or off, since I see
no value in the current behaviour - see comment in patch and my earlier comments
here for reasons.)

dbaron, what do you think?

Ben Bucksch (:BenB)

Assignee

Comment 25

•

23 years ago

Posted proposal to .netlib: <news://news.mozilla.org/3B665A87.8050807@beonex.com>.

Katsuhiko Momoi

Comment 26

•

23 years ago

> It's not my decision to make. I will ask around here and find 
> out if changing the default UA is OK. There may be some 
> resistance to changing it at all, especially if we've always
> provided the language information in the UA.

Netscape commercial builds cannot remove the lang info from
UA string. They serve as important tracking tools. 

If Mozilla wants to make that an option, that is fine 
but that option should not be the default. 
If there is a proposed UI for it, commercial builds might
consider removing the UI.

Ben Bucksch (:BenB)

Assignee

Comment 27

•

23 years ago

> If Mozilla wants to make that an option, that is fine 
> but that option should not be the default.

Why? If it's a pref, Netscape can alter it trivially.

Katsuhiko Momoi

Comment 28

•

23 years ago

>> If Mozilla wants to make that an option, that is fine 
>> but that option should not be the default.

> Why? If it's a pref, Netscape can alter it trivially.

That is true. As long as it is Netscape's default not 
to turn off lang info, that would be fine.

By the way I would like to raise this issue about
the privacy clause in HHTP 1.1. It is wrong-headed to
single out lang info as the only thing compromising
security. What about the fact that you're using Mozilla,
Gecko, or Netscape, Win NT5, etc. ? Why, someone could
descriminate against Netscape users or IE users or
whatever. That is also a privacy issue if lang info
is a privacy issue. The fact that somethng is mentioned
in an RFC document does not mean we need to be implementing
everything that is in it. 
In these day and age, the fact that someone might be using
en-US build means virtually nothing other than the fact that
someone maight be able to read English. We have users 
all over the world using an en-US build.
The fact that someone is using using an ja-JP build does not
mean that that person is Japanese. It simple means that someone
possibly reads Japanese but may be a Canadian, etc. 
The whole argument about the lang info being a compromising
factor is moot in my opinion. Whoever wrote the HTTP 1.1 section
onlang info and security should examine issues more broadly
and fairly.

Our proud Mozilla localizers around the world would probably 
like to see their L10 work reflected accurately in the 
UA string.

Ben Bucksch (:BenB)

Assignee

Comment 29

•

23 years ago

> wrong to single out lang info as the only thing compromising security.

Right. There are other bugs about other issues, e.g. bug 57555.

> en-US build means virtually nothing

Right, but if I speak Hebrew, it does mean something.

You can argue about the severity of this bug. But I do think that it should be
fixed. I care less about the default in Mozilla, but I would prefer that Mozilla
followed the advise of the spec.

I will attach a new patch which makes it dependant on a pref
(browser.reveal-ui-lang or similar) when I have time.

timeless

Comment 30

•

23 years ago

ben: you're german no? I know a bunch of people who contribute to mozilla.org 
who can read hebrew and I wonder if _any_ of them have this concern (I know 
it's really odd ..)

tao

Comment 31

•

23 years ago

Folks:

Some websites sniff U-A string to redirect users to appropriate pages for 
downloading localized version of their software/patch. When locale info does not
present, "en-US" are often used as the default.

Following spec is a good thing when it does not break existing websites. I agree 
that making it a preference and default to 'on' seems to be a good compromise.

Jeremy M. Dolan

Comment 32

•

23 years ago

> Some websites sniff U-A string to redirect users to appropriate pages for 
> downloading localized version of their software/patch.

IE 5.0, 5.01, 5.5, and presumably earlier versions don't put language in UA.
Opera doesn't put language in UA. Konqueror doesn't put language in UA. This is
a wholly unappropriate and nonportable place for that information. That's the
whole purpose of the Accept-Language header, to specify what language you want
information in.

If you want to ignore the RFC and default Accept-Language to on, that's fine
enough by me (and is a seperate bug anyway). But there's no purpose to reveal
the *UI* language, if not for privacy, for correctness. Accept-Language is the
language(s) the user wants to receive information in.

tao

Comment 33

•

23 years ago

I don't think I've ever said that it is proper to use U-A string for content
negotiation; as you pointed out, accept-lang in the HTTP header serves such 
purpose. All I said is there are indeed websites misuse the U-A string...

Glad hear standard advocate, though :-)

Katsuhiko Momoi

Comment 34

•

23 years ago

> But there's no purpose to reveal the *UI* language, if not 
> for privacy, for correctness. Accept-Language is the
> language(s) the user wants to receive information in.

I think you should read the definition of User-agent:

"14.43 User-Agent

   The User-Agent request-header field contains information about the
   user agent originating the request. This is for statistical purposes,
   the tracing of protocol violations, and automated recognition of user
   agents for the sake of tailoring responses to avoid particular user
   agent limitations. User agents SHOULD include this field with
   requests. The field can contain multiple product tokens (section 3.8)
   and comments identifying the agent and any subproducts which form a
   significant part of the user agent. By convention, the product tokens
   are listed in order of their significance for identifying the
   application.

       User-Agent     = "User-Agent" ":" 1*( product | comment )

   Example:

       User-Agent: CERN-LineMode/2.15 libwww/2"

Further Product tokens are deifned as:

" Product tokens are used to allow communicating applications to
   identify themselves by software name and version. Most fields using
   product tokens also allow sub-products which form a significant part
   of the application to be listed, separated by white space. By
   convention, the products are listed in order of their significance
   for identifying the application. .... etc."

UI language differs if localization files are different. 
It is clearly a significant part of the application. And though
this is not common, localization itself might reveal a bug that 
was not caught before the product was shipped. This latter type
of case does actually. For tracking purposes, it is in my opinion
siginificant info. 

I don't believe that we should use considerations raised
for Accept-Language for user-agent issues. I just want to point
out that there are arguments for both sides of this issue and also
that the HTTP 1.1 says nothing about not revealing the UI language in
the User-agent header. If MS or Opera wants not to include that
info, that is fine but let that not bind what we should do here.

Jeremy M. Dolan

Comment 35

•

23 years ago

Keep in mind, that where the RFC says "automated recognition of user agents for
the sake of tailoring responses", I think this would more refer to protocol
tailoring, not content. For example, Apache's default config contains some magic
to disable Keep-Alive for some broken versions of IE that claim to support it.

HTML includes its own means of "avoid[ing] particular user agent limitations",
such as CSS, and other ways of making content still accessible to older browsers.

The only browser revealing language in U-A I know of is Netscape 4.*, which,
last I heard, had 8% market share. Any page basing content off this field isn't
a whole hell of a lot effective right now. If future versions of Mozilla and
Netscape 6 remove it from U-A, more web designers won't be tricked into
mistaking U-A for A-L (see also: Microsoft J++).

OK, I'll shutup now, sorry for all the spam, folks.

Daniel Veditz [:dveditz]

Comment 36

•

23 years ago

The user agent language is used for distribution tracking, right? Personally if
I were creating a language pack I'd be gratified to have a clue how far it had
spread, especially for a minority/endangered language (Navaho? Hawaiian?
Gaelic?). If I were using such a language I'd probably want to signify my
presence (ethnic pride).

This kind of thing should obviously be a pref. (and now we can commence arguing
over the default setting in Mozilla.)  If the UA were easier to change (i.e. via
the pref UI rather than hacking prefs.js) this wouldn't be so much of an issue.

As the person mostly responsible for foisting "navigator.language" on people in
4.x I think we could safely nuke it. Eh, I guess we should return a string so we
don't break pages accidentally on an undefined property, maybe "" or "unknown".

tao

Comment 37

•

23 years ago

Hi, Kat:

We should probably inform webmasters of whatever change we make in the final so
they can adapt accordingly.

Mitchell Stoltz (not reading bugmail)

Comment 38

•

23 years ago

I agree that we should have a pref. I don't know what the default setting should
be, but I would lean towards leaving the language in there by default. We should
have a pref checkbox for "paranoid mode" that will turn off the language part of
the UA as well as other small privacy violations which are the norm.

Jacek Piskozub

Comment 39

•

23 years ago

I believe that as mst potential Mozilla users does not live in zones of war or
ethnic cleansing, the default setting should be the present behavior.

Ben Bucksch (:BenB)

Assignee

Comment 40

•

23 years ago

> as mst potential Mozilla users does not live in zones of war or ethnic
> cleansing, the default setting should be the present behavior.

The problem is that users might not know that we spread this info. And it's not
worth a UI pref IMO.

Default in Mozilla: So far, I count 1+4 (non-Netscape+Netscape) votes for on,
4+0 for off/dummy.

Daniel Veditz [:dveditz]

Comment 41

•

23 years ago

Any votes gathered here are going to be meaningless because mostly only people
who agree will find this bug. People who are happy with the way things are have
no clue others want to change things--though I will grant that most people
probably don't care one way or another.

The navigator.language issue should be dealt with separately. IMHO axe that,
leave the UA the way it is, and make it a hell of a lot easier for people to
spoof the UA (e.g. pref UI with radio-buttons for common options and then a text
box for custom text). Paranoid nuts who are worried about language in the UA
also don't like giving out OS info and nearly everything else in the UA -- we
shouldn't address UA privacy issues item by item.

So is this bug about navigator.language, or is it about the UA? If the latter it
should be invalid, in favor of some UA uber-bug

Daniel Glazman (:glazou) (not active in Mozilla any more)

Comment 42

•

23 years ago

Just to show that I am the kind of person who cares about languages : I use a
browser with an english UI. I configured it so it accepts the following
languages in this order : French, English, Swedish, Spanish, Yiddish, Xhosa.
And the person who Cc:ed me (timeless ?)on this bug also knows that privacy is
on top of list of my concerns.

I think this bug is a total non issue and a waste of time and neurons. We have
language strings crossing the web in all directions since 1996 and nobody never
ever complained about it.

My 0.02? only...

Ben Bucksch (:BenB)

Assignee

Comment 43

•

23 years ago

> People who are happy with the way things are have no clue others want to
> change things

Right.

OK, you talked me into making to default on for Mozilla.

> The navigator.language issue should be dealt with separately. IMHO axe that,

I care more about the UA-string, since we are really spreading this all over the
world and playing it in many server-logs. Implementation is coupled.

> and make it a hell of a lot easier for people to spoof the UA (e.g. pref UI
> with radio-buttons for common options and then a text box for custom text)

There is a bug about it, but it has its own problems, like (by the user)
unintended side-effects. I am not a fan of having UI for setting the UA-string
to arbitary values.
> So is this bug about navigator.language, or is it about the UA? If the latter
> it should be invalid, in favor of some UA uber-bug

It's about both ("don't reveal" includes all ways). But it is INVALID in no
case, because it's a legal request. It is also disabled in Beonex Communicator
(by default), so I would appreciate, if I wouldn't have to carry around source
patches.

If I attach a patch to make it depending on a pref, default on, everybody is
happy, no?

Jeremy M. Dolan

Comment 44

•

23 years ago

I was originally arguing to remove the language from U-A, but enough NSCPies
want to leave it for various reasons, so I say just leave it. I didn't notice
4.7 had been doing it all this time, so it's nothing urgent. If it becomes an
issue, we'll address it post 1.0. But for god sakes, don't make it a pref. And
certainly no UI. 

If anything, a generic U-A pref (also with no UI, or a plain textbox under
Debug... none of this multi pulldown nonsense) could be used to remove it, or, I
have a bug open on disabling U-A altogether. But seperate prefs to tweak each
part of the U-A string would be nuts.

Daniel Veditz [:dveditz]

Comment 45

•

23 years ago

Ben, implementation could be trivially uncoupled if we wanted to deal with
navigator.language separately.

>> is it about the UA? If the latter it should be invalid, in favor
>> of some UA uber-bug
>
> It's about both ("don't reveal" includes all ways). But it is INVALID in no
> case, because it's a legal request.
It's a valid concern, but wrong to consider in isolation from the other
user-agent privacy concerns. Do you really want a bunch of "Hide UA Language",
"Hide UA platform", "Hide UA OS", "Hide UA OS version", etc. prefs? The UI would
be ugly which means they'd be hidden prefs, and that means most people who might
benefit would have no clue they were there.

Mike Shaver (:shaver -- probably not reading bugmail closely)

Comment 46

•

23 years ago

If IE 5.0 and 5.5 don't send the language in the UA string, how many sites can
we really be breaking?  I think the compatibility stance so far, in DOM and
other key areas, has been that if IE5 and NS4 do different things, we should ape
IE5 because it has so many more users.

(I'm sure that localization folks would love to know how many people are using
their work, and I have no problem with that desire, but I don't want to turn the
UA string into about:credits.)
The argument that we've been doing this since 1996 doesn't sway me: browsers had
privacy-hostile cookie and image handling for years to, but I'm pretty sure
nobody's going to stand up and say that fixing it doesn't matter.

I'd support pulling UI language out of the UA because
 - people should be using Accept-Language to tailor content

 - I can't believe that we're functionally breaking that many sites if IE
doesn't do this, and the browser-number-tracking stuff really doesn't sway me,
because, again, IE is the majority of the browser population, and you can't do
this kind of tracking on it

 - there's too much crap in the UA anyway

tao

Comment 47

•

23 years ago

>If IE 5.0 and 5.5 don't send the language in the UA string, how many sites can
>we really be breaking?  

Some websites have logic like this:

  if (Netscape) {
    // assuming locale info presents
    do some locale specific things...
  }
  else if (MSIE) {
    do nothing...
  } else {
    do nothing..
  }

The problem is more like Netscape used to include locale info in the U-A and
some websites use it to do something for international users. Unless 
webmasters are advised of any upcoming change, their websites are doomed to
break. I won't be surprised to see they start advising people that they websites
work better with other browsers.

Mitchell Stoltz (not reading bugmail)

Comment 48

•

23 years ago

I agree with dveditz. People who care whether or not their UI language is
revealed in the UA string probably also don't want their browser and OS versions
revealed. We already have a (hidden) pref for overriding the UA string; why
don't we just encourage people to use that?

Jacek Piskozub

Comment 49

•

23 years ago

Mitchell: One of the reasons is bug 83376. It seems Sun Java uses the UA to
check if the browser is Netscape/Mozilla. It is either free choice of UA or
working Java :-(

timeless

Comment 50

•

23 years ago

please IGNORE the sun jvm problem, that's a bug which someone working on oji or 
the sun jre will fix, it shouldn't ask us about our spoofable useragent.  All 
things considered a simple pref could probably be exposed:

[x] Include system and locale information in useragent. checked by default.

unchecking it would strip out all information except very basic stuff.

Jacek Piskozub

Comment 51

•

23 years ago

Timeless: I'm actually for ignoring the Java bug. The thing that bothers me is
the reply from Sun edburns@acm.org posted on 7/17 as a comment to bug 83376:

> Java Plug-in depends on user-agent string for version information, no fix
> will be made.
>
> zhengyu.gu@sun.com

I believe this needs applyng some pressure on Sun.

Tim Powell

Comment 52

•

23 years ago

Although IE5,IE5.5 do not include the language in the user agent string, they do
expose it in JavaScript through navigator.userLanguage and
navigator.systemLanguage. I believe navigator.language should continue to report
useful and accurate information in Mozilla. Perhaps the language returned can be
talored to the accept language stuff under edit->prefs->navigator->language.
Perhaps report the prefered language. It seems that this is the only way the DOM
can configure to the language, which could be useful if not serving special
pages for each language.

I don't see any reason to remove this from UA by default, especially since we
allow changing the UA completely. I've always thought that this was one of the
nicer features of N4.

bobj

Comment 53

•

23 years ago

> Perhaps the language returned can be
> talored to the accept language stuff under edit->prefs->navigator->language.
> Perhaps report the prefered language.
These are different.  The string in the u-a indicates the browser localization
(i.e. the browser UI and some default settings).  The pref indicates the
user's preferred language(s) for the content.  E.g., a user may run a Japanese
browser, but prefers content in Arabic. Since the browser is enabled for many
more languages that it is currently localized, this is not unusual.

Jaime Rodriguez, Jr.

Comment 54

•

23 years ago

Removing ME ---> barrowma (acting browser PM)

Mitchell Stoltz (not reading bugmail)

Comment 55

•

23 years ago

time marches on...retargeting to 0.9.6

Target Milestone: mozilla0.9.5 → mozilla0.9.6

Mitchell Stoltz (not reading bugmail)

Comment 56

•

23 years ago

Moving to Moz1.0 as part of "paranoid mode" feature set

Target Milestone: mozilla0.9.6 → mozilla1.0

Asa Dotzler [:asa]

Comment 57

•

23 years ago

Bugs targeted at mozilla1.0 without the mozilla1.0 keyword moved to mozilla1.0.1 
(you can query for this string to delete spam or retrieve the list of bugs I've 
moved)

Target Milestone: mozilla1.0 → mozilla1.0.1

Mitchell Stoltz (not reading bugmail)

Comment 58

•

22 years ago

Futuring.

Target Milestone: mozilla1.0.1 → Future

Ben Bucksch (:BenB)

Assignee

Updated

•

19 years ago

Assignee: security-bugs → ben.bucksch

Status: ASSIGNED → NEW

Target Milestone: Future → ---

Tony Mechelynck [:tonymec]

Comment 59

•

16 years ago

Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9pre) Gecko/2008033001 SeaMonkey/2.0a1pre

Workaround: Set the pref general.useragent.locale (in about:config) to the empty string (or even to any language you want to spoof as using). If set to the empty string, the semicolon and one surrounding space are removed too.

According to http://kb.mozillazine.org/General.useragent.locale , that pref was created on 2000-02-07.

Whiteboard: [defective-privacy] → [defective-privacy] [has workaround, comment #59]

Ben Bucksch (:BenB)

Assignee

Comment 60

•

16 years ago

Per HTTP spec, "message which makes the user aware of the loss of privacy involved", i.e. not only user-configurable, but even an alert.
(This applies all the more if this is set by default based on UI or OS language, like Firefox does.)

Ben Bucksch (:BenB)

Assignee

Updated

•

16 years ago

Assignee: ben.bucksch → mozilla

Daniel Veditz [:dveditz]

Updated

•

16 years ago

Keywords: privacy

Phil Ringnalda (:philor)

Updated

•

15 years ago

QA Contact: ckritzer → toolkit

messi

Comment 63

•

15 years ago

While I believe that my request (bug 525637) is a superset of this very old bug and not a duplicate I'd like to reactivate the topic here. Though it seems Mozilla is avoiding the topic.

Also, setting the locale to blank is not a workaround because it will make you stand out even more. Please remove "has workaround" keyword.

Ben Bucksch (:BenB)

Assignee

Updated

•

14 years ago

Priority: P1 → P4

Henri Sivonen (:hsivonen)

Comment 65

•

14 years ago

Bug 543202 is also a superset and not an exact duplicate. For example, the Gecko build date is absolutely useless for any non-b.m.o sniffing but it makes UA strings have more fingerprintable parts.

obsolete.fax

Updated

•

14 years ago

Flags: blocking1.9.0.19?

Henri Sivonen (:hsivonen)

Comment 66

•

14 years ago

Note that CSS now exposes the directionality of the UI language. The HTML5 parser (via <isindex> prompt) exposes the UI language but not necessarily which regional variant when the string does not happen to vary by region.

Ben Bucksch (:BenB)

Assignee

Comment 67

•

14 years ago

Both should instead use the *content* language (Prefs|Content|Language|Choose... , which is by default Firefox install the same as the UI language). The content language is fine to expose to the website, that's what it's for, and it's changeable independently. It's also more correct to use that, because the site should adjust based on that, and it would have wrong effects to have an English-UI Firefox nightly set to Arabic report left-to-right and "en" in some places and Arabic in Accept-Language.

Gervase Markham [:gerv]

Comment 68

•

14 years ago

Are there _any_ known cases of a website using Accept-Language to infer someone's ethnicity and taking action against them, either electronic or physical?

The only way to avoid that possibility would be to remove all language-identifying features from what the browser sends - Accept, JS, everything. However, these features are used today to provide serious and measurable benefits to web users, 99.999% of whom don't care if a site knows what languages they speak.

Gerv

Daniel Glazman (:glazou) (not active in Mozilla any more)

Comment 69

•

14 years ago

(In reply to comment #68)

> The only way to avoid that possibility would be to remove all
> language-identifying features from what the browser sends - Accept, JS,
> everything. However, these features are used today to provide serious and
> measurable benefits to web users, 99.999% of whom don't care if a site knows
> what languages they speak.

I don't understand some of the comments above Gerv's... Ok, the spec says something, but pragmatism sometimes help if the spec is counter-productive

There are many web sites out there that are tailored to serve me better, and that includes serve me in my own language if and when they support it. This is a hugely positive factor for all users around the world. Geolocating the IP address is not enough and many web sites have elaborated detection based on all what's available today, including user-agent string. Change that and you'll break their behaviour, and I won't call that "for the greatest benefit of all".

I am myself strongly in favor of a wontfix for this bug unless the solution implemented is a "never reveal my language" pref in an advanced preference panel with default "on". And I'm not sure it's worth the bloat, honestly.

Ben Bucksch (:BenB)

Assignee

Comment 70

•

14 years ago

Gerv, I am not opposed to Accept-Language at all, which is user-configurable and defaulted to the UI language. I think I said as much in my last message.

I am opposed to the browser sending the *UI* language, where it's different from Accept-Language. Wherever we send the language, locale or country to the site, e.g. UserAgent string, it should be user-configurable value from Prefs|Content|Language|Choose... (which is already used for Accept-Language), not the UI language value.

There is no loss for the user here. On the contrary, if anything it's going to work better.

Ben Bucksch (:BenB)

Assignee

Comment 71

•

14 years ago

Please note that the summary of this bug specifically says "UI language", i.e. browser locale/ package, not "content language" = Accept-Language = "Prefs|Content|Language|Choose...".

obsolete.fax

Updated

•

14 years ago

blocking1.9.1: --- → ?

blocking1.9.2: --- → ?

blocking2.0: --- → ?

Flags: blocking1.9.0.19?

Keywords: checkin-needed

Dão Gottwald [:dao]

Updated

•

14 years ago

Keywords: checkin-needed

Dão Gottwald [:dao]

Updated

•

14 years ago

blocking1.9.1: ? → ---

blocking1.9.2: ? → ---

Henri Sivonen (:hsivonen)

Comment 72

•

14 years ago

FWIW, I'm not really concerned about using language per se for nefarious purposes. I'm more concerned about UI language being yet another piece of configuration entropy that can be used for fingerprinting. See https://panopticlick.eff.org/

Ben Bucksch (:BenB)

Assignee

Comment 73

•

14 years ago

I'll update patch here as part of bug 57555, once I get to it.

Gervase Markham [:gerv]

Comment 74

•

14 years ago

Ben: so you are not arguing for this switch from a privacy point of view, but a functionality one? If so, that does make sense to me.

Gerv

Ben Bucksch (:BenB)

Assignee

Comment 75

•

14 years ago

I argue from both perspectives. If the user is in control, there is no privacy issue, or at least no critical one. Also, reducing the number of permutations is good for functionality (consistent) and privacy (fingerprinting), in this case.

Johnny Stenback (:jst)

Comment 76

•

14 years ago

Not holding the 1.9.3 release for this bug.

blocking2.0: ? → -

Henri Sivonen (:hsivonen)

Updated

•

14 years ago

Depends on: 572656

gionnico

Comment 77

•

14 years ago

I don't understand why should all locales but english one have something like

it-it
it;q=0.8
en-us;q=0.5
en;q=0.3

If I download italian, why is english there? And why two strings with different priorities (it-it and it)?

This, by the way gives more information for fingerprinting than a plain "it" (this is in general more common than the double it-it) set as default for every italian build: official or unofficial and for every platform.

Ben Bucksch (:BenB)

Assignee

Comment 78

•

14 years ago

gionnico, you're int he wrong bug. You talk about Accept-Language header, which is not subject of this bug.

Dão Gottwald [:dao]

Updated

•

14 years ago

Depends on: 580032

Ben Bucksch (:BenB)

Assignee

Comment 79

•

14 years ago

Attached patch Patch 3: Change navigator.language to use Accept-Language (obsolete) — Details — Splinter Review

After bug 572656 has fixed the UA string by removing the language part, this also fixes navigator.language. The property is retained (JS has no access to the Accept-Language header, to my knowledge), and retains the formal format, but uses the value from Accept-Language (which the user can freely configure in the pref window) instead of the UI language.

Asking biesi to review.

Attachment #460173 - Flags: review?(cbiesinger)

Ben Bucksch (:BenB)

Assignee

Comment 80

•

14 years ago

http://browserspy.dk/language.php
http://browserspy.dk/showprop.php

Ben Bucksch (:BenB)

Assignee

Updated

•

14 years ago

Attachment #43864 - Attachment is obsolete: true

Ben Bucksch (:BenB)

Assignee

Comment 81

•

14 years ago

Comment on attachment 460173 [details] [diff] [review]
Patch 3: Change navigator.language to use Accept-Language

Actually, bz reviewed bug 572656

Attachment #460173 - Flags: review?(cbiesinger) → review?(bzbarsky)

Axel Hecht [:Pike]

Comment 82

•

14 years ago

Comment on attachment 460173 [details] [diff] [review]
Patch 3: Change navigator.language to use Accept-Language

Would this break on 

en-gb;q=0.8, en;q=0.7

Didn't dig into whitespace handling.

Not that I'm in sync with the rationale of these bugs, for the record.

Boris Zbarsky [:bzbarsky]

Comment 83

•

14 years ago

Comment on attachment 460173 [details] [diff] [review]
Patch 3: Change navigator.language to use Accept-Language

Yeah, Axel's right.  This needs to handle q values.  And things like spaces around the ',' chars, etc.  Probably best to just use nsCharSeparatedTokenizer and then deal with the q value thing.

Attachment #460173 - Flags: review?(bzbarsky) → review-

Ben Bucksch (:BenB)

Assignee

Comment 84

•

14 years ago

> This needs to handle q values.

You mean I need to look for ";" in addition to ","? Yes, sure, sorry for the oversight. Will attach new patch.

(I do not care to support *manually* hacked prefs that have a lesser preferred language as the first entry, if that's what you meant.)

> nsCharSeparatedTokenizer

Will take a look, how big the code will be with that and with FindInReadable().

Ben Bucksch (:BenB)

Assignee

Comment 85

•

14 years ago

Actually, the pref "intl.accept_languages" does not contains ;q= . The UI doesn't write q= in there, and if a user does, it's ignored and not sent in HTTP header, but the HTTP re-calculates the q=.

Also, while nsCharSeparatedTokenizer is useful here, it seems to skip the first token. Either I don't know how to use its API, or the implementation is broken.

Ben Bucksch (:BenB)

Assignee

Comment 86

•

14 years ago

> nsCharSeparatedTokenizer ... seems to skip the first token

Nevermind, I was stupid.

Ben Bucksch (:BenB)

Assignee

Comment 87

•

14 years ago

Attached patch Patch 4: Change navigator.language to use Accept-Language (obsolete) — Details — Splinter Review

Attachment #460173 - Attachment is obsolete: true

Attachment #461774 - Flags: review?(bzbarsky)

Reşat SABIQ (Reshat)

Comment 88

•

14 years ago

I feel kinda bad for apparently slowing the approval of the patch down a bit, but I'd like to draw attention to a few things:
1. 3-char lang codes (need to be handled).
Based on url mentioned in another bug, there are 2 existing examples of them: http://mxr.mozilla.org/l10n-mozilla1.9.2/search?string=intl.accept_languages&find=global/intl.properties
(The same could be asked for 3-char country codes, but there are no examples of those, and they might not be probable (haven't looked into it).)
2. Default for messy accept-lang pref: should this be en-US rather than en for consistency w/ default en-US locale? I tend to think "yes".
3. Update urls: %LOCALE% in chrome prefs is resolved by navigator.language as of now, AFAIK. I'm 99.9% sure that the current UI language shouldn't be changed by an update based on preferred content language. If that's the case, then there needs to be an additional patch (in this bug or another), that accounts for necessary additional logic that is now required to substitute %LOCALE% in chrome app.update.url pref w/ UI lang, rather than navigator.language value.
(Plus, I'm not 100% sure that an update in language B for an installation whose UI is in language A works smoothly every time: i'm not saying it doesn't either, but it would probably be something worth verifying if the updates were going to be based on the current preferred language.)
4. empty pref special case (messed up, or user wishes to not specify accept-language in HTTP):
4.1. should this result in navigator.language being empty, for consistency with HTTP Accept-Language? I tend to think "yes".
4.2. i'm also throwing this note in based on Java StringTokenizer's throwing an exception if nextElement() is called w/o hasMoreTokens() having returned true prior to that. Quick look at the source appears to suggest that nsCharSeparatedTokenizer works differently and (at least currently) returns an empty string instead. Forgive me for not having time to verify that, but i hope those who are already building FX 4 can take a quick look to make sure current empty pref handling wouldn't crash the app, for instance.

Accounting for all of this should easy, except maybe item 3., which necessitates a patch for another class.

P.S. For the record, I can't wait for this fix.

Reşat SABIQ (Reshat)

Comment 89

•

14 years ago

Just noticed this as well:
5. Currently (FX <=3.6.x, country code in navigator.language is upper-case, whereas accept-language pref's country code is lower-case. 
Should navigator.language value be:
5.a. backwards compatible
(this doesn't seem feasible, because some locales have country code in accept-lang, and no country code in navigator.language; either that's something sites will need to adjust to, or there'd need to be a map of first-accept-lang=ui-lang pairs based on FX 3.6.x used for 100% backwards-compatibility)
or
5.b. upper-case
or
5.c. lower-case

IMHO, ideally 5.a., but 5.b. would be easier to implement and more consistent, at the expense of additional country code for some locales and/or case change for some locales (sorry, i haven't analyzed other locales for letter-case). If sites can adjust to UA format change for ALL locales, they can also adjust to navigator.language values changing slightly for SOME locales (but it might  warrant a list of before-after values for affected locales).

Reşat SABIQ (Reshat)

Comment 90

•

14 years ago

(In reply to comment #89)
> 5.b. upper-case
> or
> 5.c. lower-case
Clarification, I meant:
5.b. using upper-case country code
or
5.c. using lower-case country code

[not reading bugmail]

Comment 91

•

14 years ago

That can be addressed in another bug if it needs to be changed.

Reşat SABIQ (Reshat)

Comment 92

•

14 years ago

(In reply to comment #88)
> 3. Update urls: %LOCALE% in chrome prefs is resolved by navigator.language as
> of now, AFAIK. I'm 99.9% sure that the current UI language shouldn't be changed
> by an update based on preferred content language. If that's the case, then
> there needs to be an additional patch (in this bug or another), that accounts
> for necessary additional logic that is now required to substitute %LOCALE% in
> chrome app.update.url pref w/ UI lang, rather than navigator.language value.

Please ignore item 3.: a couple of my memory wires got crossed, and in fact no changes need to be made to app.update.url handling, because %LOCALE% there appears to be replaced based on the locale in update.locale file in app directory, and NOT based on navigator.language value. Unlike elsewhere, I acknowledge having wastefully posted 10 lines in c88 for item 3, and 6 lines here to clear that up.

Also, to clarify, an obvious example for item 5 that i had in mind but didn't mention is en-US vs. en-us.

Ben Bucksch (:BenB)

Assignee

Comment 93

•

14 years ago

Thanks for the comments, Reshat.
1. I was assuming only 2-char ISO lang codes were valid, but the HTTP spec explicitly gives examples with other codes, so I'll be more lax. I'll still check that the user didn't use the wrong Windows syntax of en_GB instead of the correct Internet syntax en-GB.

3. If we broke %LOCALE% in the updater, that'd be really bad. I'll double-check that this is not the case, as well as whether there are other stupid uses of navigator.language.

5. Yes, I was already thinking of casing, thanks for pointing out that the current navigator.language is en-US. I'll maybe just fix the casing, but only in cases of a 5-letter code (lowercase 2-letter, dash, uppercase 2-letter).
Whether this is important also depends on other browsers. If they use different casing, likely sites will be tolerant (using .toLowerCase()).

Specs:
<http://asg.web.cmu.edu/rfc/rfc2616.html#sec-14.4>
<https://developer.mozilla.org/en/Navigator.language>

Ben Bucksch (:BenB)

Assignee

Comment 94

•

14 years ago

4. The fallback could be either "en" or "". I chose the former, but it's trivial to do the latter. Up to reviewer. *Informed* opinions welcome.

Ben Bucksch (:BenB)

Assignee

Comment 95

•

14 years ago

Responding to self, I think it would indeed be better to use "" as fallback, esp. in light of "i-cherokee" as first accept-lang, and of sites in other countries which want to use the local language as fallback.

When using "" as fallback, we leave the fallback to the site. When using "en", the site cannot differentiate whether it's a fallback or the user really meant English. So, I'll use "" as fallback.

Axel Hecht [:Pike]

Comment 96

•

14 years ago

There's a good chance that we'll have script tags in language codes at some point, and maybe even x- stuff.

Reşat SABIQ (Reshat)

Comment 97

•

14 years ago

(In reply to comment #88)
> 2. Default for messy accept-lang pref: should this be en-US rather than en for
> consistency w/ default en-US locale? I tend to think "yes".
...
> 4. empty pref special case (messed up, or user wishes to not specify
> accept-language in HTTP): 
> 4.1. should this result in navigator.language being empty, for consistency with
> HTTP Accept-Language? I tend to think "yes".

IMHO, "" navigator.language for "" accept-language is good.
With regards to "" or afore-mentioned "en-US" navigator.language for messy/unrecognized accept-language, i'm not so sure, because accept-language header is not "" or "en-US" in this case as of now. I tend to think navigator.language should at least match the language in accept-language (allowing the (remote) possibility of not having a country code as it is the case in 3.6.x). So for "messed-up" intl.accept_languages, i'd either:
2.1. provide "messed-UP" as navigator.language, and keep HTTP accept-language as is (i.e., "messed-up")
or
2.2. provide "" for both navigator.language and HTTP accept-language
2.3. provide "" for both navigator.language, and log a bug to do the same for HTTP accept-language

Taking into account Axel's comment as well, 2.1 might be the best way to go in the context of this bug. If 2.2, or 2.3 were found desirable, IMHO we need a separate bug for that (especially, because exceptional manually set values are the subject of discussion here).

Ben Bucksch (:BenB)

Assignee

Comment 98

•

14 years ago

Attached patch Patch 5: Change navigator.language to use Accept-Language (obsolete) — Details — Splinter Review

- removed the check, so now also allowing i-cherokee
- return "", if the accept-lang pref is empty (or otherwise invalid)
- replace _ with - (only first one for now)
- return uppercase for en-US

Attachment #461774 - Attachment is obsolete: true

Attachment #462244 - Flags: review?(bzbarsky)

Attachment #461774 - Flags: review?(bzbarsky)

Ben Bucksch (:BenB)

Assignee

Comment 99

•

14 years ago

Searching for "navigator.language" in source returns (only):
./extensions/reporter/resources/content/reporter/reportWizard.js:
  const gParamLanguage = window.navigator.language;
./testing/extensions/community/chrome/content/litmus.js:
  this.locale = navigator.language;
./testing/sisyphus/tests/mozilla.org/download-page/userhook.js:
  data['005 navigator.language']      = navigator.language;
I should fix these, too, but I will probably use bug 580032 as tracker for that.

Boris Zbarsky [:bzbarsky]

Comment 100

•

14 years ago

Hmm.  Why do we want to do the bits about converting '_' to '-' (how would the '_' get there?) and uppercasing stuff?

Reşat SABIQ (Reshat)

Comment 101

•

14 years ago

IMHO, these points are worth another look:
4.    IMHO, no assert is needed for empty accept-language, since RFC says:
"If no Accept-Language header is present in the request, the server
    SHOULD assume that all languages are equally acceptable."
Only affects debug builds, but still...
5. country code upper-casing:
The following cases should be handled as well:
abc-XY
abc-XY-dialect
https://wiki.mozilla.org/L10n:Locale_Codes
https://wiki.mozilla.org/L10n:Teams
One way of pseudocoding would be: if first '-' (at index 2 or 3), if any, is followed by no more than 2 alpha chars in a row, uppercase those 2 chars.
That said, the whole item goes away if we are going to make country code in navigator.language lower-case, though that wouldn't be backwards-compatible.

Axel Hecht [:Pike]

Comment 102

•

14 years ago

Please, don't use our wiki as a standard. The real document is http://tools.ietf.org/html/bcp47, which refers to http://tools.ietf.org/html/rfc4647 for the matching stuff.

Here, http://tools.ietf.org/html/bcp47#section-2.1.1 rules:

 At all times, language tags and their subtags, including private use
   and extensions, are to be treated as case insensitive: there exist
   conventions for the capitalization of some of the subtags, but these
   MUST NOT be taken to carry meaning.

Reşat SABIQ (Reshat)

Comment 103

•

14 years ago

Thanks, Axel. Well, i assumed we wanted consistency. If we are fine w/ en-US on one hand, and ast-es on the other, that's a different matter of course. If that is considered and decided to be OK, then that's the way it's gonna be. Still worth drawing attention to it, IMHO.

Ben Bucksch (:BenB)

Assignee

Comment 104

•

14 years ago

> Hmm.  Why do we want to do the bits about converting '_' to '-'
> (how would the '_' get there?) and uppercasing stuff?

1. "-" vs. "_". The RFCs and ISO say that the separator is "-", e.g. "en-US". However, POSIX uses LANG="de_DE" in env vars, and Windows also uses the "en_US" notation. The latter is therefore common, and I've seen many people use "en_US" although the spec/protocol explicitly said "en-US", so it's a common error. How would it get there? By somebody editing the pref manually in <about:config>. The normal pref dialog does not allow to specify arbitrary tags. To avoid confusion and problems on the site's end, I want to prevent this, even if it's unlikely. I tried to make sure it's not a perf problem (just 2 int comparisons) nor a lot of code (just 2 lines).

2. uppercase: our pref contains e.g. "de-de,en-us,en", i.e. lower case. However, our locale codes are "de-DE", i.e. country part in upper case. BCP47 (mentioned by Axel above) also uses this notation in the examples. It's the convention. We used to return that as well, e.g. "en-US". If a site doesn't use navigator.language.toLowerCase(), the comparison will not fail. If our own code (comment 99) is any indication, this error is common, so if we don't adjust the casing to the convention, we may break stuff.

Now, my parsing here is primitive. It works fine for the 2-letter-dash-2-letter codes that we use for locales, and IIRC that's all that the pref dialog allows currently, so it should work be sufficient currently. Only "failure" is indeed "ast-es" and similar codes with 3 letters as first part. BCP47 also gives examples: zh-Hant, de-Latn-DE, de-DE-x-goethe. For these, my code would return zh-HANT, de-LATN-DE, de-DE-X-GOETHE, which is not the convetional casing. If you think I should improve this, I would use the nsCharSeparator here as well, and then uppercase every 2-letter part, apart from the first part.

Boris Zbarsky [:bzbarsky]

Comment 105

•

14 years ago

I guess my question is how far we're willing to go in terms of trying to canonicalize random input.  I think the only two sources of this pref are:

1)  What our prefs dialog generates.
2)  What our localizers set up as the default for their locale.
3)  about:config.

I claim we don't care about #3, can impose any reasonable rules we want on #2, and fully control #1.  So if it simplifies our code, we should just assume whatever we want and impose corresponding rules on #2.  Axel, thoughts?

Axel Hecht [:Pike]

Comment 106

•

14 years ago

rightly so

Boris Zbarsky [:bzbarsky]

Comment 107

•

14 years ago

s/two sources/sources/, clearly.  ;)

Reşat SABIQ (Reshat)

Comment 108

•

14 years ago

Concise:
IMHO, Ben's suggestion of uppercasing the first 2-letter part, after the first part, if any, might be ideal (though a bit harder to implement), but restricting such uppercasing to just the assumption that this 2-letter part is the second subtag (following 2- or 3-char first subtag), as i understand Boris and Axel appear to be inclined to do, will probably be sufficient for a long time (and might save x ms on each access?). If these are the 2 operational choices to proceed with, then choosing between them is almost a coin-toss situation IMHO.

Verbose additional info:
One could also say that we have 1 model, which is the intl.accept_languages chrome pref, whose value is presented by several views (in MVC pattern lingo).

Paraphrasing Boris, the input can come from:
i. pre-shipped intl.accept_languages pref values based on intl.properties (1) and 2))
ii. random about:config entries by the user (3))

FYI, if i enter "messed1-up,messed2-up" as random input, both prefs dialog and about:config reflect the same value (prefs dialog just shows them as codes in [] without displaying a recognized lang name in front of each value, although some pre-shipped values also don't have recognized lang names).

Ben Bucksch (:BenB)

Assignee

Comment 109

•

14 years ago

Attached patch Patch 6: Change navigator.language to use Accept-Language, using tokerizer for uppercasing (obsolete) — Details — Splinter Review

Here's another patch with better uppercasing code. It uses the tokenizer for this as well, and uppercases all 2-letter parts, apart from the first one.

Attachment #462638 - Flags: review?(bzbarsky)

Ben Bucksch (:BenB)

Assignee

Comment 110

•

14 years ago

> I claim we don't care about [<about:config>], can impose any reasonable rules
> we want on [our prefs UI], and fully control [the defaults of the
> localized builds].  So if it simplifies our code, we should just assume
> whatever we want and impose corresponding rules on [the prefs UI].

OK, great, works for me.

Only catch is: the intl.accept_languages pref may have existing values. So, even if we change the prefs UI, the old values are still there. And unfortunately, they are in "de-de" notation. So, unless you want to migrate, we have to work with that.

I attached another patch with better lang part parsing. Make your pick. I'd take patch 6.

Boris Zbarsky [:bzbarsky]

Comment 111

•

14 years ago

Comment on attachment 462638 [details] [diff] [review]
Patch 6: Change navigator.language to use Accept-Language, using tokerizer for uppercasing

I still think the uppercasing is silly, but r=me if you s/PRBool/bool/ for that thing you assign bools into.

Attachment #462638 - Flags: review?(bzbarsky) → review+

Boris Zbarsky [:bzbarsky]

Updated

•

14 years ago

Attachment #462244 - Flags: review?(bzbarsky) → review-

Ben Bucksch (:BenB)

Assignee

Updated

•

14 years ago

Attachment #462244 - Attachment is obsolete: true

Ben Bucksch (:BenB)

Assignee

Comment 112

•

14 years ago

Comment on attachment 462638 [details] [diff] [review]
Patch 6: Change navigator.language to use Accept-Language, using tokerizer for uppercasing

what happens with sr?

Attachment #462638 - Flags: superreview?(bzbarsky)

Ben Bucksch (:BenB)

Assignee

Comment 113

•

14 years ago

Attached patch Patch 7: Change navigator.language to use Accept-Language — Details — Splinter Review

Playing Boules with bool and PRBool.

Attachment #462638 - Attachment is obsolete: true

Attachment #462663 - Flags: superreview?(bzbarsky)

Attachment #462663 - Flags: review+

Attachment #462638 - Flags: superreview?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 114

•

14 years ago

Comment on attachment 462663 [details] [diff] [review]
Patch 7: Change navigator.language to use Accept-Language

Let's have jst do that.

Attachment #462663 - Flags: superreview?(bzbarsky) → superreview?(jst)

Johnny Stenback (:jst)

Comment 115

•

14 years ago

Comment on attachment 462663 [details] [diff] [review]
Patch 7: Change navigator.language to use Accept-Language

+    while (localeTokenizer.hasMoreTokens())
+    {
+      const nsSubstring &code = localeTokenizer.nextToken();
+      if (code.Length() == 2 && !first)
+      {
+        nsAutoString upper(code);
+        ::ToUpperCase(upper);
+        aLanguage.Replace(pos, code.Length(), upper);
+      }
+      pos += code.Length() + 1; // 1 is the separator
+      if (first)
+        first = false;

Might as well loose the if check there, the result will be the same w/o it and with less branching and less code.

sr=jst

Attachment #462663 - Flags: superreview?(jst) → superreview+

Robert Kaiser

Comment 116

•

14 years ago

That bug basically means decreased usability for people on our website, as we can't offer the correct locale download matching the browser version they are using right now any more. Thanks for breaking us.

Dão Gottwald [:dao]

Updated

•

14 years ago

Attachment #462663 - Flags: approval2.0?

Ben Bucksch (:BenB)

Assignee

Comment 117

•

14 years ago

KaiRo, wrong. Just use Accept-Language. That's the standard and what you should have used anyway.

Robert Kaiser

Comment 118

•

14 years ago

(In reply to comment #117)
> KaiRo, wrong. Just use Accept-Language. That's the standard and what you should
> have used anyway.

Completely wrong. I don't give a damn about the preferred language of *web sites* for that user (and that's what Accept-Language is), I only care about the *UI language* he is using. If I can't match that, I probably should even think about trying about giving the user any specific preferred download but send him through a hurdle run of clicks to select it himself. Very user friendly, but thank you for giving me no other choice.

Daniel Glazman (:glazou) (not active in Mozilla any more)

Comment 119

•

14 years ago

(In reply to comment #118)

> Completely wrong. I don't give a damn about the preferred language of *web
> sites* for that user (and that's what Accept-Language is), I only care about
> the *UI language* he is using. If I can't match that, I probably should even
> think about trying about giving the user any specific preferred download but
> send him through a hurdle run of clicks to select it himself. Very user
> friendly, but thank you for giving me no other choice.

Guys, you are _both_ right. Ben want to follow an IETF recommendation on
a privacy issue and KaiRo wants to match the UI language because that's the
only way he can serve correctly someone like me, ie someone browsing the web
in french when it's available but using only en-US software.

That said, KaiRo, I think the vast majority of internet users use a browser
UI locale matching the accept-language's topmost language, and only a small
minority of geeks have a different configuration.

Let me ask a naive question here: if I download a given localized version of
Firefox, is the Accept-Language set by default to match that language? If yes,
then at least KaiRo can rely on that for, again, the vast majority of users.
The minority of übergeeks will be annoyed a bit but hey we're always annoyed
by everything aren't we?

FWIW, I still think the war-on-privacy-issues goes too far here. Anyway...

Dão Gottwald [:dao]

Comment 120

•

14 years ago

(In reply to comment #119)
> Let me ask a naive question here: if I download a given localized version of
> Firefox, is the Accept-Language set by default to match that language?

Yes.

FWIW, this isn't just fixing a privacy issue. It's also about making navigator.language more useful when it's used on the client side very much like Accept-Language would be used on the server side.

Ben Bucksch (:BenB)

Assignee

Comment 121

•

14 years ago

> if I download a given localized version of
> Firefox, is the Accept-Language set by default to match that language?

For the big locales, yes. For some small locales, there may be differences.

Yes, the download button detecting system and language is only for a good default, to make it easier for the majority of users.

I think a link/page "Other languages and systems", like Firefox has, should solve this for the most part.

Robert Kaiser

Comment 122

•

14 years ago

(In reply to comment #119)
> Let me ask a naive question here: if I download a given localized version of
> Firefox, is the Accept-Language set by default to match that language?

Fully depends on the localizers, but usually, the UA language is *among* the value in Accept-Language, even if some carefully chosen variant of it might be the primary language listed there. Also, the user might change the Accept-Language at will, making it way more fingerprintable than the UI locale, and e.g. possible making "ger-saxon" their primary Accept-Language header when using a German (de) build, i.e. having an Accept-Language header including all of ger-saxon, de-DE, de, and possibly even en-US and/or en.

> The minority of übergeeks will be annoyed a bit but hey we're always annoyed
> by everything aren't we?

At least by all the paranoia going on with some things like the so-called "privacy" or "fingerprinting" threats introduced by UA strings.

And I'm very sensible about privacy matters usually, but there are clearly things where we are overdoing it while at the same time not working on stuff that ought to be much higher priority but may have higher impact overall, like the exposure of the plugin or installed font lists to the web, which are way more fingerprintable than ridiculous things like the UI languages or using a nightly.

Axel Hecht [:Pike]

Comment 123

•

14 years ago

In most cases, the first accept locale differs from the chosen locale. Maybe just in that it's de-de instead of de, or upper vs lower case things that folks reading locale-code specs should already deal with, but it's not exactly the same thing.

http://mxr.mozilla.org/l10n-mozilla1.9.2/search?string=intl.accept_languages&find=global/intl.properties for data.

Robert Kaiser

Comment 124

•

14 years ago

In any case, as I can't be bothered to write a heuristic parser for that new stupid property, I'll just offer en-US builds when that locale doesn't give an exact match with one of the locales we can offer, everyone else needs to use the the "other platforms and languages link". Who needs usability anyhow.

Ben Bucksch (:BenB)

Assignee

Comment 125

•

14 years ago

> usually, the UA language is *among* the value in Accept-Language

So, it's solvable. You can detect which UI locale the build is, and offer the right download, in most cases (98%? of users) at least. The others can fall back to "other languages".

Robert Kaiser

Comment 126

•

14 years ago

(In reply to comment #125)
> > usually, the UA language is *among* the value in Accept-Language
> 
> So, it's solvable. You can detect which UI locale the build is, and offer the
> right download, in most cases (98%? of users) at least.

Only if I build a parser for the Accept-Languages list, which is quite some work for a simple download box...

Ben Bucksch (:BenB)

Assignee

Comment 127

•

14 years ago

fairly simple:
var localeMapping = {
  "de-de" : "de",
  "fo-ba" : "ba",
  ...
}
var useLocale = "unknown";
for each (var entry in acceptLang.split(",")) { // separate langs
  // strip q and spaces
  let lang = entry.replace(/;.*/, "").replace(" ", "").toLowerCase();
  if (localeMapping[lang]) {
    useLocale = localeMapping[lang];
    break;
  }
}
if (useLocale == "unknown") {
  showOtherLangsInBiggerFont(); // or directly on page
  useLocale = "en-US";
}
var downloadURL = mirror + "seamonkey-" + currentVersion + "-" + platformSpec + "-" + useLocale + platformExtension;

That's 12 lines of JS code, plus the mapping (which is fairly static). It won't be much more in PHP or whatever you use on the website, if you want to do it there instead.

Pascal Chevrel:pascalc (PTO until April 26)

Comment 128

•

14 years ago

kairo, for PHP you can use my locale detection class:
http://granary.stage.mozilla.com/libs/l10n-demos/localeDetectionDemo.php

Axel Hecht [:Pike]

Comment 129

•

14 years ago

FTR, both ignore script tags, which we'll apparently not get for 4.0, but that are totally fine to use (not that we have UI for those).

dwitte@gmail.com

Comment 130

•

14 years ago

Should we get a followup bug for removing the 'general.useragent.locale' pref from all.js and nuking the corresponding code/API in nsHttpHandler?

Axel Hecht [:Pike]

Comment 131

•

14 years ago

We don't have to have some pref to select the chrome locale, and I don't see a good argument for dropping this one.

dwitte@gmail.com

Comment 132

•

14 years ago

Sorry, I don't follow -- you're saying it's unnecessary but you want to keep it? What useful information does it provide?

Dão Gottwald [:dao]

Comment 133

•

14 years ago

(In reply to comment #130)
> Should we get a followup bug for [...] nuking the corresponding code/API in nsHttpHandler?

Reşat SABIQ (Reshat)

Comment 134

•

14 years ago

(In reply to comment #116)
> That bug basically means decreased usability for people on our website, as we
> can't offer the correct locale download matching the browser version they are
> using right now any more. 

It would also be more convenient if people had their name, and their preferred language written on their foreheads. Yet, somehow, i don't think i would want to be one of those people, and i think the majority of people wouldn't be either. It's not all about convenience and usability. There are trade-offs, and cost-benefit analyses involved.

(In reply to comment #130)
> Should we get a followup bug for removing the 'general.useragent.locale' pref
> from all.js and nuking the corresponding code/API in nsHttpHandler?

I think the pref might be used by some add-ons that provide UI for switching between more than 2 langpacks. Not sure if this is still workable in FF 4... That said, http should no longer have anything to do w/ this pref. If it stays, it should only be for manual or addon-based UI locale manipulation.

Daniel C

Comment 135

•

14 years ago

I don't understand comment 131 or comment 133.

Axel Hecht [:Pike]

Comment 136

•

14 years ago

(In reply to comment #131)
> We don't have to have some pref to select the chrome locale, and I don't see a
> good argument for dropping this one.

Can't type.

We do have to have some pref to select the chrome locale, and I don't see a good argument for dropping g.u.locale.

dwitte@gmail.com

Comment 137

•

14 years ago

It's now misnamed, that's all. We can leave the name as-is but we should probably remove the API for reading it on nsHttpHandler, because that doesn't belong there anymore.

Benjamin Smedberg

Comment 138

•

14 years ago

Please wait until after we branch.

Attachment #462663 - Flags: approval2.0? → approval2.0-

Dão Gottwald [:dao]

Updated

•

13 years ago

Depends on: post2.0

(no longer active)

Comment 139

•

13 years ago

I do not feel comfortable taking this on cedar.  Please land this on mozilla-central when it's ready

Whiteboard: [defective-privacy] [has workaround, comment #59] → [defective-privacy] [has workaround, comment #59][not-ready-for-cedar]

Dão Gottwald [:dao]

Comment 140

•

13 years ago

http://hg.mozilla.org/mozilla-central/rev/ead683169ef2

Status: NEW → RESOLVED

Closed: 13 years ago

Component: Security → DOM

QA Contact: toolkit → general

Resolution: --- → FIXED

Summary: Don't reveal UI language to site/page → Change navigator.language to use Accept-Language instead of the UI language

Target Milestone: --- → mozilla2.2

Dão Gottwald [:dao]

Updated

•

13 years ago

No longer depends on: 580032

Dão Gottwald [:dao]

Updated

•

13 years ago

Whiteboard: [defective-privacy] [has workaround, comment #59][not-ready-for-cedar] → [defective-privacy]

Dão Gottwald [:dao]

Updated

•

13 years ago

No longer blocks: 71569

Ben Bucksch (:BenB)

Assignee

Updated

•

13 years ago

Summary: Change navigator.language to use Accept-Language instead of the UI language → Don't reveal UI language to site/page -- Change navigator.language to use Accept-Language instead of the UI language

Ben Bucksch (:BenB)

Assignee

Comment 142

•

13 years ago

Thanks, Dao, for commiting!

Ben Bucksch (:BenB)

Assignee

Comment 143

•

13 years ago

(hihi. Almost 10 years after my first patch here. Do I get a prize? :) )

Dão Gottwald [:dao]

Updated

•

13 years ago

Blocks: 646428

shawn.sumin

Comment 144

•

13 years ago

@Ben: You get a Cookie (Brand name: Privacy Cookies). Congratz.

Ben Bucksch (:BenB)

Assignee

Comment 145

•

13 years ago

nomnomnom

Jorge Villalobos [:jorgev] (he/him)

Comment 146

•

13 years ago

sheppy, I think this might be important for add-on developers and worthwhile documenting. Thanks!

Keywords: dev-doc-needed

Eric Shepherd [:sheppy]

Comment 147

•

13 years ago

Documentation updated:

https://developer.mozilla.org/en/DOM/window.navigator.language

Also mentioned on Firefox 5 for developers.

Keywords: dev-doc-needed → dev-doc-complete

shawn.sumin

Updated

•

13 years ago

tracking-firefox5: --- → ?

Asa Dotzler [:asa]

Comment 148

•

13 years ago

not interested for 5.

tracking-firefox5: ? → -

Eric Shepherd [:sheppy]

Comment 149

•

13 years ago

I thought this landed already on Aurora. Did it not?

Benjamin Smedberg

Comment 150

•

13 years ago

Looks like it did, but release drivers still have no reason to track it in particular at this point.

shawn.sumin

Updated

•

13 years ago

tracking-firefox6: --- → ?

Dão Gottwald [:dao]

Updated

•

13 years ago

tracking-firefox6: ? → ---

Matthew N. [:MattN]

Updated

•

12 years ago

Blocks: 418485

Reşat SABIQ (Reshat)

Updated

•

7 years ago

Blocks: 1386461

Nobody; OK to take it and work on it

Updated

•

5 years ago

Component: DOM → DOM: Core & HTML

Proposed Fix for UA-string. Also contains Win fix for bug 57555. 23 years ago Ben Bucksch (:BenB) 4.30 KB, patch		Details \| Diff \| Splinter Review
Patch 3: Change navigator.language to use Accept-Language 14 years ago Ben Bucksch (:BenB) 1.73 KB, patch	bzbarsky : review-	Details \| Diff \| Splinter Review
Patch 4: Change navigator.language to use Accept-Language 14 years ago Ben Bucksch (:BenB) 2.32 KB, patch		Details \| Diff \| Splinter Review
Patch 5: Change navigator.language to use Accept-Language 14 years ago Ben Bucksch (:BenB) 3.81 KB, patch	bzbarsky : review-	Details \| Diff \| Splinter Review
Patch 6: Change navigator.language to use Accept-Language, using tokerizer for uppercasing 14 years ago Ben Bucksch (:BenB) 4.22 KB, patch	bzbarsky : review+	Details \| Diff \| Splinter Review
Patch 7: Change navigator.language to use Accept-Language 14 years ago Ben Bucksch (:BenB) 4.22 KB, patch	BenB : review+ jst : superreview+ benjamin : approval2.0-	Details \| Diff \| Splinter Review