PF character encoding: UTF8 vs iso-8859-1

I manage two generally parallel, unmodified PF 7.00.05 sites, one in English, one in Polish. Both are operating normally.

In particular, the Polish site correctly renders Polish diacritic characters as well as a sprinkling of "special" German and Cyrillic characters. When I examine the PF-generated headers, I see

<meta http-equiv='Content-Type' content='text/html; charset=utf-8' />

exactly as I would expect. Unicode is the way to go.

The English site renders English characters correctly, of course, plus a similar sprinkling of "special" German and Cyrillic characters are rendered correctly. When I examine the PF-generated headers, I see

<meta http-equiv='Content-Type' content='text/html; charset=iso-8859-1' />

which is not what I expect, as iso-8859-1 --as far as I can tell-- is incapable of rendering Polish diacritics and any Cyrillic. (I suppose I must except the non-diacritic Polish characters and Cyrillic characters that look like Latin ones, but the overlaps are beside the point.)

Q1: On a PF page declared to be iso-8859-1 encoded, how is it that the Polish diacritics and Cyrillic characters render correctly? Could the browser be reading the BOM or doing an analysis of the actual content and overriding the header declaration? Or what? Or am I completely misunderstanding character encoding?

Q2: Is there a good technical reason that the standard English 7.00.05 installation doesn't use utf-8? Or simply a lack of pressing reason to convert it?

Q3: (Bonus Question) Is there a quick-n-easy php method of determining the encoding of a particular file?



author hen3ry
forumGeneral Discussion
replies1 post
viewed2351 times
Last updated on 9 years ago


Users who participated in discussion: hen3ry