UTF8/ANSI encoding bug

Post your questions and problem reports here

Moderator: kfury77

Forum rules
Please try to follow these guidelines. This will help to receive faster and more accurate response.
  • Check the Support section of the corresponding product first. Chances are you will find your answer there;
  • Do not create new topics for already reported problems. Add your comments to the existing topics instead;
  • Create separate topic for each problem request. Do NOT post a number of non-related problem reports in a single topic;
  • Give your topic a meaningful title. Titles such as "A question," "Bug report" and "Help!" provide others no clue what your message is about;
  • Include the version number of the software you are using;
  • This is not an official customer support helpdesk. If you need a prompt and official response, please contact our support team directly instead. It may take a while until you receive a reply in the forum;
Post Reply
athei
Posts: 9
Joined: Fri Aug 01, 2008 7:55 pm

UTF8/ANSI encoding bug

Post by athei »

1.Create new php file, echo some utf8 chars, save as utf8 without boom.
In browser this characters are displaying ok.
2. Delete utf8 chars and echo non utf8 text. [CTRL+S]
3.Close and reopen file - this file is recognized as ANSI
4. echo some utf8 chars [CTRL+S]
In browser this characters aren't displaying ok, because file is threaten as ANSI not UTF8.
IMHO it's bug, because I shouldn't have to always after I close file and reopen go to "File/save as" and from list select encoding utf8...
This is happening also after I close an start program again.
Last edited by athei on Tue Jul 28, 2009 11:16 pm, edited 1 time in total.
User avatar
Karlis
Site Admin
Posts: 3605
Joined: Mon Jul 15, 2002 5:24 pm
Location: Riga, Latvia, Europe
Contact:

Re: UTF8/ANSI encoding bug

Post by Karlis »

Karlis Blumentals
Blumentals Software
www.blumentals.net
athei
Posts: 9
Joined: Fri Aug 01, 2008 7:55 pm

Re: UTF8/ANSI encoding bug

Post by athei »

But i have encoding in this file

Code: Select all

header('Content-Type: text/html; charset=utf-8');
and when i add utf8 characters (for example: ęółśążźćń ) again the utf8 isn't detected and file is still in ansi, so characters are not displayed properly.
Something like that isn't happening in program like notpead++, where utf8 is well supported and recognized all the time, and I don't have to manually select encoding.
User avatar
Karlis
Site Admin
Posts: 3605
Joined: Mon Jul 15, 2002 5:24 pm
Location: Riga, Latvia, Europe
Contact:

Re: UTF8/ANSI encoding bug

Post by Karlis »

1) As you are saying that you have Unicode symbols included in the file and they are not detected (this should be impossible), please send me the file (software[ at ]latnet.lv). Unicode symbols are symbols that take two character spaces when viewed with non-unicode program and these should be detected 100%. I tried to create empty file with only "ęółśążźćń", I saved the file as UTF-8 without BOM and it was detected properly. So these symbols DO get detected. The will not be detected if you have entered and saved them from a non-unicode program.

2) A workaround in your case could be placing

Code: Select all

<meta http-equiv="content-type" content="text/html; charset=utf-8" /> 
tag (even in a comment, will do) as

Code: Select all

header('Content-Type: text/html; charset=utf-8');

is currently not supported.
Karlis Blumentals
Blumentals Software
www.blumentals.net
athei
Posts: 9
Joined: Fri Aug 01, 2008 7:55 pm

Re: UTF8/ANSI encoding bug

Post by athei »

Code: Select all

    header('Content-Type: text/html; charset=utf-8');
is currently not supported.
- and that's the reason!
1) As you are saying that you have Unicode symbols included in the file and they are not detected (this should be impossible),
- I'm afraid You don't understand me.
Read carefully my first post, I have

Code: Select all

<?php
header('Content-Type: text/html; charset=utf-8');
echo 'ęółśążźćń';
?>
1.Save as utf8 without boom. - In browser this characters are displaying ok.
2. Delete utf8 chars and echo simple ascii chars so this file looks like

Code: Select all

<?php
header('Content-Type: text/html; charset=utf-8');
echo 'test';
?>
[CTRL+S] or save. File is still detected at bottom as utf8 without boom.
3.Close and reopen file (or restart program) - this file is recognized as ANSI
4. echo some utf8 chars so this file looks like

Code: Select all

<?php
header('Content-Type: text/html; charset=utf-8');
echo 'ęółśążźćń';
?>
[CTRL+S] (or save and restart)
In browser this characters aren't displaying ok, because file is threaten as ANSI not UTF8. And I have to manually from menu save as and select utf8 encoding.

I hope You understand me now.
The solution is to have meta charset or manually save file from menu.
User avatar
Karlis
Site Admin
Posts: 3605
Joined: Mon Jul 15, 2002 5:24 pm
Location: Riga, Latvia, Europe
Contact:

Re: UTF8/ANSI encoding bug

Post by Karlis »

I got you now ;) Well, we could try to detect UTF-8 just from "charset=utf-8" but I wonder whhether it would cause any problems as it may happen that a non UTF-8 file accidentally has "charset=utf-8" string in it. So I am not sure whhether it would be good.
Karlis Blumentals
Blumentals Software
www.blumentals.net
Post Reply