Plugins: How to set encoding for TStringList -> SaveToFile?

Post your questions and problem reports here

Moderator: kfury77

Forum rules
Please try to follow these guidelines. This will help to receive faster and more accurate response.

  • Check the Support section of the corresponding product first. Chances are you will find your answer there;
  • Do not create new topics for already reported problems. Add your comments to the existing topics instead;
  • Create separate topic for each problem request. Do NOT post a number of non-related problem reports in a single topic;
  • Give your topic a meaningful title. Titles such as "A question," "Bug report" and "Help!" provide others no clue what your message is about;
  • Include the version number of the software you are using;
  • This is not an official customer support helpdesk. If you need a prompt and official response, please contact our support team directly instead. It may take a while until you receive a reply in the forum;

Plugins: How to set encoding for TStringList -> SaveToFile?

Postby pmk65 » Sun Mar 12, 2017 4:15 pm

Im having problems setting encoding for ANSI/UTF-8 documents when using TStringList -> SaveToFile

I tried various ways, but none of them worked.
(Like adding a 2nd parameter to SaveToFile or settting TStringList.DefaultEncoding / TStringList.Encoding.)

So how do I set the encoding for TStringList SaveToFile, so I can save in specific ANSI/UTF-8 formats?

Text in editor before I copy it to TStringList and save it using SaveToFile:
Code: Select all
.test {
   content: "bøllehår";
}


Then when I load the content back into TStringList using LoadFromFile and paste it back in editor, it looks like this:
Code: Select all
.test {
   content: "b�lleh�r";
}


Both extended characters ("ø" and "å") are turned into the SAME 3 character sequence ("�", hex: EF BF BD). So it's not possible to fix it "manually" either.

Doesn't matter if the content in editor is ANSI or UTF-8. Both produces the same buggy result.
There are 10 types of people in the world: Those who understand binary and those who don't.
User avatar
pmk65
 
Posts: 678
Joined: Sun Dec 20, 2009 9:58 pm
Location: Copenhagen, Denmark

Re: Plugins: How to set encoding for TStringList -> SaveToFi

Postby pmk65 » Wed Mar 15, 2017 2:34 am

I tried using instead TFileStream instead, but I can only get it to write 1 char/byte to file no matter what I do.

My code:
Code: Select all
var len = Length(contents);
FS = new TFileStream(filename, fmCreate);
FS.Size = len;
FS.Write(contents, len);
delete FS;
There are 10 types of people in the world: Those who understand binary and those who don't.
User avatar
pmk65
 
Posts: 678
Joined: Sun Dec 20, 2009 9:58 pm
Location: Copenhagen, Denmark

Re: Plugins: How to set encoding for TStringList -> SaveToFi

Postby Aivars » Thu Mar 16, 2017 4:18 pm

Don't set FS.Size, just use FS.Write. Since "contents" is UTF-16 string internally, you will need to use 2 x length to write it correctly, but it still won't be UTF8, which is probably what you need. I could add extra method(s) for TStrings to save in various encodings and/or export this function for pluginscripts: function SaveStringToFile(s, FName: string; Encoding: TEncoding): boolean;
Blumentals Software Programmer
User avatar
Aivars
Blumentals Software Developer
 
Posts: 2452
Joined: Thu Aug 22, 2002 1:40 pm
Location: Latvia

Re: Plugins: How to set encoding for TStringList -> SaveToFi

Postby pmk65 » Thu Mar 16, 2017 8:22 pm

The TFileStream was just an experiment to see if that would help.
Thanks for the info about UTF-16. That explains a lot.

Adding an encoding method would really help. (In newer versions of Delphi, both "SaveToFile" and "LoadFromFile" has a 2nd parameter to set encoding. So there might be a need for a "Load" command with encoding too.)
It seems really stupid that they have implemented LoadFromFile and SaveToFile without a way to set the file encoding.

So "SaveStringToFile" would be a new command, where you don't have to setup a TStringList first?

Currently I "solved" the problem in my Beautifier and CSSComb plugins, by converting all Unicode chars to Entities before saving. (Not an optimal solution, but at least it works.)
There are 10 types of people in the world: Those who understand binary and those who don't.
User avatar
pmk65
 
Posts: 678
Joined: Sun Dec 20, 2009 9:58 pm
Location: Copenhagen, Denmark

Re: Plugins: How to set encoding for TStringList -> SaveToFi

Postby Aivars » Fri Mar 17, 2017 1:17 pm

The problem is that FastScript does not support optional parameters, so it would have to be a new method, otherwise just adding the second parameter would break the old scripts.

Yes, SaveStringToFile will be a new global function, no need for any objects. I've already added this to my todo list.
Blumentals Software Programmer
User avatar
Aivars
Blumentals Software Developer
 
Posts: 2452
Joined: Thu Aug 22, 2002 1:40 pm
Location: Latvia

Re: Plugins: How to set encoding for TStringList -> SaveToFi

Postby pmk65 » Tue Mar 21, 2017 7:53 pm

I have been experimenting a bit more with "CreateOleObject", and came up with these two functions:

Code: Select all
/**
* Save data to file
*
* @param  string   file The path/name of file
* @param  string   data The content to save
* @param  int      enc The document encoding (Webuilder numeric format)
*
* @return void
*/
function SaveToFile(file, data, enc) {

    switch (enc) {
        case 0:   charset = "iso-8859-1";   // ANSI
        case 1,2: charset = "utf-8";      // UTF-8 & UTF-8 without BOM
        default:  charset = "utf-16";   // UTF-8
    }

    var stream = CreateOleObject("ADODB.Stream");
    stream.Open;
    stream.Position = 0;
    stream.CharSet = charset;
    stream.Type = 2;              // 1 = binary, 2 = text
    stream.WriteText(data);
    stream.SaveToFile(file, 2);    // 1 don't overwrite, 2= overwrite
    stream.Close;
}

/**
* Load data to file
*
* @param  string   file The path/name of file
* @param  int      enc The document encoding (Webuilder numeric format)
*
* @return string   the content of the file
*/
function FileGetContents(file, enc) {

    switch (enc) {
        case 0:   charset = "iso-8859-1";   // ANSI
        case 1,2: charset = "utf-8";      // UTF-8 & UTF-8 without BOM
        default:  charset = "utf-16";   // UTF-8
    }

    var stream = CreateOleObject("ADODB.Stream");
    stream.Open;
    stream.Position = 0;
    stream.CharSet = charset;
    stream.Type = 2;              // 1 = binary, 2 = text
    stream.LoadFromFile(file);
    var data =stream.ReadText();
    stream.Close;
   return data;
}


With those, I can load/save the Editor content in ANSI, UTF-8 and UTF-16 formats.
There are 10 types of people in the world: Those who understand binary and those who don't.
User avatar
pmk65
 
Posts: 678
Joined: Sun Dec 20, 2009 9:58 pm
Location: Copenhagen, Denmark


Return to HTMLPad / Rapid CSS / Rapid PHP / WeBuilder Support

Who is online

Users browsing this forum: No registered users and 1 guest