How can I save a object from TStringList class to file (Delphi XE 2) with UTF8 without BOM?
Asked Answered
C

2

7

When I save the object from TStringList class file content to a file, the file is saved with UTF-8 correctly but UTF-8 with BOM by default.

My code is:

myFile := TStringList.Create;
try
  myFile.Text := myData;
  myFile.saveToFile('myfile.dat', TEncoding.UTF8)
finally
  FreeAndNil(myFile);
end;

In the example the file "myfile.dat" appear as "UTF-8 BOM" encoding.

How can I save the file without BOM?

Clywd answered 28/8, 2015 at 15:33 Comment(0)
P
14

You simply have to set the property TStrings.WriteBOM to false.

The documentation tells us about this:

Will cause SaveToStream or SaveToFile to write a BOM.

Set WriteBOM to True to cause SaveToStream to write a BOM (byte-order mark) to the stream and to cause SaveToFile to write a BOM to the file.

Pozzy answered 29/8, 2015 at 12:51 Comment(3)
Thanks @sir-rufo this really a good exit. But is true with a new file. Is not true when reads a file before save (LoadFromFile...). Maybe problems with Delphi XE2. I will try to update the Delphi version. This is so strange.Clywd
I did not get the point because I did not see any line of code (XE8), where the property WriteBOM is set except in constructor and AssignTo which is as expected. The default value is true and will not change unless you do so.Pozzy
Just an FYI, TStrings.WriteBOM was added in Delphi XE, but TStrings first gained TEncoding support in D2009. But SirRufo is correct, WriteBOM will always be True, even after a load, unless you explicitly set it to False. The TStrings.LoadFrom...() methods do not change the value of WriteBOM, but they do change the value of the TStrings.Encoding property.Pyx
E
10

You can achieve this by creating your own encoding class descended from TUTF8Encoding and overriding the GetPreamble method :-

type
  TUTF8EncodingNoBOM = class(TUTF8Encoding)
  public
    function GetPreamble: TBytes; override;
  end;

function TUTF8EncodingNoBOM.GetPreamble: TBytes; begin SetLength(Result, 0); end;

Echovirus answered 28/8, 2015 at 15:48 Comment(6)
Andy_D In my test, used the class that you modified, the final file is saved in ANSI encoding.Clywd
@FabianoSilva Are you sure? How did you check output encoding?Gwen
Yes @GabrielF, at Notepad++ the output encoding is ANSI.Clywd
@FabianoSilva IIRC, by default, if a file contains no non-ANSI chars, notepad++ identifies it as ANSI (in this case, UTF8 and ANSI are the same thing). Try adding in some special chars before saving or change Notepad++ configuration to identify it as UTF8 w/o BOM (I don't remember where you change that).Gwen
@FabianoSilva I think it's under Setttings -> Preferences -> New document. If you choose Encoding as UTF8 w/o BOM and check Apply to opened ANSI files, Notepad++ will identify as UTF8 plain ASCII files.Gwen
@Gwen the Notepad++ deceived me. Closing and opening the file again made to show the correct encoding. Thanks.Clywd

© 2022 - 2024 — McMap. All rights reserved.