Getting "ÿþI" as output data when reading from a .log file using delphi
Asked Answered
G

3

6

I am trying to read data from a .log file and process its contents. The log file is created by another application. When I use the readln command in Delphi and display the contents of the file in a memo, I only get the one line of data (ÿþI) from a file with over 6000 lines of data.

    procedure TForm1.Button1Click(Sender: TObject);
    Var
        F : TextFile;
        s : string;
    begin
        AssignFile(F, 'data.log');
        Reset(F);

        while not Eof(F) do
        begin
            Readln(F, s);
            Memo1.Lines.Add(s);
        end;
    end;

Does anyone know what the problem might be?

Gershom answered 6/10, 2011 at 11:59 Comment(2)
Probably an encoding error. You should check what encoding the file is in and adapt your program to handle that (or convert the file, whichever is easier in the long run).Farl
It's not foolproof but any decent text editor, and even some non-decent ones (like Notepad), will attempt to guess encodings. So if the data looks fine in Notepad, click file-save-as and see what it guessed. That's a good initial guess, anyways.Lush
A
4

As Michael said, you are dealing with a UTF-16 encoded file, so you will have to load and decode it manually. There are various WideString-based TStringList-like classes floating around online, or Borland has its own implementation in the WideStrings unit, try using one of them instead of Pascal file I/O, eg:

procedure TForm1.Button1Click(Sender: TObject);
var
  SL : TWideStringList;
  I: Integer;
  s : string;
begin
  SL := TWideStringList.Create;
  try
    SL.LoadFromFile('data.log');
    Memo1.Lines.BeginUpdate;
    try
      for I := 0 to SL.Count-1 do
        Memo1.Lines.Add(SL[I]);
    finally
      Memo1.Lines.EndUpdate;
    end;
  finally
    SL.Free;
  end;
end; 

Or:

uses
  .., WideStrings;

procedure TForm1.Button1Click(Sender: TObject);
var
  SL : TWideStringList;
begin
  SL := TWideStringList.Create;
  try
    SL.LoadFromFile('data.log');
    Memo1.Lines.Assign(SL);
  finally
    SL.Free;
  end;
end; 

Alternatively, install a copy of TNTWare or TMS, which both have Unicode-enabled components. Then you should be able to just LoadFromFile() the .log file directly into whicher Unicode Memo component you chose to use.

Allegro answered 6/10, 2011 at 20:11 Comment(3)
TNT Unicode Controls (the "free" version) is hosted here tntunicodecontrolsDoriandoric
Or use TWideStringList from the standard WideStrings.pas (included since at least D2006). However it doesn't seem to handle the BOMZanezaneski
TWideStringList didn't used to be in its own standalone unit, which is why I didn't mention it before. It used to be part of another Borland internet framework that may or may not be installed. Good to know that they eventually separated it out for common use.Allegro
S
4

You're dealing with a UTF-16 file (as evidenced by the first two characters), and Delphi 2007 is not prepared for that, so it stops reading on the first $0 byte, because Readln thinks the line ends there.

You'll need to use a different method of reading the file, and you'll have to read into a WideString (and probably convert that to a string). Since Delphi 2007 is not properly Unicode-capable, I think you'll also have to do your own line splitting, but I don't have that available here, so I'm not completely certain.

Seanseana answered 6/10, 2011 at 13:0 Comment(1)
A gross hack would be: Read entire contents into WideString. Strip BOM. Memo.Lines.Text := MyWideString.Pythagorean
A
4

As Michael said, you are dealing with a UTF-16 encoded file, so you will have to load and decode it manually. There are various WideString-based TStringList-like classes floating around online, or Borland has its own implementation in the WideStrings unit, try using one of them instead of Pascal file I/O, eg:

procedure TForm1.Button1Click(Sender: TObject);
var
  SL : TWideStringList;
  I: Integer;
  s : string;
begin
  SL := TWideStringList.Create;
  try
    SL.LoadFromFile('data.log');
    Memo1.Lines.BeginUpdate;
    try
      for I := 0 to SL.Count-1 do
        Memo1.Lines.Add(SL[I]);
    finally
      Memo1.Lines.EndUpdate;
    end;
  finally
    SL.Free;
  end;
end; 

Or:

uses
  .., WideStrings;

procedure TForm1.Button1Click(Sender: TObject);
var
  SL : TWideStringList;
begin
  SL := TWideStringList.Create;
  try
    SL.LoadFromFile('data.log');
    Memo1.Lines.Assign(SL);
  finally
    SL.Free;
  end;
end; 

Alternatively, install a copy of TNTWare or TMS, which both have Unicode-enabled components. Then you should be able to just LoadFromFile() the .log file directly into whicher Unicode Memo component you chose to use.

Allegro answered 6/10, 2011 at 20:11 Comment(3)
TNT Unicode Controls (the "free" version) is hosted here tntunicodecontrolsDoriandoric
Or use TWideStringList from the standard WideStrings.pas (included since at least D2006). However it doesn't seem to handle the BOMZanezaneski
TWideStringList didn't used to be in its own standalone unit, which is why I didn't mention it before. It used to be part of another Borland internet framework that may or may not be installed. Good to know that they eventually separated it out for common use.Allegro
Z
0

As mentioned in my comment to Remy's answer, there is a TWideStrings/TWideStringList declared in WideStrings:

uses WidesStrings;
//...
var
  Ws: TWideStrings;
  s: string;
  i: Integer;
begin
  Ws := TWideStringList.Create;
  try
    ws.LoadFromFile('C:\temp\UniTest.txt');
    for i := 0 to ws.Count - 1 do
    begin
      s := ws[i];
      Memo1.Lines.Add(s);
    end;
  finally
    ws.Free;
  end;
end;

Note however that is isn't a TStrings descendant, so it can't be directly assigned to TStrings properties like TMemo.Lines, you have to add them one by one.

It also doesn't seem to handle the BOM (your ÿþ) or big-endian encoding

Zanezaneski answered 7/10, 2011 at 0:24 Comment(3)
Actually, a TWideStringList from the WideStrings unit can be directly Assign()ed to any TStrings object (and vice versa), as TWideStringList overrides the virtual Assign() and AssignTo() methods to support exactly that.Allegro
@Remy: Hmm, my attempt to assign Memo1.Lines := Ws; failed. I see that TWideString.Assign handl;e TStrings, but not the other way in Delphi 2007 at least.Zanezaneski
Assigning a TWideStringList to a TStrings calls TStrings.Assign() first. TStrings.Assign() does not recognize TWideStringList so TPersistent.Assign() calls TWideStringList.AssignTo()next. In D2010 at least, TWideStringList.AssignTo() recognizes TStrings. Maybe that is not the case in D2007 yet.Allegro

© 2022 - 2024 — McMap. All rights reserved.