(Wide)String - storing in TFileStream, Delphi 7. What is the fastest way?
Asked Answered
N

4

6

I'm using Delphi7 (non-unicode VCL), I need to store lots of WideStrings inside a TFileStream. I can't use TStringStream as the (wide)strings are mixed with binary data, the format is projected to speed up loading and writing the data ... However I believe that current way I'm loading/writing the strings might be a bottleneck of my code ...

currently I'm writing length of a string, then writing it char by char ... while loading, first I'm loading the length, then loading char by char ...

So, what is the fastest way to save and load WideString to TFileStream?

Thanks in advance

Nashner answered 30/8, 2009 at 15:20 Comment(4)
Changing a particular area of your code because you believe it might be the bottleneck can be a huge waste of time. You should measure first, there are a lot of tools to help you there, some free, some commercial. Try these first for some links: #292131 and stackoverflow.com/questions/368938/delphi-profiling-toolsImpartial
Thanks, but I was using QueryPerformanceCounter to detect that ;) anyway that was the bottleneck for sure, as reading char by char is very slow... all the other operations were just saving some short binary data.Nashner
Ah, OK. I was just reacting on your use of the words "believe" and "might", sorry then for the preaching ;-)Impartial
You're welcome, probably I misused those words a bit, my English is very poor ;)Nashner
E
6

Rather than read and write one character at a time, read and write them all at once:

procedure WriteWideString(const ws: WideString; stream: TStream);
var
  nChars: LongInt;
begin
  nChars := Length(ws);
  stream.WriteBuffer(nChars, SizeOf(nChars);
  if nChars > 0 then
    stream.WriteBuffer(ws[1], nChars * SizeOf(ws[1]));
end;

function ReadWideString(stream: TStream): WideString;
var
  nChars: LongInt;
begin
  stream.ReadBuffer(nChars, SizeOf(nChars));
  SetLength(Result, nChars);
  if nChars > 0 then
    stream.ReadBuffer(Result[1], nChars * SizeOf(Result[1]));
end;

Now, technically, since WideString is a Windows BSTR, it can contain an odd number of bytes. The Length function reads the number of bytes and divides by two, so it's possible (although not likely) that the code above will cut off the last byte. You could use this code instead:

procedure WriteWideString(const ws: WideString; stream: TStream);
var
  nBytes: LongInt;
begin
  nBytes := SysStringByteLen(Pointer(ws));
  stream.WriteBuffer(nBytes, SizeOf(nBytes));
  if nBytes > 0 then
    stream.WriteBuffer(Pointer(ws)^, nBytes);
end;

function ReadWideString(stream: TStream): WideString;
var
  nBytes: LongInt;
  buffer: PAnsiChar;
begin
  stream.ReadBuffer(nBytes, SizeOf(nBytes));
  if nBytes > 0 then begin
    GetMem(buffer, nBytes);
    try
      stream.ReadBuffer(buffer^, nBytes);
      Result := SysAllocStringByteLen(buffer, nBytes)
    finally
      FreeMem(buffer);
    end;
  end else
    Result := '';
end;

Inspired by Mghie's answer, have replaced my Read and Write calls with ReadBuffer and WriteBuffer. The latter will raise exceptions if they are unable to read or write the requested number of bytes.

Edwin answered 30/8, 2009 at 15:38 Comment(7)
Your second WriteWideString() version does not compile (missing typecast to PWideChar, missing paren), but more importantly it fails for empty strings. Your second ReadWideString() should also check for length 0 and simply return an empty string in that case.Impartial
I see no reason it wouldn't work for empty strings; SysStringByteLen returns zero for null pointers. The requirement to type-cast to PWideChar is because either SysStringByteLen is misdeclared to take a PWideChar instead of WideString, or BSTR is misdeclared to be PWideChar instead of WideString. Nonetheless, I've fixed that and addressed your other concerns, too. Thanks.Edwin
I did see a reason it might fail for strings having just one byte, though. With range checking enabled, the expression ws[1] should fail in that case. (Delphi QC bug 9425 and Free Pascal bug 0010013 affect whether it fails on any particular version.)Edwin
I tried your code with an empty string, in Delphi 4 and Delphi 2009, and on both a negative (error code) value was returned. This is on Windows XP 64.Impartial
Try type-casting to Pointer instead, and feel free to edit and fix this answer if that works. I don't have Delphi handy to test it myself. I'm just going off what MSDN says.Edwin
Good catch. SysStringByteLen is indeed declared to take a PWideChar, and the type cast feeds it a pointer to an empty wide string with a bogus length - it should be 0 but isn't. That's the negative value I saw in my tests. Using a pointer cast works as expected, and I edited accordingly. Thanks for the comment.Impartial
Missing closing bracket on stream.WriteBuffer(nChars, SizeOf(nChars);Eyrie
I
6

There is nothing special about wide strings, to read and write them as fast as possible you need to read and write as much as possible in one go:

procedure TForm1.Button1Click(Sender: TObject);
var
  Str: TStream;
  W, W2: WideString;
  L: integer;
begin
  W := 'foo bar baz';

  Str := TFileStream.Create('test.bin', fmCreate);
  try
    // write WideString
    L := Length(W);
    Str.WriteBuffer(L, SizeOf(integer));
    if L > 0 then
      Str.WriteBuffer(W[1], L * SizeOf(WideChar));

    Str.Seek(0, soFromBeginning);
    // read back WideString
    Str.ReadBuffer(L, SizeOf(integer));
    if L > 0 then begin
      SetLength(W2, L);
      Str.ReadBuffer(W2[1], L * SizeOf(WideChar));
    end else
      W2 := '';
    Assert(W = W2);
  finally
    Str.Free;
  end;
end;
Impartial answered 30/8, 2009 at 15:39 Comment(0)
S
2

WideStrings contain a 'string' of WideChar's, which use 2 bytes each. If you want to store the UTF-16 (which WideStrings use internally) strings in a file, and be able to use this file in other programs like notepad, you need to write a byte order mark first: #$FEFF.

If you know this, writing can look like this:

Stream1.Write(WideString1[1],Length(WideString)*2); //2=SizeOf(WideChar)

reading can look like this:

Stream1.Read(WideChar1,2);//assert returned 2 and WideChar1=#$FEFF
SetLength(WideString1,(Stream1.Size div 2)-1);
Stream1.Read(WideString1[1],(Stream1.Size div 2)-1);
Symphonious answered 30/8, 2009 at 16:10 Comment(2)
He said he wants to store lots of strings, they're going to be intermixed with binary data, and they'll be prefixed by their lengths. Definitely not something to be used with Notepad. Your code dedicates the entire stream to a single string.Edwin
Code unconditionally accessing the first element of an empty string will cause access violations.Impartial
P
1

You can also use TFastFileStream for reading the data or strings, I pasted the unit at http://pastebin.com/m6ecdc8c2 and a sample below:

program Project36;

{$APPTYPE CONSOLE}

uses
  SysUtils, Classes,
  FastStream in 'FastStream.pas';

const
  WideNull: WideChar = #0;

procedure WriteWideStringToStream(Stream: TFileStream; var Data: WideString);
var
  len: Word;
begin
  len := Length(Data);
  // Write WideString length
  Stream.Write(len, SizeOf(len));
  if (len > 0) then
  begin
    // Write WideString
    Stream.Write(Data[1], len * SizeOf(WideChar));
  end;
  // Write null termination
  Stream.Write(WideNull, SizeOf(WideNull));
end;

procedure CreateTestFile;
var
  Stream: TFileStream;
  MyString: WideString;
begin
  Stream := TFileStream.Create('test.bin', fmCreate);
  try
    MyString := 'Hello World!';
    WriteWideStringToStream(Stream, MyString);

    MyString := 'Speed is Delphi!';
    WriteWideStringToStream(Stream, MyString);
  finally
    Stream.Free;
  end;
end;

function ReadWideStringFromStream(Stream: TFastFileStream): WideString;
var
  len: Word;
begin
  // Read length of WideString
  Stream.Read(len, SizeOf(len));
  // Read WideString
  Result := PWideChar(Cardinal(Stream.Memory) + Stream.Position);
  // Update position and skip null termination
  Stream.Position := Stream.Position + (len * SizeOf(WideChar)) + SizeOf(WideNull);
end;

procedure ReadTestFile;
var
  Stream: TFastFileStream;

  my_wide_string: WideString;
begin
  Stream := TFastFileStream.Create('test.bin');
  try
    Stream.Position := 0;
    // Read WideString
    my_wide_string := ReadWideStringFromStream(Stream);
    WriteLn(my_wide_string);
    // Read another WideString
    my_wide_string := ReadWideStringFromStream(Stream);
    WriteLn(my_wide_string);
  finally
    Stream.Free;
  end;
end;

begin
  CreateTestFile;
  ReadTestFile;
  ReadLn;
end.
Professorate answered 30/8, 2009 at 18:52 Comment(2)
Note: That code won't work if the string to be read contains any null characters.Edwin
Code unconditionally accessing the first element of an empty string will cause access violations.Impartial

© 2022 - 2024 — McMap. All rights reserved.