TVirtualStringTree - resetting non-visual nodes and memory consumption
Asked Answered
Z

5

9

I have an app that loads records from a binary log file and displays them in a virtual TListView. There are potentially millions of records in a file, and the display can be filtered by the user, so I do not load all of the records in memory at one time, and the ListView item indexes are not a 1-to-1 relation with the file record offsets (List item 1 may be file record 100, for instance). I use the ListView's OnDataHint event to load records for just the items the ListView is actually interested in. As the user scrolls around, the range specified by OnDataHint changes, allowing me to free records that are not in the new range, and allocate new records as needed.

This works fine, speed is tolerable, and the memory footprint is very low.

I am currently evaluating TVirtualStringTree as a replacement for the TListView, mainly because I want to add the ability to expand/collapse records that span multiple lines (I can fudge it with the TListView by incrementing/decrementing the item count dynamically, but this is not as straight forward as using a real tree).

For the most part, I have been able to port the TListView logic and have everything work as I need. I notice that TVirtualStringTree's virtual paradigm is vastly different, though. It does not have the same kind of OnDataHint functionality that TListView does (I can use the OnScroll event to fake it, which allows my memory buffer logic to continue working), and I can use the OnInitializeNode event to associate nodes with records that are allocated.

However, once a tree node is initialized, it sees that it remains initialized for the lifetime of the tree. That is not good for me. As the user scrolls around and I remove records from memory, I need to reset those non-visual nodes without removing them from the tree completely, or losing their expand/collapse states. When the user scrolls them back into view, I can re-allocate the records and re-initialize the nodes. Basically, I want to make TVirtualStringTree act as much like TListView as possible, as far as its virtualization is concerned.

I have seen that TVirtualStringTree has a ResetNode() method, but I encounter various errors whenever I try to use it. I must be using it wrong. I also thought of just storing a data pointer inside each node to my record buffers, and I allocate and free memory, update those pointers accordingly. The end effect does not work so well, either.

Worse, my largest test log file has ~5 million records in it. If I initialize the TVirtualStringTree with that many nodes at one time (when the log display is unfiltered), the tree's internal overhead for its nodes takes up a whopping 260MB of memory (without any records being allocated yet). Whereas with the TListView, loading the same log file and all the memory logic behind it, I can get away with using just a few MBs.

Any ideas?

Zulemazullo answered 11/5, 2010 at 21:24 Comment(0)
A
1

You probably shouldn't switch to VST unless you have a use for at least some of the nice features of VST that a standard listbox / listview don't have. But there is of course a large memory overhead compared to a flat list of items.

I don't see a real benefit in using TVirtualStringTree only to be able to expand and collapse items that span multiple lines. You write

mainly because I want to add the ability to expand/collapse records that span multiple lines (I can fudge it with the TListView by incrementing/decrementing the item count dynamically, but this is not as straight forward as using a real tree).

but you can implement that easily without changing the item count. If you set the Style of the listbox to lbOwnerDrawVariable and implement the OnMeasureItem event you can adjust the height as required to draw either only the first or all lines. Drawing the expander triangle or the little plus symbol of a tree view manually should be easy. The Windows API functions DrawText() or DrawTextEx() can be used both to measure and draw the (optionally word-wrapped) text.

Edit:

Sorry, I completely missed the fact that you are using a listview right now, not a listbox. Indeed, there is no way to have rows with different heights in a listview, so that's no option. You could still use a listbox with a standard header control on top, but that may not support everything you are using now from listview functionality, and it may itself be as much or even more work to get right than dynamically showing and hiding listview rows to simulate collapsing and expanding.

Alphaalphabet answered 12/5, 2010 at 9:53 Comment(2)
I currently use a TListView in vsReport mode, not a TListBox. TListView does not support variable-height items in vsReport.Zulemazullo
I'm marking this one as the answer for now, but only because it tells me to not switch to VST.Zulemazullo
C
1

If I understand it correctly, the memory requirement of TVirtualStringTree should be:

nodecount * (SizeOf(TVirtualNode) + YourNodeDataSize + DWORD-align-padding)

To minimize the memory footprint, you could perhaps initialize the nodes with only pointers to offsets to a memory-mapped file. Resetting nodes which have already been initialized doesn't seem necessary in this case - the memory footprint should be nodecount * (44 + 4 + 0) - for 5 million records, about 230 MB.

IMHO you can't get any better with the tree but using a memory-mapped file would allow you to read the data directly from the file without allocating even more memory and copying the data to it.

You could also consider using a tree structure instead of a flat view to present the data. That way you could initialize child nodes of a parent node on demand (when the parent node is expanded) and resetting the parent node when it's collapsed (therefore freeing all its child nodes). In other words, try not to have too many nodes at the same level.

Conjunctive answered 12/5, 2010 at 9:13 Comment(1)
In fact, I already do use a memory mapped file, but due to the possible size of the file (can be up to 1+ GB), and the way the display filtering is implemented, I do not map the entire file into memory at one time. As for implementing a tree structure in code, that will not help very much as most of the log file items will not have child nodes. That is why a ListView is currently being used for the display. Supporting child nodes is to accomodate a feature where multi-line log entries are stored in the log file as separate records that I would merge together in memory for easier managementZulemazullo
A
1

You probably shouldn't switch to VST unless you have a use for at least some of the nice features of VST that a standard listbox / listview don't have. But there is of course a large memory overhead compared to a flat list of items.

I don't see a real benefit in using TVirtualStringTree only to be able to expand and collapse items that span multiple lines. You write

mainly because I want to add the ability to expand/collapse records that span multiple lines (I can fudge it with the TListView by incrementing/decrementing the item count dynamically, but this is not as straight forward as using a real tree).

but you can implement that easily without changing the item count. If you set the Style of the listbox to lbOwnerDrawVariable and implement the OnMeasureItem event you can adjust the height as required to draw either only the first or all lines. Drawing the expander triangle or the little plus symbol of a tree view manually should be easy. The Windows API functions DrawText() or DrawTextEx() can be used both to measure and draw the (optionally word-wrapped) text.

Edit:

Sorry, I completely missed the fact that you are using a listview right now, not a listbox. Indeed, there is no way to have rows with different heights in a listview, so that's no option. You could still use a listbox with a standard header control on top, but that may not support everything you are using now from listview functionality, and it may itself be as much or even more work to get right than dynamically showing and hiding listview rows to simulate collapsing and expanding.

Alphaalphabet answered 12/5, 2010 at 9:53 Comment(2)
I currently use a TListView in vsReport mode, not a TListBox. TListView does not support variable-height items in vsReport.Zulemazullo
I'm marking this one as the answer for now, but only because it tells me to not switch to VST.Zulemazullo
U
1

To meet your requirement "to expand/collapse records that span multiple lines", I'd simply use a drawgrid. To check it out, drag a drawgrid onto a form, then plug in the following Delphi 6 code. You can collapse and expand 5,000,000 multiline records (or whatever quantity you want) with essentially no overhead. It's a simple technique, doesn't require much code, and works surprisingly well.


unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms, Dialogs, Grids, StdCtrls;

type
  TForm1 = class(TForm)
    DrawGrid1: TDrawGrid;
    procedure DrawGrid1DrawCell(Sender: TObject; ACol, ARow: Integer; Rect: TRect; State: TGridDrawState);
    procedure DrawGrid1SelectCell(Sender: TObject; ACol, ARow: Integer; var CanSelect: Boolean);
    procedure DrawGrid1TopLeftChanged(Sender: TObject);
    procedure DrawGrid1DblClick(Sender: TObject);
    procedure FormCreate(Sender: TObject);
  private
    procedure AdjustGrid;
  end;

var
  Form1: TForm1;

implementation

{$R *.dfm}

// Display a large number of multi-line records that can be expanded or collapsed, using minimal overhead.
// LinesInThisRecord() and RecordContents() are faked; change them to return actual data.

const TOTALRECORDS = 5000000; // arbitrary; a production implementation would probably determine this at run time

// keep track of whether each record is expanded or collapsed
var isExpanded: packed array[1..TOTALRECORDS] of boolean; // initially all FALSE

function LinesInThisRecord(const RecNum: integer): integer;
begin // how many lines (rows) does the record need to display when expanded?
result := (RecNum mod 10) + 1; // make something up, so we don't have to use real data just for this demo
end;

function LinesDisplayedForRecord(const RecNum: integer): integer;
begin // how many lines (rows) of info are we currently displaying for the given record?
if isExpanded[RecNum] then result := LinesInThisRecord(RecNum) // all lines show when expanded
else result := 1; // show only 1 row when collapsed
end;

procedure GridRowToRecordAndLine(const RowNum: integer; var RecNum, LineNum: integer);
var LinesAbove: integer;
begin // for a given row number in the drawgrid, return the record and line numbers that appear in that row
RecNum := Form1.DrawGrid1.TopRow; // for simplicity, TopRow always displays the record with that same number
if RecNum > TOTALRECORDS then RecNum := 0; // avoid overflow
LinesAbove := 0;
while (RecNum > 0) and ((LinesDisplayedForRecord(RecNum) + LinesAbove) < (RowNum - Form1.DrawGrid1.TopRow + 1)) do
  begin // accumulate the tally of lines in expanded or collapsed records until we reach the row of interest
  inc(LinesAbove, LinesDisplayedForRecord(RecNum));
  inc(RecNum); if RecNum > TOTALRECORDS then RecNum := 0; // avoid overflow
  end;
LineNum := RowNum - Form1.DrawGrid1.TopRow + 1 - LinesAbove;
end;

function RecordContents(const RowNum: integer): string;
var RecNum, LineNum: integer;
begin // display the data that goes in the grid row.  for now, fake it
GridRowToRecordAndLine(RowNum, RecNum, LineNum); // convert row number to record and line numbers
if RecNum = 0 then result := '' // out of range
else
  begin
  result := 'Record ' + IntToStr(RecNum);
  if isExpanded[RecNum] then // show line counts too
    result := result + ' line ' + IntToStr(LineNum) + ' of ' + IntToStr(LinesInThisRecord(RecNum));
  end;
end;

procedure TForm1.AdjustGrid;
begin // don't allow scrolling past last record
if DrawGrid1.TopRow > TOTALRECORDS then DrawGrid1.TopRow := TOTALRECORDS;
if RecordContents(DrawGrid1.Selection.Top) = '' then // move selection back on to a valid cell
  DrawGrid1.Selection := TGridRect(Rect(0, TOTALRECORDS, 0, TOTALRECORDS));
DrawGrid1.Refresh;
end;

procedure TForm1.DrawGrid1DrawCell(Sender: TObject; ACol, ARow: Integer; Rect: TRect; State: TGridDrawState);
var s: string;
begin // time to draw one of the grid cells
if ARow = 0 then s := 'Data' // we're in the top row, get the heading for the column
else s := RecordContents(ARow); // painting a record, get the data for this cell from the appropriate record
// draw the data in the cell
ExtTextOut(DrawGrid1.Canvas.Handle, Rect.Left, Rect.Top, ETO_CLIPPED or ETO_OPAQUE, @Rect, pchar(s), length(s), nil);
end;

procedure TForm1.DrawGrid1SelectCell(Sender: TObject; ACol, ARow: Integer; var CanSelect: Boolean);
var RecNum, ignore: integer;
begin
GridRowToRecordAndLine(ARow, RecNum, ignore); // convert selected row number to record number
CanSelect := RecNum <> 0; // don't select unoccupied rows
end;

procedure TForm1.DrawGrid1TopLeftChanged(Sender: TObject);
begin
AdjustGrid; // keep last page looking good
end;

procedure TForm1.DrawGrid1DblClick(Sender: TObject);
var RecNum, ignore, delta: integer;
begin // expand or collapse the currently selected record
GridRowToRecordAndLine(DrawGrid1.Selection.Top, RecNum, ignore); // convert selected row number to record number
isExpanded[RecNum] := not isExpanded[RecNum]; // mark record as expanded or collapsed; subsequent records might change their position in the grid
delta := LinesInThisRecord(RecNum) - 1; // amount we grew or shrank (-1 since record already occupied 1 line)
if isExpanded[RecNum] then // just grew
else delta := -delta; // just shrank
DrawGrid1.RowCount := DrawGrid1.RowCount + delta; // keep rowcount in sync
AdjustGrid; // keep last page looking good
end;

procedure TForm1.FormCreate(Sender: TObject);
begin
Caption := FormatFloat('#,##0 records', TOTALRECORDS);
DrawGrid1.RowCount := TOTALRECORDS + 1; // +1 for column heading
DrawGrid1.ColCount := 1;
DrawGrid1.DefaultColWidth := 300; // arbitrary
DrawGrid1.DefaultRowHeight := 12; // arbitrary
DrawGrid1.Options := DrawGrid1.Options - [goVertLine, goHorzLine, goRangeSelect] + [goDrawFocusSelected, goThumbTracking]; // change some defaults
end;

end.

Underhung answered 22/6, 2010 at 4:52 Comment(10)
You are adjusting the grid's RowCount based on how many rows need to be added/removed. I stated in my original message that I already knew about that technique: "I can fudge it with the TListView by incrementing/decrementing the item count dynamically". That is the route I am going to have to take.Zulemazullo
@Remy: but you also said that was "not straight forward". with a drawgrid, expand/collapse is about as simple as you can get, plus you never have any loading/allocating/freeing/nodes/parents/children/etc to worry about. if listview works but is messy, and drawgrid works and is simple, go with the drawgrid. be sure to try the sample code. if you compare it side-by-side to the code necessary for the same functionality in listview, i think most coders would find the drawgrid solution easier to follow/maintain/modify/etc.Underhung
Unfortunately, due to the dynamic nature of my data loading and memory management, it is not a simple solution whether I use a ListView or a DrawGrid (also, being a custom control, a DrawGrid has extra internal overhead that a standard ListView does not).Zulemazullo
@Remy: sorry,i can't guess what data/memory issues prevent using a simple drawgrid. it sounded like all u needed to do was browse a filtered log file. i would just make RecordContents() seek and read the record of interest, possibly via an index, and never load or manage anything. i also don't know what u mean by "custom" or "extra overhead". to me, a listview is like a stringgrid, but a drawgrid lets you avoid overhead + do jit data fetches. i use it for instant browsing of a file of 50 million recs, with no loading/memory headaches at all. what does the sample code not do for you?Underhung
When dealing with multi-GB files, doing random seeks and reads on individual records of the file can very slow, especially if the seeks are large. My code has a complex buffering system so only a portion of the file data is accessed and displayed onscreen at a time (I also have to take into account that multi-line text spans multiple records in the log file, so the buffering code takes care of putting them back together)...Zulemazullo
... This works very well with very little overhead in my app right now, but as the cost that I do not get to expand/collapse multi-line text because I have to display each line in a separate ListView item instead of grouping them together.Zulemazullo
TDrawGrid is a TCustomControl descendant, which means it is implemented completely from scratch by Borland, so all of its overhead is in the VCL. And I would have the overhead of having to load all of my data at one time, or doing slow seek+read operations. TListView is a wrapper around the standard Win32 API ListView control, so all its overhead is inside the OS, and its virtual mode is optimized for handling large amounts of data. Your sample code does not actually deal with any file I/O. Try adding that and see how well it performs with millions of records.Zulemazullo
@Remy: you would think "seeks and reads" would be slow, but they're really not for humans browsing logs. try the sample code and you'll see. if you like "complex buffering", use it, but such code should be independent and segregated from whatever control is used. "I do not get to expand/collapse". note that the sample code expands/collapses with a double-click. try it. (continued...)Underhung
@Remy: your drawgrid vs. listview comparison isn't convincing. drawgrid is open source, with no preload requirement, and simple. your claim seeks and reads would be too slow is a guess, not a measurement. data access is all done from one simple RecordContents() routine, so any buffering or index method can still be bolted on without changing the component. listview is closed source, more complicated, and poses multiline problems. drawgrid sounds better. (continued...)Underhung
@Remy: "Your sample...does not actually deal with any...I/O" because i don't know the format of your log file. go ahead, add the 2 lines of code necessary to fetch your records: a seek and a read. "see how well it performs with millions of records". like i said, i'm using drawgrid to browse 50 million records, and it works great. i can lead you to the water, but i can't make u drink. try the sample to see the collapsing/exanding on fake lines. then stick a seek and a read in RecordContents() to directly grab your actual log records. i think you'll be surprised how nicely it works.Underhung
S
0

You shouldn't use ResetNode because this method invokes InvalidateNode and initializes node again, leading to opposite effect than expected. I don't know if it's possible to induce VST to free memory size specified in NodeDataSize without actually removing node. But why not set NodeDataSize to size of Pointer ( Delphi, VirtualStringTree - classes (objects) instead of records ) and manage data yourself? Just an idea...

Snuggle answered 12/5, 2010 at 9:11 Comment(2)
That is what I do right now. In fact, I am currently using NodeDataSize=0, but the app still uses up 200+ MB of memory. This is because the bare minimum size of TVirtualNode itself is 44 bytes, so multiplied by 5 million nodes is (44 x 5000000) = 220000000 = 214843.75 Kb = 209.81 MB, and that is just for the TVirtualNode memory alone, not including the overhead that the VCL memory manager uses on top of that.Zulemazullo
I think it's possible in VST that child nodes looks exactly like their parent node so it's imperceptible to user that they are child nodes. You could add five thousands top-level nodes which every has one thousand child nodes.Snuggle
C
0

Give "DeleteChildren" a try. Here's what this procedure's comment says:

// Removes all children and their children from memory without changing the vsHasChildren style by default.

Never used it, but as I read it, you can use that in the OnCollapsed event to free the memory allocated to nodes that just became invisible. And then re-generate those nodes in OnExpading so the user never knows the node went away from memory.

But I can't be sure, I never had a need for such behaviour.

Coacher answered 12/5, 2010 at 11:6 Comment(4)
DeleteChildren() does not address the concerns I have. Most of the nodes would not have any child nodes to begin with. I was hoping to add support for the few ones that would. What I need is to free memory for the top-level nodes that become invisible while scrolling.Zulemazullo
All visible nodes in TVirtualTree require an TVirtualNode because all the events used to get stuff related to the nodes (ex: CellText) expect an PVirtualNode as the only way to identify the Node. If that's a problem for you then you might want to look into the TVTNodeMemoryManager (and maybe hack it to be backed by a Memory Mapped File) OR go back to your initial ListView solution as suggested by mghie. And there's always the "100% custom control" solution, considering your very special needs! Making your own might be faster then hacking the alternatives.Coacher
Although a custom control would be nice, I don't have that kind of spare time to develop one. I'll look at TVTNodeMemoryManager, but I don't think it will help much since all of the physical TVirtualNode items still have to be allocated in order for the tree to function.Zulemazullo
@Remy or Else: I am across about this issue, while TListView supports only visually only a given string length as 256. So an OwnerData VT String component is required as on top of VT GetText procedure.Conformist

© 2022 - 2024 — McMap. All rights reserved.