Separate Data Structure vs VirtualStringTree's PVirtualNodes to Store Data?
Asked Answered
B

1

7

So I have been messing with creating my own separate data structure. I finally got it working, but then I discovered that the memory usage was ridiculously high compared to the old method.

To test this, I created the same test app, but I would store the data in my PVirtualNodes.

When adding 1000 roots, with 1000 children each, the separate data structure was using around 208 MB, while the PVirtualNode one only used about 160 MB, and it was a wee bit faster as well.

I thought using a separate data structure was supposed to use less memory, and be faster, but I suppose that is the price instead?

Here is the source for the "Store data in PVirtualNode": http://pastebin.com/j6L2eHJt

Here is the source for the "Store data in Seperate Datastructure": http://pastebin.com/iSwR0hW1

Brevity answered 13/5, 2011 at 14:0 Comment(2)
Also be sure you measure the right memory. Not "memory usage" or "Workingset" from taskmanager, but "VM Size" or "Commit Size"Impinge
Or better yet use AQTime to measure your VCL object memory usage directly.Southwards
T
21

The first problem I see with your separate-data-structure code is that it incorrectly makes the category own the users. That means that if the same physical person is in two categories, it will appear in your data structure as though there are two completely separate people. You'll spend the lifetime of the project regretting that decision.

You need a list of users, and you need a list of categories. A user can contain a list of categories that it belongs to, or a category can contain a list of users that belong to it. Maybe both.

type
  TUser = class;
  TCategory = class;
  TContactList = class
  private
    FUsers: TObjectList<TUser>;
    FCategories: TObjectList<TCategory>;
  end;

  TUser = class
  private
    FCategories: TObjectList<TCategory>;
  public
    constructor Create(const DisplayName: string; SkypeID: Integer);
    property DisplayName: string;
    property SkypeID: Integer;
    property Categories: TObjectList<TCategory> read FCategories;
  end;

  TCategory = class
  private
    FUsers: TObjectList<TUser>;
  public
    constructor Create(const Name: string; ID: Integer);
    property Name: string;
    property ID: Integer;
    property Users: TObjectList<TUser> read FUsers;
  end;

constructor TContactList.Create;
begin
  // The contact list is the single master list of all contacts; it
  // owns the user objects, so set OwnsObjects = True
  FUsers := TObjectList<TUser>.Create(True);
  FCategories := TObjectList<TCategory>.Create(True);
end;

constructor TUser.Create;
begin
  // A user does not own its categories; set OwnsObjects = False
  FCategories := TObjectList<TCategory>.Create(False);
end;

constructor TCategory.Create;
begin
  // A category does not own its members; set OwnsObjects = False
  FUsers := TObjectList<TUser>.Create(False);
end;

Notice how this code makes no mention of the tree control. A contact list does not own a tree control. If it did, then you'd be right back where you started months ago, where you had the problem of how to display users in multiple tree controls. And also notice that a user can appear in more than category, but no single category owns that user. Instead, the contact list owns the user, and grants categories permission to refer to them, and vice versa.

One of your first problems when you started this project was how to detect duplicate elements in the tree control. Now that problem is much easier because there is no tree control at all. You just have to find duplicates in the flat user list. If you don't add duplicates to that list in the first place, then you don't have to worry about finding duplicates in a more complicated GUI control anymore.

Notice that the data structure is not a tree. It's two lists, and each item in the list can refer to any number of items from the opposite list. In that sense, it's actually a graph. You merely display the data as a tree because otherwise it's hard to visualize it.

So, now that you have a data structure that's independent of the tree controls, how do you link the tree to the data it's supposed to display? Each node should hold a reference to the TUser or TCategory that it represents. The node's data record could be defined like this:

type
  PNodeData = ^TNodeData;
  TNodeData = record
  case Integer of
    0: Obj: TObject;
    1: User: TUser;
    2: Category: TCategory;
  end;

You can use it to implement the tree's OnGetText event like this:

procedure TJeffForm.TreeGetText(Sender: TBaseVirtualTree; Node: PVirtualNode;
  Column: TColumnIndex; TextType: TVSTTextType; var CellText: string);
var
  Data: PNodeData;
begin
  if TextType = ttStatic then
    Exit;
  Data := Sender.GetNodeData(Node);
  Assert(Assigned(Data), 'Node wasn''t initialized properly');
  Assert(Assigned(Data.Obj), 'Node has no object');

  if Data.Obj is TUser then
    CellText := Data.User.DisplayName
  else if Data.Obj is TCategory then
    CellText := Data.Category.Name
  else
    CellText := Format('Unknown node type %s', [Data.Obj.ClassName]);
end;

That is, each node will contain either a user or a category. The first element of the array will contain a value that's one of those types, but since you don't know in advance which it will be, you have a third type that's safe to use as either one, TObject. It's important to have the required data in the first field of the record because that's the field you're allowed to set when you call AddNewNode, even before the node has been completely initialized.

One of the secrets to avoiding long node-creation times is to avoid creating the nodes you don't need. If a top-level node is collapsed, then you don't actually need to create any of its children. Just set the flag on the top node indicating that is has children, and it will correctly paint itself with the "+" button. If the user later clicks the button to expand the node, then the tree control will ask you how many children it has, at which point it will create them. And even then, it will only initialize the nodes that need to be painted immediately. Delay the work until it's necessary. Somebody with a million contacts will probably never want to see all of them at once, so there's no need to create a million items in your GUI control.

When your program starts, all you need to do, at first, is load the user and category lists, and then set the tree's category count:

Tree.RootNodeCount := ContactList.Categories.Count;

That's all.

The tree's event will take care of the rest of the initialization phase. If you want some of the category nodes to be expanded from the start, then all you have to do is expand them. The tree's events will ask you how many children each node has, and you can implement the event to answer. Once the tree has created nodes for them, it will ask how to initialize them, and you can assign the corresponding user or category object to the node at that time. The tree will tell you when it needs more information. You don't have to give it any more than it asks for.

Tameka answered 13/5, 2011 at 14:59 Comment(5)
Strange, I believe I already replied to this. Anyways, here goes: Thanks for this detailed answer! May I ask, why would I store the TCategorys in the TUser object, and also store the TUsers in the TCategory object? Wouldnt that use more memory?Brevity
Jeff, given a user, how do you know which categories it is a member of? For each user, the additional memory used will be approximately this: TObjectList.InstanceSize + Categories.Count * SizeOf(Pointer). How many categories are there likely to be? How many categories is any one user likely to be in? As I said, either list is optional. Intended usage should determine the data structure. If you don't intend to look up categories for users, then you don't need that list.Tameka
@Rob - Well, all depending on how the end-user is using my software. I know someone who had over 100 categories, and a whole bunch of people were in many of them. But anyways, so what I did wrong, was that I had my Object Lists own the items? Should I store a pointer to a user in each category the user is in? How would I do that?Brevity
As I said in the first sentence, what you did wrong was that you had the categories own the users. (Whether that was done specifically via TObjectList's concept of ownership is beside the point). It meant that a user in two categories would have two separate TUser instances. Yes, each category should have a reference to each user in that category. That's what the TCategory.Users property is for in the code I gave you. To add a user to a category, call Category.Users.Add(User) (and also User.Categories.Add(Category), if you're keeping both lists).Tameka
Grate answer, I especially like the back references to Jeff's older questions.Laudation

© 2022 - 2024 — McMap. All rights reserved.