Is there an upper limit to the number of objects in a Smalltalk image?

P

2

6

I'm putting together a NLP experiment in which concepts are agents in a system designed to engender Emergent properties consisting of new concepts (here's a link for those who don't know what Emergence is). Smalltalk (specifically the Pharo dialect) appears to be ideal for this kind of application because of the ease with which I can create fully-encapsulated concept objects that relate to one another as independent agents, and, the fact that SmallTalk allows me to inspect the state of the system as it's running.

My concern is whether or not the system will start to choke if too many objects are present and all sending messages to one another. In theory, my implementation could engender millions of concept objects and I don't want to devote the time working this out in SmallTalk if the system can't handle something that large.

Are there limiting factors (software factors, not hardware) regarding the quantity of active objects in a SmallTalk image?
Can the system handle the message traffic that would be present in a system with millions of chatty objects?

Thank you in advance for your help!

Pearce answered 31/10, 2013 at 23:5 Comment(0)

A

3

The internal working size of object pointers within Pharo is still 32 bit I believe. There's been chatter of 64b versions, but it's one thing to have a 32b VM running on a 64b machine, and another thing to have an actual, 64b through and through VM.

So there's an implicit limit right there, but still room for "millions" of objects. Start reaching in to the "100's of millions" and you may well bump in to some limits.

Having millions of objects in the end isn't really an issue, now it moves to threads of control, and Pharo doesn't do much threading in that case. So it really comes how to how many actual distinct contexts you will have, not necessarily objects per se.

Having a chain of millions of objects talking to each other isn't really a big deal, you'll simply run in to whatever message passing overhead there is in the underlying VM to limit raw performance. Pharo is pretty fast, but it's not Java fast. Whether it's fast enough for you is for you to answer.

I also can't speak to how well the Pharo GC handles millions of live objects, I can only suggest that it's 2013, Squeak (upon which Pharo is based) has been around since the mid 90's, GC tech is pretty much mature now, and I don't suspect that Pharo's GC is spectacularly awful in this regard.

I would simply do some micro benchmarks and try for yourself.

Azedarach answered 31/10, 2013 at 23:24 Comment(2)

w00t - t/y for the quick reply! – Pearce 1/11, 2013 at 0:49

Just as an update from a few yeares later, Squeak (and Pharo, and Cuis) is available in 64bit pointer form. As indeed are VisualWorks, GemTalk and likely other implementations. So the chances of running out of object identity are low. And with current RAM prices the odds of having enough storage to come within variable-sword range of a problem are minimal. – Sihunn 26/3, 2018 at 17:42

L

3

Regarding 1: The number of objects is limited by the virtual address space that is available to the VM - which, with the standard builds, is only a few hundred MBs large. My current Squeak image contains over 3.5 million instances of Object in its idle state - which should give you an impression about what is possible.

Regarding 2: My Squeak image performs at around 26 million message sends per second on my not-so-up-to-date Intel Core i7 2620M (but uses one core only, of course).

However, i doubt that you will be satisfied with the result of your current approach. You talked about inspecting the state of the system - which really is totally awesome in Squeak/Pharo - but you can't (manually) inspect the state of millions of objects. But then again, I don't know exactly what you are up to ;)

Liquidate answered 31/10, 2013 at 23:24 Comment(1)

t/y as well for jumping on this so quickly! FYI- I'm not looking to inspect all of the objects at once - but I do need to inspect the concept structures (concepts can self-replicate, engendering more complex forms to which they are still connected) to see what's being made, how it's being made, and what caused it to be made. So long as I can do that, I'm gtg. – Pearce 1/11, 2013 at 0:52

A

3