GAE ndb design, performance and use of repeated properties
Asked Answered
A

2

12

Say I have a picture gallery and a picture could potentially have 100k+ fans. Which ndb design is more efficient?

class picture(ndb.model):
    fanIds = ndb.StringProperty(repeated=True)
    ... [other picture properties]

or

class picture(ndb.model):
    ... [other picture properties]

class fan(ndb.model):
    pictureId = StringProperty()
    fanId = StringProperty()

Is there any limit on the number of items you can add to an ndb repeated property and is there any performance hit with storing a large amount of items in a repeated property? If it is less efficient to use repeated properties, what is their intended use?

Antichrist answered 13/3, 2013 at 4:30 Comment(2)
Nothing to do with the answer but I would suggest you to follow the conventions.. class names CamelCase and property names lower_case_underscore..Cwmbran
Also for the pictureId use the ndb.KeyProperty(kind=picture) as you have it in the current model.. and fanId = ndb.KeyProperty(kind=fan, repeated=True) instead of StringProperty for better handling of the entities.Cwmbran
C
33

Do not use repeated properties if you have more than 100-1000 values. (1000 is probably already pushing it.) They weren't designed for such use.

Cheeky answered 14/3, 2013 at 19:26 Comment(3)
Jumping into this answer from another question: (stackoverflow.com/questions/26740505). One should not use the repeated properties for more than 10 elements? So relationship should be avoid via repeated keys. Correct?Gentry
@Guido What should we use for such type of bulk data storage ?Lillian
@Lillian I think the NDB PickleProperty is what you're looking for.Elana
C
5

Generally v1 would be much cheaper.

In terms of read/write costs, you pay per entity fetch/written, so you want to reduce the number of entities. version 1 will be cheaper. Significantly cheaper if you fetch every fan every time you fetch a picture.

However each entity is limited to 1MB. If you have 100k+ fans, you could hit that limit depending on the size of your fanId. That's not counting your other picture data, so you could blow that 1MB limit. You'll have to add some more complex code to handle overflow cases.

Large entities take longer to fetch than small entities. If you're going to fetch all the fans at once all the time, v1 will be better. If you're only going to fetch say 5 fans at any one point, v2 might be faster (only might). If on the other hand you try to pull 100k fan entities... that's gonna take forever.

Chazan answered 13/3, 2013 at 15:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.