What are the speed comparisons of NDB vs DB (on High Replication Datastore)?
Asked Answered
W

2

14

Taken from the Python NDB Overview:

When the application reads an entity, that entity is automatically cached; this gives fast (and inexpensive) reads for frequently-read entities.

...

The NDB function that writes the data (for example, put()) returns after the cache invalidation; the Apply phase happens asynchronously.

In watching on Youtube, Google I/O 2011: More 9s Please: Under The Covers of the High Replication Datastore, at 13:11-ish, the average latencies are:

Master/Slave:

  • Read: 15ms
  • Write: 20ms

High Replication:

  • Read: 15ms
  • Write: 45ms

How significantly does NDB affect these speeds, from the app's perspective?

Edit: Specifically curious about timing stats (in milliseconds).

Extra Credit: I've also heard Nick Johnson refer to queries taking around 160ms each (in 2009) [link] Does NDB provide any speed benefits on queries?

Wolfsbane answered 31/3, 2012 at 7:0 Comment(0)
S
10

Using NDB makes your datastore calls appear, from your app's perspective, significantly faster.

READ: Best case scenario, reads are made from instance cache or memcache. In most cases, this will be significantly faster than reading from datastore.

WRITE: The NDB put/write method returns right after the cache invalidation. This is way faster than a normal write. So from your app's perspective, it's quite faster. The actual write, however, is performed asynchronously.

NDB vs DB (High Replication): In terms of speed from your app's perspective, NDB should be a clear win.

Slate answered 31/3, 2012 at 8:6 Comment(4)
Thanks for the quick reply! I'm specifically interested in the timing, in milliseconds. Editing the post to reflect that now. :)Wolfsbane
@wTyeRogers If you want to know exact figures, you'll have to do your own benchmarks.Dimetric
@NickJohnson, awesome; thanks! Since this is my first Stack Overflow post, I have a question regarding etiquette that's not in the FAQ: Do I edit Albert's post to include the stats, or do I post my own individual answer? (It just feels a bit odd to post an answer to my own question..)Wolfsbane
@wTyeRogers If the answer has substantial new information, answer your own question - that's perfectly acceptable behaviour on SO. If you think it's just a minor tweak, feel free to edit an existing answer.Dimetric
E
18

You'll have to benchmark for yourself -- times depend on many factors, like entity size and complexity: more properties or more items in repeated properties -> more complex.

The numbers you quote are really old and probably no longer reflect reality; most users' experience is that HRD is not slower than M/S, on average (in part because M/S has much higher variability).

There were some NDB benchmarks done here: http://code.google.com/p/appengine-ndb-experiment/issues/detail?id=118 -- but it doesn't compare the numbers to old db.

You can use Appstats to quickly do some timing of operations in a real app.

Electronic answered 31/3, 2012 at 15:22 Comment(1)
Good to know for my first Stack Overflow post! I was secretly hoping that you would be one of the responders, given your intimate knowledge of NDB and your ability to analyze stats to death with your X-Ray vision of Python --a deadly combination for this question. Thank you for highlighting some of the complexity points, and the link is very helpful.Wolfsbane
S
10

Using NDB makes your datastore calls appear, from your app's perspective, significantly faster.

READ: Best case scenario, reads are made from instance cache or memcache. In most cases, this will be significantly faster than reading from datastore.

WRITE: The NDB put/write method returns right after the cache invalidation. This is way faster than a normal write. So from your app's perspective, it's quite faster. The actual write, however, is performed asynchronously.

NDB vs DB (High Replication): In terms of speed from your app's perspective, NDB should be a clear win.

Slate answered 31/3, 2012 at 8:6 Comment(4)
Thanks for the quick reply! I'm specifically interested in the timing, in milliseconds. Editing the post to reflect that now. :)Wolfsbane
@wTyeRogers If you want to know exact figures, you'll have to do your own benchmarks.Dimetric
@NickJohnson, awesome; thanks! Since this is my first Stack Overflow post, I have a question regarding etiquette that's not in the FAQ: Do I edit Albert's post to include the stats, or do I post my own individual answer? (It just feels a bit odd to post an answer to my own question..)Wolfsbane
@wTyeRogers If the answer has substantial new information, answer your own question - that's perfectly acceptable behaviour on SO. If you think it's just a minor tweak, feel free to edit an existing answer.Dimetric

© 2022 - 2024 — McMap. All rights reserved.