Is BigTable slow or am I dumb?

Asked 5/6, 2009 at 8:15 Answered 1/3, 2017 at 19:46

Solved django google-app-engine django-models bigtable

I basically have the classic many to many model. A user, an award, and a "many-to-many" table mapping between users and awards.

Each user has on the order of 400 awards and each award is given to about 1/2 the users.

I want to iterate over all of the user's awards and sum up their points. In SQL it would be a table join between the many-to-many and then walk through each of the rows. On a decent machine with a MySQL instance, 400 rows should not be a big deal at all.

On app engine I'm seeing around 10 seconds to do the sum. Most of the time being spent in Google's datastore. Here is the first few rows of cProfile

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      462    6.291    0.014    6.868    0.015 {google3.apphosting.runtime._apphosting_runtime___python__apiproxy.Wait}
      913    0.148    0.000    1.437    0.002 datastore.py:524(_FromPb)
     8212    0.130    0.000    0.502    0.000 datastore_types.py:1345(FromPropertyPb)
      462    0.120    0.000    0.458    0.001 {google3.net.proto._net_proto___parse__python.MergeFromString}

Is my data model wrong? Am I doing the lookups wrong? Is this a shortcoming that I have to deal with with caching and bulkupdating (which would be a royal pain in the ass).

Rustie answered 5/6, 2009 at 8:15 Comment(4)

+1 LOL. I love the title of this question! – Blakley 5/6, 2009 at 8:18

Isn't Google's BigTable basically a hash table? – Bumblebee 5/6, 2009 at 8:20

That top function as a Wait... why are you spending 6 seconds in a Wait? – Baptistry 5/6, 2009 at 8:22

@balpha: no, it's not implemented as a hash table. @workmad3: that's the application process waiting for the datastore process (which may or may not be running on the same node, I'm not sure). – Marlenmarlena 5/6, 2009 at 8:24

Could be a bit of both ;-)

If you're doing 400 queries on the Awards table, one for each result returned for a query on the mapping table, then I would expect that to be painful. The 1000-result limit on queries is there because BigTable thinks that returning 1000 results is at the limit of its ability to operate in a reasonable time. Based on the architecture, I'd expect the 400 queries to be way slower than the one query returning 400 results (400 log N vs. (log M) + 400).

The good news is that on GAE, memcaching a single hashtable containing all the awards and their points values is pretty straightforward (well, looked pretty straightforward when I cast an eye over the memcache docs a while back. I've not needed to do it yet).

Also, if you didn't already know, for result in query.fetch(1000) is way faster than for result in query, and you're restricted to 1000 results either way. The advantages of the latter are (1) it might be faster if you bail out early, and (2) if Google ever increases the limit beyond 1000, it gets the benefit without a code change.

You might also have problems when you delete a user (or an award). I found on one test that I could delete 300 objects inside the time limit. Those objects were more complex than your mapping objects, having 3 properties and 5 indices (including the implicit ones), whereas your mapping table probably only has 2 properties and 2 (implicit) indices. [Edit: just realised that I did this test before I knew that db.delete() can take a list, which is probably much faster].

BigTable does not necessarily do the things that relational databases are designed to do well. Instead, it distributes data well across many nodes. But almost all websites run fine with a bottleneck on a single db server, and hence don't strictly need the thing that BigTable does.

One other thing: if you're doing 400 datastore queries on a single http request, then you will find that you hit your datastore fixed quota well before you hit your request fixed quota. Of course if you're well within quotas, or if you're hitting something else first, then this might be irrelevant for your app. But the ratio between the two quotas is something like 8:1, and I take this as a hint what Google expects my data model to look like.

Marlenmarlena answered 5/6, 2009 at 8:44 Comment(4)

Great advices. It sounds like I should move to the normal django models and store it all on MySQL until I hit a scaling problem. – Rustie 5/6, 2009 at 9:29

If your data is better in MySQL than BigTable, I think you have to ask yourself why you're using app engine at all. If there's a good reason ("free hosting", for instance), then I guess yes, but to me it looks like a bit of a hack. BigTable (and in general distribution across Google's cloud) is probably the only interesting technical difference between GAE and any old LAMP stack. – Marlenmarlena 5/6, 2009 at 10:18

Or you could rethink your models. With the appengine datastore, you do not want to iterate over rows during a request, instead pull one row out quickly. One way to do this is to keep your totals/subtotals/aggregates up to date at write time, not at read time. Another way to do it is to run background processes (either with their cron, or remote_api) to update totals/subtotals/aggregates asynchronously. – Brenner 5/6, 2009 at 10:19

@dar: yes. In fact, I'd say that at the moment all the best reasons for using GAE imply that you should learn to work within its constraints rather than work around them. It's the only way to get the distribution/scalability which is its most interesting feature. – Marlenmarlena 5/6, 2009 at 10:26

Is my data model wrong? Am I doing the lookups wrong?

Yes and yes, I'm afraid.

As far as your data model goes, the best way by far to handle this is to store the sum against the User record, and update it when a user gains/loses an award. There's really no point in counting their score every single time when the vast majority of the time, it will be unchanged. If you make the "UserAward" entity type a child entity of the "User", you can update the score and insert or delete the UserAward entry in a single atomic transaction, ensuring your count is always accurate.

onebyone points out that you can memcache the awards table. That's a good idea, but given the limited amount of data, an even better one is to store it in local memory. Global members persist between HTTP requests, and since I presume you don't update the awards table often, you don't have to worry much about invalidating the cache. Just load it on the first request (or even hardcode it into your source). If you do change the list of awards, deploying a new minor update will reset all the instances, causing them to reload.

For lookups, bear in mind that a substantial cost of doing datastore operations is the round-trip time. A get() operation, which looks up 1 or more records by ID (you can batch!) takes about 20-40ms. A query, however, takes about 160-200ms. Hence, the power of denormalization.

Langur answered 5/6, 2009 at 16:24 Comment(2)

Thank you. I did simplify my problem a little for the sake of this question. I do update awards quite a bit, as well as awardees. And for returning the "UserAwards" I'm going to need more info than just the points. I'd like the icon for the award, and probably the title. Is your batch get() done on references? When I have my 400 UserAward rows and I start walking to get the userAward.award will it get by ID and batch them? That could be the lifesaver right there. – Rustie 5/6, 2009 at 19:47

It's not possible to batch referenceproperty resolution the 'natural' way. What you can do is call myent.properties()['propname'].get_value_for_datastore(myent) to retrieve the Key, which allows you to batch things up. Even if you update awards a lot, I would still suggest storing them all in local memory or in memcache, with some way to invalidate the cache. Your other option is to use a ListProperty(Reference) of awards on each entity. If you don't need to look users up by their awards, you can set indexed=False to decrease the overhead, too. – Langur 7/6, 2009 at 3:19

One important idiom of app engine is that storage is cheap but time is never in surplus. It seems the best way to do many to many relationships in app engine is to simply store the information on both sides. IE a user has a list of awards and a each award has a list of users. To look up all the awards a user has you simply query the awards table for a certain user.

This idea is well demonstrated here: Building Scalable Complex Apps

Nilla answered 13/10, 2010 at 14:22 Comment(0)

Google BigTable run on Google Distributed File System.

The data is distributed. Maybe 400 rows mysql still have better, but for larger data google BigTable might faster.

I think thats why they encouraging us to use memcache to make it faster.

Chungchungking answered 5/6, 2009 at 8:44 Comment(0)

Even you mention BigTable, I think you are implementing a relational database on cloud SQL.

Your model is all right, it is the proper way of doing something like this. I don't see a good reason to de-normalize aggregates onto user table.

Did you create indexes for fast table joining. It is pretty simple. You may need BTree indexes for all fields that involve table joining. No need to index the aggregating field (which you take the SUM of). Basically both foreign keys of the N:N table should be indexed. If those foreign keys refer to the primary key of other two tables, that is enough to go.

Above the order of 100 rows, a simple BTree index on foreign keys can have a decent and noticeable increase in throughput.

I'm running a database on CloudSQL where some edge tables have over 2 million records. Only after the 2.5 million records I'm considering some de-normalization, and that is also some extra indexes, and still aggregating for the SUM. Otherwise I would be doing unnecessary updates to SUM field whenever new records are added.

Only when the table was going over 1 million records, we had to consider using a read replica. And that is when we could distinguish between processes that only read some tables and not writing.

If you are using Django, beware when you implement LIMIT according to their documentation; because it is very misleading. When you [:100] (splice) on a record set, it is not what you expecting on the SQL that is actually sent to SQL server. I had a very hard time figuring that out. Django is not a good option when you plan to do something that would spawn to very large scale. But at the order of 1000 records, it would be fine.

Enclasp answered 1/3, 2017 at 19:46 Comment(0)

Recommended topics

Hot tags