GAE DataStore vs Google Cloud SQL for Enterprise Management Systems
Asked Answered
G

2

14

I am building an application that is an enterprise management system using gae. I have built several applications using gae and the datastore, but never one that will require a high volume of users entering transactions along with the need for administrative and management reporting. My biggest fear is that when I need to create cross-tab and other detailed reports (or business intelligence reporting and data manipulation) I will be facing a mountain of problems with gae's datastore querying and data pull limits. Is it really just architectural preference or are there quantitative concerns here?

In the past I have built systems using C++/c#/Java against an Oracle/MySql/MSSql (with a caching layer sprinkled in for some added performance on complex or frequently accessed db results).

I keep reading that we are to throw away the old mentality of relational data and move to the new world of the big McHashTable in the sky... but new isnt always better... Any insight or experience on the above would be helpful.

Gamaliel answered 5/6, 2012 at 22:25 Comment(1)
The relational model has proven itself very useful, but existing products haven't been able to provide it at internet scale, which is why we have a lot of different solutions with their own issues. NuoDB is an example of an interesting and promising "NewDB" database. While we're waiting, people tend to combine solutions for their particular use cases, copying data to separate databases for querying and reports, using gigaspaces in front of rdbms and so on.Prevail
U
22

From the Cloud SQL FAQ:

Should I use Google Cloud SQL or the App Engine Datastore?

This depends on the requirements of the application. Datastore provides NoSQL key-value > storage that is highly scalable, but does not support the complex queries offered by a SQL database. Cloud SQL supports complex queries and ACID transactions, but this means the database acts as a ‘fixed pipe’ and performance is less scalable. Many applications use both types of storage.

If you need a lot of writes (~XXX per/s) to db entity w/ distributed keys, that's where the Google App Engine datastore really shine.

If you need support for complex and random user crafted queries, that's where Google Cloud SQL is more convenient.

Udela answered 6/6, 2012 at 15:5 Comment(4)
Sounds like a combination of the two might be best in this circumstance. Perhaps I can break things up into two stages. I can use the datastore as the application interface for OLTP, then through either async queues or cron jobs migrate that data to the cloud sql for OLAP. There is data duplication, but I could use the migration activity as an opportunity to transform the data in code itself before writing to a more normalized state and clean 'stale' data from the datastore side.... ahh the work I just created for myself...Gamaliel
I actually just came across this, which gives some weight to my earlier comment. https://mcmap.net/q/828614/-complex-queries-using-gae-datastoreGamaliel
If you have a huge amount of data, you could also consider using Big Query developers.google.com/bigquery, more suited to operate on a large set of imported data.Udela
Thanks this is great. It makes me ho-hum a bit about what are the right tools (OLTP = datastore, OLAP = Google Cloud SQL) OR (OLTP = Google Cloud SQL, OLAP = BigQuery)... In any case, I think that optimally I would see OLTP using datastore, daily querying / reporting using GCSQL with a sprinkle of big query for when a report pilfers through some massive data (eg. debit/credit transactional data)...Gamaliel
L
6

What is scare me more in GAE datastore is index number limitation. For example if you need search by some field or sorting - you need +1 index. Totally you can have 200 indexes. If you have entity with 10 searchable fields and you can sort by any field - there will be about 100 combunations. So you need 100 indexes. I have developed few small projects for gae - and this is success stories. But when big one come - this is not for gae.

About cache - you can do it with gae, but they distributed cache works very slow. I prefer to create private single instance of permanent backend with RESTfull API that holds cached values in memory. Frontend instances call this API to get/set values.

Maybe it is posible to build complex system with gae, but this will be a set of small applications/services.

Laundrywoman answered 6/6, 2012 at 6:9 Comment(4)
As described in this article developers.google.com/appengine/articles/indexselection, the new advanced query planner can greatly reduce the number of indexes for complex queries by using zigzag merge join on single property indexes instead of an expensive combinaison of composite indexes.Udela
Yes, read this already. Makes a big difference in terms of limits.Gamaliel
Yes, I have read it, but still by formula that provided if you have search by 1 of 10 fields and sort by 1 of 10 fields you got same about 100 indexes. So if you have few such entities in your project - yes gae can be the one. But some projects contains greate number of such Entities. So I still on my posiyion - gae with datastore is good for small projects. You can build something huge using gae if you distribute it via small projects.Laundrywoman
The index limitation is going away (there's been an announcement). Also costs for DS queries are changing, perhaps going down for most usersMariellemariellen

© 2022 - 2024 — McMap. All rights reserved.