Appengine Search API vs Datastore
Asked Answered
J

4

21

I am trying to decide whether I should use App-engine Search API or Datastore for an App-engine Connected Android Project. The only distinction that the google documentation makes is

... an index search can find no more than 10,000 matching documents. The App Engine Datastore may be more appropriate for applications that need to retrieve very large result sets.

Given that I am already very familiar with the Datastore: Will someone please help me, assuming I don't need 10,000 results?

  • Are there any advantages to using the Search API versus using Datastore for my queries (per the quote above, it seems sensible to use one or the other)? In my case the end user must be able to search, update existing entries, and create new entities. For example if my app is a bookstore, the user must be able to add new books, add reviews to existing books, search for a specific book.
  • My data structure is such that the content will be supplied by the end user. Document vs Datastore entity: which is cheaper to update? $$, etc.
  • Can they supplement each other: Datastore and Search API? What's the advantage? Why would someone consider pairing the two? What's the catch/cost?
Jarrad answered 26/4, 2014 at 22:41 Comment(1)
This is a great question. But the selected answer is subpar. I will up vote the question, but there needs a better answer that addresses the points in the question.Riddance
W
7

The key difference is that with the Datastore you cannot search inside entities. If you have a book called "War and peace", you cannot find it if a user types "war peace" in a search box. The same with reviews, etc. Therefore, it's not really an option for you.

Wolford answered 26/4, 2014 at 23:26 Comment(5)
More precisely, in datastore you cant search by 'contains' thus you ccant search the example here with two words to search. There are other limitations in datastore like allowing only up to two inequalities.Phytoplankton
Thank you so much for the answers. This really helps a lot. So shall I assume that otherwise I can use the Search API and document instead of the datastore to store my data? I.e. am I to understand that the only advantage of datastore is the 10,000 limit? Otherwise, Search API Documents can do anything that the datastore can do?Jarrad
You still need the Datastore. This is where you store your data, like book id/ISBN, author, price, category, etc. You can use Search API to store book titles and reviews, but you need to link these records to entities in the datastore.Wolford
Datastore also has transactions. If you don't need text search you would normally never use the Search API.Ferreous
Cloud Datastore queries do not support substring matches, case-insensitive matches, or so-called full-text search. The NOT, OR, and != operators are not natively supported, but some client libraries may add support on top of Cloud DatastoreProsthodontics
V
18

Some other info:

  1. The datastore is a transactional system, which is important in many use cases. The search API is not. For example, you can't put and delete and document in a search index in a single transaction.
  2. The datastore has a lot in common with a NoSql DB like Cassandra, while the search API is really a textual search engine, very similar to something like Lucene. If you understand how a reverse index works, you'll get a better understanding of how the search API works.
  3. A very good reason to combine usage of the datastore API and the search API is that the datastore makes it very difficult to do some types of queries (e.g. free text queries, geospatial queries) that the search API handles very easily. Thus, you could store your main entities in the datastore, but then use the search API if you need to search in ways the datastore doesn't allow. Down the road, I think it would be great if the datastore and search API were more tightly integrated, for example by letting you do free text search against indexed Text fields, where app engine would automatically create a search Document Index behind the scenes for you.
Vermiform answered 27/4, 2014 at 23:55 Comment(0)
W
7

The key difference is that with the Datastore you cannot search inside entities. If you have a book called "War and peace", you cannot find it if a user types "war peace" in a search box. The same with reviews, etc. Therefore, it's not really an option for you.

Wolford answered 26/4, 2014 at 23:26 Comment(5)
More precisely, in datastore you cant search by 'contains' thus you ccant search the example here with two words to search. There are other limitations in datastore like allowing only up to two inequalities.Phytoplankton
Thank you so much for the answers. This really helps a lot. So shall I assume that otherwise I can use the Search API and document instead of the datastore to store my data? I.e. am I to understand that the only advantage of datastore is the 10,000 limit? Otherwise, Search API Documents can do anything that the datastore can do?Jarrad
You still need the Datastore. This is where you store your data, like book id/ISBN, author, price, category, etc. You can use Search API to store book titles and reviews, but you need to link these records to entities in the datastore.Wolford
Datastore also has transactions. If you don't need text search you would normally never use the Search API.Ferreous
Cloud Datastore queries do not support substring matches, case-insensitive matches, or so-called full-text search. The NOT, OR, and != operators are not natively supported, but some client libraries may add support on top of Cloud DatastoreProsthodontics
P
2

The most serious con of Search API is Eventual Consistency as stated here: https://developers.google.com/appengine/docs/java/search/#Java_Consistency

It means that when you add or update a record with Search API, it may not reflect the change immediately. Imagine a case where a user upload a book or update his account setting, and nothing changes because the change hasn't gone to all servers yet.

I think Search API is only good for one thing: Search. It basically acts as a search engine for your data in Datastore.

So my advice is to keep the data in datastore that user expects immediate result, and use Search API to search the data that user won't expect immediate result.

Preemie answered 2/9, 2014 at 1:34 Comment(0)
S
0

The Datastore only provides a few query operators (=, !=, <, >), doing nested filters and multiple inequalities would either be costly or impossible (timeouts) and search results may give a lot of False Positives. You can do partial string search by tokenizing but this will bloat your entity. Best way to get through these limitations is using Structured Properties and/or Ancestor Queries.

Search API on the other hand runs a Full Text search on Search Documents, which is faster and more accurate than NDB queries without relying on tokenized data. Downside is it relies on data staying up to date.

Use Datastore to process your data (create, update, delete), then run a function to put these data as documents and cluster using indexes, then run the searches using the Search API.

Standfast answered 14/6, 2017 at 3:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.