How to implement faster Search As You Type (SAYT) api in Rails 3 application
Asked Answered
P

6

9

I am having millions of records in my mysql database. I implemented a normal REST api for the iPhone app in Rails 3, but the SAYT feature response is really slow. It takes a lot of time to search the database and return the results. How can I improve the performance?

I have indexed my tables. What all aspects should I look more into like MySQL tuning, or should I use rails sphinx or sunspot? Will that help Please help me with all your expert advices.

Prajna answered 24/2, 2012 at 6:12 Comment(2)
Could you share a little more information? How many columns are you searching? What types of fields? Can you share the query you're using now? You should definitely look into a full text search engine, elastic search is another good one: railscasts.com/episodes?search=searchTheone
How slow is slow? Can you post a gist of the rails development log for a search request? The answer to your question depends a lot on whether the most time is spent in querying the database, returning the results, both, something else, etc. Profile before changing things!Caldwell
F
2

I agree with the general answer: use a search engine like Sphinx (and limit the number of results returned); they are designed to do exactly what you want.

However, while millions of records may sound like a lot, you should first determine what is taking a long time. I have major love for Sphinx and ThinkingSphinx -- they take what is a rather complex process and make it pretty simple and easy. But, in the end, a search engine it's another system to manage, configure, learn and know. If you don't have to go there, it's easier not to, right?

It might be the query, it might be time spent returning the data (limit is your friend!). Or it might be that you're getting hundreds of requests per second, perhaps because the delay on auto-complete is too short -- if a lookup occurs at every character, fast typists or multiple users can easily overrun the server with queries that provide no utility for the user.

Watch the Rails logs and see what's really going on. If it's a simple query performance issue, doing a complicated full-text search, then, yeah, that's going to be slow and Sphinx is going to be worth the effort. Your database has an explain tool that, with some work, can help you understand what the database is doing to get the result. It's not uncommon that an index doesn't get used.

What about caching? Memcached is a fantastic tool. Or maybe even just your buffer size settings for the database can allow it to use more memory for caching.

Footstall answered 14/3, 2012 at 19:28 Comment(0)
P
1

I am not sure what you mean by adding a faster search but it is good to limit your search results to 100 as it touches on usability. Not many users will go through 100 records for their search.

In order to achieve such search, i suggest that you include the keyword table. Keyword table should consist of the record id and the keyword associated to it and how many times the keyword has been transacted in the database.

Thus it will help you determine the top hundred records and the most accurate search.

There are many algorithms search as Map Reduce too which runs concurrently. I don;t think your mobile device technology can handle map reduce.

Portulaca answered 25/2, 2012 at 18:24 Comment(1)
For the first page of results on a mobile device, I'd say 5-10 results is plenty, even.Caldwell
P
1

I would also recommend using a full text search engine like Sphinx.

There is a good screencast about using Sphinx and rails with the thinking_sphinx gem:

Railscast thinking_sphinx gem

With that gem you can also affect the search result regarding importance by e.g. adding field weights:

Thinking Sphinx documentation

Since it is a mobile device i would as well keep the ammount of results sent to the mobile to a minmum as madi allready mentioned it.

Have fun

Palladian answered 1/3, 2012 at 17:12 Comment(0)
L
1

For searching millions of records quickly, you're probably going to want to use a trie type data structure. http://en.wikipedia.org/wiki/Trie has ruby sample code if you'd like some help.

Simplified a bit, the trie is a highly storage efficient method of tracking what children belong to what initial list of characters.

Essentially, your SAYT tech would take a string in, and return the top 15 or so results off the trie's entry for that string.

Depending on how self-similar your rows are, this will have impact on your RAM usage, of course.

Lumbye answered 13/3, 2012 at 17:50 Comment(0)
S
0

Depending on what query you make, LIKE queries that match at the beginning of the column may use indices (in Postgres I know for sure that they do; in MySQL I'm not sure).

So,

Widget.where('name LIKE ?', "#{search_term}%").all

will use a database index (at least in Postgres), while

Widget.where('name LIKE ?', "%#{search_term}%").all

will not. Note the % at the beginning of the search term. Your mileage may also vary with ILIKE (case-insensitive), vs LIKE (case sensitive) conditions. Read the database docs. That may be the lowest-hanging fruit.

A search engine, as proposed by another reply, is another option. If you're deployed on Heroku, there are some cloud-search add-ons that are pretty easy to integrate, but it's probably still an order of magnitude more work than tuning your queries a bit.

Scever answered 11/3, 2012 at 5:15 Comment(0)
K
0

You could make a table for most searched and then prioritize searching that way maybe, hope this helps.

Kahler answered 14/3, 2012 at 13:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.