From a performance perspective, how efficient is it to use a MySQL temporary table for a highly used website feature?
Asked Answered
C

3

18

I'm attempting to write a search functionality for a website, and I've decided upon an approach of using MySQL temporary tables to handle the data input, via the query below:

CREATE TEMPORARY TABLE `patternmatch`
  (`pattern` VARCHAR(".strlen($queryLengthHere)."))

INSERT INTO `patternmatch` VALUES ".$someValues

Where $someValues is a set of data with the layout ('some', 'search', 'query') - or basically what the user searched. I then search my main table images based on the data within table patternmatch like so:

SELECT images.* FROM images JOIN patternmatch ON (images.name LIKE patternmatch.pattern)

I then apply a heuristic or scoring system based on how well each result matched the input and display the results by that heuristic etc.

What I'm wondering is how much overhead does creating a temporary table require? I understand that they only exist in session, and are dropped as soon as the session is ended, but if I have hundreds of thousands of searches per second, what sort of performance issues might I encounter? Is there any better way of implementing a search functionality?

Castigate answered 18/9, 2013 at 6:39 Comment(5)
If you have hundreds of thousands of searches per second then where the table is stored will only be a small part of your problems. You should plan for a more realistic scenario. Using a normal table with appropriate indices will take you a long way.Brebner
I don't think you're understanding me. A regular table called images containing all possible results is already properly set up and indexed. I'm asking whether a session-specific temporary table to handle the search input by the user (created on a search, dropped on the end of a session) is an appropriate way of handling a search functionality.Castigate
While I can't comment much on your existing approach (seems fine to me), you're probably looking for Redis. Here some links to see if this is what you need: Redis benchmark Why use Redis with MySQL User cases with Redis - Not sure if this fits the bill but you should have a look if you are concerned with database performance.Eloign
An important question, is why? Why do you want to create this table? Seems like a lot of overhead, for very little (if any) benefit. Why not just put the pattern(s) directly in the sql query on the images table.Capuchin
To evaluate performance you should add some quantitative data to your question. Like how many rows your images table has, how many patterns there will be in each query, what's the allowed format of the patterns (where will the %s be), how many matches do you expect for each query.Canonize
H
7

What you stated is totally correct, the temporary table will only be visible to the current user/connection. Still, there is some overhead and some other problems such as:

  • For each of the thousands of searches you are going to create and fill that table (and drop it later) - not per user, per search. Because each search most likely will re-execute the script, and "per session" does not mean PHP session - it means database session (open connection).
  • You will need the CREATE TEMPORARY TABLES privilege, which you might not have.
  • Still, that table really should have MEMORY type, which steals your RAM more than it looks like. Because even having VARCHAR, MEMORY tables use fixed length row-storage.
  • If your heuristics later need to refer to that table twice (like SELECT xyz FROM patternmatch AS pm1, patternmatch AS pm2 ...) - this is not possible with MEMORY tables.

Next, it would be easier for you - and also for the database - to add the LIKE '%xyz%' directly to your images tables WHERE clause. It will do the same without the overhead of creating a TEMP TABLE and joining it.

In any case - no matter which way you go - that WHERE will be horribly slow. Even if you add an index on images.name you most likely will need LIKE '%xyz%' instead of LIKE 'xyz%', so that index will not get used.

I'm asking whether a session-specific temporary table to handle the search input by the user (created on a search, dropped on the end of a session) is an appropriate way of handling a search functionality.

No. :)

Alternative options

MySQL has a build-in Fulltext-Search (since 5.6 also for InnoDB) that even can give you that scoring: I highly recommend giving it a read and a try. You can be sure that the database knows better than you how to do that search efficiently.

If you are going to use MyISAM instead of InnoDB, be aware of the often overlooked limitation that FULLTEXT searches only return anything if the number of results is less than 50% of the total table rows.

Other things that you might want to look at, are for example Solr (Nice introduction read to that topic itself would be the beginning of http://en.wikipedia.org/wiki/Apache_Solr ). We are using it in our company and it does a great job, but it requires quite some learning.

Summary

The solution to your current problem itself (the search) is to use the FULLTEXT capabilities.

If I have hundreds of thousands of searches per second, what sort of performance issues might I encounter? Is there any better way of implementing a search functionality?

To give you a number, 10.000 calls per second is not "trivial" already - with hundreds of thousands of searches per second the sort of performance issues you will encounter are everywhere in your set-up. You are going to need a couple of servers, load balancing and tons of other amazing tech crap. And one of this will be for example Solr ;)

Has answered 24/9, 2013 at 18:56 Comment(1)
Thank you very much for your answer! I will investigate the options you have mentioned...Castigate
V
3
  1. Creating temporary tables on disk is relatively expensive. In your scenario it sounds like it'll be slower than it's worth.
  2. It's usually only worthwhile to create temporary tables in memory. But you need to know you have enough memory available at all times. If you plan to support so many searches per second this is not a good solution.
  3. MySQL has full-text searching built-in. It's good for small systems. This would likely perform far better than your temp table and JOIN. But if you want to support thousands of searches per second I would not recommend it. It could consume too much of your overall database performance. Plus you're then forced to use MyISAM for storage which might have its own issues in your scenario.
  4. For so many searches you'll want to offload the work to another system. Plenty of searching systems with scoring already exist. Take a look at ElasticSearch, Solr/Lucene, Redis, etc.
Veii answered 23/9, 2013 at 21:58 Comment(0)
D
1

From the code you give, I really don't think tmp tables are needed, nor is FULLTEXT searching. But ... about tmp table performance:

The creation/cleanup of the tmp table is not written to transaction logs, so it will be relatively quick for the OS to do the I/O involved. If the temporary tables will be small and short-lived, and you have lots of buffers available for the OS, the disk realistically wont even be touched. If you think it will be anyways, get an SSD drive, and get more RAM.

But if you are realistic that you are looking at hundreds of thousands of searches per second then you have a big engineering project on hand. Why not just do:

select images.* from images where name in ('some', 'search', 'query')

?

Delacroix answered 28/9, 2013 at 4:33 Comment(1)
Because in that case he should use stuff like Solr or Sphinx ;) Try to do that number of different searches on DB servers in production that are on your hand - AND part of an affordable architectureHas

© 2022 - 2024 — McMap. All rights reserved.