MySQL number of items within "in clause"
Asked Answered
T

5

76

I have three tables to define users:

USER: user_id (int), username (varchar)
USER_METADATA_FIELD: user_metadata_field_id (int), field_name (varchar)
USER_METADATA: user_metadata_field_id (int), user_id (int), field_value (varchar)

I'd like to create a middle tier user that has certain access to other users within the application. To determine which users the logged in use can access, I am using a subquery like the following:

SELECT user_id FROM user WHERE user_id 
     IN (SELECT user_id 
         FROM user_metadata 
         WHERE user_metadata_field_id = 1 AND field_value = 'foo')

Currently I am storing the subquery string in a variable and then dynamically inserting it into the outer query each time I need to pull a list of users. After doing this I thought, "it has got to be better to just store a string of the actual user_ids".

So instead of storing this in a variable...

$subSql = "SELECT user_id FROM user_metadata WHERE user_metadata_field_id = 1 AND field_value = 'foo'";

... I actually perform the query and store the result like this...

$subSql = "12, 56, 89, 100, 1234, 890";

Then when I need to pull a lit of users that the logged in user has access to, I can do so with:

$sql = "SELECT user_id FROM user WHERE user_id IN ($subSql)";

And finally the questions:

How many items can you use in a MySQL IN CLAUSE? Storing the actual ids instead of the sub-sql statement has got to be faster for performing that outer query each time, right?

Tsunami answered 7/10, 2009 at 15:29 Comment(0)
F
38

Starting from a certain number, the IN tables are faster.

MySQL has something inside its code that makes building a range over a large number of constant values slower than doing the same in a nested loop.

See this article in my blog for performance details:

Flap answered 7/10, 2009 at 15:42 Comment(3)
Hey Quassnoi: there seems to be a lot of disagreement about your performance tests and your assertion that a temporary table is better. I think you may be wrong here.Jehovist
@IcedDante: there are scripts to reproduce everything I wrote about in the blog post. You are free to write your own blog post, with your own scripts, and demonstrate how wrong I am. Otherwise, talks like this are just hot air.Flap
I know this question is old, but your answer did nor answered the real question: "How many items can you use in a MySQL IN CLAUSE?"Belike
K
162

From the manual:

The number of values in the IN list is only limited by the max_allowed_packet value.

Knavish answered 10/1, 2012 at 19:2 Comment(2)
An example of default setting is max_allowed_packet | 1048576 (1 MB)Weak
The default value of max_allowed_packet is 4MBRabinowitz
F
38

Starting from a certain number, the IN tables are faster.

MySQL has something inside its code that makes building a range over a large number of constant values slower than doing the same in a nested loop.

See this article in my blog for performance details:

Flap answered 7/10, 2009 at 15:42 Comment(3)
Hey Quassnoi: there seems to be a lot of disagreement about your performance tests and your assertion that a temporary table is better. I think you may be wrong here.Jehovist
@IcedDante: there are scripts to reproduce everything I wrote about in the blog post. You are free to write your own blog post, with your own scripts, and demonstrate how wrong I am. Otherwise, talks like this are just hot air.Flap
I know this question is old, but your answer did nor answered the real question: "How many items can you use in a MySQL IN CLAUSE?"Belike
D
11

As hinted in Quassnoi's response, one stumbles upon other practical considerations, before hitting any possible limit imposed by a given MySql version's implementation (*). Therefore, as the number of admin users (or other criteria which may require an IN construct) grows, one should seek to use alternatives to a literal "IN", such as the use of temporary (or even permanent) tables.

Since you are considering special handling of the "admin user" criteria, for performance purposes, I'd like to offer an comment and a suggestion.

Comment: Could this be a case of premature optimization?
I'm unaware of the specifics of this database, its volume, complexity etc. And, yes, I am aware of some the performance tribute to be paid to the EAV (Entity-Attribute-Value) format, but I'm thinking that even for successful businesses, the accounts database rarely counts in excess of 10,000 users. So even with very many attributes per user we're still looking at a relatively small EAV table, which may not require this type of optimization. (On the other hand a few other optimization tricks may be welcome in other areas).
Furthermore, typical use cases, involve a relative few inquiries into the account database, relative to other queries, and this is therefore another reason to deffer any non trivial performance consideration for the accounts-related features of the application.

Suggestion: Maybe use "re-normalized attributes"
For attributes that are singled-valued, and in particular if they are short, they can be moved (or duplicated) in the Entity table ('USER' table in this case). This introduces a bit of logic at the time items are inserted or updated, but this sames many joins (or subqueries) and also provides opportunities to consider multi-field indexes to support the most common use cases.

(*) Is there a limt?
I haven't read about any such a limit; I know Oracle has (had) a 1,000 limit at some time, MSSQL doesn't; of course all servers do have a limit based on the overall length of the SQL statement, but this is a really big number! if one ever stumble upon that one, he/she has other problems... ;-)

Diplostemonous answered 7/10, 2009 at 16:16 Comment(0)
C
7

MySQL's IN Clause itself doesn't have such limit. I tried with 8000 elements its work fine for me. Stack overflow error could be of variable declared,

Creamy answered 20/4, 2011 at 7:31 Comment(0)
M
0

If you have more than 1000 values within the IN() clause MariaDB seems to automatically create temporary tables for performance improvement. You can see this using EXPLAIN.

Marilynnmarimba answered 30/1, 2020 at 10:34 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.