In addition to what Bill says...
Smallest index
InnoDB picks the 'smallest' index for doing COUNT(*)
. It could be that all of the indexes of communication
are bigger than the smallest of transaction
, hence the time difference. When judging the size of an index, include the PRIMARY KEY
column(s) with any secondary index:
PRIMARY KEY(id), -- INT (4 bytes)
INDEX(flag), -- TINYINT (1 byte)
INDEX(name), -- VARCHAR(255) (? bytes)
For measuring size, the PRIMARY KEY
has big since it includes (due to clustering) all the columns of the table. INDEX(flag)
is "5 bytes". INDEX(name)
probably averages a few dozen bytes. SELECT COUNT(*)
will clearly pick INDEX(flag)
.
Apparently transaction
has a 'small' index, but communication
does not.
TEXT
/BLOG
columns are sometimes stored "off-record". Hence, they do not count in the size of the PK index.
Query Cache
If the "Query cache" is turned on, the second running of a query may be immensely faster than the first. But that is only if there were no changes to the table in the mean time. Since any change to the table invalidates all QC entries for that table, the QC is rarely useful in production systems. By "faster" I mean on the order of 0.001 seconds; not 1.44 seconds.
The difference between 1m38s and 1.44s is probably due to what was cached in the buffer_pool -- the general caching area for InnoDB. The first run probably found none of the 'smallest' index in RAM so it did a lot of I/O, taking 98 seconds to fetch all 4.5M rows of that index. The second run found all that data cached in the buffer_pool, so it ran at CPU speed (no I/O), hence much faster.
Good Enough
In situations like this, I question the necessity of doing the COUNT(*)
at all. Notice how you said "2.8 mio entries", as if 2 significant digits was "good enough". If you are displaying the count to users on a UI, won't that be "good enough"? If so, one solution to the performance is to do the count once a day and store it some place. This would allow instantaneous access to a "good enough" value.
There are other techniques. One is to keep the counter updated, either with active code, or with some form of Summary Table.
Throwing hardware at it
You already found that changing the hardware did not help.
- The 98s was as fast as any of RDS's I/O offerings can run.
- The 1.44s was as fast as any one RDS CPU can run.
- MySQL (and its variants) do not use more than one CPU per query.
- You had enough RAM so the entire 'small' index would fit in the buffer_pool until your second
SELECT COUNT(*)..
(Too little RAM would have led the second running to be very slow.)