Why is a Mysql query execution in Node JS so much slower than a direct Mysql query execution?

Asked 25/4, 2014 at 2:9 Answered 22/5 at 22:58

Solved mysql node.js performance database-performance node-mysql

Query: select id, event_time from events where event_time > 1395797406712 and event_time < 1398389406712 order by event_time asc.

this query returns ~25k rows (500KB in total size).

When I query the above query in Node.js using the node-mysql driver, it takes ~3-4 seconds to execute. (I used console.time and console.timeEnd)

When I query it directly in mySql, it says it takes ~200 ms.

What accounts for this immense difference, and how do I improve the Node.js implementation to be inline with the direct Mysql query?

Aleron answered 25/4, 2014 at 2:9 Comment(3)

node-mysql2 should be faster. It adds some performance tricks that node-mysql currently does not do. You should give it a try. – Woodford 25/4, 2014 at 2:28

The Node app has to receive the data from the DB. How large is is your result set? Do you need all 25k rows? Limiting your results could help in that regard too. I'm interested in your findings on this, good question so far. – Giannagianni 25/4, 2014 at 13:25

@Giannagianni The size of the result set is 500KB. I need all 25k rows b/c I want to do some complicated counts on it that I'm not sure how to do in SQL – Aleron 25/4, 2014 at 17:58

Turns out it was an Amazon backend problem. We're using Elastic Beanstalk, and the EC2 and RDS instance are in the same region but not in the same availability zone. Once I set them to be the same, i.e., us-east-1a, the query in node.js took around ~200ms.

Aleron answered 25/4, 2014 at 23:7 Comment(1)

I have the same problem. But it didn't work for me. I also saw that when I run two queries at the same time from two different places, the response time of mysql is two time slower. Here is my issue: #32309480 Can you please help me? – Damselfish 31/8, 2015 at 17:46

Are you using row streaming?

If not, node-mysql is building a 25k element object, with all the allocations and iterations and assignments associated with it. And it's doing it in javascript (rather than nicely optimized native code) since node-mysql is pure javascript.

Row streaming will let you collect the information you need for your calculations one record at a time, much like you would do with a the traditional sort of cursor-based MySQL adapter you see in most other environments.

Homogeneity answered 25/4, 2014 at 18:14 Comment(1)

i tried it with row streaming, but it decreased the execution time by ~1s, still nowhere near the 200ms it should take – Aleron 25/4, 2014 at 23:5

If you wrap any long query in this

SELECT COUNT(1) FROM (
   <your query here>
) xyz

You'll see that the execution time will match any database tool (dbbeaver, etc) and the mysql cli.

The complicated part:

Where the time differential is coming from is the time it takes to convert the bytes mysql returns along with a table definition to JSON. The more rows returned the more work has to be done. When we SELECT COUNT(1) the amount of data being sent/parsed/serialized is like a couple of bytes and so what you're left with is how long the query actually takes in Node.

The differential between running directly and COUNT(1) is the parse/serialize time.

Database tools normally cheat here, if your query returns 150_000 rows they'll normally only parse/serialize the first 200 rows returned and then as you scroll parse/serialize the next 200 etc. This process will not run the query again as the bytes for the next 200 are just stored in memory buffer.

It's clear after doing some testing using mysql2, prisma, and other tools that implement the MySQL protocol that that is a clear bottleneck in node in serialization (interop) compared to other languages. It's most likely this is due to poorly written .js that takes the bytes returned and serializes into JSON.

This can be confirmed by if you force MySQL to return JSONAGGR of the data and then serialize that into data in node using any of the common drivers you'll see about as 500% performance increase (at the cost of the database doing more work).

This is because the serialization process on the NodeJS side is alot simpler the algorthimn will be:

Bytes are a utf-8 json string -> convert to utf-8 string -> JSON.parse(string) [native]

This is much faster than

Bytes returned are numbers/strings/floats/etc iterate through bytes and create json object literals manually (so basically a version of bytes -> json.parse) but the json.parse part isn't native code.

As a community I think there's a serious challenge here for the NodeJS ecosystem that we need to demand faster MySQL parsing/serialization other backend languages are literally eclipsing us in performance i'm seeing differences as high as 25seconds for 150_000 rows.

Debating the use-cases for returning this large amount of data is irrelevant if Java and Rust can make it fast so can we.

Dedra answered 22/5 at 22:58 Comment(0)

Recommended topics

Hot tags