Is DataSnap Optimized for responding to more than 1k users at the same time?

Asked 3/11, 2012 at 17:24 Answered 4/11, 2012 at 11:37

Solved delphi firebird datasnap multi-tier delphi-xe3

We want to start a big multi-tier application. The server side application must respond to more than 1000 users at the same time. We want to create server application by 64 bit compiler and client side with 32 bit. In this case we don't know DataSnap can respond to all client without any problem or not? In this case The Server computer is very powerful (multi-processor and more than 16GB of RAM) and DataBase Management system is FireBird 2.5.

Balkin answered 3/11, 2012 at 17:24 Comment(6)

This seems to be a very open-ended question, and may bring some debate... – Reece 3/11, 2012 at 17:59

My estimate is that this system will need ten or more servers... – Teage 3/11, 2012 at 19:48

@mjn: ten or more? really?! with this limited information, how can you jump to a conclusion like that? – House 4/11, 2012 at 2:6

1K concurrent users performing what amount of work each one? Yo have to be much more precise to get a useful response. – Blake 4/11, 2012 at 4:19

@Blake That's exactly my view. If clients will be fetching only one short string value from the server, surely 1 server can handle 1k concurrent clients. But if clients are streaming video from your server, you'll want to limit it to 100 or so. It also depends on how powerful the server is and how much bandwidth can be transferred at a time. – Reece 4/11, 2012 at 17:36

@JerryDodge sure, I could say with modern hardware, usually the bandwidth becomes a bottleneck before the processing power, but again, it depends. – Blake 4/11, 2012 at 20:8

You need a way to perform realistic load tests.

For the Firebird database, you can simulate concurrent users with the free Apache JMeter tool. It can run SQL statements and record their execution time statistics (average, min/max etc.). So you could for example create a thread group with twenty different SQL queries, and then run twenty threads which each will perform these queries sequentially.

JMeter allows to define time limits on the SQL query, and JMeter treats it as an error if the query exceeds this limit. Then you can try to find the maximum client count where the overall error rate is still less than (for example) five percent.

But you also need to know how high the expected database load will be, and you will also need to have a test database with a realistic size, not only a couple of records. Also, some database queries like reports might cause higher load - these should be included in the simulation too, as they can affect overall performance. In JMeter, you can create a second thread group, running in parallel with the first one, for these long-running statements with different settings (less simulated clients).

Testing the database will show if there is a bottleneck already in this area. For example, the test result could be that the database can serve twenty clients with a total average transaction rate of 20 TPS (transactions per second), which means one client executes one transaction per second. But this TPS value will decrease with higher user count.

Related question: Firebird usage in big projects which also has a link to http://www.firebirdsql.org/en/case-studies-catalog/

Regarding DataSnap client load simulation: this can be done with a scripted client, which runs a predefined set of statements / commands over the connection. To run a high number of load test clients simultaneaously you could use a service like Amazon Elastic Compute Cloud (EC2), to launch clones of your test machine image, saving you hardware costs. But of course I would start with a small client machine which simply runs ten or twenty scripted clients.

Teage answered 3/11, 2012 at 20:1 Comment(2)

Overall a good advice. Note on EC2 though: it's an interesting option for companies who don't have the knowledge or the time to maintain their own servers, but in the long run it's way more expensive than hosting your own servers. – House 4/11, 2012 at 2:18

@WoutervanNifterick I edited the text to make clear EC2 was my recommendation for running a high number of load test clients only for a short time period, which is not very expensive if the test takes only a couple of hours. – Teage 4/11, 2012 at 10:28

As far as I know DataSnap is based on Indy. And Indy's connection handling model is not very scaleable - one thread per connection, which is very resource consuming. Even using Indy's thread pools is not an option I think... Also in Windows (32 bit) for example there is a limit of the maximum threads you can create (2000 IIRC). Anyway - using many threads is not good and hits performance of the server (for reference - Windows Internals book, Windows Performance Team blog etc.)

A scalable, robust and professional application server would use IO Completion ports (IOCP) for data processing. But I don't know if DataSnap can take advantage of it.

UPDATE: On the CodeRage7 I asked similar scaleability questions. Here are the answers:

Q: Recently there was a question on StackOverflow about DataSnap's scaleability/performance. So can DS handle, for example, 2000 or more concurent user request at the network and application level?

A: the scalability is based on scalability of TCP/HTTP/HTTPs and # of connections allowed in your server operating system. Also based on memory and hardware you employ. There is no specific limit in DataSnap.

My comment: While this is true, Indy's connection handling model, i.e. one thread per connection, introduces bottleneck especially in 32 bit Windows (2000 threads max). In Win64 it should not be so much problem, but again - this kind of handling data flow leads to performance degradation.

Q: Does DataSnap support some kind of load balancing?

A: Not directly. You can do this in code in your DataSnap server(s).

My comment: I've found very good paper on implementing Failover/Load Balancing in DataSnap in Andreano Lanusse's blog

Q: Does DataSnap support IO Completion ports for better scaleability?

This my question was left unanswered.

Hope this helps!

UPDATE2:

I found very interesting post on DS Performance: DataSnap analysis based on Speed & Stability tests

UPDATE3:

Rushy answered 4/11, 2012 at 11:37 Comment(6)

There's no limit on the number of threads that a 32 bit process can create. I guess you're confusing it with a situation where the number of threads have consumed the available address space for the process. – Fastness 28/11, 2012 at 21:30

While there is no hard limit, that "situation" does impose a limit, although the limit varies based on the thread's configuration, particularly stack size. For anyone else wanting more details, see Raymond Chen's blog post on the topic. – Globefish 19/3, 2013 at 5:50

DataSnap is NOT meant to be deployed to production based on the Indy test/development server model. It's meant to be deployed using ISAPI under IIS. – Natalya 3/9, 2014 at 20:5

@WarrenP: Where is the reference for that? – Fescennine 4/9, 2014 at 9:50

docwiki.embarcadero.com/RADStudio/XE6/en/… – Natalya 4/9, 2014 at 16:51

You will probably get 10x better scalability with ISAPI versus Indy. The problem with Indy scaling is well known and is a fundamental limitation of Indy component design, so calling it a problem with DataSnap is hardly fair. ISAPI is not infinitely scalable EITHER but it's better than Indy. – Natalya 4/9, 2014 at 16:52

When the specifications for a system are made, you need to be very precise when it's about multiple users.

For example: you create a website, and the client expects 15.000 unique users. Then the client usually comes up with a requirement that the system needs to support 15.000 simultaneous users, which is very naive.

You'll need a more detailed specification than that.

Usually it's more sensible to say something like: in 99% of the requests, 99% of the users can get a response to their request within 5 seconds average.

In normal usage, you'll never see all users send a request within the same second. If at some point they all arrive within the same minute (also very unlikely), you'll have a lot fewer concurrent users.

Even for websites with tens of thousands of users, where most of them connect on a daily basis, the webserver is idle most the time, and once and a while it jumps to 5% or in extreme cases to 20%. If we really have to serve all of these users at once we'd be screwed, but that never happens, and it's not realistic to prepare a server for such loads.

House answered 4/11, 2012 at 2:5 Comment(0)

Recommended topics

Hot tags