Number of instances needed for windows azure application

Asked 15/1, 2011 at 9:17 Answered 26/5, 2011 at 12:22

azure azure-web-roles instances azure-configuration

I'm fairly new to Windows Azure and want to host a survey application that will be filled out by appr. 30.000 users simultaniously.

The application consists of 1 .aspx page that will be sent to the client once, asks 25 questions and will give a wrap-up of the given answers at the end. When the user has given the answer and hits the 'next question' buttons the given answer will be send via an .ashx handler to the server. The response is the next question and answers. The wrap-up is sent to the client after a full postback. The answer is saved in an Azure Table that is partitioned so that each partition can hold a max of 450 users.

I would like to ask if someone can give an estimated guess about how many web-role instances we need to start in order to have this application keep running. (If that is too hard to say, is it more likely to start 5, 50 or 500 instances?)

What is a better way to go: 20 small instances or 5 large instances?

Thanks for your help!

Diaphone answered 15/1, 2011 at 9:17 Comment(0)

The most obvious answer: you would be best served by testing this yourself and see how your application holds up. You can easily get performance counters and other diagnostics out of Windows Azure; for instance, you can connect Microsoft SCOM (System Center Operations Manager) to monitor your environment during test. Site Hammer is a simple load testing tool for Windows Azure (on MSDN code gallery).

Apart from this very obvious answer, I will share some guesstimates: given the type of load, you are probably better of with more small instances as opposed to a lower number of large ones, especially since you already have your storage partitioned. If you are really going to have 30K visitors simultaneously and give them a ~15 second interval between reading the questions & posting their answers you are looking at 2,000 requests per second. 10 nodes should be more than enough to handle that load. Remember that this is just a simple estimate, lacking any form of insight in your architecture, etc. For these types of loads, caching is a very good idea; it will dramatically increase the load each node can handle.

However, the best advice I can give you is to make sure that you are actively monitoring. It takes less than 30 minutes to spin up additional instances, so if you monitor your environment and/or make sure that you are notified whenever it starts to choke, you can easily upgrade your setup. Keep in mind that you do need to contact customer support to be able to go over 20 instances (this is a default limit, in place to protect you from over-spending).

Caustic answered 15/1, 2011 at 12:34 Comment(2)

Hi Tijmen, thanks for your remarks. We started load tests but since I'm rather new to this subject it is always good to try not to reinvent the wheel... The survey is somewhat different: all 30.000 visitors are watching a show and will answer the question at the same time. This will raise the requests per second to an estimated 10.000. We use caching, singleton classes and are optimizing the solution at this moment to make it as lean as possible. We will dive into the monitoring and adding resources right away! – Diaphone 16/1, 2011 at 9:33

For this type of throughput, look into the performance difference between writing to a Azure queue instead of directly into an Azure table... queue should be faster, you might gain some perf there. You do need to write a worker role to process the data in the queue, but that is not on the perf-critical path. Regardless of solution, make sure that you review the request execution time for all hits (and not averages), to make sure that there is not ~10% of hits taking too long without them showing up on the average values. – Caustic 17/1, 2011 at 7:59

Aside from the sage advice tijmenvdk gave you, let me add my opinion on instance size. In general, go with the smallest size that will support your app, and then scale out to handle increased traffic. This way, when you scale back down, your minimum compute cost is kept low. If you ran, say, a pair of extra-large instances as your baseline (since you always want minimum two instances to get the uptime SLA), your cost footprint starts at 0.12 x 8 x 2 = $1.92 per hour, even during low-traffic times. If you go with small instances, you'd be at 0.12 x 1 x 2 = $0.24 per hour.

Each VM size as associated CPU, memory, and local 9non-durable) disk storage, so pick the smallest size unit that your app works efficiently in.

For load/performance-testing, you might also want to consider a hosted solution such as Loadstorm.

Apart answered 15/1, 2011 at 15:23 Comment(1)

Hi David, thanks for the advise. Since we only have one simple table in which we store the answers, one .aspx page and one .ashx item I think we can indeed best go for small instances. – Diaphone 16/1, 2011 at 9:36

How simultaneous are the requests in reality? Will they all type the address in at exactly the same time?

That said, profile your app locally, this will enable you to estimate CPU, Network and Memory usage on Azure. Then, rather than looking at how many instances you need, look at how you can reduce the requirement! Apply these tips, and profile locally again.

Most performance tips have a tradeoff between cpu, memory or bandwith usage, the idea is to ensure that they scale equally. If you're application runs out of memory, but you have loads of CPU and network, dont

For a single page survey, ensure your html, css & js is minified, ensure its cacheable.

Combine them if possible, and to get really scaleable, push static files (css,js & images) to a CDN. This all reduces the number of requests the webserver has to deal with, and therefore reduces the number of webroles you will need = less network.

How does the ashx return the response? i.e. is it sending html, xml or json? personally, I'd get it to return JSON, as this will require less network bandwidth, and most likely less server side processing = less mem and network.

Use Asyncronous API's to access azure storage (this uses IO completion ports to free up the iis thread to handle more requests until azure storage comes back = enabling cpu to scale)

tijmenvdk has already mentioned using queues to write. Do the list of questions change? if not, cache them, so that the app only has to read from table storage once on start-up and once for each client for the final wrap-up = saves network and cpu at the expense of memory.

All of these tips are equally applicable to a normal web application, on a single server or web-farm environment.

The point I'm trying to make is that what you can't measure, you cant improve, and measurement, improvement and cost all go hand in hand. Dynamic scaling will reduce costs, but fundamentally if your application hasn't been measured and resource usage optimised, asking how many instances you need is pointless.

Unbound answered 26/5, 2011 at 12:22 Comment(0)

Recommended topics

Hot tags