Scaling a chat app - short polling vs. long polling (AJAX, PHP)
Asked Answered
W

3

34

I run a website where users can chat with each other through the browser (think Facebook chat). What is the best way to handle the live interaction? (Right now I have a poll going every 30 seconds to update online users and new incoming messages, and another poll going on chat pages every second to get new messages.)

Things I've considered:

  • HTML5 Web Sockets: didn't use this because it doesn't work in all browsers (only chrome).
  • Flash Sockets: didn't use this because I wanted to eventually support mobile web.

Right now, I am using short polling because I don't know how scalable AJAX long polling would be. I'm running a VPS server from servint right now (running apache). Should I use long polling or short polling? I don't need absolutely immediate response times (just "good enough" for a chat app). Is short polling this often with a few hundred-thousand users going to kill my server? How do I scale this, please help!

Weitzman answered 15/3, 2011 at 15:2 Comment(3)
I know that Apache generally does not handle well with many simultaneous connections. And also realize that there may be other solutions built for this scenerio (nodejs, etc.). But right now, I'd like to avoid rewriting the entire application.Weitzman
What about implementing multiple solutions for different platforms? I.e., if HTML5 is supported, browser uses HTML5, if flash is supported, browser uses flash, if none of the above is supported, browser uses ajax.Compound
You may be interested in this post urbanairship.com/blog/2010/09/29/linux-kernel-tuning-for-c500kInterstate
P
45

A few notes:

  • Polling every second is overkill. The app will still feel very responsive with a few seconds of delay between checks.
  • To save your db's traffic and speed responses, consider using an in memory cache to store undelivered messages. You could still persist messages to the db, the in memory cache would simply be used for queries for new messages to avoid queries to the db every x seconds by each user.
  • Timeout the user's chat after x seconds of inactivity to stop polling to your server. This assures someone leaving a window open won't continue to generate traffic. Offer a simple "Still there? Continue chatting." link for sessions that timeout and warn the user before the timeout so they can extend the timeout.
  • I'd suggest starting out with polling rather than comet/long polling/sockets. Polling is simple to build and support and will likely scale just fine in the short-term. If you get a lot of traffic you can throw hardware and a load balancer at the problem to scale. The entire web is based on polling - polling most certainly scales. There's a point where the complexity of alternatives like comet/long polling/etc make sense, but you need a lot of traffic before the extra development time/complexity are justified.
Pskov answered 16/3, 2011 at 15:56 Comment(1)
Your last point was very helpful - I've been trying to decide how future-proof a first polling implementation in my app needs to be, and I think I'll take your advice and get simple polling working quickly, then plan a smart long-term solution.Margeret
E
23

This is something everyone did once upon a time before the introduction of cometd and nodejs.

The issue as I see it is PHP requests on Apache are very expensive. If your chat application checks for messages every second you will find yourself in a situation where Apache does not have enough resources to respond to requests. The other area I think needs improvement is to improve the context of your chat application.

Why does it update every second if not to retrieve new messages? What if there are no messages?

Some techniques you can use;

  • Provide a light-weight endpoint to your clients that has some context about the chat session, is a new message pending, how many messages etc. The client can respond to this by updating immediately or not if there are no new messages. This endpoint can provide a simple json object via http request. You are guaranteed that this status message will be a fixed size and if the response of the status does not change you can decay it. See next message.

  • A simple decay in your javascript polling, if the client receives the same response from the server a few times in a row you can increment the poll by a set time, at present you said it was every second. If you did this you would increment to every 2,4,6,8,10 seconds. As soon as the response from the server changes you reset the decay.

Some optimizations to consider;

  • Use a PHP Opcode cache like APC.

  • Set a low timeout on all requests, you do not want any requests to hang your server.

  • Optimize your PHP code, make it lean and fast.

  • Run some load tests to see what your limits are.

  • Benchmark performance often to make sure your applications is getting faster.

  • Check apache logs for tell tale signs of overall health of the application and response times.

When scaling becomes necessary, add a new server and use a load balancer to distribute requests. I have used Varnish and HAProxy with great success, setting them up is not complicated either.

Ellis answered 16/3, 2011 at 11:39 Comment(1)
The dynamic increment is something I never thought of, really good pointCynth
L
1

If i were you i'd pick a library that uses html5 web sockets yet falls back on flash sockets if html5 isn't available, the browser that fall through the crack should be minute.

Also you should either abandon php or supplement it with a threaded socket server written either in python or ruby with em-websocket.

Leftwich answered 16/3, 2011 at 14:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.