Massive multi-user realtime application with Google App Engine
Asked Answered
I

1

9

I'm building a multiuser realtime application with Google App Engine (Python) that would look like the Facebook livestream plugin: https://developers.facebook.com/docs/reference/plugins/live-stream/

Which means: 1 to 1 000 000 users on the same webpage can perform actions that are instantly notified to everyone else. It's like a group chat but with a lot of people...

My questions:
- Is App Engine able to scale to that kind of number?
- If yes, how would you design it?
- If no, what would be your suggestions?

Right now, this is my design:
- I'm using the App Engine Channel API
- I store every user connected in the memcache
- Everytime an action is performed, a notification task is added to a taskqueue
- The task consist in retrieving all users from memcache and send them a notification.

I know my bottleneck is in the task. Everybody is notified through the same task/ request. Right now, for 30 users connected, it lasts about 1 sec so for 100 000 users, you can imagine how long it could take.

How would you correct this?

Thanks a lot

Intelsat answered 3/12, 2011 at 7:53 Comment(0)
D
11

How many updates per user do you expect per second? If each user updates just once every hour, you'll be sending 10^12 messages per hour -- every sent message results in 1,000,000 more sends. This is 277 million messages per second. Put another way, if every user sends a message an hour, that works out to 277 incoming messages per second, or 277 million outgoing messages.

So I think your basic design is flawed. But the underlying question: "how do I broadcast the same message to lots of users" is still valid, and I'll address it.

As you have discovered, the Channel API isn't great at broadcast because each call takes about 50ms. You could work around this with multiple tasks executing in parallel.

For cases like this -- lots of clients who need the exact same stateless data, I would encourage you to use polling, rather than the Channel API, since every client is going to receive the exact same information -- no need to send individualized messages to each client. Decide on an acceptable average latency (eg. 1 second) and poll at twice that rate (eg. 2 seconds). Write a very lightweight, memcache-backed servlet to just get the most recent block of data and let the clients de-dupe.

Durrace answered 3/12, 2011 at 17:58 Comment(3)
Thanks a lot Moishe! Actually I was thinking about polling and you give me the confirmation. So I think I'll implement it right away. Thanks again for your responsiveness (even during weekends): the App Engine team is wonderful!Intelsat
I think you mean "poll at half that rate" - or twice the interval. Also, it's worth noting that you can rely on frontend caching for this solution, so most polling requests won't hit the backend at all.Sniffy
Actually I benchmarked a lot of options in the past few days: hosted and non hosted solutions ranging from PubNub, Pusher, Beaconpush to node.js/ socket.io, ...etc. I think I'll go for PubNub. They seem to have a SDK for App Engine. One question though: is there any plan from the App Engine team to roll out a "fan-out"/"broadcast" push API in the next few months. ThanksIntelsat

© 2022 - 2024 — McMap. All rights reserved.