What is the best, most efficient, Client pool technique with Erlang
Asked Answered
S

3

8

I'm a real Erlang newbie (started 1 week ago), and I'm trying to learn this language by creating a small but efficient chat server. (When I say efficient I mean I have 5 servers used to stress test this with hundreds of thousands connected client - A million would be great !)

I have find some tutorials doing so, the only thing is, that every tutorial i found, are IRC like. If one user send a message, all user except sender will receive it. I would like to change that a bit, and use one-to-one discussion.

What would be the most effective client pool for searching a connected user ? I thought about registering the process, because it seems to do everything I need, but I really don't think this is the better way to do it. (Or most pretty way to do it anyway).

Does anyone would have any suggestions doing this ?

EDIT :

Every connected client is affected to an ID.

When the user is connected, it first send a login command to give it's id. When an user wants to send a message to another one the message looks like this

[ID-NUMBER][Message] %% ID-NUMBER IS A FIXED LENGTH

When I ask for "the most effective client pool", I'm actually looking for the fastest way to retrieve/add/delete one client on the connected client list which could potentially be large (hundred of thousands -- maybe millions)

EDIT 2 :

For answering some questions :

  • I'm using Raw Socket (Using telnet right now to communicate with server) - will probably move to ssl later...
  • It is my own protocol
  • Every Client is a spawned Pid
  • Every Client's Pid is linked to it's own monitor (mostly for debugging reason - The client if disconnected should reconnect by it's own starting auth from scratch)
  • I have read a couple a book before starting coding, So I do not master yet every aspect of Erlang but I'm not unaware of it, I will read more about it when needed I guess.
  • What I'm really looking for is the best way to store and search thoses PIDs to send message directly from process to process.

Should I write my own search Client function using lists ?

or should I use ets ?

Or even use register/2 unregister/1 and whereis/1 to maintain my client list, using it's unique id as atom, it seems to be the simplest way to do so, I really don't know if it is efficient, but I'm pretty sure this is the ugly solution ;-) ?

Subdue answered 1/2, 2012 at 14:40 Comment(8)
I think this is an awesome question, though a bit wavy.Improvise
Could you be a little more specific on 'What would be the most effective client pool for searching a connected user?'? I didn't get your problem.Pleach
@Pleach : I have edited my post, hope you'll find it more specificSubdue
@Subdue It is much better, thanks!Pleach
@TheSquad, you must NOT use atoms for your system! You would have to create them dynamically, and that is generally a bad idea. There is a limit to the number of atoms in an Erlang VM (trapexit.org/Atom_Table). If you're looking for something really simple, give a sequential numerical id to the users, and save their Pid in a dictionary.Overside
Also, what do you mean by "every client's Pid is linked to it's own monitor"? By monitor you mean erlang terminal?Overside
@Overside : okay, I did not knew about the atom limitation. When I say monitor I mean a linked Pid with on_exit receiver. Sorry if I do not use exact terms yet.Subdue
check out the answer edits as wellHydroxylamine
I
2

I'm doing something similar to your chat program using gproc as a pubsub (similar to the demo on that page). Each client registers as it's id. To find a particular client, you do a lookup on that client id. To subscribe to a client, you add a property to that process of the client id being subscribed to. To publish, you call gproc:send(ClientId,Message). This covers your use case, the more general room based chat as well, and can handle distributed masterless registry of processes.

I haven't tested to see if it scales to millions, but it uses ets to do the storage and gproc is rock solid code by Ulf Wiger. I wouldn't count on being able to write a better implementation.

Insanitary answered 2/2, 2012 at 0:54 Comment(3)
Thanks I'll look into it. Yeah if it is writted by Ulf Wiger, I'll have hard time toping that ! +1 for giving me the most pointed way to do this so far!Subdue
Would it be more efficient than registering the process (Knowing that limitations for registering are not an issue for me in this case) ?Subdue
It seems to be exactly what I need ! Thanks. Let's just hope it scales up nicely.Subdue
O
2

I'm also kind of new to Erlang (a couple of months), so I hope this can put you in the correct path :)

First of all, since you're a "newbie", you should know about these sites:

Well, thinking about a non persistent database, I would suggest the sets or gb_sets modules (documentation here).

If you want persistence, you should try dets (see documentation above), but I can't state anything about efficiency, so you should research this topic a bit further.

In the book Learn You Some Erlang there is a chapter on data structures that says that sets are better for read intensive systems, while gb_sets is more appropriate for a balanced usage.

Overside answered 1/2, 2012 at 20:23 Comment(1)
Thanks I have read most of what you posted, will look into Lear You Some Erlang. Yes sets, was one of my lead, just looking if there is more options to do so.Subdue
I
2

I'm doing something similar to your chat program using gproc as a pubsub (similar to the demo on that page). Each client registers as it's id. To find a particular client, you do a lookup on that client id. To subscribe to a client, you add a property to that process of the client id being subscribed to. To publish, you call gproc:send(ClientId,Message). This covers your use case, the more general room based chat as well, and can handle distributed masterless registry of processes.

I haven't tested to see if it scales to millions, but it uses ets to do the storage and gproc is rock solid code by Ulf Wiger. I wouldn't count on being able to write a better implementation.

Insanitary answered 2/2, 2012 at 0:54 Comment(3)
Thanks I'll look into it. Yeah if it is writted by Ulf Wiger, I'll have hard time toping that ! +1 for giving me the most pointed way to do this so far!Subdue
Would it be more efficient than registering the process (Knowing that limitations for registering are not an issue for me in this case) ?Subdue
It seems to be exactly what I need ! Thanks. Let's just hope it scales up nicely.Subdue
H
1

Now, Messaging systems are what everyone wants to do when they come to Erlang because the two naturally blend. However, there are a number of things to look into before one continues. Messaging basically involves the following things: User Registration, User Authentication, Sessions Management,Logging, Message Switching/routing e.t.c.

Now, to do all or most of these, one needs to have a Database, certainly IN-MEMORY, thats leads me to either Mnesia or ETS Tables. Since you are new to Erlang, i suppose you have not yet really mastered working with these. At one moment, you will need to maintain Who is communicating with who, Who is available for Chat e.t.c. Hence you might need to look up things and write things some where.

Another thing is you have not told us the Client. Is it going to be a Web Client (HTTP), is it an entirely new protocol you are implementing over raw Sockets ? Which ever way, you will need to master something called: Concurrency in Erlang. If a user connects and is assigned an ID, if your design is A process Per User, then you will have to save the Pids of these Processes or register them against some criteria, yet again monitor them if they die e.t.c. Which brings me to OTP and Supervision trees. There is quite alot, however, tell us more about the Client and Server interaction, the Network Communication you need e.t.c. Or is it just a simple Erlang RPC project you are doing for your own revision ?



EDIT

Use ETS Tables, or use Mnesia RAM tables. Do not think of registering these Pids or Storing them in a list, Array or set. Look at this solution which was given to this question

Hydroxylamine answered 2/2, 2012 at 5:45 Comment(1)
Actually I already have coded a whole Server side communication protocole using C++ with boost-asio. But There is limitations that I would like to outrun mostly about threads. I think Erlang is the language to do so ! I have Edited my post answering your questions, take a lot at them when you have time, thanks. +1Subdue

© 2022 - 2024 — McMap. All rights reserved.