Structured exception handling with a multi-threaded server

E

3

4

This article gives a good overview on why structured exception handling is bad. Is there a way to get the robustness of stopping your server from crashing, while getting past the problems mentioned in the article?

I have a server software that runs about 400 connected users concurrently. But if there is a crash all 400 users are affected. We added structured exception handling and enjoyed the results for a while, but eventually had to remove it because of some crashes causing the whole server to hang (which is worse than just having it crash and restart itself).

So we have this:

With SEH: only 1 user of the 400 get a problem for most crashes
Without SEH: If any user gets a crash, all 400 are affected.
But sometimes with SEH: Server hangs, all 400 are affected and future users that try to connect.

Empanel answered 2/10, 2008 at 20:19 Comment(0)

P

2

Break your program up into worker processes and a single server process. The server process will handle initial requests and then hand them off the the worker processes. If a worker process crashes, only the users on that worker are affected. Don't use SEH for general exception handling - as you have found out, it can and will leave you wide open to deadlocks, and you can still crash anyway.

Piacular answered 2/10, 2008 at 20:24 Comment(4)

Well you could be less granular than that. I don;t know what your load is like, but I bet you could break it along the lines of so and so many per worker process – Piacular 2/10, 2008 at 20:28

gotcha this might work as a parameter of the number of working processes. – Empanel 2/10, 2008 at 20:31

I wouldn't do that if you want to be able to scale and have decent performance... But then I'm a big fan of IOCP and overlapped IO for servers. Don't let bugs in your program dictate architecture decisions as if you are powerless to fix the bugs. – Achromatic 3/10, 2008 at 7:18

Sometimes you don't always have complete control of the code that runs in your process space - e.g., plugins, webservers. In that case it is a good architecture to separate worker processes – Piacular 3/10, 2008 at 20:36

M

3

Using SEH because your program crashes randomly is a bad idea. It's not magic pixie dust that you can sprinkle on your program to make it stop crashing. Tracking down and fixing the bugs that cause the crashes is the right solution.

Using SEH when you really need to handle a structured exception is fine. Larry Osterman made a followup post explaining what situations require SEH: memory mapped files, RPC, and security boundary transitions.

Mixologist answered 3/10, 2008 at 4:9 Comment(0)

P

2

Break your program up into worker processes and a single server process. The server process will handle initial requests and then hand them off the the worker processes. If a worker process crashes, only the users on that worker are affected. Don't use SEH for general exception handling - as you have found out, it can and will leave you wide open to deadlocks, and you can still crash anyway.

Piacular answered 2/10, 2008 at 20:24 Comment(4)

Well you could be less granular than that. I don;t know what your load is like, but I bet you could break it along the lines of so and so many per worker process – Piacular 2/10, 2008 at 20:28

gotcha this might work as a parameter of the number of working processes. – Empanel 2/10, 2008 at 20:31

I wouldn't do that if you want to be able to scale and have decent performance... But then I'm a big fan of IOCP and overlapped IO for servers. Don't let bugs in your program dictate architecture decisions as if you are powerless to fix the bugs. – Achromatic 3/10, 2008 at 7:18

Sometimes you don't always have complete control of the code that runs in your process space - e.g., plugins, webservers. In that case it is a good architecture to separate worker processes – Piacular 3/10, 2008 at 20:36

A

1

Fix the bugs in your program ? ;)

Personally I'd keep the SEH handlers in, have them dump out a call stack of where the access violation or whatever happened and fix the problems. The 'sometimes the server hangs' problem is probably due to deadlocks caused by the thread that had the SEH exception keeping something locked and so is unlikely to be related to the fact that you're using SEH itself.

Achromatic answered 3/10, 2008 at 7:16 Comment(0)

Recommended topics

Hot tags