We just tested an AppFabric cluster of 2 servers where we removed the "lead" server. The second server timeouts on any request to it with the error:
Microsoft.ApplicationServer.Caching.DataCacheException: ErrorCode<ERRCA0017>:SubStatus<ES0006>: There is a temporary failure. Please retry later. (One or more specified Cache servers are unavailable, which could be caused by busy network or servers. Ensure that security permission has been granted for this client account on the cluster and that the AppFabric Caching Service is allowed through the firewall on all cache hosts. Retry later.)
In practive this means that if one server in the cluster goes down then they all go down. (Note we are not using Windows cluster, only linking multiple AppFabric cache servers to each other.)
I need the cluster to continue operating even if a single server goes down. How do I do this?
(I realize this question is borderlining Serverfault, but imho developers should know this.)