We're trying to use AppFabric distributed cache. After a lot of back and forth with non-domain servers we finally put them in a domain and installation/setup was a bit easier. We got it up and running after fighting through a ton of errors, most of which seems trivial to include some test or more descriptive error message for in AppFabric. "Temporary error" does not explain a lot...
But there are still issues.
We set up 3 servers, one of which is "lead". We finally got the cache working and we confirmed this by pointing a Network Load Balancer to one server at a time confirming that we can set cache at one server and retrieve it at another.
Then I restarted the AppFabric Caching service on all servers and suddenly it is not working. Get-CacheHost says they are up, but we get exceptions like:
ErrorCode<ERRCA0018>:SubStatus<ES0001>:The request timed out ErrorCode<ERRCA0017>:SubStatus<ES0001>:There is a temporary failure. Please retry later.
Why would this error condition occur by simply restarting the services?
Is AppFabric Cache really ready for production use?
What happens if a server goes offline? Long timeouts?
Are we dependent on the "lead" server being up?
I suspect it will be back up after 5-10 minutes of R&R. It seems to come back by itself sometimes.
Update: It did come up after a few minutes. We have now tested by removing one server from the cluster and it resulted in a long timeout and finally an exception.