Failed to load viewstate on NLB
Asked Answered
B

4

4

We have a system which dynamically creates the controls of a page every post back and handles back using the browser history and such.

The problem is that on the production server (2 nodes on NLB) we get randomly in differnet spots with no correlation we have found, a failed to load viewstate, the control tree might be different error. However, the exact same code on our staging server (same NLB setup as production) this has never happened.

I'm basically ruling out that its code at this point, since it doesnt happen in dev/staging or local enviroments at all, where on production it is fairly frequent. This is leading me to belive we have a configuration error, somewhere.

I have set hardcoded Machine keys in the web.config that is used on staging & production, and sessions are backed on MSSQL.

If anyone has suggestions to get me going in the right direction that would be great, our entire dev team is stumped by this.

Our webconfig is here at pastbin: http://pastebin.com/m2kRTd0k

Broeder answered 4/7, 2011 at 13:35 Comment(10)
I had a similar issue I posted a few years back on SO, it might be related. See here: #343653Lombard
The request is quite short, the viewstate isnt anywhere near the length indicated in the answer to your question. This is the post request from fiddler: pastebin.com/J2HaL51D Only wierd thing is __viewstateencrypted=&... but no idea if thats how it should be or not.Broeder
So, whats different between Production and the Staging area? Level of concurrecey? Diversity of clients? Some subtle or overlooked configuration setting? Maybe some physical variation, hardware or environment ...Headlong
only thing I've turned up so far is that our iis is bound to ip's instead of hostnames on production, and windows update is disabled. I'm running the windows updates now so it is at the same version as our staging enviroment.Broeder
Also the sites run SSL. Still havnt solved it, I reduced the frequency of it occuring, for some reason it was trying to bounce them to the forms login page which was 404, setup same way on our staging box but it never redirects them there. I redirected that to the correct place and it slowed down the frequency of this error. Still happening though a bit.Broeder
Have you generated and configured a shared machine key?Mid
We do have machine keys setup properly, it is in the web.config file which gets deployed to both nodes of the NLBBroeder
I think this may have something todo with the session becoming invalidated. Our code depends on the session variables to recreate the page so the viewstate can then be loaded. However I think it is being invalidated very quickly which causes it to blow up. It is set to invalidate after 20 minutes but I jsut had the problem after no more than 5-10 minutes. Any thing that could cause this/Broeder
It is definitely a result of the session becoming invalidated well before the timeout occurs. However its only an issue when NLB is enabled. IE currently with nlb disable and running on only node 1, we have had no issues and several thousand hits in the last 24 hours. When we turn on NLB we get these errors starting immediately.Broeder
Did you ever found the root cause of the problem?Gilmagilman
T
1

Here are a few things that you could check:

  • Are you sure that the web.config files you have edited are the ones that are used? Try putting a syntax error in them and see that you get an error.
  • Check that Sticky IP is not configured on your stage environment
  • Check that your environments are at the same patch level, maybe one has a different default value than the other.
  • There could also be a difference in the network infrastructure, where the 2 environments are located
Trembly answered 14/7, 2011 at 20:55 Comment(1)
The web.config is definitely being used. Dont have sticky ip setup, patched to same level of windows updates. The network is as close as practically possible, given our production eviroement is at the colo and staging is in house. The two servers are identical vcenter boxes with the same windows image.Broeder
M
0

I have a dynamic questionnaire form in my company that is entirely database driven.

Since there are so many question types (e.g. yes/no, value based, multiple choice, sliders, multi-level drop downs, etc...), we have to create the form dynamically.

We would run into issues in production, but not on our dev or qa environment, similar to yourself; but our problem was the code. When we pushed the application to production, there were many more users running through so many more scenarios than we were ever able to in dev/qa.

Whenever we see the 'failed to load viewstate', doing one of these two things will usually solve the problem:

  • Create all the controls dynamically only in the INIT stage. It should be done here for the ViewState to work properly without much additional thought. ViewState is loaded after INIT completes and saves after PRERENDER completes. We can create the controls as late as the LOAD stage, but there may be nuances to the wiring. (http://msdn.microsoft.com/en-us/library/ms178472.aspx#general_page_lifecycle_stages)

  • Turn off AJAX and see if that fixes it (this is usually the culprit). If this fixes it, then we just have to check that there are no AJAX post backs that cause the page to change it's layout. An AJAX call may do something as simple as making a control invisible, or re-rendering a control with a new ID, causing the next next normal postback to detect changes in the control tree. If we have to make controls invisible through ajax, we just add the attribute('display','none') instead. After we are done with the changes, we turn AJAX back on.

Many answered 14/7, 2011 at 18:52 Comment(1)
I can reproduce this bug easily myself, our production server is not yet live to the masses. Also for the most part our code creates controls in init, there are special cases, but its been thoroughly tested.Broeder
R
0

Does this happen for a variety of browsers? There are older versions of Safari and some proxy servers that would truncate the ViewState when it passes it back to the server.

One thing you might want to try to Chunking the viewstate. You can do that by setting the maxPageStateFieldLength attribute on the pages tag in your web.config. Here an example

<pages maxPageStateFieldLength="900">

Finally, you may want to consider not using clientSide viewstate at all. Here's an article that implements a serverside SQL based viewstate provider: http://www.codeproject.com/KB/viewstate/ViewStateProvider.aspx

Rik answered 14/7, 2011 at 19:18 Comment(4)
It happens in chrome 12, firefox 5 and ie9, we dont support old browsers (ie7 at oldest). I'll try it tomorrow to be sure though. Plus I've seen ithappen with very short viewstates, and they can be decoded successfully using a viewstate decoder.Broeder
I see below that you can duplicate the problem. Have you tried using Fritz Onion's ViewState Decoder on what's being sent to the client and making sure that the dynamic controls aren't messing up the ControlState? FYI, ViewState Decoder is here: pluralsight-training.net/community/media/p/51688.aspxRik
The viewstate is decodable, and the data it contains is correct for the last few times the error occured.Broeder
I just put up the maxPageStateFieldLength to test it. I'm not encountering the error myself, but I've thought I solved this one a couple times so I'm going to watch the error reports for a bit to be sure.Broeder
G
0

We had a very similar issue: “failed to load viewstate” errors occurred on load-balanced production servers, but we were not able to reproduce it on local/development servers (even when we added additional instance to have load balance on development site).

Finally we found that due to an error during deployment one of production servers had different version to the other and when user started on one server and continued to another (no Sticky IP configured), viewstate errors occurred.

Gilmagilman answered 20/5, 2020 at 22:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.