A few additional thoughts.
What does the CLR see when this code is executed
As Jon and others have correctly noted, we are not doing variance on classes, only interfaces and delegates. So in your example, the CLR sees nothing; that code doesn't compile. If you force it to compile by inserting enough casts, it crashes at runtime with a bad cast exception.
Now, it's still a reasonable question to ask how variance works behind the scenes when it does work. The answer is: the reason we are restricting this to reference type arguments that parameterize interface and delegate types is so that nothing happens behind the scenes. When you say
object x = "hello";
what happens behind the scenes is the reference to the string is stuck into the variable of type object without modification. The bits that make up a reference to a string are legal bits to be a reference to an object, so nothing needs to happen here. The CLR simply stops thinking of those bits as referring to a string and starts thinking of them as referring to an object.
When you say:
IEnumerator<string> e1 = whatever;
IEnumerator<object> e2 = e1;
Same thing. Nothing happens. The bits that make a ref to a string enumerator are the same as the bits that make a reference to an object enumerator. There is somewhat more magic that comes into play when you do a cast, say:
IEnumerator<string> e1 = whatever;
IEnumerator<object> e2 = (IEnumerator<object>)(object)e1;
Now the CLR must generate a check that e1 actually does implement that interface, and that check has to be smart about recognizing variance.
But the reason we can get away with variant interfaces being just no-op conversions is because regular assignment compatibility is that way. What are you going to use e2 for?
object z = e2.Current;
That returns bits that are a reference to a string. We've already established that those are compatible with object without change.
Why wasn't this introduced earlier? We had other features to do and a limited budget.
What's the principle benefit? That conversions from sequence of string to sequence of object "just work".