Implementing a complex rotation-based camera

Asked 13/7, 2012 at 12:56 Answered 31/7, 2012 at 13:19

Solved c++math matrix directx quaternions

I am implementing a 3D engine for spatial visualisation, and am writing a camera with the following navigation features:

Rotate the camera (ie, analogous to rotating your head)
Rotate around an arbitrary 3D point (a point in space, which is probably not in the center of the screen; the camera needs to rotate around this keeping the same relative look direction, ie the look direction changes too. This does not look directly at the chosen rotation point)
Pan in the camera's plane (so move up/down or left/right in the plane orthogonal to the camera's look vector)

The camera is not supposed to roll - that is, 'up' remains up. Because of this I represent the camera with a location and two angles, rotations around the X and Y axes (Z would be roll.) The view matrix is then recalculated using the camera location and these two angles. This works great for pan and rotating the eye, but not for rotating around an arbitrary point. Instead I get the following behaviour:

The eye itself apparently moving further up or down than it should
The eye not moving up or down at all when m_dRotationX is 0 or pi. (Gimbal lock? How can I avoid this?)
The eye's rotation being inverted (changing the rotation makes it look further up when it should look further down, down when it should look further up) when m_dRotationX is between pi and 2pi.

(a) What is causing this 'drift' in rotation?

This may be gimbal lock. If so, the standard answer to this is 'use quaternions to represent rotation', said many times here on SO (1, 2, 3 for example), but unfortunately without concrete details (example. This is the best answer I've found so far; it's rare.) I've struggled to implemented a camera using quaternions combining the above two types of rotations. I am, in fact, building a quaternion using the two rotations, but a commenter below said there was no reason - it's fine to immediately build the matrix.

This occurs when changing the X and Y rotations (which represent the camera look direction) when rotating around a point, but does not occur simply when directly changing the rotations, i.e. rotating the camera around itself. To me, this doesn't make sense. It's the same values.

(b) Would a different approach (quaternions, for example) be better for this camera? If so, how do I implement all three camera navigation features above?

If a different approach would be better, then please consider providing a concrete implemented example of that approach. (I am using DirectX9 and C++, and the D3DX* library the SDK provides.) In this second case, I will add and award a bounty in a couple of days when I can add one to the question. This might sound like I'm jumping the gun, but I'm low on time and need to implement or solve this quickly (this is a commercial project with a tight deadline.) A detailed answer will also improve the SO archives, because most camera answers I've read so far are light on code.

Thanks for your help :)

Some clarifications

Thanks for the comments and answer so far! I'll try to clarify a few things about the problem:

The view matrix is recalculated from the camera position and the two angles whenever one of those things changes. The matrix itself is never accumulated (i.e. updated) - it is recalculated afresh. However, the camera position and the two angle variables are accumulated (whenever the mouse moves, for example, one or both of the angles will have a small amount added or subtracted, based on the number of pixels the mouse moved up-down and/or left-right onscreen.)
Commenter JCooper states I'm suffering from gimbal lock, and I need to:

add another rotation onto your transform that rotates the eyePos to be completely in the y-z plane before you apply the transformation, and then another rotation that moves it back afterward. Rotate around the y axis by the following angle immediately before and after applying the yaw-pitch-roll matrix (one of the angles will need to be negated; trying it out is the fastest way to decide which). double fixAngle = atan2(oEyeTranslated.z,oEyeTranslated.x);

Unfortunately, when implementing this as described, my eye shoots off above the scene at a very fast rate due to one of the rotations. I'm sure my code is simply a bad implementation of this description, but I still need something more concrete. In general, I find unspecific text descriptions of algorithms are less useful than commented, explained implementations. I am adding a bounty for a concrete, working example that integrates with the code below (i.e. with the other navigation methods, too.) This is because I would like to understand the solution, as well as have something that works, and because I need to implement something that works quickly since I am on a tight deadline.

Please, if you answer with a text description of the algorithm, make sure it is detailed enough to implement ('Rotate around Y, then transform, then rotate back' may make sense to you but lacks the details to know what you mean. Good answers are clear, signposted, will allow others to understand even with a different basis, are 'solid weatherproof information boards.')

In turn, I have tried to be clear describing the problem, and if I can make it clearer please let me know.

My current code

To implement the above three navigation features, in a mouse move event moving based on the pixels the cursor has moved:

// Adjust this to change rotation speed when dragging (units are radians per pixel mouse moves)
// This is both rotating the eye, and rotating around a point
static const double dRotatePixelScale = 0.001;
// Adjust this to change pan speed (units are meters per pixel mouse moves)
static const double dPanPixelScale = 0.15;

switch (m_eCurrentNavigation) {
    case ENavigation::eRotatePoint: {
        // Rotating around m_oRotateAroundPos
        const double dX = (double)(m_oLastMousePos.x - roMousePos.x) * dRotatePixelScale * D3DX_PI;
        const double dY = (double)(m_oLastMousePos.y - roMousePos.y) * dRotatePixelScale * D3DX_PI;

        // To rotate around the point, translate so the point is at (0,0,0) (this makes the point
        // the origin so the eye rotates around the origin), rotate, translate back
        // However, the camera is represented as an eye plus two (X and Y) rotation angles
        // This needs to keep the same relative rotation.

        // Rotate the eye around the point
        const D3DXVECTOR3 oEyeTranslated = m_oEyePos - m_oRotateAroundPos;
        D3DXMATRIX oRotationMatrix;
        D3DXMatrixRotationYawPitchRoll(&oRotationMatrix, dX, dY, 0.0);
        D3DXVECTOR4 oEyeRotated;
        D3DXVec3Transform(&oEyeRotated, &oEyeTranslated, &oRotationMatrix);
        m_oEyePos = D3DXVECTOR3(oEyeRotated.x, oEyeRotated.y, oEyeRotated.z) + m_oRotateAroundPos;

        // Increment rotation to keep the same relative look angles
        RotateXAxis(dX);
        RotateYAxis(dY);
        break;
    }
    case ENavigation::ePanPlane: {
        const double dX = (double)(m_oLastMousePos.x - roMousePos.x) * dPanPixelScale;
        const double dY = (double)(m_oLastMousePos.y - roMousePos.y) * dPanPixelScale;
        m_oEyePos += GetXAxis() * dX; // GetX/YAxis reads from the view matrix, so increments correctly
        m_oEyePos += GetYAxis() * -dY; // Inverted compared to screen coords
        break;
    }
    case ENavigation::eRotateEye: {
        // Rotate in radians around local (camera not scene space) X and Y axes
        const double dX = (double)(m_oLastMousePos.x - roMousePos.x) * dRotatePixelScale * D3DX_PI;
        const double dY = (double)(m_oLastMousePos.y - roMousePos.y) * dRotatePixelScale * D3DX_PI;
        RotateXAxis(dX);
        RotateYAxis(dY);
        break;
    }

The RotateXAxis and RotateYAxis methods are very simple:

void Camera::RotateXAxis(const double dRadians) {
    m_dRotationX += dRadians;
    m_dRotationX = fmod(m_dRotationX, 2 * D3DX_PI); // Keep in valid circular range
}

void Camera::RotateYAxis(const double dRadians) {
    m_dRotationY += dRadians;

    // Limit it so you don't rotate around when looking up and down
    m_dRotationY = std::min(m_dRotationY, D3DX_PI * 0.49); // Almost fully up
    m_dRotationY = std::max(m_dRotationY, D3DX_PI * -0.49); // Almost fully down
}

And to generate the view matrix from this:

void Camera::UpdateView() const {
    const D3DXVECTOR3 oEyePos(GetEyePos());
    const D3DXVECTOR3 oUpVector(0.0f, 1.0f, 0.0f); // Keep up "up", always.

    // Generate a rotation matrix via a quaternion
    D3DXQUATERNION oRotationQuat;
    D3DXQuaternionRotationYawPitchRoll(&oRotationQuat, m_dRotationX, m_dRotationY, 0.0);
    D3DXMATRIX oRotationMatrix;
    D3DXMatrixRotationQuaternion(&oRotationMatrix, &oRotationQuat);

    // Generate view matrix by looking at a point 1 unit ahead of the eye (transformed by the above
    // rotation)
    D3DXVECTOR3 oForward(0.0, 0.0, 1.0);
    D3DXVECTOR4 oForward4;
    D3DXVec3Transform(&oForward4, &oForward, &oRotationMatrix);
    D3DXVECTOR3 oTarget = oEyePos + D3DXVECTOR3(oForward4.x, oForward4.y, oForward4.z); // eye pos + look vector = look target position
    D3DXMatrixLookAtLH(&m_oViewMatrix, &oEyePos, &oTarget, &oUpVector);
}

Fidellia answered 13/7, 2012 at 12:56 Comment(4)

i think this error comes from the rounding/truncation error. if you increase the range of the values then you make the drift smaller if i am right – Synchronism 13/7, 2012 at 13:13

you should use more significant digits than you need – Synchronism 13/7, 2012 at 13:13

Last digit is the source of rounding/truncation – Synchronism 13/7, 2012 at 13:14

@tuğrulbüyükışık: Floating point error is possible, but I'm not sure where you mean? The angles are stored as doubles and are the 'whole' rotation (incrementing / adjusting a matrix every mouse movement would lead to much more error; here it's recalculated from scratch each time.) I'm also not sure how that would lead to angle-dependent drift. – Fidellia 13/7, 2012 at 13:44

It seems to me that "Roll" shouldn't be possible given the way you form your view matrix. Regardless of all the other code (some of which does look a little funny), the call D3DXMatrixLookAtLH(&m_oViewMatrix, &oEyePos, &oTarget, &oUpVector); should create a matrix without roll when given [0,1,0] as an 'Up' vector unless oTarget-oEyePos happens to be parallel to the up vector. This doesn't seem to be the case since you're restricting m_dRotationY to be within (-.49pi,+.49pi).

Perhaps you can clarify how you know that 'roll' is happening. Do you have a ground plane and the horizon line of that ground plane is departing from horizontal?

As an aside, in UpdateView, the D3DXQuaternionRotationYawPitchRoll seems completely unnecessary since you immediately turn around and change it into a matrix. Just use D3DXMatrixRotationYawPitchRoll as you did in the mouse event. Quaternions are used in cameras because they're a convenient way to accumulate rotations happening in eye coordinates. Since you're only using two axes of rotation in a strict order, your way of accumulating angles should be fine. The vector transformation of (0,0,1) isn't really necessary either. The oRotationMatrix should already have those values in the (_31,_32,_33) entries.

Update

Given that it's not roll, here's the problem: you create a rotation matrix to move the eye in world coordinates, but you want the pitch to happen in camera coordinates. Since roll isn't allowed and yaw is performed last, yaw is always the same in both the world and camera frames of reference. Consider the images below:

Local rotation

Your code works fine for local pitch and yaw because those are accomplished in camera coordinates.

Normal pitch around a point

But when you rotate around a reference point, you are creating a rotation matrix that is in world coordinates and using that to rotate the camera center. This works okay if the camera's coordinate system happens to line up with the world's. However, if you don't check to see if you're up against the pitch limit before you rotate the camera position, you will get crazy behavior when you hit that limit. The camera will suddenly start to skate around the world--still 'rotating' around the reference point, but no longer changing orientation.

Locked pitch around a point

If the camera's axes don't line up with the world's, strange things will happen. In the extreme case, the camera won't move at all because you're trying to make it roll.

Off axis pitch would cause roll

The above is what would normally happen, but since you handle the camera orientation separately, the camera doesn't actually roll.

Camera orientation is handled separate from translation

Instead, it stays upright, but you get strange translation going on.

One way to handle this would be to (1)always put the camera into a canonical position and orientation relative to the reference point, (2)make your rotation, and then (3)put it back when you're done (e.g., similar to the way that you translate the reference point to the origin, apply the Yaw-Pitch rotation, and then translate back). Thinking more about it, however, this probably isn't the best way to go.

Update 2

I think that Generic Human's answer is probably the best. The question remains as to how much pitch should be applied if the rotation is off-axis, but for now, we'll ignore that. Maybe it'll give you acceptable results.

The essence of the answer is this: Before mouse movement, your camera is at c₁ = m_oEyePos and being oriented by M₁ = D3DXMatrixRotationYawPitchRoll(&M_1,m_dRotationX,m_dRotationY,0). Consider the reference point a = m_oRotateAroundPos. From the point of view of the camera, this point is a'=M₁(a-c₁).

You want to change the orientation of the camera to M₂ = D3DXMatrixRotationYawPitchRoll(&M_2,m_dRotationX+dX,m_dRotationY+dY,0). [Important: Since you won't allow m_dRotationY to fall outside of a specific range, you should make sure that dY doesn't violate that constraint.] As the camera changes orientation, you also want its position to rotate around a to a new point c₂. This means that a won't change from the perspective of the camera. I.e., M₁(a-c₁)==M₂(a-c₂).

So we solve for c₂ (remember that the transpose of a rotation matrix is the same as the inverse):

M₂^TM₁(a-c₁)==(a-c₂) =>

-M₂^TM₁(a-c₁)+a==c₂

Now if we look at this as a transformation being applied to c₁, then we can see that it is first negated, then translated by a, then rotated by M₁, then rotated by M₂^T, negated again, and then translated by a again. These are transformations that graphics libraries are good at and they can all be squished into a single transformation matrix.

@Generic Human deserves credit for the answer, but here's code for it. Of course, you need to implement the function to validate a change in pitch before it's applied, but that's simple. This code probably has a couple typos since I haven't tried to compile:

case ENavigation::eRotatePoint: {
    const double dX = (double)(m_oLastMousePos.x - roMousePos.x) * dRotatePixelScale * D3DX_PI;
    double dY = (double)(m_oLastMousePos.y - roMousePos.y) * dRotatePixelScale * D3DX_PI;
    dY = validatePitch(dY); // dY needs to be kept within bounds so that m_dRotationY is within bounds

    D3DXMATRIX oRotationMatrix1; // The camera orientation before mouse-change
    D3DXMatrixRotationYawPitchRoll(&oRotationMatrix1, m_dRotationX, m_dRotationY, 0.0);

    D3DXMATRIX oRotationMatrix2; // The camera orientation after mouse-change
    D3DXMatrixRotationYawPitchRoll(&oRotationMatrix2, m_dRotationX + dX, m_dRotationY + dY, 0.0);

    D3DXMATRIX oRotationMatrix2Inv; // The inverse of the orientation
    D3DXMatrixTranspose(&oRotationMatrix2Inv,&oRotationMatrix2); // Transpose is the same in this case

    D3DXMATRIX oScaleMatrix; // Negative scaling matrix for negating the translation
    D3DXMatrixScaling(&oScaleMatrix,-1,-1,-1);

    D3DXMATRIX oTranslationMatrix; // Translation by the reference point
    D3DXMatrixTranslation(&oTranslationMatrix,
         m_oRotateAroundPos.x,m_oRotateAroundPos.y,m_oRotateAroundPos.z);

    D3DXMATRIX oTransformMatrix; // The full transform for the eyePos.
    // We assume the matrix multiply protects against variable aliasing
    D3DXMatrixMultiply(&oTransformMatrix,&oScaleMatrix,&oTranslationMatrix);
    D3DXMatrixMultiply(&oTransformMatrix,&oTransformMatrix,&oRotationMatrix1);
    D3DXMatrixMultiply(&oTransformMatrix,&oTransformMatrix,&oRotationMatrix2Inv);
    D3DXMatrixMultiply(&oTransformMatrix,&oTransformMatrix,&oScaleMatrix);
    D3DXMatrixMultiply(&oTransformMatrix,&oTransformMatrix,&oTranslationMatrix);

    D3DXVECTOR4 oEyeFinal;
    D3DXVec3Transform(&oEyeFinal, &m_oEyePos, &oTransformMatrix);

    m_oEyePos = D3DXVECTOR3(oEyeFinal.x, oEyeFinal.y, oEyeFinal.z) 

    // Increment rotation to keep the same relative look angles
    RotateXAxis(dX);
    RotateYAxis(dY);
    break;
}

Mancunian answered 16/7, 2012 at 20:53 Comment(12)

Thanks JCooper. It's not rolling - I added some clarification above. Instead it seems to go 'wild' when m_dRotationX approaches 0 or pi. – Fidellia 25/7, 2012 at 12:47

This sounds very helpful - thank you! The image is good too. So, to both translate the eye and rotate it: translate the eye to YZ plane (by X); rotate by the above fixAngle (possibly negated); rotate around the X axis; rotate by the fix angle again (possibly negated); translate back; rotate around the Y axis? Hmm. If that is wrong, would you mind adding some pseudocode please to help me get it straight? – Fidellia 25/7, 2012 at 20:34

I have added a bounty for a code example (see question for details.) Thankyou for the text description, but I am having trouble translating it to something workable. No doubt it's is my own fault / problem and your answer is right! But I don't understand your description well enough to code something that works. – Fidellia 26/7, 2012 at 15:51

@DavidM I'm not much of a Direct3d guy, but I can probably put together at least some pseudocode. Tell me this though: what is the desired behavior if the camera is "pitching" around a reference point that is 90° off to the right? Consider the camera at the green point in my picture but facing off to the left. In this case, 'pitch' around the reference point would be 'roll' for the camera. You don't allow it to roll, but you do change the pitch even though it's around a completely different axis. It seems weird to change the direction of the look vector in this case. – Mancunian 27/7, 2012 at 14:22

The user can't click a point 90 degrees off to the right. I'm not completely sure I understand the problem though. By avoiding roll, what I mean is keeping the horizon horizontal. – Fidellia 28/7, 2012 at 16:44

@DavidM But the user can click a point that's not immediately ahead. So if you click a point that's as far off to the side as possible and then try to pitch, I'm not clear on what should happen. The camera should translate up or down around that point; but I don't think you want it to tilt by the same amount. I'm not sure how else to explain the problem at the moment. I'll get something together on Monday. – Mancunian 29/7, 2012 at 2:16

Ah, ok, I think I understand. If you can put something together I will appreciate it :) – Fidellia 30/7, 2012 at 9:55

@DavidM I've put some code together. Perhaps it will fit the bill. Essentially, I've just coded Generic Human's solution. – Mancunian 31/7, 2012 at 13:21

Hi @Mancunian - this looks good... it doesn't address the view matrix (eg eye look) though, does it, just the eye position? I haven't had time to try this, but I need to award the bounty within 30 minutes, so have yourself 150 points :) – Fidellia 2/8, 2012 at 15:8

@DavidM Although not exactly efficient, I think that your way of computing the view matrix, updateView(), should work okay once the eye position is calculated correctly. – Mancunian 2/8, 2012 at 15:17

Hi @Mancunian - sorry it took so long to reply. This gives an odd effect moving the mouse left or right - the camera corkscrews higher and higher. Any ideas? – Fidellia 16/8, 2012 at 16:58

@DavidM I don't have any guesses offhand. How big is the corkscrew effect relative to the yaw? Does it always happen? Does it happen if you use a hotkey to add to dX (so that you know that no dY is getting in)? Did you implement the validatePitch function? – Mancunian 18/8, 2012 at 2:10

I think there is a much simpler solution that lets you sidestep all rotation issues.

Notation: A is the point we want to rotate around, C is the original camera location, M is the original camera rotation matrix that maps global coordinates to the camera's local viewport.

Make a note of the local coordinates of A, which are equal to A' = M × (A - C).
Rotate the camera like you would in normal "eye rotation" mode. Update the view matrix M so that it is modified to M₂ and C remains unchanged.
Now we would like to find C₂ such that A' = M₂ × (A - C₂).
This is easily done by the equation C₂ = A - M₂^-1 × A'.
Voilà, the camera has been rotated and because the local coordinates of A are unchanged, A remains at the same location and the same scale and distance.

As an added bonus, the rotation behavior is now consistent between "eye rotation" and "point rotation" mode.

Monahan answered 27/7, 2012 at 8:26 Comment(1)

Good call. This is an elegant solution. – Mancunian 1/8, 2012 at 17:48

You rotate around the point by repeatedly applying small rotation matrices, this probably cause the drift (small precision errors add up) and I bet you will not really do a perfect circle after some time. Since the angles for the view use simple 1-dimension double, they have much less drift.

A possible fix would be to store a dedicated yaw/pitch and relative position from the point when you enter that view mode, and using those to do the math. This requires a bit more bookkeeping, since you need to update those when moving the camera. Note that it will also make the camera move if the point move, which I think is an improvement.

Fernferna answered 13/7, 2012 at 14:15 Comment(3)

It's not repeatedly applying matrices: "The view matrix is then recalculated using the camera location and these two angles." That's a recalculation, not merging the existing matrix with a new one. The only things accumulated are the two angles (individually) and the camera position itself, which are used to create the matrix anew. – Fidellia 16/7, 2012 at 10:56

Unless I completely misunderstand what you mean... have a look at the RotateXAxis() (and Y) and Update() methods - you can see that Update ignores the existing value of m_oViewMatrix and calculates a new value entirely. – Fidellia 16/7, 2012 at 11:16

Yes, I mean after 10 rotations, m_oEyePos will be the result of applying 10 times a rotation matrice to the initial position, but to m_dRotationX you simply add dY 10 times. The bigger accumulation of error for the first one caused me to say "not moving in a perfect circle" (BTW should be easy to test). I assumed that a positional drift was the only possible explanation to what you described (like the rotating point going out of view). – Fernferna 21/7, 2012 at 15:22

If I understand correctly, you are satisfied with the rotation component in the final matrix (save for inverted rotation controls in the problem #3), but not with the translation part, is that so?

The problem seems to come from the fact that you treating them differently: you are recalculating the rotation part from scratch every time, but accumulate the translation part (m_oEyePos). Other comments mention precision problems, but it's actually more significant than just FP precision: accumulating rotations from small yaw/pitch values is simply not the same---mathematically---as making one big rotation from the accumulated yaw/pitch. Hence the rotation/translation discrepancy. To fix this, try recalculating eye position from scratch simultaneously with the rotation part, similarly to how you find "oTarget = oEyePos + ...":

oEyePos = m_oRotateAroundPos - dist * D3DXVECTOR3(oForward4.x, oForward4.y, oForward4.z)

dist can be fixed or calculated from the old eye position. That will keep the rotation point in the screen center; in the more general case (which you are interested in), -dist * oForward here should be replaced by the old/initial m_oEyePos - m_oRotateAroundPos multiplied by the old/initial camera rotation to bring it to the camera space (finding a constant offset vector in camera's coordinate system), then multiplied by the inverted new camera rotation to get the new direction in the world.

This will, of course, be subject to gimbal lock when the pitch is straight up or down. You'll need to define precisely what behavior you expect in these cases to solve this part. On the other hand, locking at m_dRotationX=0 or =pi is rather strange (this is yaw, not pitch, right?) and might be related to the above.

Amphitropous answered 31/7, 2012 at 13:19 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags