Actually it's not that simple. Look at this picture (it's a 2D projection for simplicity):
The blue area is the current camera's view frustum. The yellow area is a part of the blue frustum covered by the rectangular selection of the screen. The aim is to make a new frustum which best represents the yellow area. The problem is what way should the new frustum fit the yellow area.
One possibility is presented in the picture below:
The new camera's view frustum is coloured purple and the camera's eye lies on the green line. Assuming the new camera has the same properties (fovy, znear, zfar, aspect) as the old one, we can compute its new position and direction.
Now some calculations:
Height and width of the near plane:
h = 2 * tan(fovy/2) * znear
w = aspect * h
Screen-space coordinates of the rectangle:
rectangle = ( x0, y0, x1, y1 )
Screen-space center of the rectangle:
rcenter = ( (x0+x1)/2, (y0+y1)/2 )
Another image to clarify next calculations:
View-space vector lying on the near plane, pointing from the near plane's center to the rectangle's center:
center = ( (rcenter.x / screen_width - 0.5) * w,
(0.5 - rcenter.y / screen_height) * h, 0 )
Then we have to transform this vector to world-space:
centerWS = center.x * camera_right_dir + center.y * camera_up_dir
New direction of the camera (dir2n):
dir1 = camera_dir * znear
dir2 = dir1 + centerWS
dir2n = normalize(dir2)
New position of the camera (pos2):
I've made some assumptions, in order to simplify calculations.
Approximately new and old near planes are parallel, so:
(w, h) / dist = (w * (x1-x0)/screen_width, h * (y1-y0)/screen_height) / znear
(1, 1) / dist = ( (x1-x0)/screen_width, (y1-y0)/screen_height) / znear
(1/dist, 1/dist) = ( (x1-x0)/screen_width, (y1-y0)/screen_height) / znear
Hence:
dist = znear * screen_width / (x1-x0)
Which should be equal to:
dist = znear * screen_height / (y1-y0)
That is true only if the rectangle has the same proportions as the screen, which you can guarantee by locking the rectangle's proportions while the user is drawing it, or you can simply use only the rectangle's width (x1-x0)
and ignore its height (y1-y0)
or vice versa.
Finally:
pos2 = pos1 + dir2n * (dist-znear)