Redraw image from 3d perspective to 2d
Asked Answered
J

2

22

I need an inverse perspective transform written in Pascal/Delphi/Lazarus. See the following image:

image process

I think I need to walk through destination pixels and then calculate the corresponding position in the source image (To avoid problems with rounding errors etc.).

function redraw_3d_to_2d(sourcebitmap:tbitmap, sourceaspect:extended, point_a, point_b, point_c, point_d:tpoint, megapixelcount:integer):tbitmap;
var
   destinationbitmap:tbitmap;
   x,y,sx,sy:integer;
begin
  destinationbitmap:=tbitmap.create;
  destinationbitmap.width=megapixelcount*sourceaspect*???; // I dont how to calculate this
  destinationbitmap.height=megapixelcount*sourceaspect*???; // I dont how to calculate this
  for x:=0 to destinationbitmap.width-1 do
    for y:=0 to destinationbitmap.height-1 do
    begin
        sx:=??;
        sy:=??;
        destinationbitmap.canvas.pixels[x,y]=sourcebitmap.canvas.pixels[sx,sy];
    end;
  result:=destinationbitmap;
end;

I need the real formula... So an OpenGL solution would not be ideal...

Johnsonjohnsonese answered 9/1, 2013 at 18:31 Comment(0)
S
24

Note: There is a version of this with proper math typesetting on the Math SE.

Computing a projective transformation

A perspective is a special case of a projective transformation, which in turn is defined by four points.

Step 1: Starting with the 4 positions in the source image, named (x1,y1) through (x4,y4), you solve the following system of linear equations:

[x1 x2 x3] [λ]   [x4]
[y1 y2 y3]∙[μ] = [y4]
[ 1  1  1] [τ]   [ 1]

The colums form homogenous coordinates: one dimension more, created by adding a 1 as the last entry. In subsequent steps, multiples of these vectors will be used to denote the same points. See the last step for an example of how to turn these back into two-dimensional coordinates.

Step 2: Scale the columns by the coefficients you just computed:

    [λ∙x1 μ∙x2 τ∙x3]
A = [λ∙y1 μ∙y2 τ∙y3]
    [λ    μ    τ   ]

This matrix will map (1,0,0) to a multiple of (x1,y1,1), (0,1,0) to a multiple of (x2,y2,1), (0,0,1) to a multiple of (x3,y3,1) and (1,1,1) to (x4,y4,1). So it will map these four special vectors (called basis vectors in subsequent explanations) to the specified positions in the image.

Step 3: Repeat steps 1 and 2 for the corresponding positions in the destination image, in order to obtain a second matrix called B.

This is a map from basis vectors to destination positions.

Step 4: Invert B to obtain B⁻¹.

B maps from basis vectors to the destination positions, so the inverse matrix maps in the reverse direction.

Step 5: Compute the combined Matrix C = A∙B⁻¹.

B⁻¹ maps from destination positions to basis vectors, while A maps from there to source positions. So the combination maps destination positions to source positions.

Step 6: For every pixel (x,y) of the destination image, compute the product

[x']     [x]
[y'] = C∙[y]
[z']     [1]

These are the homogenous coordinates of your transformed point.

Step 7: Compute the position in the source image like this:

sx = x'/z'
sy = y'/z'

This is called dehomogenization of the coordinate vector.

All this math would be so much easier to read and write if SO were to support MathJax

Choosing the image size

The above aproach assumes that you know the location of your corners in the destination image. For these you have to know the width and height of that image, which is marked by question marks in your code as well. So let's assume the height of your output image were 1, and the width were sourceaspect. In that case, the overall area would be sourceaspect as well. You have to scale that area by a factor of pixelcount/sourceaspect to achieve an area of pixelcount. Which means that you have to scale each edge length by the square root of that factor. So in the end, you have

pixelcount = 1000000.*megapixelcount;
width  = round(sqrt(pixelcount*sourceaspect));
height = round(sqrt(pixelcount/sourceaspect));
Silvery answered 9/1, 2013 at 19:8 Comment(12)
This is only valid if the photograph was taken at normal incidence. If it was taken at an angle, you will need the depth of the corners as well - these can, however, be deduced from the angle.Sartorial
@Thomas, I don't follow your concern. I know my aproach does not handle any real-life aberrations, but as long as an ideal pinhole camera model applies, the angle will not matter: a perspective will be a projective transformation no matter the angle (unless your optical axis lies within the plane of the sheet of paper). And the above computation works for a general prohective transformation. So no depth required. Note that I'm doing computations in the real projective plane, not in the real affine 3d space. Could it be you misunderstood this aspect?Silvery
Suppose the photograph was taken at a grazing angle of the drawing - you can apply your rotation as usual to obtain the properly aligned photograph, but still at a grazing angle, which is not the desired result (OP wants "right above the paper"). The OP did not mention whether input photographs were always taken from directly above the drawing.Sartorial
@Thomas, I'm applying a projective transformation, not just a rotation. It will map the image of a rectangle taken at a grazing angle to a rectangle filling the destination image. There is no requirement that the photo be taken from directly above the drawing.Silvery
Ah, I see. You will however not have enough information to accurately reconstruct the final image, since you will be upscaling the "squashed" pixels to pixels with the desired aspect ratio, but this is unavoidable.Sartorial
@Thomas, if the megapixelcount of the reconstructed image is much smaller than that of the input photograph, there is a chance, but I agree that in general the fidelity of the reconstructed image will suffer. Some forms of aliasing might be countered by techniques such as supersampling, but that won't restore information lost in the process, just make their lack less apparent.Silvery
@Sartorial - of course pixels will be interpolated and the reconstructed image will be more or less altered. I don't see how else this could be done - after all the information needed is not present in a 2d photography (CSI and all similar TV shows notwithstanding.) There are some impressive things done with video in this regard, but alas, this is not video.Pinchas
I can see that all of you know much more about math than I am.... However, I will initially like to have an answer on whether this projective transformation is technically possible to do? Allow me to outline where the source image come from... The user has a bunch of source images which are drawn a small change in each - the perspective of my project is to create a tool for the development of stop motions. If I could some how auto detect the corners of the paper it would be even better...Johnsonjohnsonese
@LeonardoHerrera, so this is video after all! :-) I see no reason why this should not work, except that manually detecting the corners will take a lot of time, and auto-detecting them requires some more work towards image recognition, edge detection in particular. The more vertical your optical axis is, and the higher the resoltution of the protograph, the better the results will be.Silvery
@KasperDK - ouch, this is a complete new domain.Pinchas
In this case the algorithm shall consider only a single source image at a time. Is there a person who is willing to transform MvG solution to pascal source code? I would be hugely grateful for that...Johnsonjohnsonese
I finally wrote a version of this answer on the Math Stackexchange, which has better math formatting capabilities due to its use of MathJax.Silvery
P
5

Use Graphics32, specifically TProjectiveTransformation (to use with the Transform method). Don't forget to leave some transparent margin in your source image so you don't get jagged edges.

Pinchas answered 9/1, 2013 at 20:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.