Checking if vertex is visible in camera view and render or occluded

Asked 26/12, 2019 at 22:5 Answered 11/12, 2020 at 18:21

python neural-network blender raycasting bounding-box

I'm dealing with a machine learning task and I'm trying to use blender to generate synthetic images as a training dataset for a neural network. To do this, I have to find the bounding box of the objects in the rendered image.

My code, up to now, is heavily based on the one suggested in this thread, but this doesn't take care of whether a vertex is visible or occluded by another object. The desired result is indeed exactly the same explained here. I've tried the suggestion given there, but it doesn't work. I can't understand if it's because I give the ray_cast function wrong inputs (since bpy APIs are really awful) or if it's just because of the poor performance of the function, as I've read somewhere else. My code, right now, is:

import bpy
import numpy as np

def boundingbox(scene, camera, obj, limit = 0.3):
    #  Get the inverse transformation matrix.
    matrix = camera.matrix_world.normalized().inverted()
    #  Create a new mesh data block, using the inverse transform matrix to undo any transformations.
    dg = bpy.context.evaluated_depsgraph_get()
    #    eval_obj = bpy.context.object.evaluated_get(dg)
    eval_obj = obj.evaluated_get(dg)
    mesh = eval_obj.to_mesh()
    mesh.transform(obj.matrix_world)
    mesh.transform(matrix)

    #  Get the world coordinates for the camera frame bounding box, before any transformations.
    frame = [-v for v in camera.data.view_frame(scene=scene)[:3]]
    origin = camera.location
    lx = []
    ly = []

    for v in mesh.vertices:
        co_local = v.co
        z = -co_local.z
        direction =  (co_local - origin)


        result = scene.ray_cast(view_layer=bpy.context.window.view_layer, origin=origin,
                                      direction= direction) # interested only in the first return value
        intersection = result[0]
        met_obj = result[4]
        if intersection:
            if met_obj.type == 'CAMERA':
                intersection = False


        if z <= 0.0 or (intersection == True and (result[1] - co_local).length > limit):
            #  Vertex is behind the camera or another object; ignore it.
            continue
        else:
            # Perspective division
            frame = [(v / (v.z / z)) for v in frame]

        min_x, max_x = frame[1].x, frame[2].x
        min_y, max_y = frame[0].y, frame[1].y

        x = (co_local.x - min_x) / (max_x - min_x)
        y = (co_local.y - min_y) / (max_y - min_y)

        lx.append(x)
        ly.append(y)

    eval_obj.to_mesh_clear()

    #  Image is not in view if all the mesh verts were ignored
    if not lx or not ly:
        return None

    min_x = np.clip(min(lx), 0.0, 1.0)
    min_y = np.clip(min(ly), 0.0, 1.0)
    max_x = np.clip(max(lx), 0.0, 1.0)
    max_y = np.clip(max(ly), 0.0, 1.0)

    #  Image is not in view if both bounding points exist on the same side
    if min_x == max_x or min_y == max_y:
        return None

    # Figure out the rendered image size
    render = scene.render
    fac = render.resolution_percentage * 0.01
    dim_x = render.resolution_x * fac
    dim_y = render.resolution_y * fac

    # return box in the form (top left x, top left y),(width, height)
    return (
        (round(min_x * dim_x),  # X
         round(dim_y - max_y * dim_y)),  # Y
        (round((max_x - min_x) * dim_x),  # Width
         round((max_y - min_y) * dim_y))  # Height
    )

I've also tried to cast the ray from the vertex to the camera position (instead of doing the opposite) and using the small cubes workaround as explained here, but to no avail. Could please someone help me figure out how to properly do this or suggest another strategy?

Cunha answered 26/12, 2019 at 22:5 Comment(1)

Did you have any luck figuring it out? – Desertion 7/7, 2020 at 23:29

I had to solve a very similar problem

this is the code I used

def BoundingBoxFinal(obj,cam):
from bpy_extras.object_utils import world_to_camera_view
scene = bpy.context.scene
# needed to rescale 2d coordinates
render = scene.render
render_scale = scene.render.resolution_percentage / 100
res_x = render.resolution_x *render_scale
res_y = render.resolution_y *render_scale
# use generator expressions () or list comprehensions []
mat = obj.matrix_world
verts = [vert.co for vert in obj.data.vertices]
for i in range(len(verts)):
    verts[i] = obj.matrix_world @ verts[i]


coords_2d = [world_to_camera_view(scene, cam, coord) for coord in verts]

# 2d data printout:
rnd = lambda i: round(i)


X_max = max(coords_2d[0])
Y_max = max(coords_2d[1])
X_min = min(coords_2d[0])
Y_min = min(coords_2d[1])

verts_2d =[]
for x, y, distance_to_lens in coords_2d:
    verts_2d.append(tuple((rnd(res_x*x), rnd(res_y-res_y*y))))

Y_max = max(verts_2d, key = lambda i : i[1])[1]
X_max = max(verts_2d, key = lambda i : i[0])[0]
Y_min = min(verts_2d, key = lambda i : i[1])[1]
X_min = min(verts_2d, key = lambda i : i[0])[0]

verts_2d.clear()

return(
    X_min,
    Y_min,
    X_max,
    Y_max,
    obj.data.name.split('.')[0]
)

Karolynkaron answered 11/11, 2020 at 13:53 Comment(0)

I was trying to find the occlusion level of object in the scene's rendered image. What I did was, I created sort of a map(simple 2D array) of size same as rendered resolution. Then I did as follows...

for each object in the scene:
    for each vertex of the object:
        (x', y', z') = convert the vertex from local(obj.data.vertices[i].co) to world view
        (x, y, z) = convert the world view vertex(x', y', ') to the 2d camera view
        # this x, y is the 2d coordinates and that z is the distance of the point from camera
        update the 2d array with the id(corresponding to the object closer to the camera)

at the end you can check if the vertex(part of object obj) is visible or not, all you need to do is that you need the projection of that vertex in the final render image, let's say (x, y). Now all we need to check if that map/2D array has that obj's id at index(x, y). If it does, that means, obj's vertex at coordinates (x, y) in rendered images is visible and if not, that is at (x, y) the map/2d array has some other object's id. In that case, it could be concluded that the vertex of object obj in rendred images at coordinate (x, y) is overwritten by some other object in the scene(there is another object in between that particular vertex and the camera).

It's just a clever manipulation of numbers and you will get what you want. If You need a more explanation/code, let me know in the comments. Also let me know if any of you guys find anything wrong in this approach. Your comments will be appreciated

Putput answered 11/12, 2020 at 18:21 Comment(0)

Recommended topics

Hot tags