How to join png with alpha / transparency in a frame in realtime
Asked Answered
B

1

16

I'm working under the example of OpenCV android 2.4.11 which detects faces using the camera. Instead of drawing a rectangle on the face found, I'm trying to put a mask (png image) on the face. But to display the image on the face, the png image is coming with a black background where there was transparency.

FdActivity.java

public void onCameraViewStarted(int width, int height) {
        mGray = new Mat();
        mRgba = new Mat();

        //Load my mask png
        Bitmap image = BitmapFactory.decodeResource(getResources(), R.drawable.mask_1);

        mask = new Mat();

        Utils.bitmapToMat(image, mask);

}

public Mat onCameraFrame(CvCameraViewFrame inputFrame) {

        mRgba = inputFrame.rgba();
        mGray = inputFrame.gray();

        if (mAbsoluteFaceSize == 0) {
            int height = mGray.rows();
            if (Math.round(height * mRelativeFaceSize) > 0) {
                mAbsoluteFaceSize = Math.round(height * mRelativeFaceSize);
            }
            mNativeDetector.setMinFaceSize(mAbsoluteFaceSize);
        }

        MatOfRect faces = new MatOfRect();

        if (mDetectorType == JAVA_DETECTOR) {
            if (mJavaDetector != null)
                mJavaDetector.detectMultiScale(mGray, faces, 1.1, 2, 2,
                        new Size(mAbsoluteFaceSize, mAbsoluteFaceSize), new Size());
        }
        else if (mDetectorType == NATIVE_DETECTOR) {
            if (mNativeDetector != null)
                mNativeDetector.detect(mGray, faces);
        }
        else {
            Log.e(TAG, "Detection method is not selected!");
        }

        Rect[] facesArray = faces.toArray();


        for (int i = 0; i < facesArray.length; i++) {

              overlayImage(mRgba, mask, facesArray[i]);

        }

        return mRgba;
    }

    public Mat overlayImage(Mat background, Mat foregroundMask, Rect faceRect)
    {
        Mat mask = new Mat();

        Imgproc.resize(this.mask, mask, faceRect.size());

        Mat source = new Mat();
        Imgproc.resize(foregroundMask, source, background.size());

        mask.copyTo( background.submat( new Rect((int) faceRect.tl().x, (int) faceRect.tl().y, mask.cols(), mask.rows())) );

        source.release();
        mask.release();
        return background;
    }
Brien answered 28/4, 2016 at 17:23 Comment(5)
Are asking how to alpha blend with opencv? (See explanation near the end, and port those two lines to java).Tracy
I checked your code and what happened was that PNG with black background with this alpha effect. That is, the png apparently is being charged with this black background but the original image is without background!Brien
@DanMašek, thanks for reply but I tried this medium and I could not. if the png image is getting totally transparent leaving only the visible image contours. Need to remove the black area which is originally transparent ... No matter the combination of alpha values, beta and gamma and the result is not expected ... Core.addWeighted(mRgba.submat(eyeArea), 1, maskEye, 1, 1, mRgba.submat(eyeArea));Brien
hey @Brien have you ported the python code from DanMašek 's answer to Java? can you share it?Tiemroth
I think this thread has a simpler solution: #47248553Calyx
M
38

Note: I will explain the general principle and give you an example implementation in Python, as I don't have the Android development environment set up. It should be fairly straightforward to port this to Java. Feel free to post your code as a separate answer.


You need to do something similar to what the addWeighted operation does, that is the operation

Linear blend formula

However, in your case, α needs to be a matrix (i.e. we need a different blending coefficient per pixel).


Sample Images

Let's use some sample images to illustrate this. We can use the Lena image as a sample face:

Sample Face

This image as an overlay with transparency:

Overlay with Alpha

And this image as an overlay without transparency:

Overlay without Alpha


Blending Matrix

To obtain the alpha matrix, we can either determine the foreground (overlay) and background (the face) masks using thresholding, or use the alpha channel from the input image if this is available.

It is useful to perform this on floating point images with values in range 0.0 .. 1.0. We can then express the relationship between the two masks as

foreground_mask = 1.0 - background_mask

i.e. the two masks added together result in all ones.

For the overlay image in RGBA format we get the following foreground and background masks:

Foreground mask from transparency

Background mask from transparency

When we use thresholding, erode and blur in case of RGB format, we get the following foreground and background masks:

Foreground mask from threshold

Background mask from threshold


Weighted Sum

Now we can calculate two weighted parts:

foreground_part = overlay_image * foreground_mask
background_part = face_image * background_mask

For RGBA overlay the foreground and background parts look as follows:

Foreground part (RGBA overlay)

Background part (RGBA overlay)

And for RGB overlay the foreground and background parts look as such:

Foreground part (RGB overlay)

Background part (RGB overlay)


And finally add them together, and convert the image back to 8bit integers in range 0-255.

The result of the operations looks as follows (RGBA and RGB overlay respectively):

Merged (RGBA overlay)

Merged (RGB overlay)


Code Sample - RGB Overlay

import numpy as np
import cv2

# ==============================================================================

def blend_non_transparent(face_img, overlay_img):
    # Let's find a mask covering all the non-black (foreground) pixels
    # NB: We need to do this on grayscale version of the image
    gray_overlay = cv2.cvtColor(overlay_img, cv2.COLOR_BGR2GRAY)
    overlay_mask = cv2.threshold(gray_overlay, 1, 255, cv2.THRESH_BINARY)[1]

    # Let's shrink and blur it a little to make the transitions smoother...
    overlay_mask = cv2.erode(overlay_mask, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3)))
    overlay_mask = cv2.blur(overlay_mask, (3, 3))

    # And the inverse mask, that covers all the black (background) pixels
    background_mask = 255 - overlay_mask

    # Turn the masks into three channel, so we can use them as weights
    overlay_mask = cv2.cvtColor(overlay_mask, cv2.COLOR_GRAY2BGR)
    background_mask = cv2.cvtColor(background_mask, cv2.COLOR_GRAY2BGR)

    # Create a masked out face image, and masked out overlay
    # We convert the images to floating point in range 0.0 - 1.0
    face_part = (face_img * (1 / 255.0)) * (background_mask * (1 / 255.0))
    overlay_part = (overlay_img * (1 / 255.0)) * (overlay_mask * (1 / 255.0))

    # And finally just add them together, and rescale it back to an 8bit integer image
    return np.uint8(cv2.addWeighted(face_part, 255.0, overlay_part, 255.0, 0.0))

# ==============================================================================

# We load the images
face_img = cv2.imread("lena.png", -1)
overlay_img = cv2.imread("overlay.png", -1)

result_1 = blend_non_transparent(face_img, overlay_img)
cv2.imwrite("merged.png", result_1)

Code Sample - RGBA Overlay

import numpy as np
import cv2

# ==============================================================================

def blend_transparent(face_img, overlay_t_img):
    # Split out the transparency mask from the colour info
    overlay_img = overlay_t_img[:,:,:3] # Grab the BRG planes
    overlay_mask = overlay_t_img[:,:,3:]  # And the alpha plane

    # Again calculate the inverse mask
    background_mask = 255 - overlay_mask

    # Turn the masks into three channel, so we can use them as weights
    overlay_mask = cv2.cvtColor(overlay_mask, cv2.COLOR_GRAY2BGR)
    background_mask = cv2.cvtColor(background_mask, cv2.COLOR_GRAY2BGR)

    # Create a masked out face image, and masked out overlay
    # We convert the images to floating point in range 0.0 - 1.0
    face_part = (face_img * (1 / 255.0)) * (background_mask * (1 / 255.0))
    overlay_part = (overlay_img * (1 / 255.0)) * (overlay_mask * (1 / 255.0))

    # And finally just add them together, and rescale it back to an 8bit integer image    
    return np.uint8(cv2.addWeighted(face_part, 255.0, overlay_part, 255.0, 0.0))

# ==============================================================================

# We load the images
face_img = cv2.imread("lena.png", -1)
overlay_t_img = cv2.imread("overlay_transparent.png", -1) # Load with transparency

result_2 = blend_transparent(face_img, overlay_t_img)
cv2.imwrite("merged_transparent.png", result_2)
Mazel answered 12/5, 2016 at 21:54 Comment(8)
This code (blend_transparent) gives me this error: File "./test.py", line 19, in blend_transparent face_part = (face_img * (1 / 255.0)) * (background_mask * (1 / 255.0)) ValueError: operands could not be broadcast together with shapes (614,500,3) (640,500,3)Linsang
The blending algorithm requires the images to be of the same size. Simple to fix that in your code from the other question, just change line 35 to rotated = cv2.warpPerspective(glasses, M, (face.shape[1], face.shape[0])).Tracy
TMI... Too many images ;) Thanks for this straightforward answer!Volkslied
@Volkslied :) Ya, it's a bit image heavy, although they all seemed to be relevant to the explanation when I wrote it (I like to provide the inputs to allow the reader to reproduce it, as well as show the intermediate steps and the results). That said, if you have some ideas/suggestions on how to improve it, let me know (or even better, edit the answer directly). Glad it was useful.Tracy
No, just kidding :P I also like having all the steps between input and output, this way it's more easy to understand all the steps. BTW, just tested the code in my project a minute ago and it works just fine :DVolkslied
@EB Sure, but only the first variant is relevant in that case, since JPEG doesn't support transparency. You also may have to adjust the part that determines the mask, since there may be some artifacts introduced by lossy compression.Tracy
Thank you so much for this!Shock
hi Dan Masek. Can I ask something? When I try your ccode for my video(not image)I get the following error ==> overlay_img = overlay_t_img[:,:,:3] # Grab the BRG planes TypeError: 'cv2.VideoCapture' object is not subscriptable. I know what it mean but I dont know how to solve this. may I ask you to guide to solve this issue?Welby

© 2022 - 2024 — McMap. All rights reserved.