How to add Picture in Picture (PIP) for WebRTC Video Calls in iOS Swift

L

2

13

We have used the following steps of integrating PIP (Picture in Picture) for WebRTC Video Call:

We are enabling mode of Audio, Airplay, and Picture in Picture capability in our project.
We have added an Entitlement file with Accessing the Camera while multitasking, see Accessing the Camera While Multitasking.)
From the documentation link, we followed:
Provision Your App

After your account has permission to use the entitlement, you can create a new provisioning profile with it by following these steps:
1. Log in to your Apple Developer Account.
2. Go to Certificates, Identifiers & Profiles.
3. Generate a new provisioning profile for your app.
4. Select the Multitasking Camera Access entitlement from the additional entitlements for your account.
We have also integrated the following link, but how to add video render layer view in this SampleBufferVideoCallView we don’t have any particular hint. https://developer.apple.com/documentation/avkit/adopting_picture_in_picture_for_video_calls?changes=__8
Also, RTCMTLVideoView creates MTKView isn’t supported, but we have used WebRTC default video render view like RTCEAGLVideoView used to GLKView for a video rendering.

The PIP Integrate with WebRTC iOS Swift code:

class SampleBufferVideoCallView: UIView {
    override class var layerClass: AnyClass {
        get { return AVSampleBufferDisplayLayer.self }
    }
    
    var sampleBufferDisplayLayer: AVSampleBufferDisplayLayer {
        return layer as! AVSampleBufferDisplayLayer
    }
}

func startPIP() {
    if #available(iOS 15.0, *) {
        let sampleBufferVideoCallView = SampleBufferVideoCallView()
        let pipVideoCallViewController = AVPictureInPictureVideoCallViewController()
        pipVideoCallViewController.preferredContentSize = CGSize(width: 1080, height: 1920)
        pipVideoCallViewController.view.addSubview(sampleBufferVideoCallView)
        
        let remoteVideoRenderar = RTCEAGLVideoView()
        remoteVideoRenderar.contentMode = .scaleAspectFill
        remoteVideoRenderar.frame = viewUser.frame
        viewUser.addSubview(remoteVideoRenderar)
        
        let pipContentSource = AVPictureInPictureController.ContentSource(
            activeVideoCallSourceView: self.viewUser,
            contentViewController: pipVideoCallViewController)
        
        let pipController = AVPictureInPictureController(contentSource: pipContentSource)
        pipController.canStartPictureInPictureAutomaticallyFromInline = true
        pipController.delegate = self
        
    } else {
        // Fallback on earlier versions
    }
}

How to add a viewUser GLKView into pipContentSource and how to integrate remote video buffer view into SampleBufferVideoCallView?

Is it possible this way or any other way to video render buffer layer view in AVSampleBufferDisplayLayer?

Laic answered 10/3, 2022 at 6:17 Comment(1)

Hi how did you add the entitlement – Hoffarth 23/2, 2023 at 7:39

L

4

Code-Level Support Apple gave the following advice when asked about this problem:

In order to make recommendations, we'd need to know more about the code you’ve tried to render the video.

As discussed in the article you referred to, to provide PiP support you must first provide a source view to display inside the video-call view controller -- you need to add a UIView to AVPictureInPictureVideoCallViewController. The system supports displaying content from an AVPlayerLayer or AVSampleBufferDisplayLayer depending on your needs. MTKView/GLKView isn’t supported. Video-calling apps need to display the remote view, so use AVSampleBufferDisplayLayer to do so.

In order to handle the drawing in your source view, you could gain access to the buffer stream before it is turned into a GLKView, and feed it to the content of the AVPictureInPictureViewController. For example, you can create CVPixelBuffers from the video feed frames, then from those, create CMSampleBuffers Once you have the CMSampleBuffers, you can begin providing these to the AVSampleBufferDisplayLayer for display. Have a look at the methods defined there to see how this is done. There's some archived ObjC sample code AVGreenScreenPlayer that you might look at to help you get started using AVSampleBufferDisplayLayer (note: it's Mac code, but the AVSampleBufferDisplayLayer APIs are the same across platforms).

In addition, for implementing PiP support you'll want to provide delegate methods for AVPictureInPictureControllerDelegate, and for AVSampleBufferDisplayLayer AVPictureInPictureSampleBufferPlaybackDelegate. See the recent WWDC video What's new in AVKit for more information about the AVPictureInPictureSampleBufferPlaybackDelegate delegates which are new in iOS 15.

Laic answered 2/6, 2022 at 5:13 Comment(0)

L

1

To display a picture-in-picture (PIP) with WebRTC in a video call using the provided code, follow these steps:

Step 1: Initialize the WebRTC video call Make sure you have already set up the WebRTC video call with the necessary signaling and peer connection establishment. This code assumes you already have a remoteVideoTrack that represents the video stream received from the remote user.

Step 2: Create a FrameRenderer object Instantiate the FrameRenderer object, which will be responsible for rendering the video frames received from the remote user for the PIP display.

// Add this code where you initialize your video call (before rendering starts)

var frameRenderer: FrameRenderer?

Step 3: Render remote video to the FrameRenderer In the renderRemoteVideo function, add the video frames from the remoteVideoTrack to the FrameRenderer object to render them in the PIP view.

func renderRemoteVideo(to renderer: RTCVideoRenderer) {
    // Make sure you have already initialized the remoteVideoTrack from the WebRTC video call.

    if frameRenderer == nil {
        frameRenderer = FrameRenderer(uID: recUserID)
    }

    self.remoteVideoTrack?.add(frameRenderer!)
}

Step 4: Remove the FrameRenderer from rendering remote video In the removeRenderRemoteVideo function, remove the FrameRenderer object from rendering the video frames when you want to stop the PIP display.

func removeRenderRemoteVideo(to renderer: RTCVideoRenderer) {
    if frameRenderer != nil {
        self.remoteVideoTrack?.remove(frameRenderer!)
    }
}

Step 5: Define the FrameRenderer class The FrameRenderer class is responsible for rendering video frames received from WebRTC in the PIP view.

// Import required frameworks
import Foundation
import WebRTC
import AVKit
import VideoToolbox
import Accelerate
import libwebp

// Define closure type for handling CMSampleBuffer, orientation, scaleFactor, and userID
typealias CMSampleBufferRenderer = (CMSampleBuffer, CGImagePropertyOrientation, CGFloat, Int) -> ()

// Define closure variables for handling CMSampleBuffer from FrameRenderer
var getCMSampleBufferFromFrameRenderer: CMSampleBufferRenderer = { _,_,_,_ in }
var getCMSampleBufferFromFrameRendererForPIP: CMSampleBufferRenderer = { _,_,_,_ in }
var getLocalVideoCMSampleBufferFromFrameRenderer: 
CMSampleBufferRenderer = { _,_,_,_ in }

// Define the FrameRenderer class responsible for rendering video frames
class FrameRenderer: NSObject, RTCVideoRenderer {
// VARIABLES
var scaleFactor: CGFloat?
var recUserID: Int = 0
var frameImage = UIImage()
var videoFormatDescription: CMFormatDescription?
var didGetFrame: ((CMSampleBuffer) -> ())?
private var ciContext = CIContext()

init(uID: Int) {
    super.init()
    recUserID = uID
}

// Set the aspect ratio based on the size
func setSize(_ size: CGSize) {
    self.scaleFactor = size.height > size.width ? size.height / size.width : size.width / size.height
}

// Render a video frame received from WebRTC
func renderFrame(_ frame: RTCVideoFrame?) {
    guard let pixelBuffer = self.getCVPixelBuffer(frame: frame) else {
        return
    }

    // Extract timing information from the frame and create a CMSampleBuffer
    let timingInfo = covertFrameTimestampToTimingInfo(frame: frame)!
    let cmSampleBuffer = self.createSampleBufferFrom(pixelBuffer: pixelBuffer, timingInfo: timingInfo)!

    // Determine the video orientation and handle the CMSampleBuffer accordingly
    let oriented: CGImagePropertyOrientation?
    switch frame!.rotation.rawValue {
    case RTCVideoRotation._0.rawValue:
        oriented = .right
    case RTCVideoRotation._90.rawValue:
        oriented = .right
    case RTCVideoRotation._180.rawValue:
        oriented = .right
    case RTCVideoRotation._270.rawValue:
        oriented = .left
    default:
        oriented = .right
    }

    // Pass the CMSampleBuffer to the appropriate closure based on the user ID
    if objNewUserDM?.userId == self.recUserID {
        getLocalVideoCMSampleBufferFromFrameRenderer(cmSampleBuffer, oriented!, self.scaleFactor!, self.recUserID)
    } else {
        getCMSampleBufferFromFrameRenderer(cmSampleBuffer, oriented!, self.scaleFactor!, self.recUserID)
        getCMSampleBufferFromFrameRendererForPIP(cmSampleBuffer, oriented!, self.scaleFactor!, self.recUserID)
    }

    // Call the didGetFrame closure if it exists
    if let closure = didGetFrame {
        closure(cmSampleBuffer)
    }
}

// Function to create a CVPixelBuffer from a CIImage
func createPixelBufferFrom(image: CIImage) -> CVPixelBuffer? {
    let attrs = [
        kCVPixelBufferCGImageCompatibilityKey: false,
        kCVPixelBufferCGBitmapContextCompatibilityKey: false,
        kCVPixelBufferWidthKey: Int(image.extent.width),
        kCVPixelBufferHeightKey: Int(image.extent.height)
    ] as CFDictionary

    var pixelBuffer: CVPixelBuffer?
    let status = CVPixelBufferCreate(kCFAllocatorDefault, Int(image.extent.width), Int(image.extent.height), kCVPixelFormatType_32BGRA, attrs, &pixelBuffer)

    if status == kCVReturnSuccess {
        self.ciContext.render(image, to: pixelBuffer!)
        return pixelBuffer
    } else {
        // Failed to create a CVPixelBuffer
        portalPrint("Error creating CVPixelBuffer.")
        return nil
    }
}

// Function to create a CVPixelBuffer from a CIImage using an existing CVPixelBuffer
func buffer(from image: CIImage, oldCVPixelBuffer: CVPixelBuffer) -> CVPixelBuffer? {
    let attrs = [
        kCVPixelBufferMetalCompatibilityKey: kCFBooleanTrue,
        kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue,
        kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue
    ] as CFDictionary

    var pixelBuffer: CVPixelBuffer?
    let status = CVPixelBufferCreate(kCFAllocatorDefault, Int(image.extent.width), Int(image.extent.height), kCVPixelFormatType_32BGRA, attrs, &pixelBuffer)

    if status == kCVReturnSuccess {
        oldCVPixelBuffer.propagateAttachments(to: pixelBuffer!)
        return pixelBuffer
    } else {
        // Failed to create a CVPixelBuffer
        portalPrint("Error creating CVPixelBuffer.")
        return nil
    }
}

/// Convert RTCVideoFrame to CVPixelBuffer
func getCVPixelBuffer(frame: RTCVideoFrame?) -> CVPixelBuffer? {
    var buffer : RTCCVPixelBuffer?
    var pixelBuffer: CVPixelBuffer?
    
    if let inputBuffer = frame?.buffer {
        if let iBuffer = inputBuffer as? RTCI420Buffer {
            if let cvPixelBuffer = iBuffer.convertToCVPixelBuffer() {
                // Use the cvPixelBuffer as an RTCCVPixelBuffer
                // ...
                pixelBuffer = cvPixelBuffer
                return pixelBuffer
            }
            return convertToPixelBuffer(iBuffer)
        }
    }
    
    buffer = frame?.buffer as? RTCCVPixelBuffer
    pixelBuffer = buffer?.pixelBuffer
    return pixelBuffer
}
 /// Convert RTCVideoFrame to CMSampleTimingInfo
func covertFrameTimestampToTimingInfo(frame: RTCVideoFrame?) -> CMSampleTimingInfo? {
    let scale = CMTimeScale(NSEC_PER_SEC)
    let pts = CMTime(value: CMTimeValue(Double(frame!.timeStamp) * Double(scale)), timescale: scale)
    let timingInfo = CMSampleTimingInfo(duration: kCMTimeInvalid,
                                        presentationTimeStamp: pts,
                                        decodeTimeStamp: kCMTimeInvalid)
    return timingInfo
}

/// Convert CVPixelBuffer to CMSampleBuffer
func createSampleBufferFrom(pixelBuffer: CVPixelBuffer, timingInfo: CMSampleTimingInfo) -> CMSampleBuffer? {
    var sampleBuffer: CMSampleBuffer?
    
    var timimgInfo = timingInfo
    var formatDescription: CMFormatDescription? = nil
    CMVideoFormatDescriptionCreateForImageBuffer(kCFAllocatorDefault, pixelBuffer, &formatDescription)
    
    let osStatus = CMSampleBufferCreateReadyWithImageBuffer(
        kCFAllocatorDefault,
        pixelBuffer,
        formatDescription!,
        &timimgInfo,
        &sampleBuffer
    )
    
    // Print out errors
    if osStatus == kCMSampleBufferError_AllocationFailed {
        portalPrint("osStatus == kCMSampleBufferError_AllocationFailed")
    }
    if osStatus == kCMSampleBufferError_RequiredParameterMissing {
        portalPrint("osStatus == kCMSampleBufferError_RequiredParameterMissing")
    }
    if osStatus == kCMSampleBufferError_AlreadyHasDataBuffer {
        portalPrint("osStatus == kCMSampleBufferError_AlreadyHasDataBuffer")
    }
    if osStatus == kCMSampleBufferError_BufferNotReady {
        portalPrint("osStatus == kCMSampleBufferError_BufferNotReady")
    }
    if osStatus == kCMSampleBufferError_SampleIndexOutOfRange {
        portalPrint("osStatus == kCMSampleBufferError_SampleIndexOutOfRange")
    }
    if osStatus == kCMSampleBufferError_BufferHasNoSampleSizes {
        portalPrint("osStatus == kCMSampleBufferError_BufferHasNoSampleSizes")
    }
    if osStatus == kCMSampleBufferError_BufferHasNoSampleTimingInfo {
        portalPrint("osStatus == kCMSampleBufferError_BufferHasNoSampleTimingInfo")
    }
    if osStatus == kCMSampleBufferError_ArrayTooSmall {
        portalPrint("osStatus == kCMSampleBufferError_ArrayTooSmall")
    }
    if osStatus == kCMSampleBufferError_InvalidEntryCount {
        portalPrint("osStatus == kCMSampleBufferError_InvalidEntryCount")
    }
    if osStatus == kCMSampleBufferError_CannotSubdivide {
        portalPrint("osStatus == kCMSampleBufferError_CannotSubdivide")
    }
    if osStatus == kCMSampleBufferError_SampleTimingInfoInvalid {
        portalPrint("osStatus == kCMSampleBufferError_SampleTimingInfoInvalid")
    }
    if osStatus == kCMSampleBufferError_InvalidMediaTypeForOperation {
        portalPrint("osStatus == kCMSampleBufferError_InvalidMediaTypeForOperation")
    }
    if osStatus == kCMSampleBufferError_InvalidSampleData {
        portalPrint("osStatus == kCMSampleBufferError_InvalidSampleData")
    }
    if osStatus == kCMSampleBufferError_InvalidMediaFormat {
        portalPrint("osStatus == kCMSampleBufferError_InvalidMediaFormat")
    }
    if osStatus == kCMSampleBufferError_Invalidated {
        portalPrint("osStatus == kCMSampleBufferError_Invalidated")
    }
    if osStatus == kCMSampleBufferError_DataFailed {
        portalPrint("osStatus == kCMSampleBufferError_DataFailed")
    }
    if osStatus == kCMSampleBufferError_DataCanceled {
        portalPrint("osStatus == kCMSampleBufferError_DataCanceled")
    }
    
    guard let buffer = sampleBuffer else {
        portalPrint(StringConstant.samplbeBuffer)
        return nil
    }
    
    let attachments: NSArray = CMSampleBufferGetSampleAttachmentsArray(buffer, true)! as NSArray
    let dict: NSMutableDictionary = attachments[0] as! NSMutableDictionary
    dict[kCMSampleAttachmentKey_DisplayImmediately as NSString] = true as NSNumber
    
    return buffer
}

Step 6: Implement the PIP functionality Based on the provided code, it seems you already have a PIP functionality implemented using the AVPictureInPictureController. Ensure that the startPIP function is called when you want to enable PIP during the video call. The SampleBufferVideoCallView is used to display the PIP video frames received from the frameRenderer.

/// start PIP Method
fileprivate func startPIP() {
    runOnMainThread() {
        if #available(iOS 15.0, *) {
            if AVPictureInPictureController.isPictureInPictureSupported() {
                let sampleBufferVideoCallView = SampleBufferVideoCallView()
                
                getCMSampleBufferFromFrameRendererForPIP = { [weak self] cmSampleBuffer, videosOrientation, scalef, userId  in
                    guard let weakself = self else {
                        return
                    }
                    if weakself.viewModel != nil {
                        if objNewUserDM?.userId != userId && weakself.viewModel.pipUserId == userId {
                            runOnMainThread {
                                sampleBufferVideoCallView.sampleBufferDisplayLayer.enqueue(cmSampleBuffer)
                            }
                        }
                    }
                }
                
                sampleBufferVideoCallView.contentMode = .scaleAspectFit
                
                self.pipVideoCallViewController = AVPictureInPictureVideoCallViewController()
                
                // Pretty much just for aspect ratio, normally used for pop-over
                self.pipVideoCallViewController.preferredContentSize = CGSize(width: 1080, height: 1920)
                
                self.pipVideoCallViewController.view.addSubview(sampleBufferVideoCallView)
                
                sampleBufferVideoCallView.translatesAutoresizingMaskIntoConstraints = false
                let constraints = [
                    sampleBufferVideoCallView.leadingAnchor.constraint(equalTo: self.pipVideoCallViewController.view.leadingAnchor),
                    sampleBufferVideoCallView.trailingAnchor.constraint(equalTo: self.pipVideoCallViewController.view.trailingAnchor),
                    sampleBufferVideoCallView.topAnchor.constraint(equalTo: self.pipVideoCallViewController.view.topAnchor),
                    sampleBufferVideoCallView.bottomAnchor.constraint(equalTo: self.pipVideoCallViewController.view.bottomAnchor)
                ]
                NSLayoutConstraint.activate(constraints)
                
                sampleBufferVideoCallView.bounds = self.pipVideoCallViewController.view.frame
                
                let pipContentSource = AVPictureInPictureController.ContentSource(
                    activeVideoCallSourceView: self.view,
                    contentViewController: self.pipVideoCallViewController
                )
                
                self.pipController = AVPictureInPictureController(contentSource: pipContentSource)
                self.pipController.canStartPictureInPictureAutomaticallyFromInline = true
                self.pipController.delegate = self
                
                print("Is pip supported: \(AVPictureInPictureController.isPictureInPictureSupported())")
                print("Is pip possible: \(self.pipController.isPictureInPicturePossible)")
            }
        } else {
            // Fallback on earlier versions
            print("PIP is not supported in this device")
        }
    }
}

Note: The FrameRenderer object should be defined in your application, and you should ensure that the PIP view's position and size are appropriately set up to achieve the desired PIP effect. Additionally, remember to handle the call-end scenario and release the frameRenderer and WebRTC connections gracefully.

Keep in mind that the code provided assumes you already have the necessary WebRTC setup, and this code focuses on the PIP rendering aspect only. Additionally, PIP is supported from iOS 15.0 onwards, so make sure to handle devices running earlier versions appropriately.

Laic answered 5/8, 2023 at 5:29 Comment(1)

Can you please share your IOS native code. I have question regarding Remote view track Can we share remoteViewTrack from Flutter side to IOS Native – Summand 18/1 at 13:12

Provision Your App

Recommended topics

Hot tags