DirectX Screen Capture - Desktop Duplication API - limited frame rate of AcquireNextFrame
Asked Answered
C

3

6

I'm trying to use Windows Desktop Duplication API to capture the screen and save the raw output to a video. I'm using AcquireNextFrame with a very high timeout value (999ms). This way I should get every new frame from windows as soon as it at has one, which naturally should be at 60fps anyway. I end up getting sequences where everything looks good (frame 6-11), and then sequences where things look bad (frame 12-14). If I check AccumulatedFrames

lFrameInfo.AccumulatedFrames

the value is often 2 or higher. From my understanding, this means windows is saying "hey hold up, I don't have a frame for you yet", because calls to AcquireNextFrame take so long. But once windows does finally give me a frame, it is saying "hey you were actually too slow and ended up missing a frame". If i could somehow get these frames I think I would be getting 60hz.

This can be further clarified with logging:

I0608 10:40:16.964375  4196 window_capturer_dd.cc:438] 206 - Frame 6 start acquire
I0608 10:40:16.973867  4196 window_capturer_dd.cc:451] 216 - Frame 6 acquired
I0608 10:40:16.981364  4196 window_capturer_dd.cc:438] 223 - Frame 7 start acquire
I0608 10:40:16.990864  4196 window_capturer_dd.cc:451] 233 - Frame 7 acquired
I0608 10:40:16.998364  4196 window_capturer_dd.cc:438] 240 - Frame 8 start acquire
I0608 10:40:17.007876  4196 window_capturer_dd.cc:451] 250 - Frame 8 acquired
I0608 10:40:17.015393  4196 window_capturer_dd.cc:438] 257 - Frame 9 start acquire
I0608 10:40:17.023905  4196 window_capturer_dd.cc:451] 266 - Frame 9 acquired
I0608 10:40:17.032411  4196 window_capturer_dd.cc:438] 274 - Frame 10 start acquire
I0608 10:40:17.039912  4196 window_capturer_dd.cc:451] 282 - Frame 10 acquired
I0608 10:40:17.048925  4196 window_capturer_dd.cc:438] 291 - Frame 11 start acquire
I0608 10:40:17.058428  4196 window_capturer_dd.cc:451] 300 - Frame 11 acquired
I0608 10:40:17.065943  4196 window_capturer_dd.cc:438] 308 - Frame 12 start acquire
I0608 10:40:17.096945  4196 window_capturer_dd.cc:451] 336 - Frame 12 acquired
I0608 10:40:17.098947  4196 window_capturer_dd.cc:464] 1 FRAMES MISSED on frame: 12
I0608 10:40:17.101444  4196 window_capturer_dd.cc:438] 343 - Frame 13 start acquire
I0608 10:40:17.128958  4196 window_capturer_dd.cc:451] 368 - Frame 13 acquired
I0608 10:40:17.130957  4196 window_capturer_dd.cc:464] 1 FRAMES MISSED on frame: 13
I0608 10:40:17.135459  4196 window_capturer_dd.cc:438] 377 - Frame 14 start acquire
I0608 10:40:17.160959  4196 window_capturer_dd.cc:451] 399 - Frame 14 acquired
I0608 10:40:17.162958  4196 window_capturer_dd.cc:464] 1 FRAMES MISSED on frame: 14

Frame 6-11 look good, the acquires are roughly 17ms apart. Frame 12 should be acquired at (300+17=317ms). Frame 12 starts waiting at 308, but doesn't get anything until 336ms. Windows didn't have anything for me until the frame after (300+17+17~=336ms). Okay sure maybe windows just missed a frame, but when I finally get it, I can check AccumulatedFrames and its value was 2 (meaning I missed a frame because I waited too long before calling AcquireNextFrame). In my understanding, it only makes sense for AccumulatedFrames to be larger than 1 if AcquireNextFrame returns immediately.

Furthermore, I can use PresentMon while my capture software is running. The logs show MsBetweenDisplayChange for every frame, which is fairly steady at 16.666ms (with a couple outliers, but much less than my capture software is seeing).

These people (1, 2) seem to have been able to get 60fps, so I'm wondering what I am doing incorrectly.

My code is based on this:

int main() {
    int FPS = 60;
    int video_length_sec = 5;

    int total_frames = FPS * video_length_sec;
    for (int i = 0; i < total_frames; i++) {
        if(!CaptureSingleFrame()){
            i--;
        }
    }
}

ComPtr<ID3D11Device> lDevice;
ComPtr<ID3D11DeviceContext> lImmediateContext;
ComPtr<IDXGIOutputDuplication> lDeskDupl;
ComPtr<ID3D11Texture2D> lAcquiredDesktopImage;
ComPtr<ID3D11Texture2D> lGDIImage;
ComPtr<ID3D11Texture2D> lDestImage;
DXGI_OUTPUT_DESC lOutputDesc;
DXGI_OUTDUPL_DESC lOutputDuplDesc;
D3D11_TEXTURE2D_DESC desc;

// Driver types supported
D3D_DRIVER_TYPE gDriverTypes[] = {
    D3D_DRIVER_TYPE_HARDWARE
};
UINT gNumDriverTypes = ARRAYSIZE(gDriverTypes);

// Feature levels supported
D3D_FEATURE_LEVEL gFeatureLevels[] = {
    D3D_FEATURE_LEVEL_11_0,
    D3D_FEATURE_LEVEL_10_1,
    D3D_FEATURE_LEVEL_10_0,
    D3D_FEATURE_LEVEL_9_1
};
UINT gNumFeatureLevels = ARRAYSIZE(gFeatureLevels);


bool Init() {
    int lresult(-1);

    D3D_FEATURE_LEVEL lFeatureLevel;

    HRESULT hr(E_FAIL);

    // Create device
    for (UINT DriverTypeIndex = 0; DriverTypeIndex < gNumDriverTypes; ++DriverTypeIndex)
    {
        hr = D3D11CreateDevice(
            nullptr,
            gDriverTypes[DriverTypeIndex],
            nullptr,
            0,
            gFeatureLevels,
            gNumFeatureLevels,
            D3D11_SDK_VERSION,
            &lDevice,
            &lFeatureLevel,
            &lImmediateContext);

        if (SUCCEEDED(hr))
        {
            // Device creation success, no need to loop anymore
            break;
        }

        lDevice.Reset();

        lImmediateContext.Reset();
    }

    if (FAILED(hr))
        return false;

    if (lDevice == nullptr)
        return false;

    // Get DXGI device
    ComPtr<IDXGIDevice> lDxgiDevice;
    hr = lDevice.As(&lDxgiDevice);

    if (FAILED(hr))
        return false;

    // Get DXGI adapter
    ComPtr<IDXGIAdapter> lDxgiAdapter;
    hr = lDxgiDevice->GetParent(
        __uuidof(IDXGIAdapter), &lDxgiAdapter);

    if (FAILED(hr))
        return false;

    lDxgiDevice.Reset();

    UINT Output = 0;

    // Get output
    ComPtr<IDXGIOutput> lDxgiOutput;
    hr = lDxgiAdapter->EnumOutputs(
        Output,
        &lDxgiOutput);

    if (FAILED(hr))
        return false;

    lDxgiAdapter.Reset();

    hr = lDxgiOutput->GetDesc(
        &lOutputDesc);

    if (FAILED(hr))
        return false;

    // QI for Output 1
    ComPtr<IDXGIOutput1> lDxgiOutput1;
    hr = lDxgiOutput.As(&lDxgiOutput1);

    if (FAILED(hr))
        return false;

    lDxgiOutput.Reset();

    // Create desktop duplication
    hr = lDxgiOutput1->DuplicateOutput(
        lDevice.Get(), //TODO what im i doing here
        &lDeskDupl);

    if (FAILED(hr))
        return false;

    lDxgiOutput1.Reset();

    // Create GUI drawing texture
    lDeskDupl->GetDesc(&lOutputDuplDesc);
    desc.Width = lOutputDuplDesc.ModeDesc.Width;
    desc.Height = lOutputDuplDesc.ModeDesc.Height;
    desc.Format = lOutputDuplDesc.ModeDesc.Format;
    desc.ArraySize = 1;
    desc.BindFlags = D3D11_BIND_FLAG::D3D11_BIND_RENDER_TARGET;
    desc.MiscFlags = D3D11_RESOURCE_MISC_GDI_COMPATIBLE;
    desc.SampleDesc.Count = 1;
    desc.SampleDesc.Quality = 0;
    desc.MipLevels = 1;
    desc.CPUAccessFlags = 0;
    desc.Usage = D3D11_USAGE_DEFAULT;


    hr = lDevice->CreateTexture2D(&desc, NULL, &lGDIImage);

    if (FAILED(hr))
        return false;

    if (lGDIImage == nullptr)
        return false;

    // Create CPU access texture
    desc.Width = lOutputDuplDesc.ModeDesc.Width;
    desc.Height = lOutputDuplDesc.ModeDesc.Height;
    desc.Format = lOutputDuplDesc.ModeDesc.Format;
    std::cout << desc.Width << "x" << desc.Height << "\n\n\n";
    desc.ArraySize = 1;
    desc.BindFlags = 0;
    desc.MiscFlags = 0;
    desc.SampleDesc.Count = 1;
    desc.SampleDesc.Quality = 0;
    desc.MipLevels = 1;
    desc.CPUAccessFlags = D3D11_CPU_ACCESS_READ | D3D11_CPU_ACCESS_WRITE;
    desc.Usage = D3D11_USAGE_STAGING;

    return true;
}

void WriteFrameToCaptureFile(ID3D11Texture2D* texture) {

    D3D11_MAPPED_SUBRESOURCE* pRes = new D3D11_MAPPED_SUBRESOURCE;
    UINT subresource = D3D11CalcSubresource(0, 0, 0);

    lImmediateContext->Map(texture, subresource, D3D11_MAP_READ_WRITE, 0, pRes);

    void* d = pRes->pData;
    char* data = reinterpret_cast<char*>(d);

    // writes data to file
    WriteFrameToCaptureFile(data, 0);
}

bool CaptureSingleFrame()
{
    HRESULT hr(E_FAIL);
    ComPtr<IDXGIResource> lDesktopResource = nullptr;
    DXGI_OUTDUPL_FRAME_INFO lFrameInfo;
    ID3D11Texture2D* currTexture;

    hr = lDeskDupl->AcquireNextFrame(
        999,
        &lFrameInfo,
        &lDesktopResource);

    if (FAILED(hr)) {
        LOG(INFO) << "Failed to acquire new frame";
        return false;
    }

    if (lFrameInfo.LastPresentTime.HighPart == 0) {
        // not interested in just mouse updates, which can happen much faster than 60fps if you really shake the mouse
        hr = lDeskDupl->ReleaseFrame();
        return false;
    }

    int accum_frames = lFrameInfo.AccumulatedFrames;
    if (accum_frames > 1 && current_frame != 1) {
        // TOO MANY OF THESE is the problem
        // especially after having to wait >17ms in AcquireNextFrame()
    }

    // QI for ID3D11Texture2D
    hr = lDesktopResource.As(&lAcquiredDesktopImage);

    // Copy image into a newly created CPU access texture
    hr = lDevice->CreateTexture2D(&desc, NULL, &currTexture);
    if (FAILED(hr))
        return false;
    if (currTexture == nullptr)
        return false;

    lImmediateContext->CopyResource(currTexture, lAcquiredDesktopImage.Get());


    writer_thread->Schedule(
        FROM_HERE, [this, currTexture]() {
        WriteFrameToCaptureFile(currTexture);
    });
    pending_write_counts_++;

    hr = lDeskDupl->ReleaseFrame();

    return true;
}

**EDIT - According to my measurements, you must call AcquireNextFrame() before the frame will actually appear by about ~10ms, or windows will fail to acquire it and get you the next one. Every time my recording program takes more than 7 ms to wrap around (after acquiring frame i until calling AcquireNextFrame() on i+1), frame i+1 is missed.

***EDIT - Heres a screenshot of GPU View showing what I'm talking about. The first 6 frames process in no time, then the 7th frame takes 119ms. The long rectangle beside "capture_to_argb.exe" corresponds to me being stuck inside AcquireNextFrame(). If you look up to the hardware queue, you can see it cleanly rendering at 60fps, even while I'm stuck in AcquireNextFrame(). At least this is my interpretation (I have no idea what I'm doing).

Chinchy answered 7/6, 2017 at 3:51 Comment(7)
The "60 Hz" you are seeing is the display refresh rate, the rate at which the physical display throws pixels to the screen. A display refresh does not require new data being available. New data being available is what causes AcquireNextFrame to return a success code. A new frame need not arrive at the same rate as the display refresh (although that would be optimal).Jabalpur
See the below comment; PresentMon says my browser is updating at 60hz, but AcquireNextFrame has the same difficultyChinchy
"every 17ms I ask..." so you Sleep between the calls within some loop?Concession
I switched to a busy while loop (see question) instead of sleep since sleep only guarantees a minimum time; ie, has the possibility of oversleeping 17msChinchy
Sleep makes no such guarantees. It's not clear, why you have implemented a timeout at all, when IDXGIOutputDuplication::AcquireNextFrame provides the functionality you are trying to replicate already.Jabalpur
Oops sorry i was using sleep_for not sleep. I tried with and without their timeout with no success (see question).Chinchy
@AaronGermuth did you find any solutions?Conte
W
2

"Current Display Mode: 3840 x 2160 (32 bit) (60hz)" refers to display refresh rate, that is how many frames can be passed to display per second. However the rate at which new frames are rendered is typically much lower. You can inspect this rate using PresentMon or similar utilities. When I don't move the mouse it reports me something like this:

present report

As you can see when nothing happens Windows presents new frame only twice per second or even slower. However this is typically really good for video encoding because even if you are recording video at 60 fps and AcquireNextFrame reports that no new frame is available then it means that current frame is exactly the same as previous.

Wendywendye answered 7/6, 2017 at 12:14 Comment(1)
Unfortunately I don't think this is the problem. For example, when I play a youtube video at 60fps, Present Mon reports "...chrome.exe[8884]: 0000018348E28F50 (DXGI): SyncInterval 1 | Flags 0 | 16.67 ms/frame (60.0 fps, 60.0 displayed fps, 14.87 ms CPU, 44.94 ms latency)..." yet my recorded desktop video during the same time has 182 timeouts for 300 frames.Chinchy
C
1

Doing a blocking wait before next call of AcquireNextFrame you are missing the actual frames. Desktop Duplication API logic suggests that you attempt to acquire next frame immediately if you expect a decent frame rate. Your sleeping call effectively relinquishes the available remainder of execution timeout without hard promise that you get a new slice in scheduled interval of time.

You have to poll at maximal frame rate. Do not sleep (even with zero sleep time) and request next frame immediately. You will have the option to drop the frames that come too early. Desktop Duplication API is designed in a way that getting extra frames might be not too expensive of you identify them early and stop their processing.

If you still prefer to sleep between the frames, you might want to read the accuracy remark:

To increase the accuracy of the sleep interval, call the timeGetDevCaps function to determine the supported minimum timer resolution and the timeBeginPeriod function to set the timer resolution to its minimum. Use caution when calling timeBeginPeriod, as frequent calls can significantly affect the system clock, system power usage, and the scheduler. If you call timeBeginPeriod, call it one time early in the application and be sure to call the timeEndPeriod function at the very end of the application.

Concession answered 8/6, 2017 at 20:17 Comment(5)
Unfortunately, if I understand correctly, this isn't solving the problem. My question mentions that I've tried 1) with a 0 timeout and sleeping/busy-while as well as 2) with an infinite timeout, taking frames as fast as they come. The problem with the second approach is that I often end up waiting on AcquireNextFrame() for more than 17ms. Worst of all when this happens AccumulatedFrames is greater than 1, meaning I missed a frame. I've added a new gpuView picture that might illustrate what I'm sayingChinchy
I've edited the question to remove the first approach (with timeout zero). It's much simpler this wayChinchy
Timeout does not have to be infinite, just non-zero and greater than interframe time so that you see if you capture enough or frames or not. Getting timeout from AcquireNextFrame is possible if nothing was changed on the monitor. If changes exist but you acquire the frame late, you could possibly still start calling AcquireNextFrame late (esp. because of other activity on the thread).Concession
I'm currently using a timeout of 999ms (sorry I didn't realize you can pass an actual INFINITE value). At this point it will never realistically time out. I've tried using timeouts of just above 17 ms as you say. But then I do get timeouts. However, I know this isn't because nothing changed. PresentMon shows it running at 60fps. Also, If you see the screenshot I attached above, the hardware queue of the graphics card is still rendering frames every 17ms, even while my application "capture_to_argb.exe" is stalled for 100ms. So the update is happening, but AcquireNextFrame waits through it.Chinchy
I suppose one possibility is that, while i have called AcquireNextFrame(), it is not actually running, because it is still blocked on some previous operation like CopyResource of the previous frame. Is this what you are trying to say? If this was true though, I would think we would see a queue on the GPU in gpuView, right?Chinchy
W
0

As others have mentioned, the 60Hz refresh rate only indicates the frequency with which the display may change. It doesn't actually mean that it will change that frequently. AcquireNextFrame will only return a frame when what is being displayed on the duplicated output has changed.

My recommendation is to ...

  1. Create a Timer Queue timer with the desired video frame interval
  2. Create a compatible resource in which to buffer the desktop bitmap
  3. When the timer goes off, call AcquireNextFrame with a zero timeout
  4. If there has been a change, copy the returned resource to your buffer and release it
  5. Send the buffered frame to the encoder or whatever further processing

This will yield a sequence of frames at the desired rate. If the display hasn't changed, you'll have a copy of the previous frame to use to maintain your frame rate.

Waterway answered 31/8, 2017 at 15:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.