DXGI Waitable SwapChain not waiting
Asked Answered
L

1

6

I setup a DX12 application that only clears the backbuffer every frame.

It really is barebone : no PSO, no root... The only particularity is that it waits on the swapChain to be done with Present() before starting a new frame (msdn waitable swap chain) (I set the frame latency to 1 as well and on only have 2 buffers).

The first frame works well but it immediately starts drawing the second frame, and of course, the command allocator complains that it is being reset while commands are still being executed on the GPU.

I could of course setup a fence to wait for the gpu to be done before moving to a new frame, but I thought this was the job of the waitable swap chain object.

Here is the render routine:

if (m_command_allocator->Reset() == E_FAIL) { throw; }

HRESULT res = S_OK;
res = m_command_list->Reset(m_command_allocator.Get(), nullptr);
if (res == E_FAIL || res == E_OUTOFMEMORY) { throw; }

m_command_list->ResourceBarrier(1, 
&CD3DX12_RESOURCE_BARRIER::Transition(m_render_targets[m_frame_index].Get(), 
D3D12_RESOURCE_STATE_PRESENT, D3D12_RESOURCE_STATE_RENDER_TARGET));

m_command_list->RSSetViewports(1, &m_screen_viewport);
m_command_list->RSSetScissorRects(1, &m_scissor_rect);
m_command_list->ClearRenderTargetView(get_rtv_handle(), 
DirectX::Colors::BlueViolet, 0, nullptr);
m_command_list->OMSetRenderTargets(1, &get_rtv_handle(), true, nullptr);

m_command_list->ResourceBarrier(1, 
&CD3DX12_RESOURCE_BARRIER::Transition(m_render_targets[m_frame_index].Get(), 
D3D12_RESOURCE_STATE_RENDER_TARGET, D3D12_RESOURCE_STATE_PRESENT));

tools::throw_if_failed(m_command_list->Close());
ID3D12CommandList* ppCommandLists[] = { m_command_list.Get() };
m_command_queue->ExecuteCommandLists(_countof(ppCommandLists), 
ppCommandLists);

if (m_swap_chain->Present(1, 0) != S_OK) { throw; }
m_frame_index = m_swap_chain->GetCurrentBackBufferIndex();

I loop on this routine with a waitable object which I got from the swapchain:

while (WAIT_OBJECT_0 == WaitForSingleObjectEx(waitable_renderer, INFINITE, TRUE) && m_alive == true)
{
    m_graphics.render();
}

and I initialized the swapchain with the waitable flag:

DXGI_SWAP_CHAIN_DESC1 swap_chain_desc = {};
swap_chain_desc.BufferCount = s_frame_count;
swap_chain_desc.Width = window_width;
swap_chain_desc.Height = window_height;
swap_chain_desc.Format = m_back_buffer_format;
swap_chain_desc.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT;
swap_chain_desc.SwapEffect = DXGI_SWAP_EFFECT_FLIP_DISCARD;
swap_chain_desc.SampleDesc.Count = 1;
swap_chain_desc.Flags = DXGI_SWAP_CHAIN_FLAG_FRAME_LATENCY_WAITABLE_OBJECT;

ComPtr<IDXGISwapChain1> swap_chain;
tools::throw_if_failed(
    factory->CreateSwapChainForHwnd(m_command_queue.Get(), window_handle, &swap_chain_desc, nullptr, nullptr, &swap_chain));

I call the SetFrameLatency right after creating the swapChain:

ComPtr<IDXGISwapChain2> swap_chain2;
tools::throw_if_failed(m_swap_chain.As(&swap_chain2));

tools::throw_if_failed(swap_chain2->SetMaximumFrameLatency(1));

m_waitable_renderer = swap_chain2->GetFrameLatencyWaitableObject();

And the swapChain resize that goes with it:

tools::throw_if_failed(
    m_swap_chain->ResizeBuffers(s_frame_count, window_width, window_height, m_back_buffer_format, DXGI_SWAP_CHAIN_FLAG_FRAME_LATENCY_WAITABLE_OBJECT));

My question is: Am I setting something up incorrectly? or is this the way waitable swap chain works (i.e. you also need to sync with gpu with fences before waiting for the swap chain to become available)?

EDIT: Adding SetFrameLatency call + C++ coloring

Liesa answered 4/5, 2017 at 13:32 Comment(2)
Your code seems to be fine, though it is not clear whether you actually called GetFrameLatencyWaitableObject to obtain waitable_renderer.Adkisson
I edited my question with based on your suggestion and comment.Liesa
H
3

The waitable swap chain is independent from the work of protecting d3d12 object to be modified or reset while still in use by the GPU.

A waitable swap chain allow you to move the wait from the end of the frame in Present to the start of the frame with a waitable object. It has the advantage to fight latency and give more control over queuing.

Fencing over object allow you to query the GPU for completion. I recommend you to not just cross fingers as if it works one day on one system, it may not work the next one with different driver or different machine.

Because you do not want each frame to wait for GPU completion, you have to create several command allocator, usually, create a count of min(maxlatency+1,swapchain buffer count) but for safety i use personally back buffer count + 1..3. You will later find that you are gonna create way more allocators to deal with multi-threading anyway.

What does it mean for your code :

  1. Create several allocators in a ring buffer with an associated fence value
  2. Create a fence ( and a global fence next value )
  3. swap chain wait
  4. pick next allocator
  5. if fence.GetCompletedValue() < allocator.fenceValue then WaitCompletion
  6. Render
  7. signal the fence with the command queue, store the fence value to allocator and increment
  8. Jump to 3
Heave answered 4/5, 2017 at 16:2 Comment(4)
Under which condition rendering can be completed and frame presented but completion fence not firing?Adkisson
The swap chain wait object tell you that one buffer is ready to be filled, you have no assumption over the previous buffer you tried to fill, because you use only one allocator, the first time you wait will be a non op ( because 2 buffers and one was not yet used ), but because the gpu may still work on the first one, you can't reset the allocator. To reuse it you first need to wait on a fence marking the end of the render, but because you do not want to serialize cpu and gpu, you create several allocators.Heave
Also, you can in dx12 see way more patterns that can untie you allocator from the swapchain backbuffers, if you reuse bundles, if you use async compute in an separate queue, …Heave
Given that the backbuffer is flipped only when the GPU has finished processing commands, waiting for this buffer to become available also implies that the commands list "attached" to it has been processed and thus can be reset. But if the swap chain also considers the front buffer as being available (for the two first frames), then yes, I would need two command allocators (and no need for GPU sync). I'm gonna try that later today :)Liesa

© 2022 - 2024 — McMap. All rights reserved.