DXGI Waitable SwapChain not waiting

I setup a DX12 application that only clears the backbuffer every frame.

It really is barebone : no PSO, no root... The only particularity is that it waits on the swapChain to be done with Present() before starting a new frame (msdn waitable swap chain) (I set the frame latency to 1 as well and on only have 2 buffers).

The first frame works well but it immediately starts drawing the second frame, and of course, the command allocator complains that it is being reset while commands are still being executed on the GPU.

I could of course setup a fence to wait for the gpu to be done before moving to a new frame, but I thought this was the job of the waitable swap chain object.

Here is the render routine:

if (m_command_allocator->Reset() == E_FAIL) { throw; }

HRESULT res = S_OK;
res = m_command_list->Reset(m_command_allocator.Get(), nullptr);
if (res == E_FAIL || res == E_OUTOFMEMORY) { throw; }

m_command_list->ResourceBarrier(1, 
&CD3DX12_RESOURCE_BARRIER::Transition(m_render_targets[m_frame_index].Get(), 
D3D12_RESOURCE_STATE_PRESENT, D3D12_RESOURCE_STATE_RENDER_TARGET));

m_command_list->RSSetViewports(1, &m_screen_viewport);
m_command_list->RSSetScissorRects(1, &m_scissor_rect);
m_command_list->ClearRenderTargetView(get_rtv_handle(), 
DirectX::Colors::BlueViolet, 0, nullptr);
m_command_list->OMSetRenderTargets(1, &get_rtv_handle(), true, nullptr);

m_command_list->ResourceBarrier(1, 
&CD3DX12_RESOURCE_BARRIER::Transition(m_render_targets[m_frame_index].Get(), 
D3D12_RESOURCE_STATE_RENDER_TARGET, D3D12_RESOURCE_STATE_PRESENT));

tools::throw_if_failed(m_command_list->Close());
ID3D12CommandList* ppCommandLists[] = { m_command_list.Get() };
m_command_queue->ExecuteCommandLists(_countof(ppCommandLists), 
ppCommandLists);

if (m_swap_chain->Present(1, 0) != S_OK) { throw; }
m_frame_index = m_swap_chain->GetCurrentBackBufferIndex();

I loop on this routine with a waitable object which I got from the swapchain:

while (WAIT_OBJECT_0 == WaitForSingleObjectEx(waitable_renderer, INFINITE, TRUE) && m_alive == true)
{
    m_graphics.render();
}

and I initialized the swapchain with the waitable flag:

DXGI_SWAP_CHAIN_DESC1 swap_chain_desc = {};
swap_chain_desc.BufferCount = s_frame_count;
swap_chain_desc.Width = window_width;
swap_chain_desc.Height = window_height;
swap_chain_desc.Format = m_back_buffer_format;
swap_chain_desc.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT;
swap_chain_desc.SwapEffect = DXGI_SWAP_EFFECT_FLIP_DISCARD;
swap_chain_desc.SampleDesc.Count = 1;
swap_chain_desc.Flags = DXGI_SWAP_CHAIN_FLAG_FRAME_LATENCY_WAITABLE_OBJECT;

ComPtr<IDXGISwapChain1> swap_chain;
tools::throw_if_failed(
    factory->CreateSwapChainForHwnd(m_command_queue.Get(), window_handle, &swap_chain_desc, nullptr, nullptr, &swap_chain));

I call the SetFrameLatency right after creating the swapChain:

ComPtr<IDXGISwapChain2> swap_chain2;
tools::throw_if_failed(m_swap_chain.As(&swap_chain2));

tools::throw_if_failed(swap_chain2->SetMaximumFrameLatency(1));

m_waitable_renderer = swap_chain2->GetFrameLatencyWaitableObject();

And the swapChain resize that goes with it:

tools::throw_if_failed(
    m_swap_chain->ResizeBuffers(s_frame_count, window_width, window_height, m_back_buffer_format, DXGI_SWAP_CHAIN_FLAG_FRAME_LATENCY_WAITABLE_OBJECT));

My question is: Am I setting something up incorrectly? or is this the way waitable swap chain works (i.e. you also need to sync with gpu with fences before waiting for the swap chain to become available)?

EDIT: Adding SetFrameLatency call + C++ coloring

The waitable swap chain is independent from the work of protecting d3d12 object to be modified or reset while still in use by the GPU.

A waitable swap chain allow you to move the wait from the end of the frame in Present to the start of the frame with a waitable object. It has the advantage to fight latency and give more control over queuing.

Fencing over object allow you to query the GPU for completion. I recommend you to not just cross fingers as if it works one day on one system, it may not work the next one with different driver or different machine.

Because you do not want each frame to wait for GPU completion, you have to create several command allocator, usually, create a count of min(maxlatency+1,swapchain buffer count) but for safety i use personally back buffer count + 1..3. You will later find that you are gonna create way more allocators to deal with multi-threading anyway.

What does it mean for your code :

Create several allocators in a ring buffer with an associated fence value
Create a fence ( and a global fence next value )
swap chain wait
pick next allocator
if fence.GetCompletedValue() < allocator.fenceValue then WaitCompletion
Render
signal the fence with the command queue, store the fence value to allocator and increment
Jump to 3

Recommended topics

Hot tags