Losing the training wheels: Some thoughts on moving from D3D11 to D3D12 – Part 1

A new generation of graphics APIs has been released into the wild. With it come promises of improved performance through lean and mean drivers, an API that gives you – the developer – raw and virtually unhindered access to your graphics hardware, and fever dreams of pushing hundreds of thousands of draw calls per frame through your rendering pipeline without any of your systems showing any signs of breaking a sweat. Industry professionals and hobbyist developers around the world working on their very own next big thing rejoice, and all is good in the world of video game (graphics) development.

Sound amazing, doesn’t it? You bet it does!

There’s a catch though. Like with most of these big technological promises there’s a price to pay, and in this particular case the price can be expensive. The philosophy behind DirectX 12 (and Vulkan and Mantle for that matter) is for the graphics driver to get out of the developer’s way, and it does this by only doing the minimum amount of work necessary to translate the commands you provide into a predictable output on screen. This has some consequences, as by default there are no more training wheels to help you out like in DirectX 11 (and there were probably a lot more training wheels in place than you were aware of!). You are now in charge of managing the video memory backing your resources, you are in charge of making sure that your GPU resources are in valid states at all times, you are responsible for closely managing video memory budgets and consumption and telling the operating system about them, you are responsible for generating efficient resource binding layouts for your shaders, and you are responsible for so many more fun and exciting things which I couldn’t list here because I’d just be going on forever. Failing to adhere to these responsibilities will relentlessly get you kicked into the wondrous world of undefined behavior, of which the side effects can range between nothing at all going wrong and your machine completely locking up or causing a kernel panic at seemingly random times. Fun times!

This series of blog posts (which I’ve been wanting to write for a very long time, but I just never found the time to do so) goes over some problem points and general findings I’ve encountered during the time I’ve spent with DirectX 12, both professionally and in my spare time. I want to spend time especially on problem scenarios which might not be explained too well in other documentation. Additionally I’d like to talk at some point about design and architecture when building a DirectX 12 application from the ground up without a legacy D3D11 codebase to start from. Do note that this series of posts is not meant as an all-inclusive guide on how to move your existing applications to this new API or as a tutorial on how to build D3D12 applications.

Some topics we’ll go over first will be related to memory management, residency management, resource lifetime management, resource transition strategies and pipeline state object and root signature strategies. Today we’ll start off with some memory and alignment related talk.

Enjoy!


Things you might want to consider doing up front: Dealing with resource alignment

DirectX 12 does away with specialized resource interfaces such as ID3D11Buffer, ID3D11Texture2D, etc. in favor of a generalized ID3D12Resource interface with a matching D3D12_RESOURCE_DESC structure used for resource creation.

typedef struct D3D12_RESOURCE_DESC {
  D3D12_RESOURCE_DIMENSION Dimension;
  UINT64                   Alignment;
  UINT64                   Width;
  UINT                     Height;
  UINT16                   DepthOrArraySize;
  UINT16                   MipLevels;
  DXGI_FORMAT              Format;
  DXGI_SAMPLE_DESC         SampleDesc;
  D3D12_TEXTURE_LAYOUT     Layout;
  D3D12_RESOURCE_FLAGS     Flags;
} D3D12_RESOURCE_DESC;

If you’re familiar with descriptor structures in DirectX 11, nothing in here should be a big surprise. One element in this structure, the Alignment entry, might carry more weight though than you’d at first assume.

D3D12 wants video memory allocations to be straightforward and fast – remember: the driver has to get out of the developer’s way -, so it makes a lot of sense to tie the size of its allocations to a value or set of values which it can deal with in a quick way, values closely tied to video memory page sizes for example. Because of this D3D12 only allows you to specify one of four (technically three) specific alignment values: 64Kb for general purpose allocations, 4Kb for small texture resources (note: I’m generalizing heavily here), 4Mb for multisampled textures and 0 to let the runtime decide for you.

For textures these alignment values don’t pose a lot of problems, and it can be safe in a lot of cases to let the driver choose an alignment for you as textures tend to be on the larger size when it comes to memory footprints. The amount of space wasted due to alignment overhead is generally tiny compared to the actual texture size.

Buffers on the other hand pose a very big and inconvenient problem, making it almost impossible to directly (and naively) map ID3D11Buffer objects onto ID3D12Resource counterparts. As you might have noticed above, buffers have no other option than to specify a 64Kb alignment value – a value of 0 will always result in the runtime choosing 64Kb anyway -. Lots of titles tend to have a large amount of smaller ID3D11Buffer objects (think small vertex buffers, index buffers, constant buffers, etc.), which was absolutely fine in D3D11 as the driver would manage your memory for you under the hood. Taking the same approach in D3D12 will cause every single buffer, no matter how small, to take up a minimum of 64Kb of memory.

Oops.

Not dealing with this problem right away might cause your title to suddenly consume hundreds of Megabytes to Gigabytes more video memory than necessary, causing an immense amount of memory pressure on the operating system which at this point will frantically try to deal with the situation. If you have a system for dealing with residency (a topic which I’ll elaborate on in my next post) your application will start working hard (and will fail) to stay within its provided memory budget, causing your title to stutter, slow down and maybe eventually crash when the situation becomes too dire.

Pictured: You, after wasting massive amounts of video memory.

Solving this problem is going to require some work as you’re going to want to implement a mechanism for doing sub-allocations on buffer resources. A lot of D3D11 titles already provide some form of buffer sub-allocation in the form of linear allocators or ring buffers which sit on top of an underlying ID3D11Buffer resource. These are very fast and very useful, especially for video memory allocations with a lifetime of a single frame or less, and are an essential tool in any graphics codebase, so porting them to D3D12 is a good first step if you have them. If you haven’t implemented any of these yet, they’re very similar to their system memory counterparts, only now you’re dealing with pointers acquired through calls to ID3D12Resource::Map (or ID3D11DeviceContext::Map in D3D11), and for certain scenarios D3D12_GPU_VIRTUAL_ADDRESS values which act as pointers in virtual GPU memory. If you are looking for a reference implementation of these constructs, the MiniEngine project, which is part of the DirectX Graphics Samples on github provides a great implementation of a linear allocator which can be found at MiniEngine/Core/LinearAllocator.h and MiniEngine/Core/LinearAllocator.cpp.

To solve our alignment problem completely however we’ll need to go a step further by implementing a general purpose allocation system. The allocation strategies used here can be similar to the ones you’d implement for system memory allocations (look at it as implementing malloc for buffer resources). Feel free to get creative based on your specific allocation requirements; a construct such as a simple small block allocator can already solve a lot if not most of your alignment problems.

One additional thing to keep in mind when implementing a general purpose allocator is that buffer resources which are used as a backing store for your allocations are created with specific usage flags (see D3D12_RESOURCE_FLAGS) and specific heap types (see D312_HEAP_TYPE), meaning that you’ll want to specify separate allocators for separate use cases. Additionally this will have an effect on how you deal with resource state transitions (see D3D12_RESOURCE_STATES), as state transitions on a specific sub-allocated buffer can now affect other allocations on the same backing resources.

Take the following pseudocode example:

void Foo(ID3D12Resource* buffer)
{
	// Sub-allocate two buffer resources
	BufferSubAlloc bufferA = MallocBuffer(buffer, 512);
	BufferSubAlloc bufferB = MallocBuffer(buffer, 256);

	// Transition buffer A into a vertex/constant buffer state
	ResourceStateTransition(bufferA, D3D12_RESOURCE_STATE_VERTEX_AND_CONSTANT_BUFFER);

	// Transition buffer B into an index buffer state
	ResourceStateTransition(bufferB, D3D12_RESOURCE_STATE_INDEX_BUFFER);

	DoSomeStuffWithBuffers(bufferA, bufferB); // What state is buffer A in at this point?
}


Two buffer sub-allocations, Buffer A and Buffer B, are allocated off of the same buffer. Buffer A gets transitioned into a vertex/constant buffer state, which results in the underlying ID3D12Resource to be transitioned into that same state. Next up, Buffer B gets transitioned into an index buffer state, causing that same ID3D12Resource from before to be transitioned in that state. Both Buffer A and Buffer B are now in the index buffer state because their backing resource is in that state, making Buffer A invalid for use as a vertex or constant buffer.

Separating allocators based on intended use cases can save you the headache of having to deal with situations like these and will allow you to use buffer sub-allocations like you would use ID3D11Buffer objects.


Closing thoughts

I hope this post didn’t turn into too much of a rant, I sometimes have trouble with staying coherent in these kinds of posts or discussions (as many who know me can tell you). I’m looking forward to writing the rest of these, and I hope that someone somewhere finds these overviews useful. Next time I’ll probably be talking about the MakeResident and Evict calls and how to use them to manage resource residency. I might throw in more stuff if I find the time, who knows.

Thank you for reading, until next time!

Leave a Reply

Your email address will not be published. Required fields are marked *