Skip to content

unique_ptr vs shared_ptr for resource objects (Textures, Buffers, etc) #1099

@manon-traverse

Description

@manon-traverse

We need to reach a consensus on whether using shared_ptr is fine for ownership of resources like Buffers and Textures.
In order to have this discussion without it spreading over many PRs, we can centralize it here in an issue.

To kick off the conversation, here are my thoughts on the topic:

Why shared_ptr for resources in the first place?

Managing lifetimes of GPU resources requires more thought than resources because there is also the GPU timeline to take into account. If a Buffer is freed while it is still being used on the GPU, it will cause a DEVICE_LOST error. Since those errors are quite difficult to track down, usually some sort of 'keep-alive' system is used.

A code sketch demonstrating the issue:

auto Cmd = Queue.createCommandBuffer();

// Imagine this being somewhere 5 function calls deep.
{
   shared_ptr<Buffe> Buffer = Device.createBuffer(...);
   auto DescriptorSet = DescriptorSetBuilder().read(Buffer).Build(Device, Cmd, Pipeline);
   Cmd.dispatch(Pipeline, 1024, 1024, 1, DescriptorSet);
   // Buffer cleaned-up here
}

uint64_t SyncValue = Queue.submit(Cmd, Fence); // DEVICE_LOST
Fence.waitForCompletion(SyncValue);

Keep-Alive System
The simplest form of a keep-alive system is to give shared ownership of resources used on a command buffer to the command buffer. Once the command buffer has been submitted and we have synced with the GPU, we can go through the keep-alive list and release ownership. It is at that point that resources are cleaned up if there are no other owners.

A little sketch of what this would look like:

auto Cmd = Queue.createCommandBuffer(); // Frees any KeepAlive lists of synced command buffers

// Imagine this being somewhere 5 function calls deep.
{
   shared_ptr<Buffe> Buffer = Device.createBuffer(...);
   auto DescriptorSet = DescriptorSetBuilder().read(Buffer).Build(Device, Cmd, Pipeline);
   Cmd.dispatch(Pipeline, 1024, 1024, 1, DescriptorSet); // Add resources used in DescriptorSet to Cmd's KeepAlive list.
}

uint64_t SyncValue = Queue.submit(Cmd, Fence);
Fence.waitForCompletion(SyncValue);

// Next time the Queue creates a CommandBuffer or the Queue is cleaned up, the command buffer's KeepAlive list is cleared.

Keep-Alive systems can also be built using unique_ptr by introducing a retire_resource function on the command buffer. However, this means that resources need to be cleaned up manually. If the user forgets to release ownership of the resource using retireResource, there is a risk of freeing the resource too soon and triggering a DEVICE_LOST error. A way to deal with this problem is to log/assert when the user doesn't free the resources via the retireResource route, but it is not as robust as going the shared_ptr route.

A code sketch of what the unique_ptr variant would look like

auto Cmd = Queue.createCommandBuffer(); // Frees any KeepAlive lists of synced command buffers

// Imagine this being somewhere 5 function calls deep.
{
   unique_ptr<Buffe> Buffer = Device.createBuffer(...);
   auto DescriptorSet = DescriptorSetBuilder().read(Buffer.get()).Build(Device, Cmd, Pipeline);
   Cmd.dispatch(Pipeline, 1024, 1024, 1, DescriptorSet);
   Cmd.retireResource(std::move(Buffer)); // Must not forget to call this
}

uint64_t SyncValue = Queue.submit(Cmd, Fence);
Fence.waitForCompletion(SyncValue);

// Next time the Queue creates a CommandBuffer or the Queue is cleaned up, the command buffer's KeepAlive list is cleared.

My preference goes out to automatic resource tracking using shared_ptr, as there is one less step that needs to be remembered by the user.

shared_ptr Critique Points

The main critique of using shared_ptr comes down to two main issues:

  1. Performance overhead: Copying a shared_ptr requires an atomic add operation, which can be expensive in cases where many of them are copied.
  2. Unclear Management of Ownership: shared_ptr breaks the single-ownership idiom by having shared ownership. This makes it less clear what is supposed to own what, and when resources are being released.

Regarding Performance (Point 1): This type of performance overhead only becomes significant when there is a large number of them being copied. On the scale of running the testing framework, I am more concerned with initializing a device for each test than with using a couple of shared pointers, performance-wise.

Regarding Ownership( Point 2): Because managing the resource lifetimes of resources that are used on the CPU and GPU timeline is more complicated, ownership rules are always going to be a bit trickier. When the choice is between: 1. manually keep track of CPU and GPU timelines, 2. move ownership manually on the command buffer, or 3. tracking automatically, option 3 sounds like the option that is going to be the most robust.

Please let me know what your thoughts on this are. I am open to discussion.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions