A class that encapsulates the CUDA memory allocation/deallocation process.
A Plan stores all the resources (preallocated buffers, custom CUDA kernels) required to evaluate nodes from the Operation graph.
Deregisters a kernel constructor associated with the given operation type.
Used for performing a one-off evaluation of a set of operations.
A convenience overload that evaluates a single operation and returns a single DeviceBuffer.
Provides a list of all operation types supported by the CUDA backend.
Registers a CUDA kernel constructor for a given operation type.
Provides a common interface for CUDA kernels.
This is the main interface for the dopt CUDA backend.
The APIs in this module allow users to evaluate operation graphs on GPUs through the use of CUDA. There is also functionality to register CUDA implementations of custom operations.
In future, this module will also have an interface allowing the user to register their own optimisation passes to be called when constructing a plan.