-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deformable DETR #81
Comments
If your custom ops are derived from int32_t enqueue(const PluginTensorDesc* inputDesc, const PluginTensorDesc* outputDesc,
const void* const* inputs, void* const* outputs, void* workspace, cudaStream_t stream) noexcept override; In this method, batch size could be Another option is int32_t enqueue(int32_t batchSize, void const* const* inputs, void* const* outputs, void* workspace,
cudaStream_t stream) noexcept override; Batch size is the first parameter. TensorRT OSS is a good startup to learn how to write a custom plugin. |
is the batch size needed to be define for each input? let me explain my doubt: template <typename scalar_t>
void DeformAttentionForwardCUDAKernelLauncher(
cudaStream_t stream
const scalar_t *data_value
const int64_t *data_spatial_shapes
const int64_t *c
const scalar_t *data_sampling_loc
const scalar_t *data_attn_weight
const int batch_size
const int spatial_size
const int num_heads
const int channels
const int num_levels
const int num_query
const int num_point
scalar_t *data_col) Where the class member of the plugin are const int spatial_size
const int num_heads
const int channels
const int num_levels
const int num_query
const int num_point That I pass in the constructor for serialization and deserialization (they are inferred from the inputs); The input pointer will contain the following 5 inputs inputs[0] //data_value
inputs[1] //data_spatial_shapes
inputs[2] //data_level_start_index
inputs[3] //data_sampling_loc,
inputs[4] //data_attn_weight while output[0] //data_col Now according to the pytorch operation the dimensionality are as follows:
So I have all the info needed, my doubt is how TensorRT handle the batch size, I remember something along the line that the plugins needs the batch size to be specified for all inputs, so I would need to reshape both |
Ok And as for |
Hello,
I am planning to create a custom model using the mmdetection framework, I am interested in using deformable attention, I also wrote in the main project;
Are you planning on implementing the the deformable attention plugin or are you eventually available to discuss my implementation? (might become a pull request for the project). I only need some clarification on the
enqueue
function and how TensorRT handle the batch size (2 different version ofenqueue
are available, one with explicit batch size and one with Tensor description)The text was updated successfully, but these errors were encountered: