-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prepack serialization follow up PR #22670
base: main
Are you sure you want to change the base?
Conversation
…frdong/prepack_2
return initializer_name + seperator + node_name; | ||
} | ||
|
||
size_t GetMemoryAlignedOffset(size_t current_offset) { |
Check warning
Code scanning / PREfast
You can attempt to make 'onnxruntime::utils::GetMemoryAlignedOffset' constexpr unless it contains any undefined behavior (f.4). Warning
return initializer_name + seperator + node_name; | ||
} | ||
|
||
size_t GetMemoryAlignedOffset(size_t current_offset) { | ||
const size_t alignment_number_of_bytes = 64; |
Check warning
Code scanning / PREfast
The const variable 'alignment_number_of_bytes' can be computed at compile-time. Consider using constexpr (con.5). Warning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function should beconstexpr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's address some important issues in this round first.
@@ -38,15 +39,25 @@ class Attention : public OpKernel, public AttentionCPUBase { | |||
int input_idx, | |||
/*out*/ bool& used_shared_buffers) override; | |||
|
|||
virtual std::optional<Tensor> GetPrePackTensor(int /*input_index*/) override; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
virtual is redundant in overriding functions. They are virtual by default.
private: | ||
bool IsPackWeightsSuccessful(int qkv_index, AllocatorPtr alloc, size_t head_size, | ||
size_t input_hidden_size, const T* weights_data, | ||
size_t weight_matrix_col_size, PrePackedWeights* prepacked_weights); | ||
|
||
void ConvertPrePackWeightsIntoTensor(onnxruntime::AllocatorPtr& alloc, const onnxruntime::Tensor& weights, PrePackedWeights* prepacked_weights); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going by the name prepacked_weights can not be optional parameter otherwise there is nothing to prepack, wen can it be nullptr?
Is it an input or output?
|
||
template <typename T> | ||
void Attention<T>::ConvertTensorToPrePackWeight(void* tensor_data_raw) { | ||
// buffer of packed_tensor is combine of: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we create a struct of the layout and then reinterpret the ptr into it, it is difficult to read and understand the code.
|
||
TensorShapeVector shape_vector = weight_shape_.AsShapeVector(); | ||
size_t shape_vector_mem_size = utils::CalculateTensorShapeVectorMemoryUsage(shape_vector); | ||
void* shape_vector_ptr = static_cast<void*>(&shape_vector); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is undefined behavior. You need to copy data not the object of this class.
For this you can just get GetDims() span and using it copy data from it.
std::memcpy(weight_shape_buffer.get(), | ||
static_cast<char*>(tensor_data_raw) + 4 * sizeof(size_t), | ||
weight_shape_buffer_size); | ||
auto weight_shape_vector = static_cast<const InlinedVector<int64_t>*>(weight_shape_buffer.get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just wrong. Undefined behavior.
Tensor ConvertPackedBufferAndShapeToTensor(onnxruntime::AllocatorPtr& alloc, | ||
const onnxruntime::Tensor& weights, | ||
size_t packed_weights_size_, | ||
TensorShape weight_shape_, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Tensor shape is passed by value? Did you mean to pass span?
- Trailing underscores are reserved for member variables
- I am confused what is the result of this function? Is the last parameter also an output?
void* original_packed_buffer, | ||
IAllocatorUniquePtr<void>& packed_buffer); | ||
|
||
Tensor ConvertPackedBufferAndShapeToTensorWithFlag(onnxruntime::AllocatorPtr& alloc, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comments for all other new functions.
|
||
size_t CalculateTensorShapeVectorMemoryUsage(TensorShapeVector& tensor_shape_vector) { | ||
// Calculate memory for the vector object itself (metadata) | ||
size_t vector_metadata_size = sizeof(std::vector<int64_t>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You do not want to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is tensor_shape_vector related to std::vector?
// 3. original packed_weights buffer | ||
TensorShapeVector shape_vector = weight_shape_.AsShapeVector(); | ||
size_t shape_vector_mem_size = utils::CalculateTensorShapeVectorMemoryUsage(shape_vector); | ||
void* shape_vector_ptr = static_cast<void*>(&shape_vector); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Undefined behavior
// 2. weight shape: first vector memory size, then vector content | ||
// 3. original packed_weights buffer | ||
TensorShapeVector shape_vector = weight_shape_.AsShapeVector(); | ||
size_t shape_vector_mem_size = utils::CalculateTensorShapeVectorMemoryUsage(shape_vector); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am struggling as to what you are trying to do here.
Description
follow up PR of : #22256
change rest kernels with prepack implemented