Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prepack serialization follow up PR #22670

Open
wants to merge 20 commits into
base: main
Choose a base branch
from
Open

prepack serialization follow up PR #22670

wants to merge 20 commits into from

Conversation

frank-dong-ms
Copy link
Contributor

Description

follow up PR of : #22256

change rest kernels with prepack implemented

@frank-dong-ms frank-dong-ms requested a review from a team as a code owner October 31, 2024 06:48
return initializer_name + seperator + node_name;
}

size_t GetMemoryAlignedOffset(size_t current_offset) {

Check warning

Code scanning / PREfast

You can attempt to make 'onnxruntime::utils::GetMemoryAlignedOffset' constexpr unless it contains any undefined behavior (f.4). Warning

You can attempt to make 'onnxruntime::utils::GetMemoryAlignedOffset' constexpr unless it contains any undefined behavior (f.4).
return initializer_name + seperator + node_name;
}

size_t GetMemoryAlignedOffset(size_t current_offset) {
const size_t alignment_number_of_bytes = 64;

Check warning

Code scanning / PREfast

The const variable 'alignment_number_of_bytes' can be computed at compile-time. Consider using constexpr (con.5). Warning

The const variable 'alignment_number_of_bytes' can be computed at compile-time. Consider using constexpr (con.5).
Copy link
Member

@yuslepukhin yuslepukhin Oct 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function should beconstexpr

Copy link
Member

@yuslepukhin yuslepukhin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's address some important issues in this round first.

@@ -38,15 +39,25 @@ class Attention : public OpKernel, public AttentionCPUBase {
int input_idx,
/*out*/ bool& used_shared_buffers) override;

virtual std::optional<Tensor> GetPrePackTensor(int /*input_index*/) override;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

virtual is redundant in overriding functions. They are virtual by default.

private:
bool IsPackWeightsSuccessful(int qkv_index, AllocatorPtr alloc, size_t head_size,
size_t input_hidden_size, const T* weights_data,
size_t weight_matrix_col_size, PrePackedWeights* prepacked_weights);

void ConvertPrePackWeightsIntoTensor(onnxruntime::AllocatorPtr& alloc, const onnxruntime::Tensor& weights, PrePackedWeights* prepacked_weights);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going by the name prepacked_weights can not be optional parameter otherwise there is nothing to prepack, wen can it be nullptr?
Is it an input or output?


template <typename T>
void Attention<T>::ConvertTensorToPrePackWeight(void* tensor_data_raw) {
// buffer of packed_tensor is combine of:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create a struct of the layout and then reinterpret the ptr into it, it is difficult to read and understand the code.


TensorShapeVector shape_vector = weight_shape_.AsShapeVector();
size_t shape_vector_mem_size = utils::CalculateTensorShapeVectorMemoryUsage(shape_vector);
void* shape_vector_ptr = static_cast<void*>(&shape_vector);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is undefined behavior. You need to copy data not the object of this class.
For this you can just get GetDims() span and using it copy data from it.

std::memcpy(weight_shape_buffer.get(),
static_cast<char*>(tensor_data_raw) + 4 * sizeof(size_t),
weight_shape_buffer_size);
auto weight_shape_vector = static_cast<const InlinedVector<int64_t>*>(weight_shape_buffer.get());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just wrong. Undefined behavior.

Tensor ConvertPackedBufferAndShapeToTensor(onnxruntime::AllocatorPtr& alloc,
const onnxruntime::Tensor& weights,
size_t packed_weights_size_,
TensorShape weight_shape_,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Tensor shape is passed by value? Did you mean to pass span?
  2. Trailing underscores are reserved for member variables
  3. I am confused what is the result of this function? Is the last parameter also an output?

void* original_packed_buffer,
IAllocatorUniquePtr<void>& packed_buffer);

Tensor ConvertPackedBufferAndShapeToTensorWithFlag(onnxruntime::AllocatorPtr& alloc,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comments for all other new functions.


size_t CalculateTensorShapeVectorMemoryUsage(TensorShapeVector& tensor_shape_vector) {
// Calculate memory for the vector object itself (metadata)
size_t vector_metadata_size = sizeof(std::vector<int64_t>);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do not want to do it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is tensor_shape_vector related to std::vector?

// 3. original packed_weights buffer
TensorShapeVector shape_vector = weight_shape_.AsShapeVector();
size_t shape_vector_mem_size = utils::CalculateTensorShapeVectorMemoryUsage(shape_vector);
void* shape_vector_ptr = static_cast<void*>(&shape_vector);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Undefined behavior

// 2. weight shape: first vector memory size, then vector content
// 3. original packed_weights buffer
TensorShapeVector shape_vector = weight_shape_.AsShapeVector();
size_t shape_vector_mem_size = utils::CalculateTensorShapeVectorMemoryUsage(shape_vector);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am struggling as to what you are trying to do here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants