Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve default allocator wording #649

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions adoc/chapters/programming_interface.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -4918,8 +4918,10 @@ STL-based libraries (e.g, Intel's TBB provides an allocator).
==== Default allocators

A default allocator is always defined by the implementation.
For allocations greater than size zero, when successful it is guaranteed to
return non-[code]#nullptr# and new memory positions every call.
For successful allocations of size greater than zero, the default allocator must
return a pointer to a contiguous, exclusive region of memory of at least the
requested size.
For unsuccessful allocations, the default allocator must return [code]#nullptr#.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we instead removing this wording? The table below already says that the default allocator meets the C++ requirement for Allocator, and C++ defines the behavior of Allocator quite precisely.

If you ask me, the most confusing part of this section is this statement:

The user can implement an allocator that returns the same address as the one passed in the buffer constructor, but it is the responsibility of the user to handle the potential race conditions.

It would be good to figure out what we meant here and clarify this. Or, maybe it should be deleted?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My best guess is that it's talking about the buffer constructor overloads that accept both a host pointer and an allocator:

  buffer(const T* hostData, const range<Dimensions>& bufferRange,
         AllocatorT allocator, const property_list& propList = {});

The buffer owns hostData during its lifetime, so I guess it might be safe sometimes for allocator to return hostData (or other pointers within the range). But I don't know why that's useful, and I don't know how the user can avoid race conditions. Maybe use_mutex?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it might be safe sometimes for allocator to return hostData (or other pointers within the range)

Yes, that's my best guess too. However, this seems to assume that buffer will only call allocator in order to create a host buffer that contains hostData, and it won't call allocator for any other purpose. This is not specified anywhere, and it's contrary to the way the standard C++ library functions use allocators. When a class from the standard C++ library takes an allocator, I think it uses the allocator to allocate all the dynamic memory that it needs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we instead removing this wording?

I'd also prefer to just remove the sentence and not replace it with anything.

If you ask me, the most confusing part of this section is this statement:

I interpreted that as saying that a user could use an allocator for the user's own purposes (whatever those may be) and that same allocator could be passed to a buffer constructor. I.e., the act of giving an allocator to a buffer doesn't disallow the user from using the same allocator. Given that the SYCL runtime might choose to allocate memory using the allocator asynchronously, the user must put some locking mechanism into the allocator to avoid races.

Which seems reasonable to me, but I don't know if that's what was intended.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not how I interpret that sentence, but I think we all agree that the sentence is confusing. Personally, I'd be happy to delete it.

The issue about thread safety is interesting. Is it legal for the implementation to call the user's allocator simultaneously from different threads even if the application has only one thread? (Or only one thread calling SYCL APIs?) For example, could an implementation have a background thread that potentially calls the allocator at the same time as the application's main thread?

If we think this is a valid implementation, the specification should make it clear that the allocator must be written to be thread safe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I skimmed the C++ spec, and I couldn't find anything on allocator thread safety one way or another. I could well have missed something, though.

I'm not convinced std::allocator is intended to be thread-safe. It's specified as stateless, and I'd hazard a guess that any mutex or such inside the allocator could qualify as state. Also, std::allocators get rebound inside the standard library (and elsewhere), which implies constructing different allocator objects. A mutex data member wouldn't provide thread-safety. If std::allocator does have a thread-safety guarantee, I'd expect it to be provided by whatever plumbing std::allocator relies on, which could be implementation-defined.

For SYCL, I think there are two ways of looking at it:

  1. "SYCL is thread-safe so that the user doesn't need to worry about thread-safety. Therefore, the user shouldn't need to worry about providing a thread-safe allocator; it's the SYCL runtime that must provide the thread-safety."
  2. "SYCL is thread-safe. When a SYCL runtime relies on an entity provided by the user, the user-provided entity must be thread-safe so that the SYCL runtime can be thread-safe."

Both of these seem entirely reasonable to me, but only one is consistent with C++. Problem is, we don't know which one it is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I skimmed the C++ spec, and I couldn't find anything on allocator thread safety one way or another. I could well have missed something, though.

It doesn't say "thread-safe", but allocator.members says:

Except for the destructor, member functions of the default allocator shall not introduce data races ([intro.multithread]) as a result of concurrent calls to those member functions from different threads. Calls to these functions that allocate or deallocate a particular unit of storage shall occur in a single total order, and each such deallocation call shall happen before the next allocation (if any) in this order.

For SYCL, I think there are two ways of looking at it:

I agree that both of these are reasonable interpretations. I have a preference for 2.

Copy link
Collaborator

@nliber nliber Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think std::allocator is guaranteed thread-safe (at least as far as acquiring/releasing memory) because it is specified to call ::operator new which in turn is specified to call malloc or aligned_malloc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a preference for 2.

I agree. With option 1, a SYCL runtime needs to conservatively assume that any user-supplied allocator could race with any other, which hurts performance. Also, more similar to C++.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, option 2 seems reasonable too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For unsuccessful allocations, the default allocator must return [code]#nullptr#.

I think this is not the way allocators normally handle out-of-memory. A C++ allocator normally throws an exception in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that named requirement: Allocator implies that the only alternative an allocator has to providing the requested memory is throwing.

A few lines earlier in the SYCL spec (L 4892), it says:

When an allocator returns a `nullptr`, the runtime cannot allocate data on the host.

So I guess either the SYCL spec expect more C-like behavior (in which case default allocators are not actually named requirement Allocators), or default allocators throw on failure, in which case an allocator will actually throw when the runtime can't allocate data on the host. So one sentence or the other needs to change in the SYCL spec.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SYCL spec should not invent some new way for C++ allocators to work. I think the previous statement about returning nullptr was just a mistake.

The default allocator for const buffers will remove the const-ness of the type
(therefore, the default allocator for a buffer of type [code]#const int# will be
an [code]#Allocator<int>)#.
Expand Down