-
Notifications
You must be signed in to change notification settings - Fork 389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core/logging: Fix data race on log_prefix #10568
base: main
Are you sure you want to change the base?
Conversation
72a7f23
to
846c64d
Compare
Please change the commit message heading to: |
Done! |
@dsciebu - You updated the github comment and title (as seen on the web page), but not commit message (as seen by git log). Both are useful, but we need the details in the commit message, as that is what gets added to git and visible via git log. The changes themselves look good -- thanks! |
In ofi_get_core_info, which is supposed to be thread safe ("Multiple threads may call fi_getinfo simultaneously, without any requirement for serialization."), a global variable 'log_prefix' is modified, which may lead to a data race. Changing the variable to a thread local one, fixes that problem. Signed-off-by: Dariusz Sciebura <[email protected]>
846c64d
to
6935b70
Compare
I am used to the scenario, where the commits are squashed after merging and PR's title/comment is used to prepare the new commit. Anyway, I refreshed the commit name/desc as you wished. |
Hey guy, friendly reminder - can you rereview and merge my change? |
bot:aws:retest |
cc: @shefty |
@dsciebu Looks like _Thread_local doesn't play nicely with Windows. We will have to figure out what the comparable Windows solution is. Looks like you have to modify it in the caller which is annoying. See https://learn.microsoft.com/en-us/windows/win32/procthread/using-thread-local-storage |
It looks like _Thread_local is defined in C11, but then deprecated in C23. thread_local is defined for C23. I found one reference where MS uses __declspec(thread). So... it looks like there needs to be some sort of abstraction for this, possibly set based on the platform, but maybe also the compiler? |
@shefty Yeah... this is going to get messy. Woof |
@shefty @aingerson #ifndef OFI_THREAD_LOCAL_H
#define OFI_THREAD_LOCAL_H
#ifdef __cplusplus
extern "C" {
#endif
#if defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 202311L)
// C23 and above: use thread_local directly
#define OFI_THREAD_LOCAL thread_local
#elif defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 201102L) && defined(_Thread_local)
// C11: use _Thread_local
#define OFI_THREAD_LOCAL _Thread_local
#elif defined(__GNUC__) || defined(__INTEL_COMPILER) || defined(__SUNPRO_CC) || defined(__IBMCPP__) || defined(__clang__)
// GCC/Clang/Intel/SunPro/IBM compilers
#define OFI_THREAD_LOCAL __thread
#elif defined(_MSC_VER)
// Microsoft Visual C++ compiler
#define OFI_THREAD_LOCAL __declspec(thread)
#else
// Fallback: Unsupported compiler
#error "Thread-local storage is not supported on this platform"
#endif
#ifdef __cplusplus
}
#endif
#endif // OFI_THREAD_LOCAL_H |
@piotrchmiel You'll need to separate the Windows and Unix definitions and put them in their appropriate osd.h headers (see the include/ directory and the unix/ and windows/). But then something similar should work. Depending on the availability of the option, I would remove the |
The race impacts debug log messages only. I would simply live with the race by default. Definitely don't error |
I would at minimum add a compile warning/message and comment about it. Yes, this issue only applies to debug log messages, but anyone could see and call the thread local implementation for something else and think it will protect their code and be very disappointed :( |
Besides that: letting the data race impact debug mode leads to covering other problems when run with sanitizers (tsan catches data race here and stops - it's then useless for the real problem examination). |
In ofi_get_core_info, which is supposed to be thread safe ("Multiple threads may call fi_getinfo simultaneously, without any requirement for serialization."), a global variable 'log_prefix' is modified, which may lead to a data race. Changing the variable to a thread local one, fixes that problem.
Signed-off-by: Dariusz Sciebura [email protected]