-
Notifications
You must be signed in to change notification settings - Fork 452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove is_shutdown flag from processors. And fix logger::emit() to check for the flag before emit. #2462
base: main
Are you sure you want to change the base?
Remove is_shutdown flag from processors. And fix logger::emit() to check for the flag before emit. #2462
Conversation
@@ -268,6 +268,14 @@ impl opentelemetry::logs::Logger for Logger { | |||
|
|||
/// Emit a `LogRecord`. | |||
fn emit(&self, mut record: Self::LogRecord) { | |||
if self.provider.inner.is_shutdown.load(Ordering::Relaxed) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue with this is, this can affect throughput due to the contention introduced here. (Logs so far has no contention when using etw/user_events)....
Can you check stress test before and after?
I am unsure of a solution. Maybe don't check shutdown anywhere except in stdout like non-prod processors, and rely on mechanisms like export client/ etw etc. returning errors..
what harm can it cause 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes, it doesn’t make sense to introduce the contention in the hot path, even if the contention is at atomic level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are only reading the atomic variable and the variable's value does not change for the most part of application lifetime, it most likely should not have any visible effect on the throughput.
We should be able to confirm that with the stress test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the effect of uncontested atomic would be negligible enough to be noticed over stress test. It is slightly more than normal bool-check, but much less than the perf associated in case of contention. I added the benchmark over logger:emit() in this PR, which shows the latency of 1-2ns:
main branch
:
logger_emit time: [37.916 ns 37.977 ns 38.077 ns]
change: [-0.2733% -0.0648% +0.1328%] (p = 0.58 > 0.05)
No change in performance detected.
PR branch
:
logger_emit time: [38.941 ns 39.027 ns 39.172 ns]
change: [+2.6756% +2.9861% +3.3292%] (p = 0.00 < 0.05)
Performance has regressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am unsure of a solution. Maybe don't check shutdown anywhere except in stdout like non-prod processors, and rely on mechanisms like export client/ etw etc. returning errors..
what harm can it cause
The custom exporter which can be connected to reentrant and simple processor need to handle the shutdown properly in that case. As of now, etw and user_events don't do anything, so they will continue to emit even after shutdown invoked by user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the effect of uncontested atomic would be negligible enough to be noticed over stress test.
@lalitb Just to confirm, you agree that the change in this PR (reading is_shutdown
) is not introducing any contention, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, for uncontested relaxed atomic read for AtomicBool, the cost is close to regular bool read - I observed latency of 1-2 ns by adding this check, which I believe can be acceptable. Just to add, we have this check for the spans too, when they are ended.
@@ -201,15 +187,6 @@ impl Debug for BatchLogProcessor { | |||
|
|||
impl LogProcessor for BatchLogProcessor { | |||
fn emit(&self, record: &mut LogRecord, instrumentation: &InstrumentationScope) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if an emit is called after shutdown, even if we don't do is_shutdown check, the channel would be already closed, so the error it returned is good enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
something like
- Don't check for is_shutdown.
- Just export as usual.
- Since the channel is closed, it'll error out.
- Log that error.
No contention/perf cost for normal path. If logs are still emitted after shutdown, it clearly indicates some issue with user managing the lifetimes.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2462 +/- ##
=====================================
Coverage 76.9% 76.9%
=====================================
Files 123 123
Lines 22581 22548 -33
=====================================
- Hits 17379 17359 -20
+ Misses 5202 5189 -13 ☔ View full report in Codecov by Sentry. |
Changes
To discuss the change suggested be @cijothomas here - #2381 (comment)
Merge requirement checklist
CHANGELOG.md
files updated for non-trivial, user-facing changes