Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GlobalOpenTelemetry.get and set Conflict Causing Runtime Exception in OpenTelemetry SDK #6937

Open
celikrecep opened this issue Dec 10, 2024 · 3 comments
Labels
Bug Something isn't working

Comments

@celikrecep
Copy link

Hello, I am using OpenTelemetry version 1.44.0 in my Android project. On Crashlytics, I see the following error: "GlobalOpenTelemetry.set". This error doesn’t happen all the time but occurs for some users. The root cause of the error is as follows:

io.opentelemetry.api.GlobalOpenTelemetry.set (GlobalOpenTelemetry.java:115)
io.opentelemetry.api.GlobalOpenTelemetry.get (GlobalOpenTelemetry.java:85)
io.opentelemetry.api.GlobalOpenTelemetry.getMeterProvider (GlobalOpenTelemetry.java:174)
io.opentelemetry.exporter.internal.ExporterMetrics.meter (ExporterMetrics.java:91)
io.opentelemetry.exporter.internal.ExporterMetrics.seen (ExporterMetrics.java:74)
io.opentelemetry.exporter.internal.ExporterMetrics.addSeen (ExporterMetrics.java:58)
io.opentelemetry.exporter.internal.http.HttpExporter.export (HttpExporter.java:58)
io.opentelemetry.exporter.otlp.http.logs.OtlpHttpLogRecordExporter.export (OtlpHttpLogRecordExporter.java:102)
io.opentelemetry.sdk.logs.export.BatchLogRecordProcessor$Worker.exportCurrentBatch (BatchLogRecordProcessor.java:321)
io.opentelemetry.sdk.logs.export.BatchLogRecordProcessor$Worker.run (BatchLogRecordProcessor.java:238)
java.lang.Thread.run (Thread.java:929)

Prior to this, the stack trace points to the Dagger module where I provide OpenTelemetry, as follows:

io.opentelemetry.sdk.OpenTelemetrySdkBuilder.buildAndRegisterGlobal (OpenTelemetrySdkBuilder.java:85)

I provide OpenTelemetry using Dagger Hilt with the following code:

OpenTelemetrySdk.builder()
    .apply { contextPropagators.ifPresent { setPropagators(it) } }
    .apply { tracerProvider.ifPresent { setTracerProvider(it) } }
    .apply { sdkLoggerProvider.ifPresent { setLoggerProvider(it) } }
    .apply { sdkMeterProvider.ifPresent { setMeterProvider(it) } }
    .buildAndRegisterGlobal()

Afterward, I perform all operations through this OpenTelemetry instance without calling the set method again.

When I follow the stack trace, the issue seems to occur because the get() method calls set() since GlobalOpenTelemetry is null. However, within the set() method, an exception is thrown because GlobalOpenTelemetry is not null.

Despite trying various approaches, I have been unable to reproduce this error. Could you please help me with this issue?

Artifacts: (opentelemetry-bom, opentelemetry-sdk, opentelemetry-api, opentelemetry-exporter-otlp, opentelemetry-exporter-logging, opentelemetry-exporter-otlp-common)

Version: (1.44.0)

How did you reference these artifacts?
build.gradle

Note: I’m not sure if it matters, but when using version 1.40.0, this error occurred very rarely—maybe once a week at most for users.

Please also share if you have any suggestion regarding how to initialize OpenTelemetry before any of the GlobalOpenTelemetry APIs get invoked automatically by the SDK?

@celikrecep celikrecep added the Bug Something isn't working label Dec 10, 2024
@jack-berg
Copy link
Member

@jkwatson I think there's a possible race condition.

  • OtlpHttpLogRecordExporter (and other exports) capture internal metrics with a lazily initialized meter from GlobalOpenTelemetry. See here.
  • BatchLogRecordProcessor's worker thread starts immediately when built, pulling logs off the queue and calling exporting them via OtlpHttpLogRecordExporter, and causing OtlpHttpLogRecordExporter's lazily initialized meter to initialize.
  • This means as soon as SdkLoggerProvider is plugged into instrumentation and has log records flowing, GlobalOpenTelemetry.get() is going to be called. And if you recall, GlobalOpenTelemetry.get() is guaranteed to always return the same result, so calling set after get produces an error.
  • So the race is:
    • A user (or autoconfiguration process) initializes a SdkLoggerProvider with BatchLoggerProvider and OtlpHttpLogRecordExporter
    • They plug the SdkLoggerProvider into instrumentation, which records logs
    • Moments later, they call GlobalOpenTelemetry.set with the SdkLoggerProvider they've just constructed
    • There is a race between when the BatchLogRecordProcessor's worker exports the first batch, causing OtlpHttpLogRecordExporter's meter to be lazily initialized, calling GlobalOpenTelemetry.get(), and when GlobalOpenTelemetry.set is called by the setup process

@breedx-splk do you know of any instances in the android setup process where the autoconfigured SdkLoggerProvider might be used to emit logs (or events) before GlobalOpenTelemetry.set is called?

@breedx-splk
Copy link
Contributor

@breedx-splk do you know of any instances in the android setup process where the autoconfigured SdkLoggerProvider might be used to emit logs (or events) before GlobalOpenTelemetry.set is called?

I don't know of an instance like that right now, but it sounds like @celikrecep might not be using opentelemetry-android at all and is just leveraging the api+sdk directly? Can we confirm?

We do have a scenario where we need to emit events during library initialization, before the otel sdk has been created...and in those cases we have to go through some pains to buffer/store those in a separate model before we can actually emit them with otel.

@celikrecep
Copy link
Author

yes, i'm just using api+sdk, i'm not not using opentelemetry-android @breedx-splk. If I use build() instead of buildRegisterAndGlobal(), GlobalOpenTelemetry will remain as no-op. Would this break or affect metrics and baggagePropagator? Or do you have any suggestions to avoid a crash?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants