Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(otlp-transformer): json schema based validation #5265

Open
wants to merge 2 commits into
base: next
Choose a base branch
from

Conversation

SebastienGllmt
Copy link

Which problem is this PR solving?

The goal of oltp-transformer is to provide JS-friendly (de)serialization logic for traces, metrics and logs

However, currently, there is no deserialization logic (mostly equivalent to a no-op) for JSON and there are only placeholder types given.

This PR aims to make a big step towards providing proper deserialization by defining a JSON schema for the various formats (defined using Typebox)

This means you can now deserialize a trace for example with the following code

import { Value } from '@sinclair/typebox/value';

const requestBody = ...; // some raw data you got from somewhere
const parsed = Value.Parse(TExportLogsServiceRequest, request.body);
//        ^? this now has the proper type definition

Short description of the changes

Introduce JSON Schema types for log, trace and metrics in otlp-transformer as a setup for proper deserialization

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

The previous tests all still pass with the new refactoring.

You can verify yourself with npm run test inside `experimental/packages/otlp-transformer

Checklist:

  • Followed the style guidelines of this project
  • Unit tests have been added
  • Documentation has been updated

In preparation of stabilizing `@opentelemetry/otlp-transformer`,
this commit introduces some new entrypoints for the package:

* `@opentelemetry/otlp-transformer/protobuf`: utilities for working
  with the OTLP binary protobuf format
* `@opentelemetry/otlp-transformer/json`: utilities for working
  with the OTLP JSON format
* `@opentelemetry/otlp-transformer/experimental`: features to
  remain in experimental status post-stabilization

The intent of separating out the first two entrypoints is to both
aid bundlers with tree-shaking, but also to prevent the irrelevant
code from running at all, since the generated prtobuf code is known
to cause problems in certain environments (e.g. see open-telemetry#4987, open-telemetry#5096).

The last of those entrypoints is currently empty, but expected to
be utilized in future commits as features are triaged as part of
the stabilization effort.

Fixes open-telemetry#5216
@SebastienGllmt SebastienGllmt requested a review from a team as a code owner December 13, 2024 15:53
Copy link

linux-foundation-easycla bot commented Dec 13, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: chancancode / name: Godfrey Chan (c0d0966)
  • ✅ login: SebastienGllmt / name: Sebastien Guillemot (7bfdb6c)

@@ -77,7 +77,7 @@ describe('LogLevelFilter DiagLogger', () => {

const levelMap: Array<{
message: string;
level: DiagLogLevel;
level: number | DiagLogLevel;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is because this test is purposely using invalid enum values to check that the function handles them correctly. Typescript v4 didn't complain about invalid values, but Typescript v5 did so I added number to explicitly tell it we're using invalid numbers

@@ -19,6 +19,10 @@ All notable changes to experimental packages in this project will be documented

### :boom: Breaking Change

* feat(otlp-transformer)!: add new entrypoints for non-core features [#5259](https://github.com/open-telemetry/opentelemetry-js/pull/5259/)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this PR is based on #5259 to avoid ugly merge work once their PR is merged. It makes the diff unfortunately ugly until their PR is merged though

@@ -77,10 +115,11 @@
"nyc": "15.1.0",
"protobufjs-cli": "1.1.3",
"ts-loader": "9.5.1",
"typescript": "4.4.4",
"typescript": "5.7.2",
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to upgrade to Typescript v5 for typebox to work properly. Since this PR is targeting the next branch which is meant to allow newer typescript versions, I assume this isn't an issue (although I'm not sure which specific typescript v5 version you want to use)

/**
* Numerical value of the severity, normalized to values described in Log Data Model.
*/
export const enum ESeverityNumber {
export enum ESeverityNumber {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to convert a few const enum to regular enums in this PR to make the JSON Schema validation work correctly. otlp-transformer is the only place in the entire codebase that uses const enum (everywhere else uses regular enums), so probably these shouldn't have been const enum in the first place anyway

Copy link
Author

@SebastienGllmt SebastienGllmt Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably the use of const enum was motivated by bullet point number 2 in the spec below

Values of enum fields MUST be encoded as integer values`

https://github.com/open-telemetry/opentelemetry-proto/blob/main/docs/specification.md#json-protobuf-encoding

Fortunately, Typebox handles (non-const) enums properly in this case so this isn't an issue

enum Foo {
  foo = 1,
}
const box = Type.Enum(Foo);

const val: Static<typeof box> = 1;
const key: Static<typeof box> = 'foo'; // this is correctly an error

resourceSpans?: IResourceSpans[];
}
import { TFixed64, TInstrumentationScope, TKeyValue, TResource } from '../common/types';
import { Type, type Static } from "@sinclair/typebox";

export interface IExportTraceServiceResponse {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I left these top-level "response" types untouches (so this PR doesn't actually change the deserialization behavior - it just does the ground work required to make it work)

The reason why is because right now the deserialization code is just some placeholders and I wasn't sure what you had planned for how to use them. Instead of making an opinionated changed, I wanted to keep the scope of this PR small where it just introduces the groundwork required for deserialization, then updating the deserialization logic can be done in a separate PR

@pichlermarc
Copy link
Member

De-serialization for anything other than export responses is out of scope for this package as it's not needed for our exporters. @opentelemetry/otlp-transformer is intended for internal use only. If someone needs to deserialize OTLP, be it protobuf or JSON, they can likely generate code from https://github.com/open-telemetry/opentelemetry-proto.

Is there something that I'm missing here?

@SebastienGllmt SebastienGllmt force-pushed the oltp-transformer-typebox branch from 6063edb to af3c301 Compare December 13, 2024 16:22
@SebastienGllmt SebastienGllmt force-pushed the oltp-transformer-typebox branch from af3c301 to 7bfdb6c Compare December 13, 2024 16:36
Comment on lines +37 to +45
IAnyValue: Type.Object({
stringValue: Type.Optional(Type.Union([Type.String(), Type.Null()])),
boolValue: Type.Optional(Type.Union([Type.Boolean(), Type.Null()])),
intValue: Type.Optional(Type.Union([Type.Number(), Type.Null()])),
doubleValue: Type.Optional(Type.Union([Type.Number(), Type.Null()])),
arrayValue: Type.Optional(Type.Ref("IArrayValue")),
kvlistValue: Type.Optional(Type.Ref("IKeyValueList")),
bytesValue: Type.Optional(Type.Uint8Array()),
}),
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original type for this before my PR was given as an intersection of possible keys

However, possibly this was meant to be a union? Is it possible for multiple of these keys to be present, or is it always at most one of them?

Copy link
Author

@SebastienGllmt SebastienGllmt Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like, given the toAnyValue implementation, this is indeed supposed to be a union

"esnext": "build/esnext/index.js",
"types": "build/src/index.d.ts",
"main": "build/src/index.js",
"exports": {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another alternative to this PR could be to modify this to expose the generated code: https://github.com/open-telemetry/opentelemetry-js/pull/5259/files#r1884673988

@pichlermarc
Copy link
Member

De-serialization for anything other than export responses is out of scope for this package as it's not needed for our exporters. @opentelemetry/otlp-transformer is intended for internal use only. If someone needs to deserialize OTLP, be it protobuf or JSON, they can likely generate code from https://github.com/open-telemetry/opentelemetry-proto.

Is there something that I'm missing here?

@SebastienGllmt in case you did not see my comment last week.

@SebastienGllmt
Copy link
Author

@pichlermarc I think what to depends on the plans around whether or not you want to expose code-generation: #5259 (comment)

Right now the code is a kind of weird split where the types are manual but the JS code is code-generated (to avoid importing the protobuf package presumably). This PR moves both types & JS to be explicit in JS, whereas the comment in the other issue is the other option (have both come from the code-generation and exposing that to end-users)

@pichlermarc
Copy link
Member

@pichlermarc I think what to depends on the plans around whether or not you want to expose code-generation: #5259 (comment)

Right now the code is a kind of weird split where the types are manual but the JS code is code-generated (to avoid importing the protobuf package presumably). This PR moves both types & JS to be explicit in JS, whereas the comment in the other issue is the other option (have both come from the code-generation and exposing that to end-users)

Mhm, the JS code is generated to avoid having to load the proto from the file system, with is incompatible with bundlers. It used to be scattered all over the exporter code and I've consolidated it all into @opentelemetry/otlp-transformer. I have then removed any excessive public exports (see #4582, #4583) to enable bundle-size and performance improvements in SDK 2.0 (see #4385, @dyladan is working on this). Since I assume that these improvements and any follow-up changes will modify the internals significantly, and to make review of my refactoring moving preexisting code from the exporter packages to @opentelemetry/otlp-transformer easier, I did not bother improving the code there yet.

The @opentelemetry/otlp-transformer package is only intended to take care of conversion from the internal SDK data format to OTLP, and to de-serialize the response for the purpose of OTLP export. Since that is the case, we will not export any internals to end-users. We will not add features other than these as it's out-of-scope for the package and the project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants