Skip to content

Commit

Permalink
[receiver/github] add tracing via webhook skeleton (open-telemetry#36632
Browse files Browse the repository at this point in the history
)

#### Description
Adds the basic webhook configuration and logic, with a health check, to
enable development of tracings (and logs) in future iterations.

#### Testing
Added basic tests and built the component to test that the health check
endpoint, when tracing is enabled, operates correctly.

#### Documentation
Because this portion of the receiver is in development, and adds only
the skeleton, no docs have been added yet.
  • Loading branch information
adrielp authored and sbylica-splunk committed Dec 17, 2024
1 parent 7472bb2 commit f345276
Show file tree
Hide file tree
Showing 15 changed files with 439 additions and 45 deletions.
30 changes: 30 additions & 0 deletions .chloggen/gh-trace-skeleton.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Use this changelog template to create an entry for release notes.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: enhancement

# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
component: githubreceiver

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Adds webhook skeleton to GitHub receiver to receive events from GitHub for tracing.

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [27460]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext: |
This PR adds a skeleton for the GitHub receiver to receive events from GitHub
for tracing via a webhook. The trace portion of this receiver will run and
respond to GET requests for the health check only.
# If your change doesn't affect end users or the exported elements of any package,
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
# Optional: The change log or logs in which this entry should be included.
# e.g. '[user]' or '[user, api]'
# Include 'user' if the change is relevant to end users.
# Include 'api' if there is a change to a library API.
# Default: '[user]'
change_logs: [user]
65 changes: 59 additions & 6 deletions receiver/githubreceiver/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,18 +3,18 @@
<!-- status autogenerated section -->
| Status | |
| ------------- |-----------|
| Stability | [alpha]: metrics |
| Stability | [development]: traces |
| | [alpha]: metrics |
| Distributions | [contrib] |
| Issues | [![Open issues](https://img.shields.io/github/issues-search/open-telemetry/opentelemetry-collector-contrib?query=is%3Aissue%20is%3Aopen%20label%3Areceiver%2Fgithub%20&label=open&color=orange&logo=opentelemetry)](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues?q=is%3Aopen+is%3Aissue+label%3Areceiver%2Fgithub) [![Closed issues](https://img.shields.io/github/issues-search/open-telemetry/opentelemetry-collector-contrib?query=is%3Aissue%20is%3Aclosed%20label%3Areceiver%2Fgithub%20&label=closed&color=blue&logo=opentelemetry)](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues?q=is%3Aclosed+is%3Aissue+label%3Areceiver%2Fgithub) |
| [Code Owners](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/CONTRIBUTING.md#becoming-a-code-owner) | [@adrielp](https://www.github.com/adrielp), [@andrzej-stencel](https://www.github.com/andrzej-stencel), [@crobert-1](https://www.github.com/crobert-1), [@TylerHelmuth](https://www.github.com/TylerHelmuth) |

[development]: https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/component-stability.md#development
[alpha]: https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/component-stability.md#alpha
[contrib]: https://github.com/open-telemetry/opentelemetry-collector-releases/tree/main/distributions/otelcol-contrib
<!-- end autogenerated section -->

The GitHub receiver receives data from [GitHub](https://github.com). As a
starting point it scrapes metrics from repositories but will be extended to
include traces and logs.
The GitHub receiver receives data from [GitHub](https://github.com).

The current default set of metrics can be found in
[documentation.md](./documentation.md).
Expand All @@ -26,13 +26,13 @@ engineering practices.
[doracap]: https://dora.dev/capabilities/
[dorafour]: https://dora.dev/guides/dora-metrics-four-keys/

## Getting Started
## Metrics - Getting Started

The collection interval is common to all scrapers and is set to 30 seconds by default.

> Note: Generally speaking, if the vendor allows for anonymous API calls, then you
> won't have to configure any authentication, but you may only see public repositories
> and organizations. You may run into significantly more rate limiting.
> and organizations. You may also run into significantly more rate limiting.
```yaml
github:
Expand Down Expand Up @@ -92,3 +92,56 @@ For additional context on GitHub scraper limitations and inner workings please
see the [Scraping README][ghsread].
[ghsread]: internal/scraper/githubscraper/README.md#github-limitations
## Traces - Getting Started
Workflow tracing support is actively being added to the GitHub receiver.
This is accomplished through the processing of GitHub Actions webhook
events for workflows and jobs. The [`workflow_job`][wjob] and
[`workflow_run`][wrun] event payloads are then constructed into `trace`
telemetry.

Each GitHub Action workflow or job, along with its steps, are converted
into trace spans, allowing the observation of workflow execution times,
success, and failure rates.

### Configuration

**IMPORTANT: At this time the tracing portion of this receiver only serves a health check endpoint.**

The WebHook configuration exposes the following settings:

* `endpoint`: (default = `localhost:8080`) - The address and port to bind the WebHook to.
* `path`: (default = `/events`) - The path for Action events to be sent to.
* `health_path`: (default = `/health`) - The path for health checks.
* `secret`: (optional) - The secret used to [validates the payload][valid].
* `required_header`: (optional) - The required header key and value for incoming requests.

The WebHook configuration block also accepts all the [confighttp][cfghttp]
settings.

An example configuration is as follows:

```yaml
receivers:
github:
scrapers:
... <scraper configuration>: # Scraper configurations are required until Tracing functionality is complete.
webhook:
endpoint: localhost:19418
path: /events
health_path: /health
secret: ${env:SECRET_STRING_VAR}
required_header:
key: "X-GitHub-Event"
value: "action"
```

For tracing, all configuration is set under the `webhook` key. The full set
of exposed configuration values can be found in [`config.go`][config.go].

[wjob]: https://docs.github.com/en/webhooks/webhook-events-and-payloads#workflow_job
[wrun]: https://docs.github.com/en/webhooks/webhook-events-and-payloads#workflow_run
[valid]: https://docs.github.com/en/webhooks/using-webhooks/validating-webhook-deliveries
[config.go] ./config.go
[cfghttp]: https://pkg.go.dev/go.opentelemetry.io/collector/config/confighttp#ServerConfig
50 changes: 48 additions & 2 deletions receiver/githubreceiver/config.go
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,13 @@ package githubreceiver // import "github.com/open-telemetry/opentelemetry-collec
import (
"errors"
"fmt"
"time"

"go.opentelemetry.io/collector/component"
"go.opentelemetry.io/collector/config/confighttp"
"go.opentelemetry.io/collector/confmap"
"go.opentelemetry.io/collector/receiver/scraperhelper"
"go.uber.org/multierr"

"github.com/open-telemetry/opentelemetry-collector-contrib/receiver/githubreceiver/internal"
"github.com/open-telemetry/opentelemetry-collector-contrib/receiver/githubreceiver/internal/metadata"
Expand All @@ -24,19 +27,62 @@ type Config struct {
scraperhelper.ControllerConfig `mapstructure:",squash"`
Scrapers map[string]internal.Config `mapstructure:"scrapers"`
metadata.MetricsBuilderConfig `mapstructure:",squash"`
WebHook WebHook `mapstructure:"webhook"`
}

type WebHook struct {
confighttp.ServerConfig `mapstructure:",squash"` // squash ensures fields are correctly decoded in embedded struct
Path string `mapstructure:"path"` // path for data collection. Default is /events
HealthPath string `mapstructure:"health_path"` // path for health check api. Default is /health_check
RequiredHeader RequiredHeader `mapstructure:"required_header"` // optional setting to set a required header for all requests to have
Secret string `mapstructure:"secret"` // secret for webhook
}

type RequiredHeader struct {
Key string `mapstructure:"key"`
Value string `mapstructure:"value"`
}

var (
_ component.Config = (*Config)(nil)
_ confmap.Unmarshaler = (*Config)(nil)

errMissingEndpointFromConfig = errors.New("missing receiver server endpoint from config")
errReadTimeoutExceedsMaxValue = errors.New("the duration specified for read_timeout exceeds the maximum allowed value of 10s")
errWriteTimeoutExceedsMaxValue = errors.New("the duration specified for write_timeout exceeds the maximum allowed value of 10s")
errRequiredHeader = errors.New("both key and value are required to assign a required_header")
errRequireOneScraper = errors.New("must specify at least one scraper")
)

// Validate the configuration passed through the OTEL config.yaml
func (cfg *Config) Validate() error {
var errs error

// For now, scrapers are required to be defined in the config. As tracing
// and other signals are added, this requirement will change.
if len(cfg.Scrapers) == 0 {
return errors.New("must specify at least one scraper")
errs = multierr.Append(errs, errRequireOneScraper)
}
return nil

maxReadWriteTimeout, _ := time.ParseDuration("10s")

if cfg.WebHook.ServerConfig.Endpoint == "" {
errs = multierr.Append(errs, errMissingEndpointFromConfig)
}

if cfg.WebHook.ServerConfig.ReadTimeout > maxReadWriteTimeout {
errs = multierr.Append(errs, errReadTimeoutExceedsMaxValue)
}

if cfg.WebHook.ServerConfig.WriteTimeout > maxReadWriteTimeout {
errs = multierr.Append(errs, errWriteTimeoutExceedsMaxValue)
}

if (cfg.WebHook.RequiredHeader.Key != "" && cfg.WebHook.RequiredHeader.Value == "") || (cfg.WebHook.RequiredHeader.Value != "" && cfg.WebHook.RequiredHeader.Key == "") {
errs = multierr.Append(errs, errRequiredHeader)
}

return errs
}

// Unmarshal a config.Parser into the config struct.
Expand Down
36 changes: 33 additions & 3 deletions receiver/githubreceiver/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ import (
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
"go.opentelemetry.io/collector/component"
"go.opentelemetry.io/collector/config/confighttp"
"go.opentelemetry.io/collector/confmap"
"go.opentelemetry.io/collector/otelcol/otelcoltest"
"go.opentelemetry.io/collector/receiver/scraperhelper"
Expand All @@ -26,6 +27,7 @@ func TestLoadConfig(t *testing.T) {

factory := NewFactory()
factories.Receivers[metadata.Type] = factory

// https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/33594
// nolint:staticcheck
cfg, err := otelcoltest.LoadConfigAndValidate(filepath.Join("testdata", "config.yaml"), factories)
Expand All @@ -36,12 +38,27 @@ func TestLoadConfig(t *testing.T) {
assert.Len(t, cfg.Receivers, 2)

r0 := cfg.Receivers[component.NewID(metadata.Type)]
defaultConfigGitHubScraper := factory.CreateDefaultConfig()
defaultConfigGitHubScraper.(*Config).Scrapers = map[string]internal.Config{
defaultConfigGitHubReceiver := factory.CreateDefaultConfig()

defaultConfigGitHubReceiver.(*Config).Scrapers = map[string]internal.Config{
metadata.Type.String(): (&githubscraper.Factory{}).CreateDefaultConfig(),
}

assert.Equal(t, defaultConfigGitHubScraper, r0)
defaultConfigGitHubReceiver.(*Config).WebHook = WebHook{
ServerConfig: confighttp.ServerConfig{
Endpoint: "localhost:8080",
ReadTimeout: 500 * time.Millisecond,
WriteTimeout: 500 * time.Millisecond,
},
Path: "some/path",
HealthPath: "health/path",
RequiredHeader: RequiredHeader{
Key: "key-present",
Value: "value-present",
},
}

assert.Equal(t, defaultConfigGitHubReceiver, r0)

r1 := cfg.Receivers[component.NewIDWithName(metadata.Type, "customname")].(*Config)
expectedConfig := &Config{
Expand All @@ -52,6 +69,19 @@ func TestLoadConfig(t *testing.T) {
Scrapers: map[string]internal.Config{
metadata.Type.String(): (&githubscraper.Factory{}).CreateDefaultConfig(),
},
WebHook: WebHook{
ServerConfig: confighttp.ServerConfig{
Endpoint: "localhost:8080",
ReadTimeout: 500 * time.Millisecond,
WriteTimeout: 500 * time.Millisecond,
},
Path: "some/path",
HealthPath: "health/path",
RequiredHeader: RequiredHeader{
Key: "key-present",
Value: "value-present",
},
},
}

assert.Equal(t, expectedConfig, r1)
Expand Down
41 changes: 35 additions & 6 deletions receiver/githubreceiver/factory.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,10 @@ import (
"context"
"errors"
"fmt"
"time"

"go.opentelemetry.io/collector/component"
"go.opentelemetry.io/collector/config/confighttp"
"go.opentelemetry.io/collector/consumer"
"go.opentelemetry.io/collector/receiver"
"go.opentelemetry.io/collector/receiver/scraperhelper"
Expand All @@ -21,6 +23,14 @@ import (

// This file implements a factory for the github receiver

const (
defaultReadTimeout = 500 * time.Millisecond
defaultWriteTimeout = 500 * time.Millisecond
defaultPath = "/events"
defaultHealthPath = "/health"
defaultEndpoint = "localhost:8080"
)

var (
scraperFactories = map[string]internal.ScraperFactory{
metadata.Type.String(): &githubscraper.Factory{},
Expand All @@ -35,6 +45,7 @@ func NewFactory() receiver.Factory {
metadata.Type,
createDefaultConfig,
receiver.WithMetrics(createMetricsReceiver, metadata.MetricsStability),
receiver.WithTraces(createTracesReceiver, metadata.TracesStability),
)
}

Expand All @@ -51,12 +62,15 @@ func getScraperFactory(key string) (internal.ScraperFactory, bool) {
func createDefaultConfig() component.Config {
return &Config{
ControllerConfig: scraperhelper.NewDefaultControllerConfig(),
// TODO: metrics builder configuration may need to be in each sub scraper,
// TODO: for right now setting here because the metrics in this receiver will apply to all
// TODO: scrapers defined as a common set of github
// TODO: aqp completely remove these comments if the metrics build config
// needs to be defined in each scraper
// MetricsBuilderConfig: metadata.DefaultMetricsBuilderConfig(),
WebHook: WebHook{
ServerConfig: confighttp.ServerConfig{
Endpoint: defaultEndpoint,
ReadTimeout: defaultReadTimeout,
WriteTimeout: defaultWriteTimeout,
},
Path: defaultPath,
HealthPath: defaultHealthPath,
},
}
}

Expand Down Expand Up @@ -87,6 +101,21 @@ func createMetricsReceiver(
)
}

func createTracesReceiver(
_ context.Context,
params receiver.Settings,
cfg component.Config,
consumer consumer.Traces,
) (receiver.Traces, error) {
// check that the configuration is valid
conf, ok := cfg.(*Config)
if !ok {
return nil, errConfigNotValid
}

return newTracesReceiver(params, conf, consumer)
}

func createAddScraperOpts(
ctx context.Context,
params receiver.Settings,
Expand Down
6 changes: 3 additions & 3 deletions receiver/githubreceiver/factory_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,11 @@ func TestCreateDefaultConfig(t *testing.T) {

func TestCreateReceiver(t *testing.T) {
factory := NewFactory()
cfg := factory.CreateDefaultConfig()
cfg := factory.CreateDefaultConfig().(*Config)

tReceiver, err := factory.CreateTraces(context.Background(), creationSet, cfg, consumertest.NewNop())
assert.Equal(t, err, pipeline.ErrSignalNotSupported)
assert.Nil(t, tReceiver)
assert.NoError(t, err)
assert.NotNil(t, tReceiver)

mReceiver, err := factory.CreateMetrics(context.Background(), creationSet, cfg, consumertest.NewNop())
assert.NoError(t, err)
Expand Down
7 changes: 7 additions & 0 deletions receiver/githubreceiver/generated_component_test.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit f345276

Please sign in to comment.