-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include generated object files in commit history and published packages #1233
Comments
#695
I’m not really in favor of this without a really strong reason due to the
complications that binary files create with Git. I don’t think that is
worth the convenience over (for example) a generator that can be run via
Make, so I see it more as a tooling issue than a library issue.
…On Wed, Oct 30, 2024 at 5:03 PM Tyler Yahn ***@***.***> wrote:
This issue has been discussed before, I can't seem to find where though.
The goal would be to have the released go.opentelemetry.io/auto package
be a self-contained distribution.
Currently, when a project takes a dependency on go.opentelemetry.io/auto
the project will not compile:
package main
import "go.opentelemetry.io/auto"
func main() {
i := auto.NewInstrumentation()
_ = i.Close()
}
$ go version
go version go1.23.2 linux/amd64
$ cat go.mod
module testing/autoimprt
go 1.23.2
require go.opentelemetry.io/auto v0.16.0-alpha
require (
github.com/beorn7/perks v1.0.1 // indirect
github.com/cenkalti/backoff/v4 v4.3.0 // indirect
github.com/cespare/xxhash/v2 v2.3.0 // indirect
github.com/cilium/ebpf v0.16.0 // indirect
github.com/go-logr/logr v1.4.2 // indirect
github.com/go-logr/stdr v1.2.2 // indirect
github.com/google/uuid v1.6.0 // indirect
github.com/grpc-ecosystem/grpc-gateway/v2 v2.22.0 // indirect
github.com/hashicorp/go-version v1.7.0 // indirect
github.com/klauspost/compress v1.17.11 // indirect
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
github.com/pkg/errors v0.9.1 // indirect
github.com/prometheus/client_golang v1.20.4 // indirect
github.com/prometheus/client_model v0.6.1 // indirect
github.com/prometheus/common v0.60.0 // indirect
github.com/prometheus/procfs v0.15.1 // indirect
go.opentelemetry.io/auto/sdk v0.1.0-alpha // indirect
go.opentelemetry.io/contrib/bridges/prometheus v0.56.0 // indirect
go.opentelemetry.io/contrib/exporters/autoexport v0.56.0 // indirect
go.opentelemetry.io/otel v1.31.0 // indirect
go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploggrpc v0.7.0 // indirect
go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp v0.7.0 // indirect
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc v1.31.0 // indirect
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.31.0 // indirect
go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.31.0 // indirect
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.31.0 // indirect
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.31.0 // indirect
go.opentelemetry.io/otel/exporters/prometheus v0.53.0 // indirect
go.opentelemetry.io/otel/exporters/stdout/stdoutlog v0.7.0 // indirect
go.opentelemetry.io/otel/exporters/stdout/stdoutmetric v1.31.0 // indirect
go.opentelemetry.io/otel/exporters/stdout/stdouttrace v1.31.0 // indirect
go.opentelemetry.io/otel/log v0.7.0 // indirect
go.opentelemetry.io/otel/metric v1.31.0 // indirect
go.opentelemetry.io/otel/sdk v1.31.0 // indirect
go.opentelemetry.io/otel/sdk/log v0.7.0 // indirect
go.opentelemetry.io/otel/sdk/metric v1.31.0 // indirect
go.opentelemetry.io/otel/trace v1.31.0 // indirect
go.opentelemetry.io/proto/otlp v1.3.1 // indirect
golang.org/x/arch v0.11.0 // indirect
golang.org/x/exp v0.0.0-20230224173230-c95f2b4c22f2 // indirect
golang.org/x/net v0.30.0 // indirect
golang.org/x/sys v0.26.0 // indirect
golang.org/x/text v0.19.0 // indirect
google.golang.org/genproto/googleapis/api v0.0.0-20241007155032-5fefd90f89a9 // indirect
google.golang.org/genproto/googleapis/rpc v0.0.0-20241007155032-5fefd90f89a9 // indirect
google.golang.org/grpc v1.67.1 // indirect
google.golang.org/protobuf v1.35.1 // indirect
)
$ go build
..***@***.***/internal/pkg/instrumentation/bpf/database/sql/bpf_x86_bpfel.go:165:12: pattern bpf_x86_bpfel.o: no matching files found
..***@***.***/internal/pkg/instrumentation/bpf/github.com/segmentio/kafka-go/consumer/bpf_x86_bpfel.go:172:12: pattern bpf_x86_bpfel.o: no matching files found
..***@***.***/internal/pkg/instrumentation/bpf/github.com/segmentio/kafka-go/producer/bpf_x86_bpfel.go:167:12: pattern bpf_x86_bpfel.o: no matching files found
..***@***.***/internal/pkg/instrumentation/bpf/go.opentelemetry.io/auto/sdk/bpf_x86_bpfel.go:159:12: pattern bpf_x86_bpfel.o: no matching files found
..***@***.***/internal/pkg/instrumentation/bpf/go.opentelemetry.io/otel/traceglobal/bpf_x86_bpfel.go:213:12: pattern bpf_x86_bpfel.o: no matching files found
..***@***.***/internal/pkg/instrumentation/bpf/google.golang.org/grpc/client/bpf_x86_bpfel.go:170:12: pattern bpf_x86_bpfel.o: no matching files found
..***@***.***/internal/pkg/instrumentation/bpf/google.golang.org/grpc/server/bpf_x86_bpfel.go:169:12: pattern bpf_x86_bpfel.o: no matching files found
..***@***.***/internal/pkg/instrumentation/bpf/net/http/client/bpf_no_tp_x86_bpfel.go:182:12: pattern bpf_no_tp_x86_bpfel.o: no matching files found
..***@***.***/internal/pkg/instrumentation/bpf/net/http/server/bpf_x86_bpfel.go:174:12: pattern bpf_x86_bpfel.o: no matching files found
We should ship a Go module/package that is self-contained. It should not
require any user to generate code within their package cache.
cc @damemi <https://github.com/damemi> @RonFed <https://github.com/RonFed>
—
Reply to this email directly, view it on GitHub
<#1233>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOA77OFY2RJYDKU4ZOTTA3Z6FCSTAVCNFSM6AAAAABQ47ZXUKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGYZDKMZTHAYTCNA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
The thing is, you cannot execute a make task when someone downloads the package. What we ship now is a broken package. |
I meant a make task to run go generate in the module cache. Which is not
pretty, but I would like us to look into other ways to make this easier
before we commit to committing the object files directly.
The BPF object files themselves aren’t code, they’re really release
artifacts. The code manages and depends on these artifacts to run, but to
me they seem kind of like DLLs in that sense. So, the package isn’t broken,
it’s just missing dependencies by default.
It looks like others have had this question before (
cilium/ebpf#988 with an interesting
solution). I don’t disagree that we should do *something* to make this
easier to use, but I’m hesitant to commit them.
…On Wed, Oct 30, 2024 at 5:22 PM Tyler Yahn ***@***.***> wrote:
The thing is, you cannot execute a make task when someone downloads the
package. What we ship now is a broken package.
—
Reply to this email directly, view it on GitHub
<#1233 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOA77M2YPC3Z4ZAQDTCWK3Z6FEYZAVCNFSM6AAAAABQ47ZXUKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBYGQYDQNJRGI>
.
You are receiving this because you were mentioned.Message ID:
<open-telemetry/opentelemetry-go-instrumentation/issues/1233/2448408512@
github.com>
|
This is talking about running a make target within the project generating the eBPF code. It is not talking about providing a release target to others downloading the Go module. That approach is going to be flawed as it will require someone who runs |
The code that we ship already directly depends on the files we do not release: opentelemetry-go-instrumentation/internal/pkg/instrumentation/bpf/database/sql/bpf_x86_bpfel.go Lines 163 to 166 in 77e8de9
This seems like something that should be in source code if we are already referencing it. |
I’m saying we should release those files, but I don’t think publishing them
in a Git commit is the best way to do that.
Are there other (non eBPF) Go packages that embed binary files and commit
them? I’m wondering if there is any precedent for this pattern outside of
eBPF or if we would just be trading one unusual situation for another.
Within eBPF-Go there isn’t much guidance. I linked that other issue to show
that the main problem (is it idiomatic to commit BPF objects to git) is
still pretty unanswered. Maybe this is a gap in the Cilium/bpf2go projects
themselves.
Again, I’m totally with you on publishing these files but I just want to
pause and see if we can think of a better approach, if possible.
…On Wed, Oct 30, 2024 at 6:32 PM Tyler Yahn ***@***.***> wrote:
The code that we ship already directly depends on the files we do not
release:
https://github.com/open-telemetry/opentelemetry-go-instrumentation/blob/77e8de9e0f4e33afb00bacb77917bad4f3056d1f/internal/pkg/instrumentation/bpf/database/sql/bpf_x86_bpfel.go#L163-L166
This seems like something that should be in source code if we are already
referencing it.
—
Reply to this email directly, view it on GitHub
<#1233 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOA77LCFKYWCDRFDLAB4ADZ6FM7PAVCNFSM6AAAAABQ47ZXUKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBYGU4DANJWGM>
.
You are receiving this because you were mentioned.Message ID:
<open-telemetry/opentelemetry-go-instrumentation/issues/1233/2448580563@
github.com>
|
Looking at the remove import paths documentation, we need to use some VCS as the protocol to download source code (i.e. commit all the code), or run our own module proxy. Looking at the documentation for module proxies, it seems like we could respond to This module proxy will need to be added as a managed service run by OTel and require additional step in our release process for uploading the release zip files. |
FWIW, here's an example of another eBPF project checking in their object files: https://github.com/k8spacket/k8spacket/blob/65e5268975cc41a4a1b5daed8a534666a8a4fab5/ebpf/tc/tc_bpfel.o |
I'm very supportive of improving this part - I agree with @damemi - committing these files to git should only be done if we can't find another reasonable approach. |
Another idea: we could make our releases in release branches. Those branches would contained checked-in I'm not sure if this is better or just the same with regard to binary data being checked in to git. |
We could also post-curate the |
@MrAlias the post-curation idea is really interesting. That seems like the safest compromise and the pattern already exists for protobuf. I think that's my favorite option, and we could even separate the generated files into their own sub packages if we wanted. Just thinking, you could make a philosophical argument here that while this is broken in the traditional Go sense, it's also not a very traditional Go library. I would absolutely expect almost any other Go package to not behave this way. But anyone who's using this, even transitively, should be aware that it depends on compiled eBPF objects... and so I wonder if the extra build/generate steps (like the make command @RonFed shared) could be an acceptable pattern? At least until some better guidance or alternative emerges? To formalize that option, I'd propose our docs include a section on "using this library" that clearly includes exactly what commands the user needs as a setup step after they run |
This object file representation will only be updated on releases or every time we change the c files?
With this approach, if a user A common question to most of the propsed soultions is whether we want to support imports of non-official releases (with commit hashes)? I think some of the options rely on just updating the compiled code in each official release. |
I don't thing there objectively is a difference. The difference is stylistic. There is precedence with protobuf generating binary data directly held in If others are okay with this approach, I'm okay as well. It is not my first choice though. |
Correct. I haven't built out a prototype yet, but all the documentation I have read indicates this will be the behavior. We will upload snapshots of our code (maybe already zipped(?)) to some module proxy server. Those snapshots will be served and they will contain all the |
That's a good question. We would have to investigate this further to see what kind of support we could provide. I know we can provide the standard Go proxy as a fallback to our proxy and it may all users to download directly from git history. The issue would be that those downloads would not have the desired object files. I expect there is some more engineering we could do to try and solve the issue though. |
Go mod proxies we could run BTW: |
The problem I see with the Go proxy solution is it doesn't support custom Probes (outside our repo), and makes it tougher for us to do a contrib-style repo. I guess third party probes can choose to provide their object files however they'd like, but a consistent approach would be nice |
Another suggestion from ebpf slack was to use Git LFS, which from a quick search looks like it should be compatible with Go modules. I don't know if that would help with the code reviews and diffs, but it would solve the repo bloat. I also like the release branch idea, and I think we should start using release branches anyway for backports once our release model is more stable. |
It looks like using Git LFS will mean all modules that take this one as a dependency will need to have it installed and configured in their development environments: https://stackoverflow.com/questions/58280942/import-a-golang-module-with-lfs-objects |
Yeah, that is the downside. But I think it's still an improvement over the weird go cache generate commands you need otherwise since "all" you have to do is install git lfs. By the way, when we finish the custom probes API #1105 this should contain the blast radius of this issue to just the probe modules themselves, since at that point the main auto-instrumentation framework won't have an explicit dependency on the probes. The problem still exists, but I think that in conjunction with one of the other solutions is going to be enough |
I also had an idea for a proposal we could make to the Cilium libraries to support runtime object storage: The compiled object bytes aren’t actually a build dependency, they’re only needed at runtime when the Cilium libraries load them into the kernel. The file just needs to be there for the Go compiler to build. So if we built a similar approach to Git LFS and had bpf2go embed a file that just held a pointer to the actual object, then there wouldn’t be any compile errors and you could support any kind of runtime object storage: local, remote, lfs, etc… just have Cilium read the bytes into memory at runtime. The pointer file format would just have to declare the protocol and location and use that if an actual local file isn’t available. There’s details to work out in the UX, like for example when developing you would want to be able to build a local file that takes precedence over whatever the pointer file says. The downside is that you would need a connection to wherever the remote file is. We could solve this by supporting a persistent local cache (ie, download it to /tmp and read from there) or even a runtime flag that points to a local directory where the objects live (for example, each release we just publish the objects as a GitHub artifact, users download the ones they want for the probes they’re using, and save that in their host/image/etc). The tradeoff there is file I/O, but you could cache it in memory for efficiency if it's going to be loading/unloading often. Wdyt? I'm pretty sure this is feasible but I have to look more into the details |
SIG notes:
|
More updates from the slack thread:
|
This issue has been discussed before, I can't seem to find where though. The goal would be to have the released
go.opentelemetry.io/auto
package be a self-contained distribution.Currently, when a project takes a dependency on
go.opentelemetry.io/auto
the project will not compile:We should ship a Go module/package that is self-contained. It should not require any user to generate code within their package cache.
cc @damemi @RonFed
Proposals
Check in object files to git
Run and manage our own Go module proxy
#1233 (comment)
Commit the object files into release branches
#1233 (comment)
Post-process Go code to contain the object file
#1233 (comment)
Document dependency generation steps
#1233 (comment)
Use Git LFS
#1233 (comment)
Update Cilium libraries to support runtime object storage
#1233 (comment)
The text was updated successfully, but these errors were encountered: