Opentracing-cpp Redux #6

jquinn47 · 2017-01-11T17:08:03Z

This is a significant rewrite of the current opentracing-cpp implementation. I've been working on this with @lookfwd and @SaintDubious internally for the last couple weeks.

Major Changes

I took issue with the original implementation in a few significant ways:

Library Organization

The original implementation required tracers to dynamic_cast Writers and Readers into a concrete type before they could be used. This is an anti-pattern.

We would go down one of two roads.

All of the Carriers are defined in the Implementation library

This makes it very difficult for organizations to pull down an opensource (e.g, zipkin) implementation and add their own carriers to it. If I have a OrganziationTransportRPC, there is no way that's getting pushed back into the open source implementation.
If an organization writes their own implementation, the new library needs to be dependent on every RPC mechanism in the organization that may need to use the tracing library. Obvious side effects with code size and dependency management.

All of Carriers are defined external to the implementation library

If the Tracer does not have the writers defined in the library, we've created a physical circular dependency between libraries. (Tracer needs to know about Carriers, Carriers need to know about Reader/Writer defined next to Tracer). Dependency hell ensues

Heavy use of CRTP templates to define Reader/Writer interfaces. Functors can be passed into the API at compile time without affecting the base library. Implementations have an interface to rely on for more OO approach

Allocations

The original implementation always used new/delete and exposed raw pointers to calling code.

Without an interface to pass pointers back to the Tracer, this removes the ability to use custom allocators or object pools for managing spans/span contexts.

Added the Tracer::cleanup() method for returning resources to the Implementation. Introduced a StringRef to avoid unnecessary creation of temporary std::strings at the interface level

V-Table hits

In high-throughput scenarios, the v-table is not desirable. If we're planning on providing an API for performant C++ applications, we would ideally be able to avoid as much overhead as possible in the applications we're instrumenting.

We've worked around this with a compile time pattern for polymorphism.

Spec concerns

We were missing a few things in the spec:

constants
being able to iterate through baggage
binary writer

Minor Changes

Added lots of documentation/usage examples
Shifted to the cmake build system for better integration with gtest
Primary library is header only for easier installs (Noop requires the .a be built, but its optional)
Applied some clang-format tools to keep the code base consistent (I just used one my team uses internally, but I'm fine if we want to change the format so long as its consistent 👍 )

This is obviously a large change and warrants a major version bump. We've kept @lookfwd in the loop the whole time, as he's definitely our most active opentracing advocate. I believe he's on board but I'd want him to chime in here too.

Review feedback, removing guards, updated docs

…pentracing-cpp

Noop shift

lookfwd · 2017-01-21T12:10:42Z

@RomanDzhabarov On the overhead, I don't mean computational and +1 for the batching of course! What I mostly mean is that I like the way LightStep's Recorder is used here. What it does, is that it expresses to the Tracer implementation, the right way, in the context of a given application, to send off-band data through http. This is very useful. Now let's assume that you have a Zipkin C++ Tracer and it implements OT. Given the lack of standardisation on the recorder, Zipkin Tracer will likely use a (likely blocking) library (e.g. CURL) to send HTTP and will force you to link it. Then you will have to hack the Zipkin implementation to allow it to (also) allow a Recorder pattern that fits the Envoy async way of working. I think nevertheless, it's highly likely you would still have to link Zipkin's other HTTP sending mechanism. This whole mess is the overhead I'm talking about. I think from what I see here, an API could look roughly like this:

struct HttpRecorder
{
    struct Message
    {
        virtual ~Message();
        virtual const string & getUrl() const = 0;
        virtual const string & getMessage() const = 0;
        virtual const string & getMethod() const = 0;
        virtual const vector<string> & getHeaders() const = 0;
        virtual void onSuccess() = 0;
        virtual void onFailure(const bsl::string & err) = 0;
    };

    virtual ~HttpRecorder();
    virtual void sendHttpMessage(Message * msg) = 0;  // <- Whatever pointer'ish for msg
};

Note that implementations don't have to use this, since it might be used on the constructor which is outside the main API.

jquinn47 · 2017-01-21T13:24:17Z

I do like @jmacd's idea. For C++ at least, I would take it one step farther and enrich the existing Span interface to require externalize(map<string, string>&) or externalize(vector<char>&).

By doing so, we can specialize the type system but always have a reliable fallback. None of this would require runtime checks on behalf of the implementor. C++ would do it for us via double dispatch. (No guarantees this compiles, but I think i hope it gets the point across.)

class Span {
    // all of the current span methods ...
    virtual void externalize(map<string, string>& ) = 0;

     virtual void accept(Carrier *) = 0;
};

struct RedSpan: public Span{
  void externalize(map<string, string>& m){ m["id"] = std::to_string<int>(_id); }

  void accept(Carrier* c) { c->acceptSpan(this); }
  int _id;
};

struct BluSpan: public Span{
   void externalize(map<string, string>&m) { m["bid"] = _id; }
   void accept(Carrier* c) { c->acceptSpan(this); }
   string _id;
};

struct Carrier {
    virtual void acceptSpan(Span*) = 0;
};

struct CarrierImp {
    virtual void acceptSpan(Span * s){ 
      map<string, string> m; 
      s->externalize(m); 
  } 
   virtual void acceptSpan(RedSpan *rs) {
       string& id = rs->_id;
   }
};

This was one of my original thoughts when tackling the API, but it required that the Span interface be enhanced. I thought the spec was pretty rigid at the time, and wasn't sure how welcome an addition to the base span interface would be, but it does clean up some things nicely.

I'm not convinced I've thought through all of the ramifications yet. I'm most concerned about when we have different tracers interoperating, but I'll save all of that for Tuesday.

jquinn47 · 2017-01-21T16:14:44Z

Upon further review, I'm not sure my previous idea would work either. Span's would still need about all the carriers and vice versa. The physical organization of the libraries would still be a problem :(

jquinn47 · 2017-01-25T03:49:39Z

Submitted a virtual inheritance rough draft in #7. Trade offs:

Pros:

Traditional OO
Gains ability to multiplex

Cons:

Carrier code is harder to get right (minor)
More expensive 'no-ops' (minor)

I didn't want to close this PR, but I'd hate to split the discussions across two of them. I'll let the opentracing gurus decide where the discussions should continue.

bhs · 2017-01-25T04:26:49Z

First, to summarize what was a fun and productive hangout today with @jquinn47, @SaintDubious, @lookfwd, @yurishkuro, @RomanDzhabarov, @jmacd, @tedsuo, and myself:

Maintaining both C++98 and C++11 OpenTracing libraries is problematic, especially for (C++11) processes that mix libraries that depend on both.
C++11 features can be built as a wrapper around a C++98 core.
Concerns were raised about the template-driven approach... mainly cognitive overhead and Tracer multiplexing. Hence @jquinn47's speedy turnaround with opentracing-cpp re-redux (virtual inheritance) #7.
We spent a while talking about custom formats+carriers and how to manage dependencies sanely along those lines.

Anyway, @jquinn47: thank you for #7! IMO it's a worthwhile tradeoff. I feel odd cutting the (valuable) discussion in this PR off... let's wait for other opinions about #6 vs #7 and we can go from there.

SaintDubious · 2017-01-25T15:54:50Z

Thank you @bensigelman for arranging the hangout yesterday, that was extremely useful. And thank you @jquinn47 for quickly giving us a new API to consider.

My attempt to summarize the sticking point over Carriers and Spans:

Ben considers the ability to select Span type at runtime to be a feature
We consider this same exact ability to be a design flaw

For us, there is no way to change Span types over our system. There are too many services, transport mechanisms, and languages, there is no realistic way to upgrade them all. So we have to enforce exactly one Span type.

With the Virtual approach we have to play policy police with every Carrier implementer and say "You MUST cast this virtual Span to a concrete type and you may NEVER use anything else." Eventually some team will decide that within their island of services they are going to use some OTHER Span type and eventually that will leak out and the whole system becomes unstable.

We feel that the virtual approach allows for a flexibility that is dangerous to the system. It would be very helpful for us if someone has some real world examples of when they might want to use this feature.

lookfwd · 2017-01-25T16:24:46Z

The way I perceived it, there was no definite "Span (or anything) at runtime has to be there." What I felt was mostly that "there's a "cognitive overhead" on templates.. the more we can narrow their usage, the better. Let's keep the virtual approach (main benefit: simplicity) still a bit open till we are definitely convinced that there's no way to specify correct and generic enough implementations by using it."

There is no way to change Span types over our system

To clarify this - I guess that Spans will be evolving/changing over time but we can't imagine a single service switching Span types at run-time or that two types of Spans might co-exist in a system. Is that what you mean @SaintDubious?

lookfwd · 2017-01-25T16:41:16Z

One more thing I would like to clarify for the entire discussion, that, I believe, is quite C++ specific. We have to set the expectations right for the OT C++ spec/library. Please spend a few seconds to have a look on an implementation of the C++ std::string type here.

I guess my point is quite clear. That's the source code/header file of the std::string that everybody thinks that understands and absolutely everybody uses. If you ask people what's std::string - nobody will have that in mind. Highly likely they have the documentation page from here in mind or some chapter from Josuttis's book.

I think that implicitly, and partially because of other languages like Java, Go and Python, we have an unrealistic expectation on what good, simple and quality C++ code looks like.

Even large companies might have C++ experts that don't use templates, and that's perfectly fine and expected and lovely for application code. Not for library code. For example let's have a quick look on google test/google mock e.g. here. I can't see anything that looks like what's so nicely described in the documentation and I seriously doubt anyone could create C++ library code that meets google mock's objectives by using just inheritance.

yurishkuro · 2017-01-25T16:49:30Z

If I understand correctly what you mean by "changing Span type", the situation mostly arises with higher order tracers. These are some scenarios where higher order tracers are needed:

the organization decides to switch from one tracing system to another. Given that all microservices cannot be upgraded simultaneously, there's a period when the org needs to run two tracing systems concurrently, so a multiplexing tracer can be used that dispatches the instrumentation calls to two underlying (actual) tracers
tracing instrumentation is used for purposes beyond tracing, e.g. for emitting RPC metrics. The simplest way to do that is to create a decorator tracer that emits metrics and then delegates to the underlying real tracer
somewhat similar to (1) - we want to run a demo application (like Ben's donut salon or Jaeger's HotROD) with more than one tracer, to demonstrate the OT's main proposition that instrumentation stays independent of the tracing backend.

On the topic of custom carriers, as always I find it hard to follow abstract discussions about what could happen and prefer to take concrete use cases and see how they are solved in the proposed API. The specific use case for custom carrier I have is the following. Uber uses tchannel, a kind-of RPC framework & binary wire protocol, which has built in support for zipkin-like tracing at the binary format level. Using those message slots is more efficient than passing tracing data via headers. So the instrumentation in various tchannel libraries (in different languages) perform inject/extract as a two-step process

first they try to inject/extract using a custom carrier that represents tracing context as 4 fixed numeric fields. If the tracer understands that Format and carrier type (as Jaeger's tracer does), then the more efficient path is chosen by the instrumentation to encode those fields into the message.
if the tracer says it does not support that format/carrier (in Go there's an unsupported carrier format error code that tracer can return), then the instrumentation uses TextMap format/carrier to encode the span context as arbitrary key/value pairs that are then stored in the message headers.

There was a sticky point about the interface of the carrier. In Go it wasn't much of a problem due to duck typing. In Python/Java we simply used a dict/Map as the carrier to pass the 4 fields.

Another concrete examples of custom carriers often arrises in Java, where various http frameworks use different types to represent http request. So what most instrumentations do is use the TextMap (or rather HttpHeaders) OT format with a wrapper around the specific framework's http request type that masquerades that request object as the standard text map OT carrier.

jquinn47 · 2017-01-25T17:27:21Z

Trying to keep the conversation here, but this is a response to @yurishkuro comment on #7:

This is rehashing what we discussed yesterday.

The problem with the approach your suggesting, in a statically built language like C++, is that we have a problem the way the libraries are physically laid out. If I have a tracer that needs to know about different carriers, today, we're asked to write this:

#include <redcarrier.h>
#include <bluecarrier.h>

Tracer::inject(Carrier * c, span* s){
   if(RedCarrier * rc = dynamic_cast<RedCarrier*>(c)){
    ... 
   }
  else if (BluCarrier * bc = dynamic_cast<BlueCarrier*>(c){
    ... 
  } 
  else if (TextMapCarrier * tmc = dynamic_cast<TextMapCarrier*>(tmc){
     ...
  }
}

This library is going to be built one of two ways:

1: The tracer defines the carriers in its implementation
2: The tracer uses carriers defined outside of its implementation

In both situations, we have real world build problems with this library at scale.

In case 1, If I have an application that only sends 'Blue' messages, but I want the tracer implementation, I'm now forced to link against all of the different carrier libraries that the tracer needs.

Using your concrete example, If you only had tchannel in use in your applications, but the tracer implementation also supported protobuf-over-tcp carriers, would you want to link against that library?

In case 2, if the tchannel carrier is defined outside of the tracer implementation, the tracer still needs to link against it.

Now we have link lines that start to look like: -ltracer -ltchannel -lprotobuf-over-tcp -ltracer.
This circular dependency is a rats nest, and clients are still forced to link against a whole bunch of libraries, they'll probably never need.

Now what I did in #7 was try to do what @jmacd was suggesting. Push those details into the carriers, so that the link lines look more appropriate for your task -ltracer -ltchannel. Since -lprotobuf-over-tcp is never used in your app, you don't need to link against it.

jquinn47 · 2017-01-25T17:48:56Z

Now, ignoring the CRTP vs virtual debate, I believe their is a fundamental problem with the spec.

I have not yet seen how a migration path should work for switching out tracer implementations at scale. A lot of our discussion has been to about features used to accomodate this use case, but I don't see a clean way to pull it off. The demos are nifty to test a corner of the interface, but I'd posit they're not practical at scale, once multiple applications begin communicating with each other over a set protocol.

Say we're using zipkin today. We have installed it into 10k+ applications, all of which are built and deployed on different time frames. Now we want to switch to jaeger, for whatever reason.

For this to work, we first need a multiplexing tracer. All of the original tasks need to be rebuilt with a multiplexor first, but only using zipkin. We need to do this so the tracer implementation can start to understand what it means to receive multiple in-band representations of a span over the wire.

Now once everyone has that linked in, we can start to make jaeger optional, and continue to require zipkin information. This is when carrier code will start to get funky. The semantics of what to do if one tracer works but another fails are still undefined, but lets assume that they're both working.

Once that's done, we then need to require jaeger, but we can start to mark zipkin as optional. And finally, we can officially drop zipkin support and move back to a single tracer, only supporting jaeger.

This is a nightmare to pull off. I think its trickier than traditional library rollouts, because every application is able to communicate with every other application over this tracing protocol; its one dense graph.

All of the carriers used throughout an entire organization will need to be coordinated throughout this entire process. The alternative is to have the tracer implementation do it all, which has all the negative side effects described in my previous response to Yuri.

This is the line in the spec that is resonating the most with me

There is no expectation that different OpenTracing implementations Inject and Extract SpanContexts in compatible ways. Though OpenTracing is agnostic about the tracing implementation across an entire distributed system, for successful inter-process handoff it's essential that the processes on both sides of a propagation use the same tracing implementation.

Since using multiple tracers has significant maintenance side effects, and there is no reason to assume more than one tracer is operating system wide to begin with, I begin to ask, why is all of this flexibility put in place? I'm going to try my hardest to avoid the scenario I've described above anyways.

I hope this helps drive the discussion. I really don't care about CRTP vs virtual to be honest. The semantics of how these tasks are upgraded and interoperate are loosely defined. That's what I'm most interested in tightening up. I believe the design of the classes would come naturally once the problem is well formed and we're all talking about the same thing.

yurishkuro · 2017-01-25T18:40:02Z

I have not yet seen how a migration path should work for switching out tracer implementations at scales.

As Ben mentioned on the call, should this be considered the fault of the spec or the nature of the problem itself? How would you solve this if there was no OpenTracing, and you had 10k apps instrumented with bespoke zipkin API and you wanted to switch to another tracing system with completely different libraries and a completely different API? Now I think that would be intractable problem to solve. If all your 10k apps are instrumented with OpenTracing, then the process you described is painful, but it's doable.

Fundamentally there are only two ways to solve this problem as far as I can see, either (a) follow the process you described and go through a migration period when two tracers are running in the apps, or (b) try to make the two tracing implementations interoperate. In this specific example (ironically), Jaeger tracers have the ability to read incoming Zipkin's B3 headers, and Jaeger backend has the ability to accept Zipkin data model for spans, so one could get away with interop instead of dual tracers. But that's only because Jaeger was originally built on parts of Zipkin and we had similar data models and wire propagation format. If you're trying to swap tracing systems that have more significant differences (very simple example: one uses uint64 trace IDs, the other string IDs), then interop is not possible, and dual-deployment migration is the only path.

Also, if we avoid extreme examples of 10k apps (per 80-20 rule the majority of companies have much fewer microservices, say less than a 100), then the dual-tracer approach becomes a lot more feasible and not as time consuming.

jquinn47 · 2017-01-25T18:42:14Z

I'd agree with your last point. I'm am (for better or worse) apart of the 20% there

yurishkuro · 2017-01-25T18:57:41Z

Trying to keep the conversation here, but this is a response to @yurishkuro comment on #7:

You mean this comment: #7 (comment). I do not believe it introduces any library linkage issues, because the custom carrier is defined purely in the specific RPC library and the tracer does not need to know about it. The carrier in that case is just an adapter to transform the internal representation of the framework's remote request (like servlet.HttpRequest) to the interface required for the Format, like TextMapReader/TextMapWriter in Go. I am not sure if CRTP can do that, I'd think you need virtual methods for that. This usage of custom carriers is by far the most prevalent. Even Go OT uses it as part of the official API, see HTTPHeaderCarrier, which wraps http.Header as plain TextMap carrier.

The other example I gave, with TChannel, is indeed where the library linkage issue comes into view. That is not just a custom carrier, but also a custom Format. This is very rare use cases as far as I know, and we managed to solve it without having cross-dependencies by agreeing to use standard (widely available) interface for the carrier itself, e.g. in Java we'd simply use Map<String, String>. The custom part here is really just the format, which says "this data has special semantics and you can use it more efficiently in your specific protocol than if this was a TextMap format". This requires tight coupling of the RPC framework with the capabilities (but not interfaces) of the tracer. And the reason it's so rare is because it implies that the transport has special abilities to carry tracing information that is different from standard key/value metadata. In most cases these special abilities already imply some dependency on the tracing data representation. But as far as OT API is concerned, it's nice to have this capability available.

lookfwd · 2017-01-25T21:15:01Z

To make this discussion more hands-on. I implemented here an ugly throw-away draft for:

a C++11 wrapper that I believe that might bring satisfactory answer to the question if we will be able to provide something smooth for C++11 people
a Higher Order Tracer that supports two tracers and by recursion any number (yeah :D) which raises lots of interesting issues but I also think that will bring the discussion back to more hands-on stuff. The main file:

#include <app_tracing.h>

int main()
{
    // Setup
    ZipkinV1 t1;
    ZipkinV2 t2;

    MyHotTracer imp(t1, t2);

    GlobalTracer::install(&imp);


    {
        // Application code
        auto tracer = Cpp11GlobalTracer::instance();

        auto span = tracer.start("hi");

        std::vector<opentracing::TextMapPair> pairs;

        tracer.inject(&pairs, *span);

        auto ctx = tracer.extract(pairs);
    }

    // Cleanup
    GlobalTracer::uninstall();

    return 0;
}

To compile & run:

g++ -I. -D__USE_XOPEN2K8 -std=c++11 main.cpp hot.cpp app_tracing.cpp build/opentracing/libopentracing.a -o main && ./main

jquinn47 · 2017-01-26T01:23:05Z

I don't want to speak too much for @SaintDubious and @lookfwd, so if they disagree, prepare to hear back from them :)

I've personally gone full circle on this in my head at least 3 times now. Re-reading @yurishkuro comments, I may be looking at what I see as problems too harshly through my 'bloomberg' goggles. I'm still seeing issues with regards to where in the code Span implementations are exposed, and which components are responsible for knowing about those details. I've got many teams to convince to get this spread out, and if we can't come to a consensus here, I believe it will be even more difficult as I add more cooks to the opentracing kitchen.

We (myself, @SaintDubious , and @lookfwd ) will have to be picky about the first tracer we deploy at large. We will be experimenting on a small scale over the next few weeks. (Likely with the CRTP interface since we already have it, then replacing it) to iron out a few kinks with other teams and see what speed bumps are waiting for us. CRTP has already shown itself to be problematic; we hit a template compiler bug on solaris and have seen otherwise experienced developers complain about the cognitive overload that @bensigelman was wary of. While @lookfwd , myself, and probably STL implementors may be fine with it, we're likely in the minority. In my opinion, those are enough red flags to nix it.

I would love to see what @jmacd does in the go version. I think I'm going to hold off on moving forward with the virtual version until I get a better grasp of the approach he is taking there.

I'm certain the implementation we move forward with is going to need to be "version aware" and we only make changes in our organization in a backwards compatible way. That's my problem to deal with and not necessarily OT's, but if the spec is still very fluid that would make adoption harder.

I was hoping the spec would have a recommendation for that, but @yurishkuro's arguments are convincing. That's likely an intractable problem to ask this spec to solve.

lookfwd · 2017-01-26T02:04:21Z

@jquinn47, after our last chat, I came to appreciate another dimension of complexity i.e. service ownership. That was the "final nail" and I'm perfectly convinced that there's a layer of OT that we will have to adopt and stick with it, unchanged, for a couple of years. It's impossible to coordinate all those people to even just relink and redeploy. I don't think we should miss all the developments and the opportunities in the tracing industry because of this, though. "All problems in computer science can be solved by another level of indirection" and this brings the concept of OT "virtualisation" I vaguely mentioned (continued after the image).

You have a very lightweight Tracer that "talks OT v.1.0" (and will stick to it, likely, for a decade). It does no transformation or anything clever whatsoever. Maybe it could have some tiny or moderate (configurable by polling the ot-proxy) intelligence around sampling but that's about it. All those data get forwarded to an OT proxy that must be installed on every server. You can update and configure the proxy however you like. It instantiates as many OT Tracers as it needs to talk to any implementations and those Tracers might be of any version of OT. It might have migration logic that decides routing. You own this so you do whatever you want. You (1 owner) roll out 1 service (ot-proxy) to (all) N servers and it's a brand new day without other services ever having to change. Would something like this solve the problem?

Note: By getting the HOT in here right, we can prove the API is composable. If we can implement a proxy, then we can prove it's forwardable i.e. that any actions can be replicated at another place/time. Things that could prevent the latter are e.g. if we can't explicitly set the timestamps for each operation. Another issue might be that proxies should be able to communicate with each-other ID mappings (injected Proxy ID=>injected Tracer X ID), unless the API requires Tracers to provide means of starting spans with deterministic ~~stable across versions~~ IDs given a string or integer. This is a very interesting problem.

codefromthecrypt · 2017-01-26T03:50:39Z

One thing I'd like to highlight is that it is only an api design choice that end up conflating how to contribute with in-band propagation with how to contribute to out-of-band data. While we say things like 10 years etc. keep in mind that the java library is version 0.something, and it would be highly optimistic, if not silly for people to assume version 1.0 of OpenTracing is a LTS idea. There's a general rule in open source I've noticed which is that nothing is really stable until its second year of experience. Some of these sentiments should hopefully steer next versions of OpenTracing. Here's some thoughts on the in-band+out-of-band apis which may or may not help.. * Propagation interop is the most important interop in distributed tracing. There exist today different systems that talk B3, and there single systems that talk B3 or "similar" where similar is some slightly different variant. inter-process propagation is a separate problem from out-of-band data. Unless propagation works, traces are broken. Broken traces can be difficult, if possible to stitch back together. When traces are joined properly, whichever tracer library or transport reports out-of-band data is possible to reconstruct into one or more trace depots. For this reason, I think propagation (inject/extract) formats and possibly apis are fundamental to distributed tracing. * Propagation is a multi-system concern, and deserves system abstraction level In Zipkin we spend time on B3 <https://github.com/openzipkin/b3-propagation>, as it is the most important thing, more important than which tracer api is in use. Not only is it easier to pin down, it is used across systems which share no code at all. Maybe this year, there will be a cleaner format <https://groups.google.com/forum/#!topic/distributed-tracing/3pW5moQ0yyE>.. who knows. If that happens, and as Yuri mentioned, the high priority would be at a transport abstraction... how to link together traces which use different propagation formats. OpenTracing is right to have an api exposed to support multi-system concerns, though in practice its non-goal of interop has pushed testing to end implementors. For example, Jaeger have system tests to ensure their propagation across library variants. This is where the rubber hits the road.. the system abstraction. The library code is usually really uninteresting.. plus most tracers I've seen at least somewhat punt on dynamic type support. * Propagation apis happen to be linked to the Tracer api in OpenTracing 1.0, it is mostly harmless Not everyone wanted the current form of the tracer api, particularly the part about inject/extract being dynamically resolved (me raises hand). When the decision was made, it was made like that, and I don't think many arguments here were not present then. Maybe a future OT api will make a different choice, especially as different people are involved now. Note that in practice, span mutation vs propagation are often implemented decoupled. For example, both Jaeger and Brave have independent Extractor/Injector types that can fix both the form of propagation and the implementation of it. In Brave, it is a completely separate api (which we stitch back to OT <openzipkin-contrib/brave-opentracing#14> since OT requires it). You can make instrumentation that avoids the OT propagation api thing, but that would significantly impact re-use. I really hope the next OT version is more modular with regards to highly decoupled features such as propagation (in or out of process).

bhs · 2017-01-26T05:07:01Z

@adriancole: the propagation interop pain point is painful for you as someone who maintains a tracing system that is dealing with version skew across numerous forks/etc. The lack of standard tracing instrumentation is the pain point I hear over and over again when speaking with (50+? 100+?) engineering orgs that are integrating tracing systems.

TL;DR it seems you are conflating the needs of people who spend their time implementing tracing systems with the needs of people who spend their time trying to integrated with them.

OpenTracing does indeed separate the concern of in-band propagation formats from the concern of describing the semantics of application software. In fact, that's mostly the point. As software complexity grows, we need some common way of describing behavior while making minimal assumptions about consumers of those descriptions. But indeed: distinct tracing systems have distinct in-band propagation formats (even distinct versions of tracing systems have distinct in-band propagation formats).

FWIW, I agree wholeheartedly with @adriancole about the red-flag-ish-ness of talking about "10 years" with any API that is in its early stages. I feel relatively comfortable making "conceptual" guarantees about most of the core abstractions in OT, but true compile-time compatibility between whatever is in this PR (or #7) and opentracing-cpp circa 2027 is a very different story. Of course I am motivated to think through this API, and taking future migrations into account is a great way to do that, but let's be clear about the fact that it will almost certainly change.

codefromthecrypt · 2017-01-26T06:37:42Z

@adriancole <https://github.com/adriancole>: the propagation interop pain point is painful for *you* as someone who maintains a tracing system that is dealing with version skew across numerous forks/etc. The lack of standard tracing instrumentation is the pain point I hear over and over again when speaking with (50+? 100+?) engineering orgs that are integrating tracing systems. TL;DR it seems you are conflating the needs of people who spend their time implementing tracing systems with the needs of people who spend their time trying to integrated with them.

Version skew and numerous forks (to the degree that's even true) is no more a part of my focus today than any time I've been outside a monorepo. I *happen* to also do a workshop occasionally, for implementors *because* I spend most of my time with integrators. Integrators show up organically in my world. I don't go company to company.. employees of companies (and random people) show up from every time zone. It isn't organized. My goal was to highlight that propagation is a worthwhile thing to focus on as if it is busted, your traces are busted. In other words, I'm encouraging people to accept vs perpetually reject negative feedback about decisions made.

bhs · 2017-01-26T06:52:10Z

For what it's worth, I don't go company to company either, Adrian... I probably should but have been too busy. Most of these conversations I've had are ad hoc things.

It's a little hard to read the tea leaves, but I think (?) you are suggesting that I'm the "people" who should "accept" rather than "perpetually reject" negative feedback. If so, I always welcome feedback, positive or negative, about anything I'm involved with.

That said, if that "feedback" is actually in the form of advice to 3rd parties, it seems rather important to present the other side of the argument. (Which, in this case, is about whether (a) instrumentation of application code or (b) consistency about in-band propagation formats is the sticking point for distributed tracing deployments in the wild)

codefromthecrypt · 2017-01-26T07:24:46Z

It's a little hard to read the tea leaves, but I think (?) you are suggesting that I'm the "people" who should "accept" rather than "perpetually reject" negative feedback. If so, I always welcome feedback, positive or negative, about anything I'm involved with.

The challenge I'm making is indeed pointed at you, not because of your opinion topic is being discussed, rather you are the person who offers meetings in response to people questioning things. My 2p is that when something comes up where people are participating in something like C++, but not privy to the historical reasons on something, go ahead and find or open an issue for them! This issue description could include the rationale which is otherwise lost being held across N issues. This also helps others who aren't watching the threads like this or able to attend F2F meetings. There's already a spec repo, so most of this is just applying it https://github.com/opentracing/specification/issues The only really new part is the explicit idea that progress is possible (i.e. read api past 1.0.. like a github tag or milestone), so that fresh people have a role in shaping the future.

lookfwd · 2017-01-26T11:12:10Z

"10 years" <- This will happen anyway... it would happen with any API we would create internally or Open Source

"conceptual" guarantees

That's the main reason I was pushing towards using OT instead of something proprietary. OT is way more mature conceptually compared to any v1.0 we would make in-house in such short time.

This proxy ideas above, are such that we can decouple instrumentation code on apps that might be stuck in v1.0 for long time and be able to evolve with the OT API (after the proxy) without having to relink/deploy all apps.

SaintDubious · 2017-02-01T14:52:54Z

After much internal discussion, @jquinn47 and I have decided to not press for this PR anymore. So unless someone really wants to argue for it (which I strongly doubt) it is safe to close this. While we continue to like the static polymorphism and concrete type of the span when passed to the carrier, the CRTP is annoying to work with (as @bensigelman feared it would be).

bhs · 2017-02-01T18:33:08Z

@SaintDubious @jquinn47 @lookfwd

Thanks for the note, I was wondering how your conversation was proceeding. So what do you think about something like #7?

lookfwd · 2017-02-01T19:31:40Z

@bensigelman We like the direction of #7 with the proxy note above but we should have more solid evidence in ~2-weeks time.

tedsuo · 2017-02-06T22:20:08Z

Closing this issue, as the discussion has moved on to #7.

jquinn1 and others added 30 commits January 4, 2017 23:24

StringRef, BaggageRef, BaggageIterator drafted

ded4c0c

Testing everything, at least a little

3381ca1

Remove bad example

70d1545

Some documentation, StringRefWide

06ff151

Solving the wide baggage problem

2f37155

Adding noop, Baggage/BaggageWide, StringRefWide, and docs

b45a540

Adding No-op implementation

94227b4

Minor formatting

1ee01dc

Updating tests, adding spanoptions, consistently documented

419580e

Updating documentation

d42770a

Adding guards, updating documentation and tests

c8d422c

Adding constants.h

47a9047

Update usage

d35931b

Adding log/tags/baggage usage

3b86ccf

Adding iterator behaviors

9656c0b

Updating docs

63ff455

formatting

1f77d05

formatting

59e565f

formatting never ends

8f1025e

formatting never ends

9f1daca

Update examples

b5e5486

Example compiler problem

05f3bf3

typo

eeb6fa3

Review feedback, removing guards, updated docs

5ff6015

Merge pull request opentracing#2 from jquinn1/changes

5e3885a

Review feedback, removing guards, updated docs

Moving noop into its own package, updated tests

a8b9017

Merge branch 'master' of https://bbgithub.dev.bloomberg.com/jquinn1/o…

88e1e8b

…pentracing-cpp

Update reqs for tracerimps

ef885ee

Clean up build system, ability to turn off no-op, integrate tests

bde3545

Merge pull request opentracing#3 from jquinn1/noop-shift

1120bb0

Noop shift

jquinn47 mentioned this pull request Jan 25, 2017

opentracing-cpp re-redux (virtual inheritance) #7

Closed

tedsuo closed this Feb 6, 2017

rnburn mentioned this pull request Oct 16, 2017

Dynamic casting #23

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Opentracing-cpp Redux #6

Opentracing-cpp Redux #6

jquinn47 commented Jan 11, 2017 •

edited

Loading

lookfwd commented Jan 21, 2017 •

edited

Loading

jquinn47 commented Jan 21, 2017 •

edited

Loading

jquinn47 commented Jan 21, 2017

jquinn47 commented Jan 25, 2017 •

edited

Loading

bhs commented Jan 25, 2017

SaintDubious commented Jan 25, 2017

lookfwd commented Jan 25, 2017

lookfwd commented Jan 25, 2017

yurishkuro commented Jan 25, 2017

jquinn47 commented Jan 25, 2017 •

edited

Loading

jquinn47 commented Jan 25, 2017 •

edited

Loading

yurishkuro commented Jan 25, 2017

jquinn47 commented Jan 25, 2017

yurishkuro commented Jan 25, 2017

lookfwd commented Jan 25, 2017 •

edited

Loading

jquinn47 commented Jan 26, 2017 •

edited

Loading

lookfwd commented Jan 26, 2017 •

edited

Loading

codefromthecrypt commented Jan 26, 2017 via email

bhs commented Jan 26, 2017

codefromthecrypt commented Jan 26, 2017 via email

bhs commented Jan 26, 2017

codefromthecrypt commented Jan 26, 2017 via email

lookfwd commented Jan 26, 2017

SaintDubious commented Feb 1, 2017

bhs commented Feb 1, 2017

lookfwd commented Feb 1, 2017 •

edited

Loading

tedsuo commented Feb 6, 2017

Opentracing-cpp Redux #6

Opentracing-cpp Redux #6

Conversation

jquinn47 commented Jan 11, 2017 • edited Loading

Major Changes

Library Organization

Allocations

V-Table hits

Spec concerns

Minor Changes

lookfwd commented Jan 21, 2017 • edited Loading

jquinn47 commented Jan 21, 2017 • edited Loading

jquinn47 commented Jan 21, 2017

jquinn47 commented Jan 25, 2017 • edited Loading

bhs commented Jan 25, 2017

SaintDubious commented Jan 25, 2017

lookfwd commented Jan 25, 2017

lookfwd commented Jan 25, 2017

yurishkuro commented Jan 25, 2017

jquinn47 commented Jan 25, 2017 • edited Loading

jquinn47 commented Jan 25, 2017 • edited Loading

yurishkuro commented Jan 25, 2017

jquinn47 commented Jan 25, 2017

yurishkuro commented Jan 25, 2017

lookfwd commented Jan 25, 2017 • edited Loading

jquinn47 commented Jan 26, 2017 • edited Loading

lookfwd commented Jan 26, 2017 • edited Loading

codefromthecrypt commented Jan 26, 2017 via email

bhs commented Jan 26, 2017

codefromthecrypt commented Jan 26, 2017 via email

bhs commented Jan 26, 2017

codefromthecrypt commented Jan 26, 2017 via email

lookfwd commented Jan 26, 2017

SaintDubious commented Feb 1, 2017

bhs commented Feb 1, 2017

lookfwd commented Feb 1, 2017 • edited Loading

tedsuo commented Feb 6, 2017

jquinn47 commented Jan 11, 2017 •

edited

Loading

lookfwd commented Jan 21, 2017 •

edited

Loading

jquinn47 commented Jan 21, 2017 •

edited

Loading

jquinn47 commented Jan 25, 2017 •

edited

Loading

jquinn47 commented Jan 25, 2017 •

edited

Loading

jquinn47 commented Jan 25, 2017 •

edited

Loading

lookfwd commented Jan 25, 2017 •

edited

Loading

jquinn47 commented Jan 26, 2017 •

edited

Loading

lookfwd commented Jan 26, 2017 •

edited

Loading

lookfwd commented Feb 1, 2017 •

edited

Loading