|
1 | 1 | # OpenCensus Specs |
2 | 2 |
|
3 | | -## General Design |
4 | | -* For details about the library structured see [Namespace and Package](https://github.com/census-instrumentation/opencensus-specs/blob/master/NamespaceAndPackage.md) |
| 3 | +This is a high level design of the OpenCensus library, some of the API examples may be written |
| 4 | +(linked) in C++, Java or Go but every language should translate/modify them based on language |
| 5 | +specific patterns/idioms. Our goal is that all libraries have a consistent "look and feel". |
5 | 6 |
|
6 | | -## Trace Design |
7 | | -* Trace API is described [here](https://github.com/census-instrumentation/opencensus-specs/blob/master/trace/README.md) |
| 7 | +This repository uses terminology (MUST, SHOULD, etc) from [RFC 2119][RFC2119]. |
8 | 8 |
|
9 | | -* Data model is defined [here](https://github.com/census-instrumentation/opencensus-proto/blob/master/trace/trace.proto) |
| 9 | +## Overview |
| 10 | +Today, distributed tracing systems and stats collection tend to use unique protocols and |
| 11 | +specifications for propagating context and sending diagnostic data to backend processing systems. |
| 12 | +This is true amongst all the large vendors, and we aim to provide a reliable implementations in |
| 13 | +service of frameworks and agents. We do that by standardizing APIs and data models. |
10 | 14 |
|
11 | | -## Tags Design |
| 15 | +OpenCensus provides a tested set of application performance management (APM) libraries, such as |
| 16 | +Metrics and Tracing, under a friendly OSS license. We acknowlege the polyglot nature of modern |
| 17 | +applications, and provide implementations in all main programming languages including C/C++, |
| 18 | +Java, Go, Ruby, PHP, Python, C#, Node.js, Objective-C and Erlang. |
12 | 19 |
|
13 | | -## Stats Design |
| 20 | +## Ecosystem Design |
14 | 21 |
|
15 | | -## Supported Encodings |
| 22 | +![alt text][EcosystemLayers] |
| 23 | + |
| 24 | +### Layers |
| 25 | + |
| 26 | +#### Service Exporters |
| 27 | +Each backend service SHOULD implement this API to export data to their services. |
| 28 | + |
| 29 | +#### OpenCensus Library |
| 30 | +This is what we design/describe in the current document. |
| 31 | + |
| 32 | +#### Manually instrumented Frameworks |
| 33 | +We are going to instrument some of the most popular frameworks for each language using the |
| 34 | +OpenCensus library to allow users to get traces/stats when they use these frameworks. |
| 35 | + |
| 36 | +#### Tools for automatic instrumentation |
| 37 | +Some of the languages may support libraries for automatic instrumentation. Java for example can use |
| 38 | +byte-code manipulation (monkey patching) to provide an agent that automatically instruments a |
| 39 | +binary. Note: not all the languages support this. |
| 40 | + |
| 41 | +#### Application |
| 42 | +This is customer's application/binary. |
| 43 | + |
| 44 | +## Library Design |
| 45 | + |
| 46 | +### Namespace and Package |
| 47 | +* For details about the library package names structure see [Namespace and Package][NamespaceAndPackage]. |
| 48 | + |
| 49 | +### Components |
| 50 | +This section focuses on the important components that each OpenCensus library must have to |
| 51 | +support all the functionalities. |
| 52 | + |
| 53 | +Here is a layering structure of the proposed OpenCensus library: |
| 54 | + |
| 55 | +![alt text][LibraryComponents] |
| 56 | + |
| 57 | + |
| 58 | +#### Context |
| 59 | +Some of the features like tracing (distributed tracing) and tagging (possibly others) need a way |
| 60 | +to propagate a specific context (trace, tags) in-process (possibly between threads) and function |
| 61 | +calls. |
| 62 | + |
| 63 | +The key elements of the Context Support are: |
| 64 | +* Every implementation MUST offer an explicit or implicit generic Context propagation mechanism |
| 65 | +that allows different sub-contexts to be propagated. |
| 66 | +* Languages that already have this support, like [Go][goContext] or C# (ExecutionContext), MUST |
| 67 | +use the language supported generic context instead of building their own. |
| 68 | +* For an explicit generic context implementation you can look at the Java [io.grpc.Context][gRPCContext]. |
| 69 | + |
| 70 | + |
| 71 | +#### Trace |
| 72 | +Trace component is designed to support distributed tracing (see [dapper paper][DapperPaper]). |
| 73 | +Census allows functionality beyond data collection and export, for example, tracking active and |
| 74 | +keeping local samples for interesting requests. |
| 75 | + |
| 76 | +The key elements of the API can be broken down as: |
| 77 | +* A Span represents a single operation within a trace. Spans can be nested to form a trace tree. |
| 78 | +* Libraries must allow users to record tracing events for a span (attributes, annotations, links, |
| 79 | +etc.). |
| 80 | +* Spans are carried in the Context. Libraries MUST provide a way of getting, manipulating, |
| 81 | +and replacing the Span in the current context. |
| 82 | +* Libraries MUST provide a means of dynamically controlling the trace global config at runtime (e |
| 83 | +.g. trace sampling rate/probability). |
| 84 | +* Libraries SHOULD keep track of active spans and in memory samples based on latency/errors and |
| 85 | +offer ways to access the data. |
| 86 | +* Because context must also be propagated across processes, library MUST offer the functionality |
| 87 | +that allows any transport (e.g RPC, HTTP, etc.) systems to encode/decode the “trace context” for |
| 88 | +placement on the wire. |
| 89 | + |
| 90 | +##### Links |
| 91 | +* Trace API is described [here][TraceAPI]. |
| 92 | +* Data model is defined [here][TraceDataModel]. |
| 93 | + |
| 94 | +#### Tags |
| 95 | +Tags are values propagated through the Context subsystem inside a process and among processes by |
| 96 | +any transport (e.g RPC, HTTP, etc.). For example tags are used by the Stats component to break |
| 97 | +down measurements by arbitrary metadata set in the current process or propagated from a remote |
| 98 | +caller. |
| 99 | + |
| 100 | +The key elements of the Tags component are: |
| 101 | +* A tag: this is a key-value pair, where the key is a string, and the value can be one of a 64-bit |
| 102 | +integer, a boolean, or a string. The API allows for creating, modifying and querying objects |
| 103 | +representing a tag value. |
| 104 | +* A set of tags (with unique keys) carried in the Context. Libraries MUST provide a means |
| 105 | +of manipulating the tags in the context, including adding new tags, replacing tag values, deleting |
| 106 | +tags, and querying the current value for a given tag key. |
| 107 | +* Because tags must also be propagated across processes, library MUST offer the functionality that |
| 108 | +allows RPC systems to encode/decode the set of tags for placement on the wire. |
| 109 | + |
| 110 | +##### Links |
| 111 | +* TODO: Add links to API definition and data model. |
| 112 | + |
| 113 | +#### Stats |
| 114 | +The Stats component is designed to record measurements, dynamically break them down by |
| 115 | +application-defined tags, and aggregate those measurements in user-defined ways. It is designed |
| 116 | +to offer multiple types of aggregation (e.g. distributions) and be efficient (all measurement |
| 117 | +processing is done as a background activity); aggregating data enables reducing the overhead of |
| 118 | +uploading data, while also allowing applications direct access to stats. |
| 119 | + |
| 120 | +The key elements the API MUST provide are: |
| 121 | +* Defining what is to be measured (the types of data collected, and their meaning), and how data |
| 122 | +will be aggregated (e.g. into a distribution, cumulative aggregation vs. deltas, etc.). Libraries |
| 123 | +must offer ways for customer to define Metrics that make sense for their application, and support |
| 124 | +a canonical set for RPC/HTTP systems. |
| 125 | +* Recording data - API's for recording measured values. This data is then broken down by tags |
| 126 | +carried in the context (e.g. a tag can have a value that describes the current RPC service/method |
| 127 | +name; when RPC latency is recorded, this can be made in a generic call, without having to specify |
| 128 | +the exact method), and aggregated as needed (e.g. a histogram of all latency values). |
| 129 | +* Accessing the aggregated data. This can be filtered by data type, resource name, etc. This |
| 130 | +allows applications to easily get access to their own data in-process. |
| 131 | + |
| 132 | +##### Links |
| 133 | +* TODO: Add links to API definition and data model. |
| 134 | + |
| 135 | +### Supported propagation formats |
| 136 | +* Library MUST support the [TraceContext][TraceContextSpecs] format for Trace and Tags components. |
16 | 137 | * Binary encoding is defined [here](https://github.com/census-instrumentation/opencensus-specs/blob/master/encodings/BinaryEncoding.md) |
| 138 | + |
| 139 | +[EcosystemLayers]: https://github.com/census-instrumentation/opencensus-specs/blob/master/drawings/EcosystemLayers.png "Ecosystem Layer" |
| 140 | +[DapperPaper]: https://research.google.com/pubs/pub36356.html |
| 141 | +[goContext]: https://golang.org/pkg/context |
| 142 | +[gRPCContext]: https://github.com/grpc/grpc-java/blob/master/context/src/main/java/io/grpc/Context.java |
| 143 | +[LibraryComponents]: https://github.com/census-instrumentation/opencensus-specs/blob/master/drawings/LibraryComponents.png "OpenCensus Library Components" |
| 144 | +[NamespaceAndPackage]: https://github.com/census-instrumentation/opencensus-specs/blob/master/NamespaceAndPackage.md |
| 145 | +[RFC2119]: https://www.ietf.org/rfc/rfc2119.txt |
| 146 | +[TraceAPI]: https://github.com/census-instrumentation/opencensus-specs/blob/master/trace/README.md |
| 147 | +[TraceContextSpecs]: https://github.com/TraceContext/tracecontext-spec |
| 148 | +[TraceDataModel]: https://github.com/census-instrumentation/opencensus-proto/blob/master/trace/trace.proto |
0 commit comments