11# OpenCensus Specs
22
3- This is a high level design of the OpenCensus library, some of the API examples may be written
4- (linked) in C++, Java or Go but every language should translate/modify them based on language
3+ This is a high level design of the OpenCensus library, some of the API examples may be written
4+ (linked) in C++, Java or Go but every language should translate/modify them based on language
55specific patterns/idioms. Our goal is that all libraries have a consistent "look and feel".
66
77This repository uses terminology (MUST, SHOULD, etc) from [ RFC 2119] [ RFC2119 ] .
88
99## Overview
10- Today, distributed tracing systems and stats collection tend to use unique protocols and
10+
11+ Today, distributed tracing systems and stats collection tend to use unique protocols and
1112specifications for propagating context and sending diagnostic data to backend processing systems.
12- This is true amongst all the large vendors, and we aim to provide a reliable implementations in
13+ This is true amongst all the large vendors, and we aim to provide a reliable implementation in
1314service of frameworks and agents. We do that by standardizing APIs and data models.
1415
1516OpenCensus provides a tested set of application performance management (APM) libraries, such as
16- Metrics and Tracing, under a friendly OSS license. We acknowlege the polyglot nature of modern
17- applications, and provide implementations in all main programming languages including C/C++,
17+ Metrics and Tracing, under a friendly OSS license. We acknowledge the polyglot nature of modern
18+ applications, and provide implementations in all main programming languages including C/C++,
1819Java, Go, Ruby, PHP, Python, C#, Node.js, Objective-C and Erlang.
1920
2021## Ecosystem Design
2122
22- ![ alt text ] [ EcosystemLayers ]
23+ ![ Ecosystem layers ] [ EcosystemLayers ]
2324
2425### Layers
2526
26- #### Service Exporters
27+ #### Service exporters
28+
2729Each backend service SHOULD implement this API to export data to their services.
2830
29- #### OpenCensus Library
30- This is what we design/describe in the current document.
31+ #### OpenCensus library
32+
33+ This is what the following section of this document is defining and explaining.
3134
32- #### Manually instrumented Frameworks
33- We are going to instrument some of the most popular frameworks for each language using the
35+ #### Manually instrumented frameworks
36+
37+ We are going to instrument some of the most popular frameworks for each language using the
3438OpenCensus library to allow users to get traces/stats when they use these frameworks.
3539
3640#### Tools for automatic instrumentation
37- Some of the languages may support libraries for automatic instrumentation. Java for example can use
38- byte-code manipulation (monkey patching) to provide an agent that automatically instruments a
39- binary. Note: not all the languages support this.
41+
42+ Some of the languages may support libraries for automatic instrumentation. For example, Java
43+ applications can use byte-code manipulation (monkey patching) to provide an agent that
44+ automatically instruments an application. Note: not all the languages support this.
4045
4146#### Application
47+
4248This is customer's application/binary.
4349
4450## Library Design
4551
4652### Namespace and Package
53+
4754* For details about the library package names structure see [ Namespace and Package] [ NamespaceAndPackage ] .
4855
4956### Components
50- This section focuses on the important components that each OpenCensus library must have to
51- support all the functionalities.
5257
53- Here is a layering structure of the proposed OpenCensus library:
58+ This section focuses on the important components that each OpenCensus library must have to
59+ support all required functionalities.
5460
55- ![ alt text ] [ LibraryComponents ]
61+ Here is a layering structure of the proposed OpenCensus library:
5662
63+ ![ Library components] [ LibraryComponents ]
5764
5865#### Context
59- Some of the features like tracing (distributed tracing) and tagging (possibly others) need a way
60- to propagate a specific context (trace, tags) in-process (possibly between threads) and function
61- calls.
62-
63- The key elements of the Context Support are:
64- * Every implementation MUST offer an explicit or implicit generic Context propagation mechanism
65- that allows different sub-contexts to be propagated.
66- * Languages that already have this support, like [ Go] [ goContext ] or C# (ExecutionContext), MUST
67- use the language supported generic context instead of building their own.
68- * For an explicit generic context implementation you can look at the Java [ io.grpc.Context] [ gRPCContext ] .
6966
67+ Some of the features for distributed tracing and tagging need a way
68+ to propagate a specific context (trace, tags) in-process (possibly between threads)
69+ and between function calls.
70+
71+ The key elements of the context support are:
72+
73+ * Every implementation MUST offer an explicit or implicit generic Context propagation mechanism
74+ that allows different sub-contexts to be propagated.
75+ * Languages that already have this support, like Go ([ context.Context] [ goContext ] ) or C# (ExecutionContext),
76+ MUST use the language supported generic context instead of building their own.
77+ * For an explicit generic context implementation you can look at the Java [ io.grpc.Context] [ gRPCContext ] .
7078
7179#### Trace
72- Trace component is designed to support distributed tracing (see [ dapper paper] [ DapperPaper ] ).
73- Census allows functionality beyond data collection and export, for example, tracking active and
74- keeping local samples for interesting requests.
80+
81+ Trace component is designed to support distributed tracing (see the [ Dapper paper] [ DapperPaper ] ).
82+ OpenCensus allows functionality beyond data collection and export. For example, it allows
83+ tracking of the active spans and keeping local samples for interesting requests.
7584
7685The key elements of the API can be broken down as:
77- * A Span represents a single operation within a trace. Spans can be nested to form a trace tree.
78- * Libraries must allow users to record tracing events for a span (attributes, annotations, links,
79- etc.).
86+
87+ * A Span represents a single operation within a trace. Spans can be nested to form a trace tree.
88+ * Libraries must allow users to record tracing events for a span (attributes, annotations, links,
89+ etc.).
8090* Spans are carried in the Context. Libraries MUST provide a way of getting, manipulating,
81- and replacing the Span in the current context.
82- * Libraries SHOULD provide a means of dynamically controlling the trace global config at runtime (e
83- .g. trace sampling rate/probability).
84- * Libraries SHOULD keep track of active spans and in memory samples based on latency/errors and
85- offer ways to access the data.
86- * Because context must also be propagated across processes, library MUST offer the functionality
87- that allows any transport (e.g RPC, HTTP, etc.) systems to encode/decode the “trace context” for
88- placement on the wire.
91+ and replacing the Span in the current context.
92+ * Libraries SHOULD provide a means of dynamically controlling the trace global configuration at runtime
93+ (e .g. trace sampling rate/probability).
94+ * Libraries SHOULD keep track of active spans and in memory samples based on latency/errors and
95+ offer ways to access the data.
96+ * Because context must also be propagated across processes, library MUST offer the functionality
97+ that allows any transport (e.g RPC, HTTP, etc.) systems to encode/decode the “trace context” for
98+ placement on the wire.
8999
90100##### Links
91- * Trace API is described [ here] [ TraceAPI ] .
92- * Data model is defined [ here] [ TraceDataModel ] .
101+
102+ * Trace API is described at the [ Trace API] [ TraceAPI ] document.
103+ * Data model is defined at the [ Trace Data Model] [ TraceDataModel ] document.
93104
94105#### Tags
95- Tags are values propagated through the Context subsystem inside a process and among processes by
96- any transport (e.g RPC, HTTP, etc.). For example tags are used by the Stats component to break
97- down measurements by arbitrary metadata set in the current process or propagated from a remote
106+
107+ Tags are values propagated through the Context subsystem inside a process and among processes by
108+ any transport (e.g RPC, HTTP, etc.). For example tags are used by the Stats component to break
109+ down measurements by arbitrary metadata set in the current process or propagated from a remote
98110caller.
99111
100112The key elements of the Tags component are:
113+
101114* A tag: this is a key-value pair, where the key is a string, and the value can be one of a 64-bit
102- integer, a boolean, or a string. The API allows for creating, modifying and querying objects
103- representing a tag value.
104- * A set of tags (with unique keys) carried in the Context. Libraries MUST provide a means
105- of manipulating the tags in the context, including adding new tags, replacing tag values, deleting
106- tags, and querying the current value for a given tag key.
107- * Because tags must also be propagated across processes, library MUST offer the functionality that
108- allows RPC systems to encode/decode the set of tags for placement on the wire.
115+ integer, a boolean, or a string. The API allows for creating, modifying and querying objects
116+ representing a tag value.
117+ * A set of tags (with unique keys) carried in the Context. Libraries MUST provide a means
118+ of manipulating the tags in the context, including adding new tags, replacing tag values, deleting
119+ tags, and querying the current value for a given tag key.
120+ * Because tags must also be propagated across processes, library MUST offer the functionality that
121+ allows RPC systems to encode/decode the set of tags for placement on the wire.
109122
110123##### Links
124+
111125* TODO: Add links to API definition and data model.
112126
113127#### Stats
114- The Stats component is designed to record measurements, dynamically break them down by
115- application-defined tags, and aggregate those measurements in user-defined ways. It is designed
116- to offer multiple types of aggregation (e.g. distributions) and be efficient (all measurement
117- processing is done as a background activity); aggregating data enables reducing the overhead of
128+
129+ The Stats component is designed to record measurements, dynamically break them down by
130+ application-defined tags, and aggregate those measurements in user-defined ways. It is designed
131+ to offer multiple types of aggregation (e.g. distributions) and be efficient (all measurement
132+ processing is done as a background activity); aggregating data enables reducing the overhead of
118133uploading data, while also allowing applications direct access to stats.
119134
120135The key elements the API MUST provide are:
121- * Defining what is to be measured (the types of data collected, and their meaning), and how data
122- will be aggregated (e.g. into a distribution, cumulative aggregation vs. deltas, etc.). Libraries
123- must offer ways for customer to define Metrics that make sense for their application, and support
124- a canonical set for RPC/HTTP systems.
125- * Recording data - API's for recording measured values. This data is then broken down by tags
126- carried in the context (e.g. a tag can have a value that describes the current RPC service/method
127- name; when RPC latency is recorded, this can be made in a generic call, without having to specify
128- the exact method), and aggregated as needed (e.g. a histogram of all latency values).
129- * Accessing the aggregated data. This can be filtered by data type, resource name, etc. This
130- allows applications to easily get access to their own data in-process.
136+
137+ * Defining what is to be measured (the types of data collected, and their meaning), and how data
138+ will be aggregated (e.g. into a distribution, cumulative aggregation vs. deltas, etc.). Libraries
139+ must offer ways for customer to define Metrics that make sense for their application, and support
140+ a canonical set for RPC/HTTP systems.
141+ * Recording data - APIs for recording measured values. The recorded data is then broken down by tags
142+ carried in the context (e.g. a tag can have a value that describes the current RPC service/method
143+ name; when RPC latency is recorded, this can be made in a generic call, without having to specify
144+ the exact method), and aggregated as needed (e.g. a histogram of all latency values).
145+ * Accessing the aggregated data. This can be filtered by data type, resource name, etc. This
146+ allows applications to easily get access to their own data in-process.
131147
132148##### Links
149+
133150* TODO: Add links to API definition and data model.
134151
135152### Supported propagation formats
153+
136154* Library MUST support the [ TraceContext] [ TraceContextSpecs ] format for Trace and Tags components.
137- * Binary encoding is defined [ here ] ( https://github.com/census-instrumentation/opencensus-specs/blob/master/encodings/ BinaryEncoding.md )
155+ * Binary encoding is defined at the [ BinaryEncoding ] [ BinaryEncoding ] document.
138156
139157[ EcosystemLayers ] : /drawings/EcosystemLayers.png " Ecosystem Layer "
140158[ DapperPaper ] : https://research.google.com/pubs/pub36356.html
@@ -146,3 +164,4 @@ allows applications to easily get access to their own data in-process.
146164[ TraceAPI ] : https://github.com/census-instrumentation/opencensus-specs/blob/master/trace/README.md
147165[ TraceContextSpecs ] : https://github.com/TraceContext/tracecontext-spec
148166[ TraceDataModel ] : https://github.com/census-instrumentation/opencensus-proto/blob/master/trace/trace.proto
167+ [ BinaryEncoding ] : https://github.com/census-instrumentation/opencensus-specs/blob/master/encodings/BinaryEncoding.md
0 commit comments