I’m currently working with the SensorThings specification and exploring how to integrate or extend the OGC SensorThings API into an Apache Kafka-based architecture.
The SensorThings specification is designed around REST (request/response) and MQTT (pub/sub), where clients can dynamically query entity relationships such as Things, Datastreams, Sensors, and Observations. In Kafka, however, communication is more or less log-based and consumers are fully decoupled from direct metadata queries, which introduces some architectural challenges.
I’m looking for guidance or best practices on the following points:
1. Topic Structure
What would be the recommended way to map SensorThings entities to Kafka topics?
- Separate topics per entity type (e.g., Things, Datastreams, Observations)?
- Or a more datastream-centric approach (e.g., one topic per Datastream)?
- Are there any established patterns for partitioning in this context?
2. Message Format
How should messages be structured?
- Fully enriched payloads (including Thing, Sensor, Datastream metadata)?
- Or lightweight messages with only references (IDs)?
- Are there existing schema definitions (pref JSON) aligned with SensorThings?
3. Metadata & State Management
Since Kafka does not support direct querying like an API:
- How should consumers obtain and maintain metadata (Things, Sensors, Locations, etc.) needed to interpret Observations?
- Are there examples of this pattern applied to SensorThings?
4. Handling Relationships
SensorThings defines rich relationships between entities (Thing → Datastream → Observation, Sensor → Datastream, etc.).
- How are these relationships best maintained and resolved in a streaming context?
- Should consumers reconstruct this graph locally, or is there a recommended enrichment strategy?
5. Existing Work / Extensions
Has there been any prior work, discussion, or extension of SensorThings toward Kafka or similar event-streaming systems?
Also, if there is a general idea or design, or even a working example available I would be happy to continue to work on it.
More generally, the challenge seems to be bridging the gap between:
- SensorThings: query-driven, relational access to entities
- Kafka: event-driven, append-only log with eventual state reconstruction
Any insights, examples, or pointers would be greatly appreciated!
Thanks,
Jesse
I’m currently working with the SensorThings specification and exploring how to integrate or extend the OGC SensorThings API into an Apache Kafka-based architecture.
The SensorThings specification is designed around REST (request/response) and MQTT (pub/sub), where clients can dynamically query entity relationships such as Things, Datastreams, Sensors, and Observations. In Kafka, however, communication is more or less log-based and consumers are fully decoupled from direct metadata queries, which introduces some architectural challenges.
I’m looking for guidance or best practices on the following points:
1. Topic Structure
What would be the recommended way to map SensorThings entities to Kafka topics?
2. Message Format
How should messages be structured?
3. Metadata & State Management
Since Kafka does not support direct querying like an API:
4. Handling Relationships
SensorThings defines rich relationships between entities (Thing → Datastream → Observation, Sensor → Datastream, etc.).
5. Existing Work / Extensions
Has there been any prior work, discussion, or extension of SensorThings toward Kafka or similar event-streaming systems?
Also, if there is a general idea or design, or even a working example available I would be happy to continue to work on it.
More generally, the challenge seems to be bridging the gap between:
Any insights, examples, or pointers would be greatly appreciated!
Thanks,
Jesse