-
Notifications
You must be signed in to change notification settings - Fork 37
Data Model
There are two main use cases for the storage:
- User data manually generated by the User Interface
- Streams generated by IoT Devices, Operations, UX
The two types of data might be stored in different storage services. In particular, the first type is managed via the Storage Adapter API.
The storage is organized in tables, partitioned and indexed differently, depending on the access patterns. Some data like images and deployment artifacts, are hosted in blobs.
All tables have a partition key used to distribute data into multiple servers, and a primary key used to identify single records. Other indexes are available to support additional access patterns. Small tables store records in one partition.
A small non-partitioned storage.
Might be a single record, or a blob, with PCS name + PCS logo etc.
The storage is accessed via the Storage Adapter API.
A small non-partitioned table, indexed by Group ID.
The storage is accessed via the Storage Adapter API.
A small non-partitioned table, indexed by Rule ID.
The storage is accessed via the Storage Adapter API.
A storage designed to contain billions of records about millions of devices, with long retention. Records are never updated.
The table is partitioned into a fixed number of partitions (e.g. 16) to allow queries fetching telemetry for multiple devices at once, and indexed by timestamps.
Alternative partitioning methods not used:
- Partition by Device ID: when creating a graph showing data about multiple devices, a client would have to run one query per device
- Partition by time: reads and writes would always hit one hot partition
All queries must specify a partition ID, which is calculated from the Device ID. For instance, it's possible to use one query to fetch data about multiple devices, as long as the devices have the same partition ID.
The storage is accessed directly by the Streaming Service and the Telemetry API.
A storage designed to contain millions of records, with long retention. Records can be updated.
The table is partitioned into a fixed number of partitions (e.g. 16) to allow queries fetching telemetry for multiple devices at once, and indexed by timestamps.
Alternative partitioning methods not used:
- Partition by Device ID: when creating a graph showing data about multiple devices, a client would have to run one query per device
- Partition by time: reads and writes would always hit one hot partition
All queries must specify a partition ID, which is calculated from the Device ID. For instance, it's possible to use one query to fetch data about multiple devices, as long as the devices have the same partition ID.
The storage is accessed directly by the Streaming Service and the Telemetry API.
Annotations can be of multiple types, for example:
- Device Alert, with acknowledgement status
- Operational event, e.g. "deployed firmware 1.2", "added DPS"
- Business event, e.g. "product going public"