Time Series Databases - Metrics vs. tags
Asked Answered
M

3

6

I'm new with TSDB and I have a lot of temperature sensors to store in my database with one point per second. Is it better to use one unique metric per sensor, or only one metric (temperature for example) with distinct tags depending sensor??

I searched on Internet what is the best practice, but I didn't found a good answer...

Thank you! :-)

Edit: I will have 8 types of measurements (temperature, setpoint, energy, power,...) from 2500 sources

Montserrat answered 20/7, 2015 at 11:10 Comment(0)
W
10

If you are storing your data in InfluxDB, I would recommend storing all the metrics in a single measurement and using tags to differentiate the sources, rather than creating a measurement per source. The reason being that you can trivially merge or decompose the metrics using tags within a measurement, but it is not possible in the newest InfluxDB to merge or join across measurements.

Ultimately the decision rests with both your choice of TSDB and the queries you care most about running.

Wagon answered 20/7, 2015 at 23:33 Comment(3)
Thank you for your reply! I added more details about what I need. The most used queries will be get all values of one sensor for one type and sometimes (rarely) get all temperatures > x for exampleMontserrat
You want queries like these, then, which strongly imply everything goes into one measurement: SELECT * FROM metrics WHERE sensor_type='foo' and also SELECT * FROM metrics WHERE temperature > 100. If you split each sensor into an individual measurement the first is easy but the second is impossible. Note that the second query will be very expensive, since you are filtering on a non-indexed field value, but since it's infrequent that should be fine. Limiting the query to a particular time range will mitigate the performance hit.Wagon
OK! Thanks a lot for your help!Montserrat
H
2

For comparison purposes, in Axibase Time-Series Database you can store temperature as a metric and sensor id as entity name. ATSD schema has a notion of entity which is the name of system for which the data is being collected. The advantage is more compact storage and the ability to define tags for entities themselves, for example sensor location, sensor type etc. This way you can filter and group results not just by sensor id but also by sensor tags.

To give you an example, in this blog article 0601911 stands for entity id - which is EPA station id. This station collects several environmental metrics and at the same time is described with multiple tags in the database: http://axibase.com/environmental-monitoring-using-big-data/.

The bottom line is that you don't have to stage a second database, typically a relational one, just to store extended information about sensors, servers etc. for advanced reporting.

UPDATE 1: Sample network command:

series e:sensor-001 d:2015-08-03T00:00:00Z m:temperature=42.2 m:humidity=72 m:precipitation=44.3

Tags that describe sensor-001 such as location, type, etc are stored separately, minimizing storage footprint and speeding up queries. If you're collecting energy/power metrics you often have to specify attributes to series such as Status because data may not come clean/verified. You can use series tags for this purpose.

series e:sensor-001 d:2015-08-03T00:00:00Z m:temperature=42.2 ... t:status=Provisional
Hotshot answered 21/7, 2015 at 6:53 Comment(0)
G
0

You should use one metric per sensor. You probably won't be needing to aggregate values from different temperature sensors, but you will be needing to aggregate values of a given sensor (average over a minute for instance).

Metrics correspond to data coming from the same source, or at least data you will be likely to aggregate. You can create almost as many metrics as you want (up to 16 million metrics in OpenTSDB for instance).

Tags make distinctions between these pieces of data. For instance, you could tag data differently if they suddenly change a lot, in order to retrieve only relevant data if needed, without losing the rest of the data. Although for a temperature sensor getting data every second, the best would probably be to filter and only store data when the value changed...

Best practices are summed up here

Grammer answered 20/7, 2015 at 14:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.