I often use the Informix Developer Edition for ARM on my Raspberry Pi for quick prototyping and proof-of-concept (POC) experiments as the Informix Developer Edition has all of the features of the full Enterprise Edition, including native JSON and Time Series capabilities, in a small footprint that I can run locally on independent hardware. It’s also fun to run an enterprise database server on a Pi!
For this POC I wanted to see how easily I could convert existing time series tables from their rigidly-defined row types to a “schema-less” model.
The Informix TimeSeries solution stores time series data in a special format within a relational database in a way that takes advantage of the benefits of both non-relational and standard relational implementations of time series data.
My time series POCs on Pi typically involve utilizing the Sense Hat to provide sensor readings (temperature, humidity and pressure). These reading can be accessed via a Python module and can be inserted into a database from Python as well. My POC started as simply looking to move from a rigid row type for my time series to a schema-less row type using the native Informix JSON data type.
My original time series was defined as a table built on a rigid row type:
create row type reading_t
(
timestamp datetime year to fraction(5),
temperature decimal(6,2),
humidity decimal(6,2),
pressure smallint
);
create table reading_ts
(
sensor_id smallint not null primary key,
reading_data timeseries(reading_t)
) lock mode row;
The row type reading_t can only contain a data structure with the format of decimal, decimal and smallint values following the timestamp, which is why I refer to this as a “rigid row type.” This works very well if the data you store in the time series always follows this format, but what if you need to add another reading value to the time series weeks or months after you begin collecting data? You would need to unload your existing data, drop and rebuild the row type with the additional data field, then drop and rebuild the time series table before reloading the old data into the new time series. This is a lot of work and introduces some downtime into your data collection process.
JSON to the rescue
Informix has included a default JSON data type (and its corresponding binary representation BSON) since version 12.10.xC2.
JSON/BSON objects in a time series row type have been supported since 12.10.xC4 versions.
Since the data in a JSON type can be “schema-less” one can introduce new key-value pairs at any time without impacting existing data or requiring a new schema for an existing table.
For this POC I created the reading_t row type using the binary JSON data type BSON:
create row type reading_t
(
timestamp datetime year to fraction(5),
reading BSON
);
create table reading_ts
(
sensor_id smallint not null primary key,
reading_data timeseries(reading_t)
) lock mode row;
Since time series data is represented as a single row per sensor in a table the specific element is determined by the offset of time intervals from the origin of the time series. This makes visual inspection of the raw data difficult when there are hundreds or thousands of values recorded. The initial NULL values indicate empty reading “slots” after the origin timestamp in intervals of the temporal units specified by the calendar for the time series, 1 minute in this example (ts_1min).
dbaccess sensor<<!
select * from reading_ts
!
1|origin(2015-01-01 00:00:00.00000), calendar(ts_1min),
container(autopool00000002), threshold(0), regular, [NULL, NULL, NULL,
NULL, NULL, NULL, NULL, NULL, NULL,
…
('{"pressure":832.9,"temperature":29.7,"humidity":30.9}'),
('{"pressure":832.9,"temperature":29.7,"humidity":30.1}'),
('{"pressure":832.9,"temperature":29.9,"humidity":30.5}'),
('{"pressure":833,"temperature":29.9,"humidity":31.1}'),
('{"pressure":833,"temperature":29.8,"humidity":31.8}'),
('{"pressure":833,"temperature":29.8,"humidity":30.5}'),
('{"pressure":833,"temperature":29.9,"humidity":30.5}'),
('{"pressure":833,"temperature":29.8,"humidity":30.7}'),
('{"pressure":833,"temperature":29.7,"humidity":30.5}'),
('{"pressure":833,"temperature":30,"humidity":30.6}'),
('{"pressure":833,"temperature":29.8,"humidity":30.3}'),
('{"pressure":833,"temperature":29.8,"humidity":30.5}'),
('{"pressure":833,"temperature":29.8,"humidity":31.5}'),
('{"pressure":833,"temperature":30,"humidity":30.3}'),
('{"pressure":833,"temperature":29.9,"humidity":29.5}'),
('{"pressure":833.1,"temperature":30.1,"humidity":29.4}'),
('{"pressure":833.1,"temperature":30,"humidity":29.8}'),
('{"pressure":833,"temperature":30,"humidity":29.6}'),
('{"pressure":833,"temperature":29.9,"humidity":27.8}'),
('{"pressure":833,"temperature":29.8,"humidity":30}'),
('{"pressure":833,"temperature":29.9,"humidity":30}'),
('{"pressure":833,"temperature":29.8,"humidity":30.9}'),
('{"pressure":833,"temperature":29.8,"humidity":29.8}'),
('{"pressure":833,"temperature":29.9,"humidity":29.9}'),
('{"pressure":833,"temperature":29.8,"humidity":29.9}'),
('{"pressure":833,"temperature":29.8,"humidity":31.2}'),
('{"pressure":833,"temperature":30.1,"humidity":29.4}'),
('{"pressure":833,"temperature":29.9,"humidity":30.2}')
]
1 row(s) retrieved.
Once the time series table has been created using this JSON row type one can create a virtual table on top of the time series table to present a relational view of the time series data:
execute procedure tscreatevirtualtab('reading_vt', 'reading_ts',
"calendar(ts_1min),origin(2015-01-01 00:00:00.0000)", 'reduced_log');
Now the data is visually represented as relational data:
timestamp temp humidity pressure
2017-08-01 09:36:00
2017-08-01 09:39:00 35.60 34.60 848.00
2017-08-01 09:40:00 35.70 34.50 847.90
2017-08-01 09:41:00 35.80 34.30 847.90
2017-08-01 09:42:00 35.50 34.60 847.90
2017-08-01 09:43:00 35.30 34.00 848.00
2017-08-01 09:44:00 35.40 34.80 848.00
2017-08-01 09:45:00 35.30 35.20 848.00
2017-08-01 09:46:00 35.30 35.10 848.00
2017-08-01 09:47:00 35.20 35.00 848.00
2017-08-01 09:48:00 35.00 35.00 848.00
2017-08-01 09:49:00 35.20 35.60 848.00
2017-08-01 09:50:00 35.20 35.30 848.00
2017-08-01 09:51:00 35.40 35.60 848.00
2017-08-01 09:52:00 35.20 35.40 848.00
2017-08-01 09:53:00 35.40 35.80 848.00
2017-08-01 09:54:00 35.10 35.80 848.00
2017-08-01 09:55:00 35.00 34.30 848.00
2017-08-01 09:56:00 35.10 36.20 848.00
2017-08-01 09:57:00 35.20 35.80 848.00
2017-08-01 09:58:00 35.30 36.00 848.00
You can also insert into a virtual table:
insert into reading_vt values(1, current::datetime year to fraction(5),
'{"reading":{"temperature":34.5,"pressure":845.1,"humidity":41.5}}'::JSON);
Note how the reading field is actually a JSON string:
{"reading": {
"temperature":34.5,
"pressure":845.1,
"humidity":41.5
}
}
There is an implicit cast built into Informix that allows a JSON string to be inserted into a BSON field, but we could easily have written the insert with an explicit cast if we wanted additional clarity in our code:
insert into reading_vt values(1, current::datetime year to fraction(5),
'{"reading":{"temperature":34.5,"pressure":845.1,"humidity":41.5}}'::JSON::BSON);
At this point you can introduce additional key-value pairs into the reading element of the JSON in your time series without impacting existing data:
insert into reading_vt values(1, current::datetime year to fraction(5),
'{"reading":{"temperature":34.5,"pressure":845.1,"humidity":41.5},"elevation":1609.34}'::JSON);
Elements in the time series will return a null where the specified key does not exist in the reading data:
select timestamp::datetime year to second,
BSON_VALUE_DOUBLE(reading, 'elevation') as elevation,
BSON_VALUE_DOUBLE(reading, 'temperature')::decimal(6,2) as temp,
BSON_VALUE_DOUBLE(reading, 'humidity')::decimal(6,2) as humidity,
BSON_VALUE_DOUBLE(reading, 'pressure')::decimal(6,2) as pressure
from reading_vt where timestamp > '2017-08-03 10:00:00';
(expression) elevation temp humidity pressure
2017-08-03 10:01:00 33.90 36.00 846.50
2017-08-03 10:02:00 34.10 34.60 846.50
2017-08-03 10:03:00 34.10 33.90 846.50
2017-08-03 10:04:00 34.40 35.60 846.50
2017-08-03 10:05:00 34.30 35.70 846.50
As you can see, you can easily find the value of key value in a JSON field using the built-in BSON_VALUE_* functions. In this case I specified the BSON_VALUE_DOUBLE since the value is a non-string, non-integer numerical value. Informix also offers BSON_VALUE_* functions for other data types as well: BSON processing functions
Now that I have my “schema-less” time series set up and ready to accept sensor data it is time to write a simple Python program to actually insert the readings. I installed a clean copy of Raspbian Jessie prior to installing Informix 12.10.UC9DE (the latest of both) and compiled the InformixDB Python module to connect with the database engine but discovered that the latest version of the InformixDB module version doesn’t work with the version of Python included in Raspbian (Python 2.7.9). While I wait for the maintainer of the module to resolve the issue I was able to utilize another feature of Informix for my POC: the Informix REST API.
The REST API allows you to interface with the database server utilizing a wire listener process that accepts standard REST protocol calls. The wire listener gateway is included as part of all Informix editions.
The wire listener is a mid-tier gateway server that enables communication between MongoDB, REST API, and MQTT clients and the Informix database server.
With the REST API, you can use MongoDB and SQL queries against JSON and BSON document collections, traditional relational tables, and time series data. The REST API uses MongoDB syntax and returns JSON documents.
IBM Informix JSON compatibility
A standard POST payload for a “rigid” row type like my first example above would be something like this in JSON format:
{'timestamp': '2017-07-31 15:18:30.00000', 'sensor_id': 1,
'temperature':35.3, 'humidity':37.6, 'pressure':845}
# import requests library for making REST calls
import requests
import json
# specify url
url = 'http://localhost:27017/sensor/reading_vt’
ts = time.strftime('%Y-%m-%d %H:%M:00.00000')
data = {
"sensor_id": 1,
"timestamp": ts,
"temperature": 35.3,
"humidity": 37.6,
"pressure": 845
}
headers = {"Content-Type": "application/json"}
# Call REST API
response = requests.post(url, data=json.dumps(data), headers=headers)
# Print Response
print(response.text)
But how do you format a JSON payload that contains embedded JSON data, though? I struggled with this question and was unsuccessful finding an example in documentation or online.
I experimented by JSON-encoding the reading field values of the data object:
jreading = {
"temperature": 35.3,
"humidity": 37.6,
"pressure": 845
}
data = {
"sensor_id": 1,
"reading":
{
json.dumps(jreading)
}
}
I even attempted to explicitly cast the reading key value to JSON
data = {
"sensor_id": 1,
"reading":
{
json.dumps(jreading) + '::JSON'
}
Both of these attempts resulted in errors inserting into the time series via the REST API.
I then realized I was attempting to make this too complicated and allowed the REST API to do the work for me. I created a single data structure in Python and allowed the json module to encode the entire payload string without any double-encoding or additional casting. As I have found in the past, DIY conversion of data to valid JSON rarely works well; instead, allow the existing JSON encoding tools to do all of the work.
# import requests library for making REST calls
import requests
import json
import time
import string
from sense_hat import SenseHat
sense = SenseHat()
# specify url
url = 'http://localhost:27017/sensor/reading_vt'
while True:
t = sense.get_temperature()
p = sense.get_pressure()
h = sense.get_humidity()
t = round(t, 1)
p = round(p, 1)
h = round(h, 1)
ts = time.strftime('%Y-%m-%d %H:%M:00.00000')
jdata = {
"temperature": t,
"pressure": p,
"humidity": h
}
data = {
"sensor_id": 1,
"timestamp": ts,
"reading": jdata
}
headers = {"Content-Type": "application/json"}
print(data)
time.sleep(60)
# Call REST API
response = requests.post(url, data=json.dumps(data), headers=headers)
# Print Response
print(response.text)
The program now worked as intended and successfully inserted into the JSON row type time series via the REST API:
{'timestamp': '2017-08-03 10:52:00.00000', 'sensor_id': 1,
'reading': {'pressure': 846.5, 'temperature': 34.9, 'humidity': 34.9}}
[{"serverUsed":"/192.168.7.66:50001","n":1,"ok":1.0}]
Note the response includes the number of successful operations performed (“n”:1).
I can actually streamline the program further by creating only one object instead of two:
# import requests library for making REST calls
import requests
import json
import time
import string
from sense_hat import SenseHat
sense = SenseHat()
# specify url
url = 'http://localhost:27017/sensor/reading_vt'
while True:
t = sense.get_temperature()
p = sense.get_pressure()
h = sense.get_humidity()
t = round(t, 1)
p = round(p, 1)
h = round(h, 1)
l = time.strftime('%Y-%m-%d %H:%M:00.00000')
jdata = {
"temperature": t,
"pressure": p,
"humidity": h
}
data = {
"sensor_id": 1,
"timestamp": l,
"reading": {
"temperature": t,
"pressure": p,
"humidity": h
}
}
headers = {"Content-Type": "application/json"}
print(data)
time.sleep(60)
# Call REST API
response = requests.post(url, data=json.dumps(data),
headers=headers)
# Print Response
print(response.text)
At this point you can query the data via the REST API using the curl utility:
curl localhost:27017/sensor/reading_vt
[{"sensor_id":1,"timestamp":{"$date":1501601760000},
"reading":{"reading":{"temperature":34.5,"pressure":845.1,"humidity":41.5}}},
{"sensor_id":1,"timestamp":{"$date":1501601940000},
"reading":{"pressure":848.0,"temperature":35.6,"humidity":34.6}},
{"sensor_id":1,"timestamp":{"$date":1501602000000},
"reading":{"pressure":847.9,"temperature":35.7,"humidity":34.5}},
{"sensor_id":1,"timestamp":{"$date":1501602060000},
"reading":{"pressure":847.9,"temperature":35.8,"humidity":34.3}},
{"sensor_id":1,"timestamp":{"$date":1501602120000},
"reading":{"pressure":847.9,"temperature":35.5,"humidity":34.6}},
{"sensor_id":1,"timestamp":{"$date":1501602180000},
"reading":{"pressure":848.0,"temperature":35.3,"humidity":34.0}},
{"sensor_id":1,"timestamp":{"$date":1501602240000},
"reading":{"pressure":848.0,"temperature":35.4,"humidity":34.8}},
{"sensor_id":1,"timestamp":{"$date":1501602300000},
"reading":{"pressure":848.0,"temperature":35.3,"humidity":35.2}}]
Utilizing built-in features of Informix (native JSON support, the wire listener and time series) you can easily build a schema-less time series application using today’s standard programming practices.