I have been tasked to come up with a schema to store our application logs using Cassandra. I am quite new to Cassandra but from what I have read and learned so far, it could be the best approach for our use case.
Our application send thousands of SMS each day (provided by 3 local service providers) and we would love to keep a log each time an SMS is sent (for reconciliation purposes at each month's end among other things). We intend to store the information below:
id text, // uuid
phone_number text, // recipient of the SMS
message text, // Message sent
status boolean, // if the SMS was sent or not
response text, // Request response
service_provider text, // e.g Twilio, Telnyx, Venmo etc
date timestamp, // Time SMS is sent
We would like to query the following reports at any one time:
- Total number of SMS sent
- Total SMS sent for a given period of time (between 2 dates)
- Total SMS sent by a specific service provider (also within a given time period)
- Total SMS sent to a specific recipient phone number (also within a given time period)
- Total failed or successful SMS sent (also within a given period of time)
I have come up with the following tables (3) but I feel like I am over engineering or over thinking it? Perhaps it could be done simpler? I would appreciate any advice in getting this to work efficiently.
create table sms_logs_by_id
(
id text,
phone_number text,
message text,
status boolean,
response text,
provider text,
service_provider text,
date timestamp,
primary key (id, date)
) with clustering order by (date DESC);
create table sms_logs_by_service_provider
(
id text,
phone_number text,
message text,
status boolean,
response text,
provider text,
service_provider text,
date timestamp,
primary key (service_provider, date)
) with clustering order by (date DESC);
create table sms_logs_by_phone_number
(
id text,
phone_number text,
message text,
status boolean,
response text,
provider text,
service_provider text,
date timestamp,
primary key (phone_number, date)
) with clustering order by (date DESC);
create table sms_logs_by_status
(
id text,
phone_number text,
message text,
status boolean,
response text,
provider text,
service_provider text,
date timestamp,
primary key (status, date)
) with clustering order by (date DESC);
Queries run pretty well so far. I am not sure if this is the most optimum way of modelling the data. I would appreciate any advice on how I can improve this data model. Thank you!