Resampling time series or dataframes with Javascript / Node.js
Asked Answered
M

2

19

I need to resample time series in node.js. So I would like to know whether there is a tool in javascript which works similar as pandas in Python?

Lets say I have data which looks similar to this example:

[{
    "time": "28-09-2018 21:29:04",
    "value1": 1280,
    "value2": 800
},
{   
    "time": "28-09-2018 21:38:56",
    "value1": 600,
    "value2": 700
},
{
    "time": "29-09-2018 10:40:00",
    "value1": 1100,
    "value2": 300
},
{
    "time": "29-09-2018 23:50:48",
    "value1": 140,
    "value2": 300
}]

In Python I would put this data into a pandas dataframe and then resample it into a new dataframe with a different sample rate. In this example to daily data:

import pandas
df = pandas.DataFrame(...)
df_days = df.resample('1440min').apply({'value1':'sum', 'value2':'sum'}).fillna(0)

So my new data would look something like this:

[{
    "time": "28-09-2018 00:00:00",
    "value1": 1880,
    "value2": 1500
},
{   
    "time": "29-09-2018 00:00:00",
    "value1": 1240,
    "value2": 600
}]

What is in general the best way to do this in node.js / javascript ?

Mayest answered 30/9, 2018 at 8:59 Comment(4)
Possible duplicate of Python Pandas equivalent in JavaScriptAlpers
have you found a way to solve this?Podesta
Yes, I export the data as an excel sheet and then execute a python script, which loads the data and performs resampling on it. I can post the code if you like. Unfortunately I couldn't find a native nodejs solution.Mayest
@baermathias I'm trying to do something similar, would you mind posting the code?Embrasure
G
2

I don't think you need a node.js/JS library for this task. What you want to achieve can be done with a reduce function.

var a = [{
    "time": "28-09-2018 21:29:04",
    "value1": 1280,
    "value2": 800
},
{   
    "time": "28-09-2018 21:38:56",
    "value1": 600,
    "value2": 700
},
{
    "time": "29-09-2018 10:40:00",
    "value1": 1100,
    "value2": 300
},
{
    "time": "29-09-2018 23:50:48",
    "value1": 140
}];

var b = Object.values(a.reduce((container, current) => {
  var date = current['time'].substring(0, 10);
  if (!container[date])
    container[date] = {time: date + ' 00:00:00', value1: current['value1'] || 0, value2: current['value2'] || 0};
  else {
    container[date]['value1'] += current['value1'] || 0;
    container[date]['value2'] += current['value2'] || 0;
  }
  return container;
}, {}));

This function create an object with keys the date and aggregate the values. You need to take care if the date exists or not in that object. With || 0 you manage if the property is not in a element to not break anything; and with Object.values you extract the values to have an array. Since you used the date as string I treated them as string but if they are Date object you have to adjust the common part where you declare date.

Side note, as always you can reference to a prop in js with ['value1'] or also with .value1. I stick to a more familiar pythonic syntax since it was mentioned.

Of course, this is just an example with daily resample, if you need a bigger/smaller quota you have to manipulate dates. Let's say we want to emulate a 12 hours resample, you write:

var resample = 12;
var b = Object.values(a.reduce((container, current) => {
  var date = new Date(current['time'].replace(/(\d+)-(\d+)-(\d+) (\d+):(\d+):(\d+)/, '$3-$2-$1T$4:$5:$6'));
  date.setHours(Math.floor(date.getHours() / resample) * resample);
  date.setMinutes(0);
  date.setSeconds(0);
  if (!container[date.toString()])
    container[date.toString()] = {time: date, value1: current['value1'] || 0, value2: current['value2'] || 0};
  else {
    container[date.toString()]['value1'] += current['value1'] || 0;
    container[date.toString()]['value2'] += current['value2'] || 0;
  }
  return container;
}, {}));

That regex replace is because the dates are not in ISO format, you could use a library for that, like moment or others, I wanted to show that it is possible to do all with just plain JS.

Remember one thing when using JS dates: if you are in the browser the timezone is the one of the client, if you are in a server the timezone is the same of the server. If time is timezone free I don't think there should be problems because it is all managed in the local timezone.

Gamin answered 22/8, 2022 at 21:44 Comment(0)
C
0

Simple approach

  1. very simple flask app that can do pandas processing for you
  2. simple JQuery AJAX to use it.

HTML

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, minimum-scale=1.0, maximum-scale=1.0, user-scalable=no, viewport-fit=cover">
    <script src="https://code.jquery.com/jquery-3.5.1.min.js" integrity="sha256-9/aliU8dGd2tb6OSsuzixeV4y/faTqgFtohetphbbj0=" crossorigin="anonymous"></script>
</head>
<body>
    <main id="main">
        <section id="data-section">
            <h2>Data</h2>
            <div id="data"/>
        </section>
    </main>
</body>
<script>
    function apicall(url, data) {
        $.ajax({
            type:"POST", url:url, data:{data:JSON.stringify(data)},
            success: (data) => { $("#data").text(JSON.stringify(data)); }
        });
    }
    data = [{"time": "28-09-2018 21:29:04","value1": 1280,"value2": 800},{"time": "28-09-2018 21:38:56","value1": 600,"value2": 700},{"time": "29-09-2018 10:40:00","value1": 1100,"value2": 300},
            {"time": "29-09-2018 23:50:48","value1": 140,"value2": 300}];
    window.onload = function () {
        apicall("/handle_data", data);
    }
</script>
</html>

Flask App

import pandas as pd, json
from flask import Flask, redirect, url_for, request, render_template, Response

app = Flask(__name__)

@app.route('/')
@app.route('/home')
def home():
    return render_template('home.html')

@app.route('/handle_data', methods=["POST"])
def handle_data():
    df = pd.DataFrame(json.loads(request.form.get("data")))
    df["time"] = pd.to_datetime(df["time"])
    df.set_index("time", inplace=True)
    df = df.resample('1440min').apply({'value1':'sum', 'value2':'sum'}).fillna(0)
    return Response(json.dumps(df.to_dict(orient="records")),
                    mimetype="text/json")

if __name__ == '__main__':
    app.run(debug=True, port=3000)

output

enter image description here

Compute answered 24/8, 2020 at 6:7 Comment(3)
This is not an answer to the question. It may use javascript, but the question is for a JS library that is equivalent to pandas for aggregation.Microbarograph
@SidKwakkel it's a full stack answer. It's so simple to build and deploy to any cloud stack (AWS EB, GCloud, Azure). Effectively heterogeneous micro services. Homogeneous solutions quite often land up more complicatedCompute
Nice idea, this could even be a cloud functionDuality

© 2022 - 2024 — McMap. All rights reserved.