Is Bigtable or Datastore more suited to storing and using financial data for online applications?
Asked Answered
T

2

5

I'm creating a stock analysis web application. I want to store financial data for multiple stocks. Then I want to use a stock screener on them. This screener involves retrieving multiple stocks from the backend and performing a technical indicator test on them. Stocks that pass the indicator test will be returned to the user. Let's say i want to store a pandas.dataframe for exampleStock:

          open    high      low   close    volume
date                                                 
2017-08-01  247.46  247.50  246.716  247.32  55050401
2017-08-02  247.47  247.60  246.370  247.44  47211216
2017-08-03  247.31  247.34  246.640  246.96  40855997
2017-08-04  247.52  247.79  246.970  247.41  60191838
2017-08-07  247.49  247.87  247.370  247.87  31995021
....

I have been using DataStore. I create entities for each stock setting the key as the stocks symbol. I use a model like this:

from google.appengine.ext import ndb

class Stocks(ndb.Model):
    dates  = ndb.StringProperty(repeated=True)
    open   = ndb.FloatProperty(repeated=True)
    high   = ndb.FloatProperty(repeated=True)
    low    = ndb.FloatProperty(repeated=True)
    close  = ndb.FloatProperty(repeated=True)
    volume = ndb.FloatProperty(repeated=True)

Then I retrieve multiple entities to loop over with the techncial indicator check:

import numpy

listOfStocks = ndb.get_multi(list_of_keys)
for stock in listOfStocks:
  doIndicatorCheck(numpy.array(stock.close))

I want to make a query for stocks, do the indicator check and then return results to the user as fast as possible. Should I be using Bigtable for this or Datastore is fine? If Datastore is fine is this the ideal way to do it?

Thanks in advance.

Toile answered 7/9, 2018 at 7:11 Comment(1)
How much data do you think you'll have? Also, how many operations per second? How often will the data be updated? Spanner, Cloud SQL and BigQuery may be additional options depending on the answerPercolator
L
5

Disclosure: I am a product manager for Cloud Bigtable.

If you plan to have a large amount of financial data, covering the entire stock market, Cloud Bigtable is a good choice: it scales to terabytes and petabytes, and you can get low-latency responses to your requests, it is already in use in financial, risk and anti-fraud applications, and natively supports time series via its third dimension. See this blog post and video on how FIS used Cloud Bigtable for their bid on the SEC CAT project.

That said, Cloud Bigtable is strongly consistent in a single cluster, but eventually-consistent if you use replication, so you have to keep that in mind. If your users expect strong consistency, your options are:

  • use a single cluster instance (replication only within a single zone)
  • if you use cross-zone replication, route requests to a single cluster via application profiles
  • consider using a different system which provides strong consistency

Firestore will provide a serverless document database with strong consistency for your applications, so you should consider Firestore if that is important for your use case.

If you want to be able to run SQL queries on your data, consider:

Hope this helps!

Luxury answered 7/9, 2018 at 22:49 Comment(0)
D
1

As you know, Datastore is implemented using Bigtable. So you can expect them to perform similarly. In term of use case suitability, Datastore, which is going to be replaced by the Firestore soon, is more suitable for storing user or user-session related data. On top of that. Also, Bigtable is explicitly recommended for finance related workloads.

There is a page that is specifically dedicated to selecting the most you can use this page as a guide.

Diamante answered 7/9, 2018 at 13:27 Comment(3)
You cannot expect Datastore and Bigtable to perform similarly, because Datastore is on top of Megastore (to provide strong consistency), which is on top of Bigtable (which only provides strong consistency in a single cluster, but not across replicated clusters). As a result, they cannot similarly.Luxury
Also, FYI, the page you're linking to talks about Firestore, not Firebase — those are different solutions, though I agree that they do sound similar.Luxury
Miss-typed the name.Diamante

© 2022 - 2024 — McMap. All rights reserved.