Which is the suitable database for storing a large JSON? [closed]
Asked Answered
A

4

9

I have only one large JSON file. For example,

{   
    "Name": "Motor_M23",
    "AASID": {
        "IDType": "URI",
        "IDSpec": "http://acplt.org/AAS/Motor_M23"
    },
    "AssetID": {
        "IDType": "URI",
        "IDSpec": "http://acplt.org/Assets/Motor_M23"
    },
    "Header": {
        "PropertyValueStatementContainers": [
            {
                "Name": "Config",
                        .
                        .
                        .
                        .

I need to support operations following operations:

  1. Querying for an element should return all child elements e.g. Querying for AssetID should return

    "AssetID": {
        "IDType": "URI",
        "IDSpec": "http://acplt.org/Assets/Motor_M23"
    }
    
  2. Update value of elements.

  3. Delete elements.
  4. Move elements between hierarchy levels e.g. make AssetID child element of AASID.

I considered following approaches:

  1. Graph database : I started reading about Neo4J. However, it can not create graph from JSON intellgently. One has to specify node type and their hierarchy order.
  2. ElasticSearch : It can work by treating JSON as text and hence not efficient solution.
  3. Postgres : It supports querying over JSON objects but updating, deletions won't be efficient.

Is there any good database out there which can load data from large JSON and handle my operations?

Allman answered 4/9, 2018 at 11:44 Comment(3)
Document oriented nosql databases. Checkout MongoDB, it's one of the most populars and it has a lot of good documentation.Jurisconsult
@Jurisconsult I checked out Model Trees with Nested Sets. Does MongoDB support Tree creation automatically by just providing JSON? I couldn't find any way to do that.Allman
I used mongoDB with a java application using an ODM which provided the automatic creation. I do not really know if you can do it automatically, but I am pretty sure there are scripts out there that can do it for you.Jurisconsult
R
3

This is a typical architectural question to choose the right database, wherein you have to consider quite a few important aspects such as HA, resiliency, replication, sharding, tools support, maturity, licensing, backup & restore etc.

MongoDB and Couchbase DB are the two most popular and widely used document databases. There is no straight-forward answer to choose one, as you have to do trade-off analysis. I can share my two cents, hopefully this would help you in arriving at the right decision.

Either MongoDB or Couchbase NoSQL document databases can be considered, as json is the first class citizen in both and you get really good options to perform operations using fields.

  1. MongoDB (CP support out of CAP) prefers consistency over availability whereas couchbase (AP out of CAP) is high available database.
  2. MongoDB cluster works with master/slave architecture whereas couchbase cluster works with peer-to-peer distribution architecture.

There are many more dimensions to be considered and following links would take you in right direction.

https://suyati.com/blog/mongodb-vs-couchbase/

https://www.couchbase.com/comparing-couchbase-vs-mongodb

Since, in your particular case you have highlighted that you have only one large file, IMDG (in-memory data grid such as Apache Ignite) based solutions can also be considered with a single node set up.

Ratafia answered 17/9, 2018 at 5:7 Comment(1)
In my case, most of my DB data is not JSON, but I do have one table that stores large JSON objects with sizes around 10-15mb. For this specific table, my thoughts were to use a cloud service to store the files & zip them (JSON's compress very well) and then just reference the cloud store ID so I can reference / delete them etc. in my database, which is Postgres.Failing
B
2

If you are only working with JSON then you should really use a document oriented database as it will save you having to wrestle something sql related.

MongoDB is a good choice, supports many drivers and can deal with tree structures (Though I'm not sure about the automatic creation)

CRUD operations are simple and cover a wide range of cases.

For very large datasets on busy servers you should use the XFS file system and the WiredTiger storage engine as there are some gains in performance.

It's well supported, and isn't that much of a learning curve. (I came from Pure SQL without too much trouble)

You also have the option of MariaDB or MySQL which also both support JSON though I have no experience with either, and in the case of MySQL I feel it was just a 'bolt on' which had to be added in the face of an up-coming requirement.

Bainite answered 15/9, 2018 at 12:25 Comment(0)
E
0

Couchbase’ SQL-like N1QL does deep JSON traversal and manipulation. It can also index nested array elements and UNNEST arrays at query time. It will do what you are asking.

Equites answered 20/9, 2018 at 21:13 Comment(0)
A
-1

You can programmatically parse and save data in SQL tables depending on operation logic you need:

  1. Keep all children (with all contents) as text in 1 table, per top element.

Element | Children (text)

"AssetID" |  {
        "IDType": "URI",
        "IDSpec": "http://acplt.org/Assets/Motor_M23"
    },
  1. Keep elements in a table, and first level of children (with all contents) separately per top element.

ID | Element

Element ID | Child (text)

1 | "AssetID"

1 | "IDType": "URI"
1 | "IDSpec": "http://acplt.org/Assets/Motor_M23"
  1. Keep elements (names) in a table, id and parent id to parse on unlimited levels, value if no children. This way you can query all elements (at any level) and content, update as necessary. You can also parse from any element based on ids and parent ids.

ID | Element | Parent | Value

1  | "Name"   | 0 | "Motor_M23"
2  | "AASID"  | 0 | - 
3  | "IDType" | 2 | "URI",
4  | "IDSpec" | 2 | "http://acplt.org/AAS/Motor_M23"
5  | "AssetID"| 0 | -
6  | "IDType" | 5 | "URI",
7  | "IDSpec" | 5 | "http://acplt.org/Assets/Motor_M23"
8  | "Header" | 0 | -
9  | "PropertyValueStatementContainers" | 8 | -
10 | "Name"   | 9 | "Config"
Ailey answered 14/9, 2018 at 16:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.