What factors to consider when choosing a Multi-model DBMS? (OrientDB vs ArangoDB)
Asked Answered
J

1

36

I am looking to dip my hands into the world of Multi-Model DBMS, I have no particular use cases, just want to start learning.

I find that there are two prominent ones - OrientDB vs ArangoDB, but was unable to find any meaningful comparison, unopinionated between them. Can someone shed some light on the difference in features between the two, and any caveats in using one over the other? If I learn one would I be able to easily transition to the other?

(I tagged FoundationDB as well, but it is proprietary and I probably won't consider it)

This question asks for a general comparison between OrientDB vs ArangoDB for someone looking to learn about Multi-model DBMS, and not an opinionated answer about which is better.

Jermyn answered 17/2, 2015 at 2:58 Comment(5)
I'm one of the developers of ArangoDB, therefore I cannot give an unbiased answers. The exception been the last one about "transition". Unfortunately there is no common query language like SQL in the NoSQL world. Gremlin is a move towards a common QL for Graph databases, but IMHO there are still a lot of open issues. With Datastax buying Titan I'm not sure what will happen to Tinkerpop3. Therefore you would need to learn a completely different query language for true multi-model. The traversal are done in Java (Orient) or JavaScript (ArangoDB) - again different language.Hoof
I'm the founder of OrientDB (so it's biased) and I can say that OrientDB is a multi-model DBMS at the engine level, while ArangoDB and FoundationDB just implemented layers on top of it. It's like using Hibernate on top of Oracle thinking that you have a ODBMS. My 0,02.Hulburt
I'm the other founder of ArangoDB and therefore also biased. Luca, I fear you got this one wrong: ArangoDB is designed and built as a multi-model DB from its inception. All three data models together with complete support in the API and query language are implemented in the DB engine as first class citizens and with high-performance C++ code. It is untrue that graphs and key/value are only implemented as layers on top of the document store.Unclog
Hey weinberger, a GraphDB is defined as "index-free adjacency". ArangoDB uses a Hash Index to cross relationship, so it's not "index-free", sorry. It's rather a JOIN. It's like traversing relationships with a RDBMS.Hulburt
Who made up this definition? This is more marketing than anything else. When we perform a traversal, it's not a JOIN. What you wrote it not true. j.mp/1EOt8gkUnclog
P
46

Disclaimer: I would no longer recommend OrientDB, see my comments below.


I can provide a slightly less biased opinion, having used both ArangoDB and OrientDB. It's still biased as I'm the author of OrientDB's node.js driver - oriento but I don't have a vested interest in either company or product, I've just necessarily used OrientDB more.

ArangoDB and OrientDB are both targeting a similar market and have a lot of similarities:

  1. Both are multi-model, you can use them to store documents, graphs and simple key / values.
  2. Both have support for Gremlin, but it's firmly a second class citizen compared to their own preferred query languages.
  3. Both support server-side "stored procedures" in JavaScript. In both systems this comes via a slightly less than idiomatic JavaScript API, although ArangoDB's is a lot better. This is getting fixed in a forthcoming version of OrientDB.
  4. Both offer REST APIs, both aim to be usable as an "API Server" via JavaScript request handlers. This is a lot more practical in ArangoDB than OrientDB.
  5. Both are distributed under a permissive license.
  6. Both are ACID and have transaction support, but in both the transactions are server-side operations - they're more like atomic batches of commands rather than the kinds of transactions you might be used to in a traditional RDBMS.

However, there are a lot of differences:

  1. ArangoDB has no concept of "links", which are a very useful feature in OrientDB. They allow unidirectional relationships (just like a hyperlink on the web), without the overhead of edges.
  2. ArangoDB is written in C++ (and JavaScript), whereas OrientDB is written in Java. Both have their advantages:
    • Being written in C++ means ArangoDB uses V8, the same high performance JavaScript engine that powers node.js and Google Chrome. Whereas being written in Java means OrientDB uses Nashorn, which is still fast but not the fastest. This means that ArangoDB can offer a greater level of compatibility with the node.js ecosystem compared to OrientDB.
    • Being written in Java means that OrientDB runs on more platforms, including e.g. Raspberry PI. It also means that OrientDB can leverage a lot of other technologies written in Java, e.g. OrientDB has superb full text / geospatial search support via Lucene, which is not available to ArangoDB.
  3. OrientDB uses a dialect of SQL as its query language, whereas ArangoDB uses its own custom language called AQL. In theory, AQL is better because it's designed explicitly for the problem, in practise though it feels quite similar to SQL but with different keywords, and is yet another language to learn while OrientDB's implementation feels a lot more comfortable if you're used to SQL. SQL is declarative whereas AQL is imperative - YMMV here.
  4. ArangoDB is a "mostly-memory" database, it works best when most of your data fits in RAM. This may or may not be suitable for your needs. OrientDB doesn't have this restriction (but also loves RAM).
  5. OrientDB is fully object oriented - it supports classes with properties and inheritance. This is exceptionally useful because it means that your database structure can map 1-1 to your application structure, with no need for ugly hacks like ActiveRecord. ArangoDB supports something fairly similar via models in Foxx, but it's more like an optional addon rather than a core part of how the database works.
  6. ArangoDB offers a lot of flexibility via Foxx, but it has not been designed by people with strong server-side JS backgrounds and reinvents the wheel a lot of the time. Rather than leveraging frameworks like express for their request handling, they created their own clone of Sinatra, which of course makes it almost the same as express (express is also a Sinatra clone), but subtly different, and means that none of express's middleware or plugins can be reused. Similarly, they embed V8, but not libuv, which means they do not offer the same non blocking APIs as node.js and therefore users cannot be sure about whether a given npm module will work there. This means that non trivial applications cannot use ArangoDB as a replacement for the backend, which negates a lot of the potential usefulness of Foxx.
  7. OrientDB supports first class property level and database level indices. You can query and insert into specific indexes directly for maximum efficiency. I've not seen support for this in ArangoDB.
  8. OrientDB is the more established option, with many high profile users. ArangoDB is newer, less well known, but growing fast.
  9. ArangoDB's documentation is excellent, and they offer official drivers for many different programming languages. OrientDB's documentation is not quite as good, and while there are drivers for most platforms, they're community powered and therefore not always kept up to date with bleeding edge OrientDB features.
  10. If you're using Java (or a Java bridge), you can embed OrientDB directly within your application, as a library. This use case is not possible in ArangoDB.
  11. OrientDB has the concept of users and roles, as well as Record Level Security. This may be a killer feature for you, it is for me. It also supports token based authentication, so it's possible to use OrientDB as your primary means of authorizing/authenticating users. OrientDB also has LDAP integration. In contrast, ArangoDB support only a very simple auth option.

Both systems have their own advantages, so choosing between them comes down to your own situation:

  • If you're building a small application, and you're a web developer optimizing for developer productivity, it will probably be easier to get up and running quickly with ArangoDB.

  • If you're building a larger application, which could potentially store many gigabytes or terabytes of data, or have many thousands of concurrent users, or have "enterprise" use cases, or need fine grained security controls, OrientDB is the one for you.

  • If you're storing RDF or similarly structured linked data, choose OrientDB.

  • If you're using Java, just choose OrientDB.

Note: This is (my opinion of) the state of play today, things change quickly and I would not underestimate the ruthless efficiency of the awesome team behind ArangoDB, I just think that it's not quite there yet :)


Charles Pick (codemix.com)

Profiteer answered 19/2, 2015 at 21:58 Comment(5)
Unfortunately I cannot delete this answer, but I would like to retract my recommendation that you'd use OrientDB. Since writing this answer we encountered some really major, fundamental problems with OrientDB that forced us to migrate to another system.Profiteer
Why, big statement to drop at the end after all of that. We need some more infoWiliness
I can't give a lot of details because of NDA but we faced a long string of problems which were a direct result of how OrientDB is developed, rather than anything being wrong with it conceptually. To put it simply, it is not developed to the high standards that you'd expect from a database vendor, and while it claims a lot of features, a lot of those features are half baked or just don't work. We spent a lot of time and effort working on OrientDB for our clients, and in the end it disappointed all of them.Profiteer
@Profiteer You can't delete, but you can edit and add some kind of "disclaimer" at the top, your comment is easy to overlook, but what you wrote is important for people who are going to make some decisions about choosing database vendor.Approximation
This post is just over a year old now, and (quite apart from your own perspective shifting) since these are both fast-moving products I would imagine the situation has moved on a little from where things stood when this excellent summary was first written. Are you, or perhaps anyone else, in a position to edit the post and update with more current information? Has ArangoDB addressed any of its highlighted shortcomings (e.g. improved auth, better performance when data is not in RAM, or an asynchronous API)? Has OrientDB, whether in terms of features, stability or development approach?Sublett

© 2022 - 2024 — McMap. All rights reserved.