A graph DB vs a Prolog (or miniKanren)

Asked 22/3, 2015 at 9:44 Answered 3/11, 2019 at 10:12

Solved neo4j prolog graph-databases minikanren

Recently I have been looking into graph databases like Neo4j and into logic programming in Prolog and miniKanren. From what I have learned so far, both allow specifying facts and relations between them, and also querying the resulting system for some selections. So, actually I cannot see much difference between them in that they both can be used to build a graph and query it, but using different syntax. However, they are presented as totally different kinds of software.

Except the technicality that databases maybe propose a more space-time effective storage technology, and except that tiny logic cores like miniKanren are simpler and embeddable, what is the actual difference between graph databases and logic programming languages, if they are both just a graph database + query API?

Bivens answered 22/3, 2015 at 9:44 Comment(2)

Prolog is a programming language, whereas a graph database is only a data base. Most things that you can do with Prolog are impossible to accomplish with just a database. For example, building a webserver like the one that powers the SWI-Prolog web site is possible with Prolog, but impossible with just a database engine. – Moise 22/3, 2015 at 17:23

add stored procedures and triggers to graph database, load some source code parsed into attributed AST (it's a graph natively), and you'll get you db execute or transform this AST – Triboluminescent 7/8, 2017 at 22:25

No, logic programming as embodied by those things and neo4j are quite different.

On one level, you're right that they conceptually both amount to graph storage and graph query. But for logic programming, it's only conceptually graph query, there's no guarantee that it's actually stored that way (where with neo4j, it is).

Second, with logic programming you're usually trying to establish horn clauses that allow you to reason through lots of data. You can think of a horn clause as a simple rule, like "If a person is male, and is the direct ancestor of a biological child, that implies that person is a father". In cypher with neo4j, you would describe a graph pattern you wish to match, that results in data, e.g.:

 MATCH (p:Person)-[:father*]->(maleAncestor:Person)
 RETURN maleAncestor

This tells to traverse the graph by father relationships, and return male ancestors. In a logic programming language, you wouldn't do it this way. You might specify that a being a father of b means that a is male, and a is an ancestor of b. This would implicitly and transitively state that for all valid a/b pairings. Then you'd ask a question, "who are the male ancestors"? The programming environment would then answer that by exploiting your rules. That would have the effect of building a traversal through the data that's very similar to the cypher I specified above, but the way you go about understanding your data and building that traversal is totally different.

Logic programming languages usually work via predicate resolution. A graph query language like cypher works by a combination of pattern matching, and explicit path designation. They're very different.

Caa answered 22/3, 2015 at 13:42 Comment(3)

great answer... 2 things occurred to me: (1) demorgan rules transform disjunction to implication, so any set of horn clauses should be isomorphic to some directed graph where all the arrows are implication (this could fit in a neo db). (2) interpreting a neo database's meta graph yields a relation among node labels, relationship types and properties that could easily be represented in formal logic and processed by a reasoner. The point I'm making is just that, while distinct, graph queries & logic programming are compatible and potentially complementary – Conclave 22/10, 2015 at 13:32

Agreed, they are compatible, and complementary in the sense that implementing a reasoner over a constrained schema neo4j graph would be relatively straightforward. But most people, upon choosing one approach or the other, would be implicitly choosing a particular model of thinking about code/data. They could touch, but they don't usually. – Caa 22/10, 2015 at 17:9

for Prolog's N-ary predicates look on hypergraphs -- they looks much more native representations – Triboluminescent 7/8, 2017 at 22:20

There are some very cool parallels between a graph database and logic programming. It's pretty clever that you associated the two.

However, while they are abstractly able to describe the same datasets, prolog usually operates on small datasets and performs its scans in-memory. It's not a database and certainly not suited to scale with many/most of the real-time constraints that databases run across – namely a large volume of database writes.

A database like Datomic uses a subset of prolog (Datalog) as its query language, and might be slightly more compatible with your idea, but even that is far from a Labeled Property Graph (LPG) like neo4j. A big difference is that "labeled edges" between nodes, with properties, are not (as far as I can tell) a first-order concept in anything but an LPG. Although you can describe these edges or realtionships between nodes with eg. a join-table to create many-to-many relationships, they are much more fluid in something like neo4j.

Dosser answered 26/6, 2019 at 21:21 Comment(0)

From a computational point of view there is a big difference between both models. Prolog is Turing complete, which means that in principle any program in any other language can be translated into Prolog.

However, Neo4j query language *Cypher, and in general most database query languages, are not Turing complete, and therefore not suitable for representing any general program. This has pros and cons. The main con is that you usually need to combine the power on Neo4j with an external program in Python or in any other language to produce a useful application. The main pro is that all the queries in Cypher are terminating (although they might take very long time to finish) a very nice property for database queries; when you query the database you always expect an answer.

This does not happen in Prolog. A simple program like

 p(X):-p(X).

and a goal like p(a) leads to a non-terminating computation. This is the price you have to pay for having all the power of a Turing complete language.

If you want to look at another related paradigm, which is somehow between Prolog and Neo4j, have a look to deductive databases, like Datalog. The syntax of Datalog is similar to prolog (in fact it is a subset) but, analogously to database query languages, goals/queries in Datalog are always terminating.

For instance, the previous program in Datalog

  p(X):-p(X).

with the same goal p(a), produce the empy set {} of answers readily, instead of looping infinitely as in Prolog.

Plumbiferous answered 3/11, 2019 at 10:12 Comment(0)

Recommended topics

Hot tags