Create an addEdge() Gremlin query that won't duplicate for Titan
Asked Answered
M

2

6

Is there a way to create a unique edge between two vertices on a Titan graph and confirm it can't be created again, unless deleted and recreated?

Basically I need to create:

vertex1--follows-->vertex2

But I keep creating multiple edges for the same relationship:

vertex1--follows-->vertex2
vertex1--follows-->vertex2
vertex1--follows-->vertex2
vertex1--follows-->vertex2

My basic addEdge query is this:

def follow(target)
  grem = "g.addEdge(
    g.V('id', '#{id}').next(),
    g.V('id', '#{target.id}').next(),
    'follows',
    [since:#{Time.now.year}]
  )"

  $graph.execute(grem).results
end

What I am trying to find is something like this

def follow(target)
  grem = "g.addEdge(
    g.V('id', '#{id}').next(),
    g.V('id', '#{target.id}').next(),
    'follows',
    [since:#{Time.now.year}]
  ).unique(Direction.OUT)"

  $graph.execute(grem).results
end

In this document there is a method called unique, but I cannot seem to get this to work on edges, only properties of vertices.

https://github.com/thinkaurelius/titan/wiki/Type-Definition-Overview

I could run a query before the create addEdge to check for an existing edge, but that seems hacky and could cause issues with a race condition.

Is it possible a method exists which I can append to addEdge which will prevent creating a duplicate edge if an edge already exists?

Or, is there a way to create a unique property label on an edge?

Here is a gremlin session of the issue:

gremlin>  g.makeType().name('follows').unique(IN).makeEdgeLabel();
==>v[36028797018964558]
gremlin> u = g.addVertex([name:'brett'])
==>v[120004]
gremlin> u2 = g.addVertex([name:'brettU'])
==>v[120008]
gremlin> e = g.addEdge(u, u2, 'follows')
==>e[2w5N-vdy-2F0LaTPQK2][120004-follows->120008]
gremlin> e = g.addEdge(u, u2, 'follows')
An edge with the given type already exists on the in-vertex
Display stack trace? [yN] 
gremlin> e = g.addEdge(u2, u, 'follows')
==>e[2w5P-vdC-2F0LaTPQK2][120008-follows->120004]
gremlin> u3 = g.addVertex([name:'brett3'])
==>v[120012]
gremlin> e = g.addEdge(u3, u, 'follows')
An edge with the given type already exists on the in-vertex
Display stack trace? [yN] N
gremlin> g.E
==>e[2w5N-vdy-2F0LaTPQK2][120004-follows->120008]
==>e[2w5P-vdC-2F0LaTPQK2][120008-follows->120004]

Setting up the unique(IN|BOTH|OUT) creates an issue where we can only have one follower per user. This of course would make it impossible to have a user -> follows -> [users] relationship.

Here is another example of trying to set a unique property on an edge, this fails also:

gremlin> g.makeType().name('follows_id').unique(BOTH).makeEdgeLabel();
==>v[36028797018964942]
gremlin>  u = g.addVertex([name:'brett'])
==>v[200004]
gremlin>  u2 = g.addVertex([name:'brett2'])
==>v[200008]
gremlin>  u3 = g.addVertex([name:'brett3'])
==>v[200012]
gremlin> e = g.addEdge(u, u2, 'follows', [follows_id:'200004-20008'])
Value must be a vertex
Display stack trace? [yN] N
gremlin> g.E
==>e[4c9z-Q1S-2F0LaTPQQu][200004-follows->200008]
gremlin> e = g.addEdge(u, u2, 'follows', [follows_id:'200004-20008'])
Value must be a vertex
Display stack trace? [yN] N
gremlin> g.E
==>e[4c9z-Q1S-2F0LaTPQQu][200004-follows->200008]
==>e[4c9B-Q1S-2F0LaTPQQu][200004-follows->200008]
Marks answered 12/9, 2013 at 23:22 Comment(0)
J
7

To close the loop here, this question was answered in the Aurelius Graphs Mailing List. Basically:

we don't really see a use case for a uniqueness constraints to apply to pairs of vertices (a la - only one edge can exist between vertex A and B) for these reasons:

  • most times, you can get rid of the duplication quite cheaply on the query side with a dedup(): v.out('follows').dedup().....
  • the likelihood of conflict is much lower (due to the N^2 combinations of vertices) which makes locks just waaaay to expensive compared to the likelihood of conflict.

In short, you should validate edge existence in your application as it cannot be enforced by Titan.

Jaquez answered 16/9, 2013 at 11:13 Comment(1)
Idempotent updates are part and parcel of distributed systems. Dedup works... until you've modified an edge 1000s of times (e.g. changed the weight), and then there starts to become a massive leak of resources.Melise
M
0

This prevents duplication in application code versus a DB configuration and solves the issue we were having.

   grem = "
      if(g.V('uid', '#{id}').out('follows').has('id', g.V('uid', '#{target.id}').next().id).hasNext() == true){
        println 'already connected'
      } else{
        g.addEdge(
          g.V('uid', '#{id}').next(),
          g.V('uid', '#{target.id}').next(),
          'follows',
          [since:(new java.util.Date()).getTime()]
        )
      }"
    $graph.execute(grem).results
Marks answered 16/9, 2013 at 18:13 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.