I'm looking at using Titan to create a scalable geospatial data store (I'm thinking R trees). In the documentation, there is a GeoShape
query, and the docs say that titan can do geo data with Lucene or ElasticSearch. However, it seems like this would be very slow because traversing nodes in cassandra is essentially doing join queries in cassandra which is a really bad idea. I think I might be misunderstanding the data representation.
I read the Titan Data Model doc, and I still don't quite get it. If all the edges are stored in a Cassandra row, then Titan would still have to "join" on a vertex table. One way to solve this would be to make the column value equal to the edge property data, and then you could neatly package the vertex data and the edge data into the row. However, this breaks down when you want to do queries deeper than 1 node, and we're back to the joining problem again.
So. Is titan emulating join queries in Cassandra? - and - How performant is it at geo lookups under these conditions?