We're currently using Hazelcast (http://hazelcast.org/) as a distributed in-memory data grid. That's been working sort-of-well for us, but going solely in-memory has exhausted its path in our use case, and we're considering porting our application to a NoSQL persistent store. After the usual comparisons and evaluations, we're borderline close to picking Cassandra, plus eventually Spark for analytics.
Nonetheless, there is a gap in our architectural needs that we're still not grasping how to solve in Cassandra (with or without Spark): Hazelcast allows us to create a Continuous Query in that, whenever a row is added/removed/modified from the clause's resultset, Hazelcast calls up back with the corresponding notification. We use this to continuously update the clients via AJAX streaming with the new/changed rows.
This is probably a conceptual mismatch we're making, so - how to best address this use case in Cassandra (with or without Spark's help)? Is there something in the API that allows for Continuous Queries on key/clause changes (haven't found it)? Is there some other way to get a stream of key/clause updates? Events of some sort?
I'm aware that we could, eventually, periodically poll Cassandra, but in our use case, the client is potentially interested in a large number of table clause notifications (think "all changes to Ship positions on California's coastline"), and iterating out of the store would kill the streamer's scalability.
Hence, the magic question: what are we missing? Is Cassandra the wrong tool for the job? Are we not aware of a particular part of the API or external library in/outside the apache realm that would allow for this?
Many thanks for any assistance!
Hugo