I have been reading a lot about Apache Avro
these days and I am more inclined towards using it instead of using JSON
. Currently, what we are doing is, we are serializing the JSON
document using Jackson
and then writing that serialize JSON
document into Cassandra
for each row key/user id
. Then we have a REST service that reads the whole JSON
document using the row key and then deserialize it and use it further.
We will write into Cassandra like this-
user-id column-name serialize-json-document-value
Below is an example which shows the JSON document that we are writing into Cassandra. This JSON document is for particular row key/user id.
{
"lv" : [ {
"v" : {
"site-id" : 0,
"categories" : {
"321" : {
"price_score" : "0.2",
"confidence_score" : "0.5"
},
"123" : {
"price_score" : "0.4",
"confidence_score" : "0.2"
}
},
"price-score" : 0.5,
"confidence-score" : 0.2
}
} ],
"lmd" : 1379214255197
}
Now we are thinking to use Apache Avro so that we can compact this JSON document by serializing with Apache Avro and then store it in Cassandra. I have couple of questions on this-
- Is it possible to serialize the above JSON document using Apache Avro first of all and then write it into Cassandra? If yes, how can I do that? Can anyone provide a simple example?
- And also we need to deserialize it as well while reading back from Cassandra from our REST service. Is this also possible to do?
Below is my simple code which is serializing the JSON document and printing it out on the console.
public static void main(String[] args) {
final long lmd = System.currentTimeMillis();
Map<String, Object> props = new HashMap<String, Object>();
props.put("site-id", 0);
props.put("price-score", 0.5);
props.put("confidence-score", 0.2);
Map<String, Category> categories = new HashMap<String, Category>();
categories.put("123", new Category("0.4", "0.2"));
categories.put("321", new Category("0.2", "0.5"));
props.put("categories", categories);
AttributeValue av = new AttributeValue();
av.setProperties(props);
Attribute attr = new Attribute();
attr.instantiateNewListValue();
attr.getListValue().add(av);
attr.setLastModifiedDate(lmd);
// serialize it
try {
String jsonStr = JsonMapperFactory.get().writeValueAsString(attr);
// then write into Cassandra
System.out.println(jsonStr);
} catch (JsonGenerationException e) {
e.printStackTrace();
} catch (JsonMappingException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
Serialzie JSON document will look something like this -
{"lv":[{"v":{"site-id":0,"categories":{"321":{"price_score":"0.2","confidence_score":"0.5"},"123":{"price_score":"0.4","confidence_score":"0.2"}},"price-score":0.5,"confidence-score":0.2}}],"lmd":1379214255197}
AttributeValue
and Attribute
class are using Jackson Annotations
.
And also one important note, properties inside the above json document will get changed depending on the column names. We have different properties for different column names. Some column names will have two properties, some will have 5 properties. So the above JSON document will have its correct properties and its value according to our metadata that we are having.
I hope the question is clear enough. Can anyone provide a simple example for this how can I achieve that using Apache Avro. I am just starting with Apache Avro so I am having lot of problems..