How to use cql queries to get different datatypes out of cassandra with java client hector
Asked Answered
S

1

9

I'm new to cassandra and hector so i'm trying to execute cql queries but the problem is that not all columns are of type string so how dow I execute the query "select * from users"?

My column family looks like this:

UPDATE COLUMN FAMILY users
WITH comparator = UTF8Type
AND key_validation_class=UTF8Type
AND column_metadata = [
{column_name: full_name, validation_class: UTF8Type}
{column_name: email, validation_class: UTF8Type}
{column_name: state, validation_class: UTF8Type, index_type: KEYS}
{column_name: gender, validation_class: UTF8Type}
{column_name: birth_year, validation_class: LongType, index_type: KEYS}
{column_name: education, validation_class: UTF8Type}
];

I use the following code to execute the query:

CqlQuery<String, String, String> cqlQuery = new CqlQuery<String, String, String>(Keyspace,stringSerializer,stringSerializer,stringSerializer);

    cqlQuery.setQuery("select * from users");

    QueryResult<CqlRows<String, String, String>> result = cqlQuery.execute();


    if (result != null && result.get() != null) {
        List<Row<String, String, String>> list = result.get().getList();
        for (Row row : list) {
            System.out.println(".");
            List columns = row.getColumnSlice().getColumns();
            for (Iterator iterator = columns.iterator(); iterator.hasNext();) {
                HColumn column = (HColumn) iterator.next();
                System.out.print(column.getName() + ":" + column.getValue()
                        + "\t");
            }
            System.out.println("");
        }
    }

But because of the 'birth_year' column with validation class Long I can't get the value. I get the following result assuming that there is only one record:

KEY:Carl birth_year: 'strange chars?' full_name:Carl Smith gender:M eduction:electrician state:LA

If I change my query to this:

CqlQuery<String, String, Long> cqlQuery =  new CqlQuery<String, String, Long>
TutorialBase.tutorialKeyspace, stringSerializer, stringSerializer, longSerializer);

    cqlQuery.setQuery("select birth_year from users");

Than it works.

So how can I do this with only one query and what if I have more datatypes like booleans and floats in the rows of a column family?

Superfecundation answered 11/10, 2011 at 21:19 Comment(0)
A
11

You specify the value type as String in the CqlRows, so every value is expected to be a string. Because you want to mix Value types, you should keep the column metadata, but also specify the default validation class as BytesType in the schema and then use ByteBuffer as the type in CqlRows:

QueryResult<CqlRows<String, String, ByteBuffer>> result = cqlQuery.execute();

Then, when processing the values, you will have to convert to the appropriate type, and instead of iterating through the columns, you will probably get the specific column by name:

ColumnSlice<String, ByteBuffer> slice = row.getColumnSlice();
HColumn<String,ByteBuffer> col = slice.getColumnByName("birth_year");
System.out.println(" birth_year: " + col.getValue().getLong());

Of course, Strings have to be handled differently, using java.nio.charset.Charset:

Charset.defaultCharset().decode(col.getValue()).toString()

You can determine types from the Column meta-data, but I've only done this via the Thrift API (see ColumnDef), so not sure how to do it via Hector API. But HColumn does provide a getValueSerializer() method, so that could be a start.

Arvad answered 11/10, 2011 at 22:19 Comment(6)
Hi libjack, thanks for your reaction. Do you mean that it is only possible if all the columns in a column family have byteBuffer as the default validation class? That's not exectly what I want, because when inserting data into cassandra the check of valid data is not working. It would be possible to insert a string in the column birth_year. I am trying your code, but the method 'getLong()' is not recognized.Superfecundation
I found what was wrong whith: "col.getValue().getLong()" It should be "column.getValueBytes().getLong()" My previous question is solved. It is possible to have multiple validation_classe in a column family.Superfecundation
Right, getLong() is a method on ByteBuffer, so getValue() will only return a ByteBuffer if that is the type specified for HColumn:Arvad
He libjack, I'm trying to determine types from the column meta-data with ColumnDef. But do I need to ask for the columns trough the Trift api? And if yes how do I do that? You said that You've done this, do you have an example?Superfecundation
The Cluster#describeKeyspace method will return a KeyspaceDefinition with the Keyspace, ColumnFamily and metadata hierarchy fully populated.Soup
You can also replace Charset.defaultCharset().decode(col.getValue()).toString() with ByteBufferUtil.string(col.getValue()) (as of Hector 1.0-2)Gracielagracile

© 2022 - 2024 — McMap. All rights reserved.