When you create an index for a column, the values in the column are used literally as the keys of the index. In your case, the keys for your bitrates
index would be the objects within the bitrates
array in the document.
It seems like what you want is an index that's derived from the values in a field of the document. To do that, you want to define a custom indexing function that reduces the document to just the data you care about. The easiest way to experiment with them is to start by writing a query, and once you're happy with the results, converting it into an indexCreate()
statement.
Here's a statement that grabs your sample document (with id 1), and plucks the format
and rate
terms from all of the objects in its bitrate
array, and then merges them together to create a distinct set of strings:
r.db('music').table('catalog').get(1).do(function(row) {
return row('bitrates').map(function(bitrate) {
return [bitrate('format'), bitrate('rate')];
}).reduce(function(left, right) {
return left.setUnion(right);
})
})
Running this statement will return the following:
["mp3", "128K", "aac", "192K"]
This looks good, so we can use our function to create an index. In this case, since we're expecting the indexing function to return a set of items, we also want to specify {multi: true}
to ensure we can query by the items in the set, not the set itself:
r.db('music').table('catalog').indexCreate('bitrates', function(row) {
return row('bitrates').map(function(bitrate) {
return [bitrate('format'), bitrate('rate')];
}).reduce(function(left, right) {
return left.setUnion(right);
})
}, {multi: true})
Once created, you can query your index like this:
r.db('music').table('catalog').getAll('mp3', {index: 'bitrates'})
You can also supply multiple query terms, to match rows that match any of the items:
r.db('music').table('catalog').getAll('mp3', '128K', {index: 'bitrates'})
However, if a single document matches more than one term in your query, it will be returned more than once. To fix this, add distinct()
:
r.db('music').table('catalog').getAll('mp3', '128K', {index: 'bitrates'}).distinct()
If necessary, you might also consider using downcase()
to normalize the casing of the terms used in the secondary index.
You could also skip all of the indexing business entirely and use a filter()
query:
r.db('music').table('catalog').filter(function(row) {
return row('bitrates').map(function(bitrates) {
return [bitrates('format'), bitrates('rate')];
}).reduce(function(left, right) {
return left.setUnion(right);
}).contains('mp3');
})
That said, if you're almost always querying your table in the same manner, generating a secondary index using a custom function will result in dramatically better performance.
{multi:true}
. This is more of an exploratory thing I'm doing - either way your answer is stellar thank you :) – Ensconce