What is the CouchDB equivalent of the SQL COUNT(*) aggregate function?
Asked Answered
W

2

22

Yep, I'm a SQL jockey (sorta) coming into the CouchDb Map/Reduce world. I thought I had figured out how the equivalent of the COUNT(*) SQL aggregator function for CouchDB datasets with the following:

Map:

function(doc) {
  emit(doc.name, doc);
}

Reduce:

function(keys, values, rereduce){
  return values.length;
}

Which I thought worked, returning something like:

"super fun C"   2
"super fun D"   2
"super fun E"   2
"super fun F"   18

... but not really. When I add a record, this count varies wildly. Sometimes the count actually decreases, which was very surprising. Am I doing something wrong? Maybe I don't fully understand the concept of eventual consistency?

Wrath answered 19/10, 2009 at 1:27 Comment(0)
V
28

It looks like your reduce results are being re-reduced. That is, reduce is called more than once for each key and then called again with those results. You can handle that with a reduce function like this:

function(keys, values, rereduce) {
  if (rereduce) {
    return sum(values);
  } else {
    return values.length;
  }
}

Alternatively, you can change the map function so that the values are always a count of documents:

// map
function(doc) {
  emit(doc.name, 1);
}

// reduce
function(keys, values, rereduce) {
  return sum(values);
}
Vantassel answered 19/10, 2009 at 1:56 Comment(1)
Using javascript reduce functions instead of the built in ones will give you very bad performance. See David's answerCostumier
R
44

In your reduce just put:

_count

You can also get a sum using:

_sum

so basically reduce: "_sum" or reduce: "_count" and make sure the value your map emits is a valid integer (numeric value)

See "Built in reduce functions".

Rathbun answered 30/3, 2010 at 15:29 Comment(1)
This is the better answer. Read the link David posted here about built-in functions.Helluva
V
28

It looks like your reduce results are being re-reduced. That is, reduce is called more than once for each key and then called again with those results. You can handle that with a reduce function like this:

function(keys, values, rereduce) {
  if (rereduce) {
    return sum(values);
  } else {
    return values.length;
  }
}

Alternatively, you can change the map function so that the values are always a count of documents:

// map
function(doc) {
  emit(doc.name, 1);
}

// reduce
function(keys, values, rereduce) {
  return sum(values);
}
Vantassel answered 19/10, 2009 at 1:56 Comment(1)
Using javascript reduce functions instead of the built in ones will give you very bad performance. See David's answerCostumier

© 2022 - 2024 — McMap. All rights reserved.