What is the best way to retrieve distinct / unique values using SPQuery?

Asked 18/2, 2009 at 10:39 Answered 30/8, 2017 at 20:17

I have a list that looks like:

Movie          Year
-----          ----
Fight Club     1999
The Matrix     1999
Pulp Fiction   1994

Using CAML and the SPQuery object I need to get a distinct list of items from the Year column which will populate a drop down control.

Searching around there doesn't appear to be a way of doing this within the CAML query. I'm wondering how people have gone about achieving this?

Sedum answered 18/2, 2009 at 10:39 Comment(0)

Another way to do this is to use DataView.ToTable-Method - its first parameter is the one that makes the list distinct.

            SPList movies = SPContext.Current.Web.Lists["Movies"];
            SPQuery query = new SPQuery();
            query.Query = "<OrderBy><FieldRef Name='Year' /></OrderBy>";

            DataTable tempTbl = movies.GetItems(query).GetDataTable();
            DataView v = new DataView(tempTbl);
            String[] columns = {"Year"};
            DataTable tbl = v.ToTable(true, columns);

You can then proceed using the DataTable tbl.

Enthusiastic answered 18/2, 2009 at 16:35 Comment(2)

Even though this was marked as the answer, this method still relies on retrieving every list item from the database and then finding the unique values. I think the whole point of asking how to do it in CAML is to avoid retrieving every item from the database... – Hudak 15/6, 2012 at 16:24

I agree with MgSam, however, as CAML is not a real substitute for an actual query language like SQL. This is as good as it gets. And this answer does move the conversation forward. +1 from me. – Transfiguration 18/8, 2013 at 16:35

If you want to bind the distinct results to a DataSource of for example a Repeater and retain the actual item via the ItemDataBound events' e.Item.DataItem method, the DataTable way is not going to work. Instead, and besides also when not wanting to bind it to a DataSource, you could also use Linq to define the distinct values.

// Retrieve the list. NEVER use the Web.Lists["Movies"] option as in the other examples as this will enumerate every list in your SPWeb and may cause serious performance issues
var list = SPContext.Current.Web.Lists.TryGetList("Movies");

// Make sure the list was successfully retrieved
if(list == null) return;

// Retrieve all items in the list
var items = list.GetItems();

// Filter the items in the results to only retain distinct items in an 2D array
var distinctItems = (from SPListItem item in items select item["Year"]).Distinct().ToArray()

// Bind results to the repeater
Repeater.DataSource = distinctItems;
Repeater.DataBind();

Remember that since there is no CAML support for distinct queries, each sample provided on this page will retrieve ALL items from the SPList. This may be fine for smaller lists, but for lists with thousands of listitems, this will seriously be a performance killer. Unfortunately there is no more optimized way of achieving the same.

Comprador answered 29/12, 2010 at 10:25 Comment(0)

There is no DISTINCT in CAML to populate your dropdown try using something like:

foreach (SPListItem listItem in listItems)
    {
        if ( null == ddlYear.Items.FindByText(listItem["Year"].ToString()) )
       {
                   ListItem ThisItem = new ListItem();
                   ThisItem.Text = listItem["Year"].ToString();
                   ThisItem.Value = listItem["Year"].ToString();
                   ddlYear.Items.Add(ThisItem);
        }
   }

Assumes your dropdown is called ddlYear.

Tyrr answered 18/2, 2009 at 11:5 Comment(0)

Can you switch from SPQuery to SPSiteDataQuery? You should be able to, without any problems.

After that, you can use standard ado.net behaviour:

SPSiteDataQuery query = new SPSiteDataQuery();
/// ... populate your query here. Make sure you add Year to the ViewFields.

DataTable table = SPContext.Current.Web.GetSiteData(query);

//create a new dataview for our table
DataView view = new DataView(table);

//and finally create a new datatable with unique values on the columns specified

DataTable tableUnique = view.ToTable(true, "Year");

Waltraudwaltz answered 18/2, 2009 at 16:29 Comment(0)

After coming across post after post about how this was impossible, I've finally found a way. This has been tested in SharePoint Online. Here's a function that will get you all unique values for a column. It just requires you to pass in the list Id, View Id, internal list name, and a callback function.

function getUniqueColumnValues(listid, viewid, column,  _callback){
var uniqueVals = [];
$.ajax({
    url: _spPageContextInfo.webAbsoluteUrl + "/_layouts/15/filter.aspx?ListId={" + listid + "}&FieldInternalName=" + column + "&ViewId={" + viewid + "}&FilterOnly=1&Filter=1",
    method: "GET",
    headers: { "Accept": "application/json; odata=verbose" }
    }).then(function(response) {
        $(response).find('OPTION').each(function(a,b){
            if ($(b)[0].value) {
                uniqueVals.push($(b)[0].value);
            }
        });
        _callback(true,uniqueVals);
},function(){
    _callback(false,"Error retrieving unique column values");
});

}

Eustatius answered 30/8, 2017 at 20:17 Comment(0)

I was considering this problem earlier today, and the best solution I could think of uses the following algorithm (sorry, no code at the moment):

L is a list of known values (starts populated with the static Choice options when querying fill-in options, for example)
X is approximately the number of possible options

1. Create a query that excludes the items in L
1. Use the query to fetch X items from list (ordered as randomly as possible)
2. Add unique items to L
3. Repeat 1 - 3 until number of fetched items < X

This would reduce the total number of items returned significantly, at the cost of making more queries.

It doesn't much matter if X is entirely accurate, but the randomness is quite important. Essentially the first query is likely to include the most common options, so the second query will exclude these and is likely to include the next most common options and so on through the iterations.

In the best case, the first query includes all the options, then the second query will be empty. (X items retrieved in total, over 2 queries)

In the worst case (e.g. the query is ordered by the options we're looking for, and there are more than X items with each option) we'll make as many queries as there are options. Returning approximately X * X items in total.

Jolda answered 23/11, 2012 at 21:8 Comment(0)

Recommended topics

Hot tags