I don't see any way to do this in less than O(C) per query, but I have some ideas on how to maximize efficiency. The idea is basically to build a lookup table for each element. If some elements are rare and some are common, you can have positive and negative lookup tables:
s[i] // your query, an array of size 2 thousand, true/false
sign[i] // whether the ith element is positive/negative lookup. +/- 1
sets[i] // a list of all the sets that the ith element belongs/(doesn't) to
query(s):
overlaps[i] // an array of size C, initialized to 0's
for i in len(s):
if s[i]:
for j in sets[i]:
overlaps[j] += sign[i]
return max_index(overlaps)
Especially if many of your elements are of widely differing probabilities (as you said), this approach should save you some time: very rare or very common elements can be dealt with almost instantly.
To further optimize: you can sort the structure so that the elements that are most common/most rare are dealt with first. After you have done the first e.g. 3/4, you can do a quick pass to see if the closest matching set is so far ahead of the next set that it is not necessary to continue, though again whether that is worthwhile depends on the details of your data's distribution.
Yet another refinement: make sets[i] one of two possible structures: if the element is very rare or common, sets[i] is just a list of the sets that the ith element is in/not in. However, suppose the ith element is in half the sets. Then sets[i] is just a list of indices half as long as the number of sets, looping through it and incrementing overlaps is wasteful. Have a third value for sign[i]: if sign[i] == 0, then the ith element is relatively close to 50% commonality (this may just mean between 5% and 95%, or anything else), and instead of a list of sets in which it appears, it will simply be an array of 1's and 0's with length equal to C. Then you would just add the array in its entirety to overlaps which would be faster.