The problem is naturally O(n), as a result your solution is probably the most efficient.
Since you are trying to count an arbitrary subset of bits you cannot count the bits when they are set (would would provide a speed boost if you are not setting the bits too often).
You could check to see if the processor you are using has a command which will return the number of set bits. For example a processor with SSE4 could use the POPCNT according to this post. This would probably not work for you since .Net does not allow assembly (because it is platform independent). Also, ARM processors probably do not have an equivalent.
Probably the best solution would be a look up table (or switch if you could guarantee the switch will compiled to a single jump to currentLocation + byteValue). This would give you the count for the whole byte. Of course BitArray does not give access to the underlying data type so you would have to make your own BitArray. You would also have to guarantee that all the bits in the byte will always be part of the intersection which does not sound likely.
Another option would be to use an array of booleans instead of a BitArray. This has the advantage not needing to extract the bit from the others in the byte. The disadvantage is the array will take up 8x as much space in memory meaning not only wasted space, but also more data push as you iterate through the array to perform your count.
The difference between a standard array look up and a BitArray look up is as follows:
Array:
- offset = index * indexSize
- Get memory at location + offset and save to value
BitArray:
- index = index/indexSize
- offset = index * indexSize
- Get memory at location + offset and save to value
- position = index%indexSize
- Shift value position bits
- value = value and 1
With the exception of #2 for Arrays and #3 most of these commands take 1 processor cycle to complete. Some of the commands can be combined into 1 command using x86/x64 processors, though probably not with ARM since it uses a reduced set of instructions.
Which of the two (array or BitArray) perform better will be specific to your platform (processor speed, processor instructions, processor cache sizes, processor cache speed, amount of system memory (Ram), speed of system memory (CAS), speed of connection between processor and RAM) as well as the spread of indexes you want to count (are the intersections most often clustered or are they randomly distributed).
To summarize: you could probably find a way to make it faster, but your solution is the fastest you will get for your data set using a bit per boolean model in .NET.
Edit: make sure you are accessing the indexes you want to count in order. If you access indexes 200, 5, 150, 151, 311, 6 in that order then you will increase the amount of cache misses resulting in more time spent waiting for values to be retrieved from RAM.