First I would change how ListControl
sees your data source, you're converting result IEnumerable<string>
to List<string>
. Especially when you just typed few characters this may be inefficient (and unneeded). Do not make expansive copies of your data.
- I would wrap
.Where()
result to a collection that implements only what is required from IList
(search). This will save you to create a new big list for each character is typed.
- As alternative I would avoid LINQ and I'd write something more specific (and optimized). Keep your list in memory and build an array of matched indices, reuse array so you do not have to reallocate it for each search.
Second step is to do not search in the big list when small one is enough. When user started to type "ab" and he adds "c" then you do not need to research in the big list, search in the filtered list is enough (and faster). Refine search every time is possible, do not perform a full search each time.
Third step may be harder: keep data organized to be quickly searched. Now you have to change the structure you use to store your data. imagine a tree like this:
A B C
Add Better Ceil
Above Bone Contour
This may simply be implemented with an array (if you're working with ANSI names otherwise a dictionary would be better). Build the list like this (illustration purposes, it matches beginning of string):
var dictionary = new Dictionary<char, List<string>>();
foreach (var user in users)
{
char letter = user[0];
if (dictionary.Contains(letter))
dictionary[letter].Add(user);
else
{
var newList = new List<string>();
newList.Add(user);
dictionary.Add(letter, newList);
}
}
Search will be then done using first character:
char letter = textBox_search.Text[0];
if (dictionary.Contains(letter))
{
listBox_choices.DataSource =
new MyListWrapper(dictionary[letter].Where(x => x.Contains(textBox_search.Text)));
}
Please note I used MyListWrapper()
as suggested in first step (but I omitted by 2nd suggestion for brevity, if you choose right size for dictionary key you may keep each list short and fast to - maybe - avoid anything else). Moreover note that you may try to use first two characters for your dictionary (more lists and shorter). If you extend this you'll have a tree (but I don't think you have such big number of items).
There are many different algorithms for string searching (with related data structures), just to mention few:
- Finite state automaton based search: in this approach, we avoid backtracking by constructing a deterministic finite automaton (DFA) that recognizes stored search string. These are expensive to construct—they are usually created using the powerset construction—but are very quick to use.
- Stubs: Knuth–Morris–Pratt computes a DFA that recognizes inputs with the string to search for as a suffix, Boyer–Moore starts searching from the end of the needle, so it can usually jump ahead a whole needle-length at each step. Baeza–Yates keeps track of whether the previous j characters were a prefix of the search string, and is therefore adaptable to fuzzy string searching. The bitap algorithm is an application of Baeza–Yates' approach.
- Index methods: faster search algorithms are based on preprocessing of the text. After building a substring index, for example a suffix tree or suffix array, the occurrences of a pattern can be found quickly.
- Other variants: some search methods, for instance trigram search, are intended to find a "closeness" score between the search string and the text rather than a "match/non-match". These are sometimes called "fuzzy" searches.
Few words about parallel search. It's possible but it's seldom trivial because overhead to make it parallel can be easily much higher that search itself. I wouldn't perform search itself in parallel (partitioning and synchronization will become soon too expansive and maybe complex) but I would move search to a separate thread. If main thread isn't busy your users won't feel any delay while they're typing (they won't note if list will appear after 200 ms but they'll feel uncomfortable if they have to wait 50 ms after they typed). Of course search itself must be fast enough, in this case you don't use threads to speed up search but to keep your UI responsive. Please note that a separate thread will not make your query faster, it won't hang UI but if your query was slow it'll still be slow in a separate thread (moreover you have to handle multiple sequential requests too).
HashSet<T>
won't help you here, because you're searching the part of string. – CantilevertextBox_search.Text
which makes native API calls to get the content of the textbox. So practice wins yet again. See my answer bellow for details. – Incipit