Is it possible to search a string with collator?
Asked Answered
L

2

7

I'm currently creating a mechanism to filter items by a query string.

I want to convert this to locale-aware version (basically, case-ignorance in English, but also for Japanese Kana):

return items.filter((item) => {
  return item.name.indexOf(query) !== -1;
});

I have heard of ES6 Intl.Collation, and I would like to use it if it achieves my goal.

Longtin answered 29/3, 2017 at 8:26 Comment(1)
Yes, but toLowerCase is English only.Longtin
D
0

The following works for French, and it may carry over to Japanese too:

const collator = new Intl.Collator('en', { sensitivity: 'base', usage: 'search' });
function contains(string, substring) {
  if (substring.length === 0) {
    return true;
  }
  string = string.normalize('NFC');
  substring = substring.normalize('NFC');
  let scan = 0;
  for (; scan + substring.length <= string.length; scan += 1) {
    const slice = string.slice(scan, scan + substring.length);
    if (collator.compare(substring, slice) === 0) {
      return true;
    }
  }
  return false;
}

Eg.

contains("à point", "a point")
true

Cf. https://github.com/adobe/react-spectrum/blob/7f63e933e61f20891b4cf3f447ab817f918cb263/packages/%40react-aria/i18n/src/useFilter.ts#L58

Disaccustom answered 6/6, 2023 at 7:6 Comment(0)
Z
0

I tried to improve and optimize the answer by charles-at-stack:

const searchCollator = new Intl.Collator('en', {
  usage: 'search',
  sensitivity: 'base',
  ignorePunctuation: true,
})

function includes(string, subString, {normalizeInputs = false} = {}) {
  if (normalizeInputs) {
    string = string.normalize('NFC')
    subString = subString.normalize('NFC')
  }
  const upperBound = string.length - subString.length + 1
  for (let i = 0; i < upperBound; i++)
    if (!searchCollator.compare(string.slice(i, i + subString.length), subString)) return true
  return false
}

console.log(includes('à point', 'a point'))

For TypeScript just add string type to the arguments:

function includes(string: string, subString: string, {normalizeInputs = false} = {}) // ...
Zacek answered 25/9 at 14:4 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.