Power Query check if string contains strings from a list
Asked Answered
A

4

6

Is there a way to check a text field to see if it contains any of the strings from a list?

Example Strings to Check:

The raisin is green
The pear is red
The apple is yellow

List Example to Validate Against

red
blue
green

The result would be

either:

green
red
null

or:

TRUE
TRUE
FALSE
Ajani answered 7/11, 2018 at 20:49 Comment(0)
D
7

Daniel has a decent solution, but it won't work if the example strings aren't space-separated. For example, The brick is reddish would detect red as a substring.

You can create a custom column with this formula instead:

(C) => List.AnyTrue(List.Transform(Words, each Text.Contains(C[Texts], _)))

This takes the list Words = {"red","blue","green"} and checks if each of the colors in the list is contained in the [Texts] column for that row. If any are, then it returns TRUE otherwise FALSE.

The whole query looks like this:

let
    TextList = {"The raisin is green","The pear is red","The apple is yellow"},
    Texts = Table.FromList(TextList, Splitter.SplitByNothing(), {"Texts"}, null, ExtraValues.Error),
    Words = {"red","blue","green"},
    #"Added Custom" = Table.AddColumn(Texts, "Check", (C) => List.AnyTrue(List.Transform(Words, each Text.Contains(C[Texts], _))))
in
    #"Added Custom"

Edit: As @Greg points out, this can be simplified using List.MatchesAny:

(C) => List.MatchesAny(Words, each Text.Contains(C[Texts], _)))
Digraph answered 8/11, 2018 at 2:33 Comment(2)
Was exactly the solution to my problem and I think it should be the approved solution. Just an additional note for the people who read that but need to that the word list coming from the row instead of static. If you have a column with the word list [Words] then you can use: (C) => List.AnyTrue(List.Transform(C[Words], each Text.Contains(C[Texts], _)))Anamorphoscope
@Anamorphoscope We can simplify List.AnyTrue(List.Transform(...)) into List.MatchesAny(...).Granulation
G
3

Here's a nifty function Text_ContainsAny(), which returns true wherever one (or more) of the substrings is contained in text, and false otherwise.

let Text_ContainsAny = (
    text as text,
    substrings as list,
    optional comparer as nullable function
) as logical =>
    List.MatchesAny(substrings, each Text.Contains(text, _, comparer))
in
    Text_ContainsAny

If you want the matching substrings themselves, then this function Text_Substrings() should return them in a list:

let Text_Substrings = (
    text as text,
    substrings as list,
    optional comparer as nullable function
) as list =>
    List.Select(substrings, each Text.Contains(text, _, comparer))
in
    Text_Substrings
Granulation answered 10/4 at 4:41 Comment(0)
F
0

This will make the trick, it's PowerQuery ("M") code:

let
    Texts = {"The raisin is green","The pear is red","The apple is yellow"},
    Words = {"red","blue","green"},
    TextsLists = List.Transform(Texts, each Text.Split(_," ")),
    Output = List.Transform(TextsLists, each List.Count(List.Intersect({_,Words}))>0)
in
    Output

There are two lists: the sentences (Texts) and the words to check (Words). The first thing to do is to convert the sentences in lists of words splitting the strings using " " as the delimiter.

TextsLists = List.Transform(Texts, each Text.Split(_," ")),

Then you "cross" the new lists with the list of Words. The result are lists of elements (strings) that appears in both lists (TextLists and Words). Now you count these new lists and check if the result is bigger than cero.

Output = List.Transform(TextsLists, each List.Count(List.Intersect({_,Words}))>0)

Output is a new list {True, True, False).

Alternatively, you can change the Output line by this one:

Output = List.Transform(TextsLists, each List.Intersect({_,Words}){0}?)

This will return a list of the first coincidence or null if there's no coincidence. In the example: {"green", "red", "null"}

Hope this helps you.

Fillister answered 7/11, 2018 at 22:2 Comment(1)
If you're transforming to a list, you can use the List.ContainsAny function.Digraph
P
0

each if Text.Remove([Texts], {"The raisin is green","The pear is red","The apple is yellow"})<>[Texts] then ...

Paleontography answered 3/2, 2022 at 19:4 Comment(1)
As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.Rights

© 2022 - 2024 — McMap. All rights reserved.