PostgreSQL Regex Word Boundaries?
Asked Answered
S

3

84

Does PostgreSQL support \b?

I'm trying \bAB\b but it doesn't match anything, whereas (\W|^)AB(\W|$) does. These 2 expressions are essentially the same, aren't they?

Shona answered 29/9, 2010 at 20:41 Comment(0)
F
111

PostgreSQL uses \m, \M, \y and \Y as word boundaries:

\m   matches only at the beginning of a word
\M   matches only at the end of a word
\y   matches only at the beginning or end of a word
\Y   matches only at a point that is not the beginning or end of a word 

See Regular Expression Constraint Escapes in the manual.

There is also [[:<:]] and [[:>:]], which match the beginning and end of a word. From the manual:

There are two special cases of bracket expressions: the bracket expressions [[:<:]] and [[:>:]] are constraints, matching empty strings at the beginning and end of a word respectively. A word is defined as a sequence of word characters that is neither preceded nor followed by word characters. A word character is an alnum character (as defined by ctype) or an underscore. This is an extension, compatible with but not specified by POSIX 1003.2, and should be used with caution in software intended to be portable to other systems. The constraint escapes described below are usually preferable (they are no more standard, but are certainly easier to type).

Femmine answered 29/9, 2010 at 20:46 Comment(1)
I don't quite understand if "\m" and "[[:<:]]" are synonymous, or a case where they would act differently.Ehr
E
21

A simple example

select * from table_name where column ~* '\yAB\y';

This will match AB ab ab - text text ab text AB text-ab-text text AB text ...

But you have to use:

select * from sometable where name ~* '\\yAB\\y';

in case you have standard_conforming_strings flag set to OFF. Note the double slashes.
You can set it manually :

set standard_conforming_strings=on;

Then :select * from table_name where column ~* '\yAB\y'; should work.

Escobedo answered 3/1, 2015 at 8:55 Comment(3)
I use postgres 9.3.10 and value ~* '\yAB\y' works just fine. Is your note 9.2-specific?Goldschmidt
Ditto, re: postgres 10.2: ... where synonyms ~* '\ya1b\y'; works; the double-backslashed version '\\ya1b\\y' does not work.Scaler
"Change the default value of standard_conforming_strings to on" - postgresql.org/docs/9.1/release-9-1.htmlSparoid
S
6

Exact word search in text:

I was facing following problem.

I wanted to search all contacts which has 'cto' as exact word in titles, but in results was getting results with title having 'director' in it, I was using following query

select * from contacts where title ilike '%cto%';

I also tried with whitspaces around wildcard as '% cto %', it was getting matched with text which contains ' cto ', got results like 'vp, cto and manger', but not results with exact title as 'cto'.

I wanted both 'vp, cto and manger' and 'cto' in results, but not 'director' in results

Following worked for me

select * from contacts where title ~* '\\ycto\\y';

~   Matches regular expression, case sensitive
~*  Matches regular expression, case insensitive    
Sapsucker answered 22/4, 2015 at 13:15 Comment(1)
When you match phrase ILIKE '% cto %, you just need to add spaces around phrase: ' ' || phrase || ' ' ILIKE '% cto %'. This will works for title as 'cto'. Thank you, because your idea helps me to found this solution: #18080604Sheldonshelduck

© 2022 - 2024 — McMap. All rights reserved.