Hive: regexp_replace square brackets
Asked Answered
W

3

6

I need to replace the square brackets around a key:value pair similar to the following. Any help is much appreciated!

data in 'properties' looks like this:

name: property1
value: [12345667:97764458]

**code**
SELECT p.name, regexp_replace(p.value,'[','') AS value
FROM properties p

Solved: Revised code

SELECT p.name, regexp_replace(p.value,'\\[|\\]','') AS value
FROM properties p;
Windstorm answered 15/1, 2014 at 2:20 Comment(1)
code: regexp_replace(p.value,'\[|\]','') worksCheloid
C
5

You always need to double your backslashes in Hive regexes. That's because a single backslash is used as an escape character in Hive strings so it would get stripped off before the regex parser can see it. A double backslash becomes a single backslash, which is what you need.

To see how backslashes get stripped just run a select using your regex as a string literal:

select '\\[' from t limit 1;
OK
\[
Calicut answered 5/2, 2014 at 18:44 Comment(0)
G
2

Here is you regex [\[\]]+ this will match one or more [ and ] in a string.

Gobo answered 15/1, 2014 at 2:31 Comment(3)
João, thanks. I'm still getting an error running this. Maybe I'm using the code incorrectly? Please see the revised code in the original submission.Windstorm
pe.value does not exist, you table name alias is 'p'.Buttermilk
Sorry that was a typo in the code above. I added the error returned.Windstorm
P
0

If you are attempting this from Python virtual environment, triple the /// to escape [ or ]

spark.sql("""    
    select 
        REGEXP_REPLACE(
            'arn:scheduler:::job/3bf19fab[xyz]-“#date{dt($YYY_MM_DD)}”',
            '[^a-zA-Z0-9:\/\\\-_{}()\\\[\\\]#$%,.]+',
            '',
            1) 
""").show(truncate=False)
arn:scheduler:::job/3bf19fab[xyz]-#date{dt($YYY_MM_DD)} 
Polymath answered 7/9, 2024 at 2:38 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.