I have a string field "myfield.keyword", where entries have the following format:
AAA_BBBB_CC
DDD_EEE_F
I am trying to create a scripted field that outputs the substring before the first _, a scripted field that outputs the substring between the first and second _ and a scripted field that outputs the substring after the second _.
I was trying to use .split('_') to do this, but found that this method is not available in Painless:
def newfield = "";
def path = doc[''myfield.keyword].value;
if (...)
{newfield = path.split('_')[1];} else {newfield="null";}
return newfield
I then tried the workaround suggested here, but found that I must enable regexes in Elastic (which would not be possible in my case):
def newfield = "";
def path = doc[''myfield.keyword].value;
if (...)
{newfield = /_/.split(path)[1];} else {newfield="null";}
return newfield
Is there a way to do this that does presuppose enabling regexes?
EDIT (after answer):
My question was not well formed. In particular, the string that needs to be split has four occurrences of '_'. Something like:
AAA_BB_CCC_DD_E
FFF_GGG_HH_JJJJ_KK
So, if I understand correctly, indexOf()
and lastIndexOf()
cannot give me BB, CCC or DD. I thought that I could adapt your solution, and find the index of the second and third occurrences of _, by using string.indexOf("_", 1)
and string.indexOf("_", 2)
. However, I always get the same result as string.indexOf("_")
, without any extra parameters (i.e. the result is always the index of _'s first occurence).