Regex to extract value between a single quote and parenthesis using boost token iterator
Asked Answered
D

2

-1

I have a value like this:

Supoose I have a string:

s = "server ('m1.labs.teradata.com') username ('u\'se)r_*5') password('uer 5')  dbname ('default')";

I need to extract

  • token1 : 'm1.labs.teradata.com'
  • token2 : 'u\'se)r_*5'
  • token3 : 'uer 5'

I am using the following regex in cpp:

regex re("(\'[!-~]+\')"); 

sregex_token_iterator i(s.begin(), s.end(), re, 0);
sregex_token_iterator j;

unsigned count = 0;
while(i != j)
  {
    cout << "the token is"<<"   "<<*i++<< endl;
    count++;
  }
cout << "There were " << count << " tokens found." << endl;

return 0;
Defalcate answered 19/7, 2017 at 13:37 Comment(16)
Simpliest would be '[^']+'Untold
You need to capture that part, and use str(1) to get the capturing group #1 value.Reeding
@Slava: I have a value like this: arg1('FooBar') arg2('Another Value') something else What regex will return the values enclosed in the quotation marks (e.g. FooBar and Another Value)? I am using the following regex in cpp: regex re("(\'[^']+\')") Like this??Defalcate
Supoose I have a string: s = "server ('m1.labs.teradata.com') username ('u\'se)r_*5') password('uer 5') dbname ('default')"; I need to extract token1 : 'm1.labs.teradata.com' token2 : 'u\'se)r_*5' token3 : 'uer 5'Defalcate
Do you have to use the sregex_token_iteratorrather than the regular one?Florentinaflorentine
You should put examples in your question that are relevant to your problem. Perhaps edit your question to include your new examples? They turn it into a different question.Florentinaflorentine
No I can use anything else , but it should extract the string.Defalcate
Here is your solution. Here is a safer variation.Reeding
This utterly trivial search doesn't need the complexity of regular expressions. Use start = std::string::find('\'') to find the beginning of the target text and then use end = std::string::find('\'', start) to find the end.Catheryncatheter
@WiktorStribiżew I want to extract string in the form 'm1.labs.teradata.com' 'u\'se)r_*5' 'uer 5' With the single quotes around it, is their a way? And also my string can contain special character such as "user')5" . With your regex I am not able to do so. I have used the regex as asked in the question but that does not work for me. Please can you help me out with this? And also can you please explain the regex you have writtenDefalcate
With single quotes and with both types of quotes.Reeding
@WiktorStribiżew The one with single qoutes one: If I have a string username ('u\\'se)'r_*5'), I want to include the quote which is before r, but the regex is not doing that. Thanks for your time. I really need the help.Defalcate
No idea what the rules are now. I cannot help when example strings are inconsistent. Ask Slava, he seems to understand your issue better.Reeding
@WiktorStribiżew Can you help me out with how can I involve a single quotes withing the string which I am extracting? Example : user'5Defalcate
Do you mean username ('u\\'se)r_*5') please correct you question or even better provide minimal reproducible exampleUntold
@Untold -: Suppose my string is: "server ('m1.labs.teradata.com') username ('u\'se)r_*5') password('uer 5') dbname ('default')"; Consider the username ('u\'se)r_*5') As you can see I am trying to escape the single quote in " 'u\'se)r_5' " using " \' ", so the output token looks like 'u'se)r_5'. Can You please help me out with the regex for the same.Defalcate
U
2

If you do not expect symbol ' inside your string then '[^']+' would match what you need:

regex re("'[^']+'");

live example Result:

the token is   'FooBar'
the token is   'Another Value'
There were 2 tokens found.

if you do not need single quotes to be part of match change code to:

regex re("'([^']+)'");

sregex_token_iterator i(s.begin(), s.end(), re, {1});

another live example

the token is   FooBar
the token is   Another Value
There were 2 tokens found.
Untold answered 19/7, 2017 at 13:47 Comment(4)
OP already mentioned the string may contain escaped single quotes.Reeding
@WiktorStribiżew where?Untold
@WiktorStribiżew Comments are not part of the question, OP needs to fix the question. Currently this answer is correct.Florentinaflorentine
I do not understand what OP needs now.Reeding
D
0

The correct regex for this string would be

(?:'(.+?)(?<!\\)')

https://regex101.com/r/IpzB80/1

Derry answered 19/7, 2017 at 14:43 Comment(3)
The output string should contain escaped single quote, but the regex givven by you does not work.Defalcate
Try this string : It is not working: server ('m1.labs.teradata.com') username ('u\'se)r_*5') password('ue/'r5') dbname ('default')Defalcate
@Derry C++11 regex does not support look behind - #14539187Untold

© 2022 - 2024 — McMap. All rights reserved.