Regex to get the words after matching string
Asked Answered
P

6

100

Below is the content:

Subject:
    Security ID:        S-1-5-21-3368353891-1012177287-890106238-22451
    Account Name:       ChamaraKer
    Account Domain:     JIC
    Logon ID:       0x1fffb

Object:
    Object Server:  Security
    Object Type:    File
    Object Name:    D:\ApacheTomcat\apache-tomcat-6.0.36\logs\localhost.2013-07-01.log
    Handle ID:  0x11dc

I need to capture the words after the Object Name: word in that line. Which is D:\ApacheTomcat\apache-tomcat-6.0.36\logs\localhost.2013-07-01.log.

How can I do this?

^.*\bObject Name\b.*$ matches - Object Name

Picked answered 5/10, 2013 at 2:7 Comment(0)
E
60

If you are using a regex engine that doesn't support \K, the following should work for you:

[\n\r].*Object Name:\s*([^\n\r]*)

Working example

Your desired match will be in capture group 1.


[\n\r][ \t]*Object Name:[ \t]*([^\n\r]*)

Would be similar but not allow for things such as " blah Object Name: blah" and also make sure that not to capture the next line if there is no actual content after "Object Name:"

Eberle answered 5/10, 2013 at 2:18 Comment(12)
But i need the match result to be D:\ApacheTomcat\apache-tomcat-6.0.36\logs\localhost.2013-07-01.log not in a match groupPicked
@CasperNine, why? And what language are you using?Eberle
because the program im using captures only match result. Im using a log management tool called logstash. put your regex to this site regexpal.com and see.. it matches the whole line.Picked
@CasperNine, it depends on if that supports lookbehinds. Try this and let me know your result: (?<=Object Name:)([^\n\r]*) See hereEberle
@CasperNine, then you'll have to either use capture groups or base it off the following line like this: [^\s]+(?=\s+Handle ID:) The problem with this is that it isn't flexible so if your format or order changes at all it wont work.Eberle
n̶o̶p̶e̶ ̶i̶t̶ ̶d̶o̶e̶s̶n̶'̶t̶ ̶w̶o̶r̶k̶ ̶:̶(̶ . Sorry it Works. but keeps a blank space at the beginning of the match line/Picked
let us continue this discussion in chatPicked
In lookbehinds you cannot use quantifiers so you could remove the blank space by putting the exact amount of spaces... but this wouldn't be flexible if you have varying number of spaces between the key/value pair.Eberle
I have one more question from you. How do i use [^\s]+(?=\s+Handle ID:) when the string is something like Object Name: F:\Shared\Full_Option\Standed sinhala letters\Lalith\~$rapt order.doc? Something with spacesPicked
@CasperNine, you could try matching against newlines instead of any space characters... [^\r\n]+(?=\s+Handle ID:)Eberle
For any future visitors, I strongly suggest looking at @Ticonderoga answer which captures the need better and is a more general purpose solutionHoseahoseia
depends on the regex engine being used.Eberle
T
75

But I need the match result to be ... not in a match group...

For what you are trying to do, this should work. \K resets the starting point of the match.

\bObject Name:\s+\K\S+

You can do the same for getting your Security ID matches.

\bSecurity ID:\s+\K\S+
Ticonderoga answered 5/10, 2013 at 2:51 Comment(3)
\K not working in javascript, any other solutions?Patten
This worked great for me in Notepad++. I'm not sure what regex processor it uses, but it does allow the \K when doing regex searches.Exchangeable
regexr says \K works only with PCRE and not in javascript, no clue what PCRE is though, seems server sided stuff.Floro
E
60

If you are using a regex engine that doesn't support \K, the following should work for you:

[\n\r].*Object Name:\s*([^\n\r]*)

Working example

Your desired match will be in capture group 1.


[\n\r][ \t]*Object Name:[ \t]*([^\n\r]*)

Would be similar but not allow for things such as " blah Object Name: blah" and also make sure that not to capture the next line if there is no actual content after "Object Name:"

Eberle answered 5/10, 2013 at 2:18 Comment(12)
But i need the match result to be D:\ApacheTomcat\apache-tomcat-6.0.36\logs\localhost.2013-07-01.log not in a match groupPicked
@CasperNine, why? And what language are you using?Eberle
because the program im using captures only match result. Im using a log management tool called logstash. put your regex to this site regexpal.com and see.. it matches the whole line.Picked
@CasperNine, it depends on if that supports lookbehinds. Try this and let me know your result: (?<=Object Name:)([^\n\r]*) See hereEberle
@CasperNine, then you'll have to either use capture groups or base it off the following line like this: [^\s]+(?=\s+Handle ID:) The problem with this is that it isn't flexible so if your format or order changes at all it wont work.Eberle
n̶o̶p̶e̶ ̶i̶t̶ ̶d̶o̶e̶s̶n̶'̶t̶ ̶w̶o̶r̶k̶ ̶:̶(̶ . Sorry it Works. but keeps a blank space at the beginning of the match line/Picked
let us continue this discussion in chatPicked
In lookbehinds you cannot use quantifiers so you could remove the blank space by putting the exact amount of spaces... but this wouldn't be flexible if you have varying number of spaces between the key/value pair.Eberle
I have one more question from you. How do i use [^\s]+(?=\s+Handle ID:) when the string is something like Object Name: F:\Shared\Full_Option\Standed sinhala letters\Lalith\~$rapt order.doc? Something with spacesPicked
@CasperNine, you could try matching against newlines instead of any space characters... [^\r\n]+(?=\s+Handle ID:)Eberle
For any future visitors, I strongly suggest looking at @Ticonderoga answer which captures the need better and is a more general purpose solutionHoseahoseia
depends on the regex engine being used.Eberle
M
20

You're almost there. Use the following regex (with multi-line option enabled)

\bObject Name:\s+(.*)$

The complete match would be

Object Name:   D:\ApacheTomcat\apache-tomcat-6.0.36\logs\localhost.2013-07-01.log

while the captured group one would contain

D:\ApacheTomcat\apache-tomcat-6.0.36\logs\localhost.2013-07-01.log

If you want to capture the file path directly use

(?m)(?<=\bObject Name:).*$
Modiste answered 5/10, 2013 at 2:21 Comment(9)
I want the complete match to be D:\ApacheTomcat\apache-tomcat-6.0.36\logs\localhost.2013-07-01.log can't i do that?Picked
@CasperNine Yes, you can. Updated the regex.Modiste
@Ticonderoga yes thats correct. But how that actually works? what if need to match words which are in the line Security ID:Picked
@CasperNine, did you try (?m)(?<=\bObject Name:).*$?Modiste
@RaviThapliyal your updated regex keeps a blank space in front of the line. how do i avoid that?Picked
@CasperNine, I guess it's not possible for you to trim it but variable length look-behind is not supported with almost all the regex engines. You could use (?m)(?<=\bObject Name:\s{4}).*$ but it would fail for others like Security ID: because the amount of whitespace varies.Modiste
@hwnd: that would fail if the file structure changes (re-ordered or the next token is dropped).Modiste
Yes, I saw that, I posted an answer on how he could do it.Ticonderoga
@RaviKThapliyal I need to extract "slprop: Information Analysis for Microsoft Office,Show Color next to Signal,Red" from pastebin.com/NRU4vJk6 . Please note there are line breaks.Hamate
M
18

This might work out for you depending on which language you are using:

(?<=Object Name:).*

It's a positive lookbehind assertion. More information could be found here.

It won't work with JavaScript though. In your comment I read that you're using it for logstash. If you are using GROK parsing for logstash then it would work. You can verify it yourself here:

https://grokdebug.herokuapp.com/

Enter image description here

Module answered 20/9, 2016 at 10:56 Comment(0)
H
-4

Here's a quick Perl script to get what you need. It needs some whitespace chomping.

#!/bin/perl

$sample = <<END;
Subject:
  Security ID:        S-1-5-21-3368353891-1012177287-890106238-22451
  Account Name:       ChamaraKer
  Account Domain:     JIC
  Logon ID:       0x1fffb

Object:
  Object Server:  Security
  Object Type:    File
  Object Name:    D:\\ApacheTomcat\\apache-tomcat-6.0.36\\logs\\localhost.2013- 07-01.log
  Handle ID:  0x11dc
END

my @sample_lines = split /\n/, $sample;
my $path;

foreach my $line (@sample_lines) {
  ($path) = $line =~ m/Object Name:([^s]+)/g;
  if($path) {
    print $path . "\n";
  }
}
Healing answered 5/10, 2013 at 2:35 Comment(1)
regex not pythonHyohyoid
H
-4

This is a Python solution.

import re

line ="""Subject:
    Security ID:        S-1-5-21-3368353891-1012177287-890106238-22451
    Account Name:       ChamaraKer
    Account Domain:     JIC
    Logon ID:       0x1fffb

Object:
    Object Server:  Security
    Object Type:    File
    Object Name:    D:\ApacheTomcat\apache-tomcat-6.0.36\logs\localhost.2013-07-01.log
    Handle ID:  0x11dc"""



regex = (r'Object Name:\s+(.*)')
match1= re.findall(regex,line)
print (match1)

*** Remote Interpreter Reinitialized  ***
>>> 
['D:\\ApacheTomcat\x07pache-tomcat-6.0.36\\logs\\localhost.2013-07-01.log']
>>> 
Hodden answered 1/8, 2017 at 17:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.