Regex to get the words after matching string

P

6

100

Below is the content:

Subject:
    Security ID:        S-1-5-21-3368353891-1012177287-890106238-22451
    Account Name:       ChamaraKer
    Account Domain:     JIC
    Logon ID:       0x1fffb

Object:
    Object Server:  Security
    Object Type:    File
    Object Name:    D:\ApacheTomcat\apache-tomcat-6.0.36\logs\localhost.2013-07-01.log
    Handle ID:  0x11dc

I need to capture the words after the Object Name: word in that line. Which is D:\ApacheTomcat\apache-tomcat-6.0.36\logs\localhost.2013-07-01.log.

How can I do this?

^.*\bObject Name\b.*$ matches - Object Name

Picked answered 5/10, 2013 at 2:7 Comment(0)

E

60

If you are using a regex engine that doesn't support \K, the following should work for you:

[\n\r].*Object Name:\s*([^\n\r]*)

Working example

Your desired match will be in capture group 1.

[\n\r][ \t]*Object Name:[ \t]*([^\n\r]*)

Would be similar but not allow for things such as " blah Object Name: blah" and also make sure that not to capture the next line if there is no actual content after "Object Name:"

Eberle answered 5/10, 2013 at 2:18 Comment(12)

But i need the match result to be D:\ApacheTomcat\apache-tomcat-6.0.36\logs\localhost.2013-07-01.log not in a match group – Picked 5/10, 2013 at 2:26

@CasperNine, why? And what language are you using? – Eberle 5/10, 2013 at 2:26

because the program im using captures only match result. Im using a log management tool called logstash. put your regex to this site regexpal.com and see.. it matches the whole line. – Picked 5/10, 2013 at 2:30

@CasperNine, it depends on if that supports lookbehinds. Try this and let me know your result: (?<=Object Name:)([^\n\r]*) See here – Eberle 5/10, 2013 at 2:37

@CasperNine, then you'll have to either use capture groups or base it off the following line like this: [^\s]+(?=\s+Handle ID:) The problem with this is that it isn't flexible so if your format or order changes at all it wont work. – Eberle 5/10, 2013 at 2:45

n̶o̶p̶e̶ ̶i̶t̶ ̶d̶o̶e̶s̶n̶'̶t̶ ̶w̶o̶r̶k̶ ̶:̶(̶ . Sorry it Works. but keeps a blank space at the beginning of the match line/ – Picked 5/10, 2013 at 2:46

let us continue this discussion in chat – Picked 5/10, 2013 at 2:47

In lookbehinds you cannot use quantifiers so you could remove the blank space by putting the exact amount of spaces... but this wouldn't be flexible if you have varying number of spaces between the key/value pair. – Eberle 5/10, 2013 at 2:53

I have one more question from you. How do i use [^\s]+(?=\s+Handle ID:) when the string is something like Object Name: F:\Shared\Full_Option\Standed sinhala letters\Lalith\~$rapt order.doc? Something with spaces – Picked 8/10, 2013 at 8:42

@CasperNine, you could try matching against newlines instead of any space characters... [^\r\n]+(?=\s+Handle ID:) – Eberle 8/10, 2013 at 12:55

For any future visitors, I strongly suggest looking at @Ticonderoga answer which captures the need better and is a more general purpose solution – Hoseahoseia 25/10, 2022 at 13:52

depends on the regex engine being used. – Eberle 15/11, 2022 at 23:17

T

75

But I need the match result to be ... not in a match group...

For what you are trying to do, this should work. \K resets the starting point of the match.

\bObject Name:\s+\K\S+

You can do the same for getting your Security ID matches.

\bSecurity ID:\s+\K\S+

Ticonderoga answered 5/10, 2013 at 2:51 Comment(3)

\K not working in javascript, any other solutions? – Patten 1/11, 2016 at 3:59

This worked great for me in Notepad++. I'm not sure what regex processor it uses, but it does allow the \K when doing regex searches. – Exchangeable 7/6, 2017 at 20:33

regexr says \K works only with PCRE and not in javascript, no clue what PCRE is though, seems server sided stuff. – Floro 11/9, 2018 at 14:15

E

60

If you are using a regex engine that doesn't support \K, the following should work for you:

[\n\r].*Object Name:\s*([^\n\r]*)

Working example

Your desired match will be in capture group 1.

[\n\r][ \t]*Object Name:[ \t]*([^\n\r]*)

Would be similar but not allow for things such as " blah Object Name: blah" and also make sure that not to capture the next line if there is no actual content after "Object Name:"

Eberle answered 5/10, 2013 at 2:18 Comment(12)

But i need the match result to be D:\ApacheTomcat\apache-tomcat-6.0.36\logs\localhost.2013-07-01.log not in a match group – Picked 5/10, 2013 at 2:26

@CasperNine, why? And what language are you using? – Eberle 5/10, 2013 at 2:26

because the program im using captures only match result. Im using a log management tool called logstash. put your regex to this site regexpal.com and see.. it matches the whole line. – Picked 5/10, 2013 at 2:30

@CasperNine, it depends on if that supports lookbehinds. Try this and let me know your result: (?<=Object Name:)([^\n\r]*) See here – Eberle 5/10, 2013 at 2:37

@CasperNine, then you'll have to either use capture groups or base it off the following line like this: [^\s]+(?=\s+Handle ID:) The problem with this is that it isn't flexible so if your format or order changes at all it wont work. – Eberle 5/10, 2013 at 2:45

n̶o̶p̶e̶ ̶i̶t̶ ̶d̶o̶e̶s̶n̶'̶t̶ ̶w̶o̶r̶k̶ ̶:̶(̶ . Sorry it Works. but keeps a blank space at the beginning of the match line/ – Picked 5/10, 2013 at 2:46

let us continue this discussion in chat – Picked 5/10, 2013 at 2:47

In lookbehinds you cannot use quantifiers so you could remove the blank space by putting the exact amount of spaces... but this wouldn't be flexible if you have varying number of spaces between the key/value pair. – Eberle 5/10, 2013 at 2:53

I have one more question from you. How do i use [^\s]+(?=\s+Handle ID:) when the string is something like Object Name: F:\Shared\Full_Option\Standed sinhala letters\Lalith\~$rapt order.doc? Something with spaces – Picked 8/10, 2013 at 8:42

@CasperNine, you could try matching against newlines instead of any space characters... [^\r\n]+(?=\s+Handle ID:) – Eberle 8/10, 2013 at 12:55

For any future visitors, I strongly suggest looking at @Ticonderoga answer which captures the need better and is a more general purpose solution – Hoseahoseia 25/10, 2022 at 13:52

depends on the regex engine being used. – Eberle 15/11, 2022 at 23:17

M

20

You're almost there. Use the following regex (with multi-line option enabled)

\bObject Name:\s+(.*)$

The complete match would be

Object Name:   D:\ApacheTomcat\apache-tomcat-6.0.36\logs\localhost.2013-07-01.log

while the captured group one would contain

D:\ApacheTomcat\apache-tomcat-6.0.36\logs\localhost.2013-07-01.log

If you want to capture the file path directly use

(?m)(?<=\bObject Name:).*$

Modiste answered 5/10, 2013 at 2:21 Comment(9)

I want the complete match to be D:\ApacheTomcat\apache-tomcat-6.0.36\logs\localhost.2013-07-01.log can't i do that? – Picked 5/10, 2013 at 2:32

@CasperNine Yes, you can. Updated the regex. – Modiste 5/10, 2013 at 2:37

@Ticonderoga yes thats correct. But how that actually works? what if need to match words which are in the line Security ID: – Picked 5/10, 2013 at 2:39

@CasperNine, did you try (?m)(?<=\bObject Name:).*$? – Modiste 5/10, 2013 at 2:43

@RaviThapliyal your updated regex keeps a blank space in front of the line. how do i avoid that? – Picked 5/10, 2013 at 2:44

@CasperNine, I guess it's not possible for you to trim it but variable length look-behind is not supported with almost all the regex engines. You could use (?m)(?<=\bObject Name:\s{4}).*$ but it would fail for others like Security ID: because the amount of whitespace varies. – Modiste 5/10, 2013 at 2:47

@hwnd: that would fail if the file structure changes (re-ordered or the next token is dropped). – Modiste 5/10, 2013 at 2:53

Yes, I saw that, I posted an answer on how he could do it. – Ticonderoga 5/10, 2013 at 2:54

@RaviKThapliyal I need to extract "slprop: Information Analysis for Microsoft Office,Show Color next to Signal,Red" from pastebin.com/NRU4vJk6 . Please note there are line breaks. – Hamate 26/4, 2023 at 9:59

M

18

This might work out for you depending on which language you are using:

(?<=Object Name:).*

It's a positive lookbehind assertion. More information could be found here.

It won't work with JavaScript though. In your comment I read that you're using it for logstash. If you are using GROK parsing for logstash then it would work. You can verify it yourself here:

https://grokdebug.herokuapp.com/

Module answered 20/9, 2016 at 10:56 Comment(0)

H

-4

Here's a quick Perl script to get what you need. It needs some whitespace chomping.

#!/bin/perl

$sample = <<END;
Subject:
  Security ID:        S-1-5-21-3368353891-1012177287-890106238-22451
  Account Name:       ChamaraKer
  Account Domain:     JIC
  Logon ID:       0x1fffb

Object:
  Object Server:  Security
  Object Type:    File
  Object Name:    D:\\ApacheTomcat\\apache-tomcat-6.0.36\\logs\\localhost.2013- 07-01.log
  Handle ID:  0x11dc
END

my @sample_lines = split /\n/, $sample;
my $path;

foreach my $line (@sample_lines) {
  ($path) = $line =~ m/Object Name:([^s]+)/g;
  if($path) {
    print $path . "\n";
  }
}

Healing answered 5/10, 2013 at 2:35 Comment(1)

regex not python – Hyohyoid 6/2, 2018 at 19:12

H

-4

This is a Python solution.

import re

line ="""Subject:
    Security ID:        S-1-5-21-3368353891-1012177287-890106238-22451
    Account Name:       ChamaraKer
    Account Domain:     JIC
    Logon ID:       0x1fffb

Object:
    Object Server:  Security
    Object Type:    File
    Object Name:    D:\ApacheTomcat\apache-tomcat-6.0.36\logs\localhost.2013-07-01.log
    Handle ID:  0x11dc"""



regex = (r'Object Name:\s+(.*)')
match1= re.findall(regex,line)
print (match1)

*** Remote Interpreter Reinitialized  ***
>>> 
['D:\\ApacheTomcat\x07pache-tomcat-6.0.36\\logs\\localhost.2013-07-01.log']
>>>

Hodden answered 1/8, 2017 at 17:35 Comment(0)

Recommended topics

Hot tags