I found this useful article on using Active Patterns with Regular Expressions: http://www.markhneedham.com/blog/2009/05/10/f-regular-expressionsactive-patterns/
The original code snippet used in the article was this:
open System.Text.RegularExpressions
let (|Match|_|) pattern input =
let m = Regex.Match(input, pattern) in
if m.Success then Some (List.tl [ for g in m.Groups -> g.Value ]) else None
let ContainsUrl value =
match value with
| Match "(http:\/\/\S+)" result -> Some(result.Head)
| _ -> None
Which would let you know if at least one url was found and what that url was (if I understood the snippet correctly)
Then in the comment section Joel suggested this modification:
Alternative, since a given group may or may not be a successful match:
List.tail [ for g in m.Groups -> if g.Success then Some g.Value else None ]
Or maybe you give labels to your groups and you want to access them by name:
(re.GetGroupNames() |> Seq.map (fun n -> (n, m.Groups.[n])) |> Seq.filter (fun (n, g) -> g.Success) |> Seq.map (fun (n, g) -> (n, g.Value)) |> Map.ofSeq)
After trying to combine all of this I came up with the following code:
let testString = "http://www.bob.com http://www.b.com http://www.bob.com http://www.bill.com"
let (|Match|_|) pattern input =
let re = new Regex(pattern)
let m = re.Match(input) in
if m.Success then Some ((re.GetGroupNames()
|> Seq.map (fun n -> (n, m.Groups.[n]))
|> Seq.filter (fun (n, g) -> g.Success)
|> Seq.map (fun (n, g) -> (n, g.Value))
|> Map.ofSeq)) else None
let GroupMatches stringToSearch =
match stringToSearch with
| Match "(http:\/\/\S+)" result -> printfn "%A" result
| _ -> ()
GroupMatches testString;;
When I run my code in an interactive session this is what is output:
map [("0", "http://www.bob.com"); ("1", "http://www.bob.com")]
The result I am trying to achieve would look something like this:
map [("http://www.bob.com", 2); ("http://www.b.com", 1); ("http://www.bill.com", 1);]
Basically a mapping of each unique match found followed by the count of the number of times that specific matching string was found in the text.
If you think I'm going down the wrong path here please feel free to suggest a completely different approach. I'm somewhat new to both Active Patterns and Regular Expressions so I have no idea where to even begin in trying to fix this.
I also came up with this which is basically what I would do in C# translated to F#.
let testString = "http://www.bob.com http://www.b.com http://www.bob.com http://www.bill.com"
let matches =
let matchDictionary = new Dictionary<string,int>()
for mtch in (Regex.Matches(testString, "(http:\/\/\S+)")) do
for m in mtch.Captures do
if(matchDictionary.ContainsKey(m.Value)) then
matchDictionary.Item(m.Value) <- matchDictionary.Item(m.Value) + 1
else
matchDictionary.Add(m.Value, 1)
matchDictionary
Which returns this when run:
val matches : Dictionary = dict [("http://www.bob.com", 2); ("http://www.b.com", 1); ("http://www.bill.com", 1)]
This is basically the result I am looking for, but I'm trying to learn the functional way to do this, and I think that should include active patterns. Feel free to try to "functionalize" this if it makes more sense than my first attempt.
Thanks in advance,
Bob