Match with empty sequence
Asked Answered
T

4

10

I'm learning F# and I've started to play around with both sequences and match expressions.

I'm writing a web scraper that's looking through HTML similar to the following and taking the last URL in a parent <span> with the paging class.

<html>
<body>
    <span class="paging">
        <a href="http://google.com">Link to Google</a>
        <a href="http://TheLinkIWant.com">The Link I want</a>
    </span>
</body>
</html>

My attempt to get the last URL is as follows:

type AnHtmlPage = FSharp.Data.HtmlProvider<"http://somesite.com">

let findMaxPageNumber (page:AnHtmlPage)= 
    page.Html.Descendants()
    |> Seq.filter(fun n -> n.HasClass("paging"))
    |> Seq.collect(fun n -> n.Descendants() |> Seq.filter(fun m -> m.HasName("a")))
    |> Seq.last
    |> fun n -> n.AttributeValue("href")

However I'm running into issues when the class I'm searching for is absent from the page. In particular I get ArgumentExceptions with the message: Additional information: The input sequence was empty.

My first thought was to build another function that matched empty sequences and returned an empty string when the paging class wasn't found on a page.

let findUrlOrReturnEmptyString (span:seq<HtmlNode>) =
    match span with 
    | Seq.empty -> String.Empty      // <----- This is invalid
    | span -> span
    |> Seq.collect(fun (n:HtmlNode) -> n.Descendants() |> Seq.filter(fun m -> m.HasName("a")))
    |> Seq.last
    |> fun n -> n.AttributeValue("href")

let findMaxPageNumber (page:AnHtmlPage)= 
    page.Html.Descendants()
    |> Seq.filter(fun n -> n.HasClass("paging"))
    |> findUrlOrReturnEmptyStrin

My issue is now that Seq.Empty is not a literal and cannot be used in a pattern. Most examples with pattern matching specify empty lists [] in their patterns so I'm wondering: How can I use a similar approach and match empty sequences?

Thermel answered 11/8, 2016 at 22:44 Comment(3)
Just use an if .. else here; match is just complicating things. (if Seq.isEmpty span then "" else ...)Corymb
The example has been simplified, there are a few places in my pipeline when I'd have to start adding if-else. Since I'm new to F# I'm mostly wondering if there is a proper way to match empty sequences since it seems common to match empty sequences.Thermel
If it's common in your code and you are adamant on sticking with match then create an active pattern for it.Corymb
D
16

The suggestion that ildjarn gave in the comments is a good one: if you feel that using match would create more readable code, then make an active pattern to check for empty seqs:

let (|EmptySeq|_|) a = if Seq.isEmpty a then Some () else None

let s0 = Seq.empty<int>

match s0 with
| EmptySeq -> "empty"
| _ -> "not empty"

Run that in F# interactive, and the result will be "empty".

Draft answered 12/8, 2016 at 3:15 Comment(2)
Thanks, I'm unfamiliar with active patterns but this looks like a pretty reasonable solution as well.Thermel
FYI: msdn.microsoft.com/visualfsharpdocs/conceptual/…Southeastwardly
M
13

You can use a when guard to further qualify the case:

match span with 
| sequence when Seq.isEmpty sequence -> String.Empty
| span -> span
|> Seq.collect (fun (n: HtmlNode) ->
    n.Descendants()
    |> Seq.filter (fun m -> m.HasName("a")))
|> Seq.last
|> fun n -> n.AttributeValue("href")

ildjarn is correct in that in this case, an if...then...else may be the more readable alternative, though.

Mccrory answered 11/8, 2016 at 22:48 Comment(3)
This shouldn't compile. span is of type seq<HtmlNode> and if String.Empty were defined, which it isn't as far as I know, it certainly should be of type string. Did you maybe mean to indent the pipeline? Otherwise, use a one-line early-out like if Seq.isEmpty span then "" else to allow opening a branch without requiring deeper indentation.Suspect
That is correct; it doesn't compile. I knew that without setting up the whole thing including the HTML type provider, it wouldn't compile anyway, so I didn't make any effort to fix it. It would have been nicer to have at least the match expression compiling, yes.Mccrory
String.Empty has been defined since .NET 1.0. Provided you open System of course.Gimpel
F
5

Use a guard clause

match myseq with
| s when Seq.isEmpty s -> "empty"
| _ -> "not empty"
Forta answered 3/4, 2019 at 16:3 Comment(0)
J
2

Building on the answer from @rmunn, you can make a more general sequence equality active pattern.

let (|Seq|_|) test input =
    if Seq.compareWith Operators.compare input test = 0
        then Some ()
        else None

match [] with
| Seq [] -> "empty"
| _ -> "not empty"
Joceline answered 12/8, 2016 at 7:27 Comment(1)
BTW, you might think that naming the active pattern Seq would conflict with the Seq module, but it won't. You'd still be able to use Seq.append and other functions from the Seq module; the compiler will figure it out. Inside a match pattern, the name Seq will reference the active pattern; outside a pattern, the name Seq will continue to reference the module.Draft

© 2022 - 2024 — McMap. All rights reserved.