Inconsistent IEnumerable ArgumentException while generating a complex object using FsCheck
Asked Answered
E

1

12

The Problem

In F#, I am using FsCheck to generate an object (which I'm then using in an Xunit test, but I can recreate entirely outside of Xunit, so I think we can forget about Xunit). Running the generation 20 times in FSI,

  • 50% of the time, the generation runs successfully.
  • 25% of the time, the generation throws:

    System.ArgumentException: The input must be non-negative.
    Parameter name: index
    >    at Microsoft.FSharp.Collections.SeqModule.Item[T](Int32 index, IEnumerable`1 source)
       at [email protected](Int32 n, StdGen r0) in C:\Users\Kurt\Projects\FsCheck\FsCheck\src\FsCheck\Gen.fs:line 63
       at FsCheck.Gen.go@290-1[b](FSharpList`1 gs, FSharpList`1 acc, Int32 size, StdGen r0) in C:\Users\Kurt\Projects\FsCheck\FsCheck\src\FsCheck\Gen.fs:line 295
       at [email protected](Int32 n, StdGen r) in C:\Users\Kurt\Projects\FsCheck\FsCheck\src\FsCheck\Gen.fs:line 297
       at [email protected](Int32 n, StdGen r0) in C:\Users\Kurt\Projects\FsCheck\FsCheck\src\FsCheck\Gen.fs:line 63
       at FsCheck.Gen.sample@155[a](Int32 size, Gen`1 gn, Int32 i, StdGen seed, FSharpList`1 samples) in C:\Users\Kurt\Projects\FsCheck\FsCheck\src\FsCheck\Gen.fs:line 157
       at FsCheck.Gen.Sample[a](Int32 size, Int32 n, Gen`1 gn) in C:\Users\Kurt\Projects\FsCheck\FsCheck\src\FsCheck\Gen.fs:line 155
       at <StartupCode$FSI_0026>.$FSI_0026.main@() in C:\projects\Alberta\Core\TestFunc\Script.fsx:line 57
    Stopped due to error
    
  • 25% of the time, the generation throws:

    System.ArgumentException: The input sequence has an insufficient number of elements.
    Parameter name: index
    >    at Microsoft.FSharp.Collections.IEnumerator.nth[T](Int32 index, IEnumerator`1 e)
       at Microsoft.FSharp.Collections.SeqModule.Item[T](Int32 index, IEnumerable`1 source)
       at [email protected](Int32 n, StdGen r0) in C:\Users\Kurt\Projects\FsCheck\FsCheck\src\FsCheck\Gen.fs:line 63
       at FsCheck.Gen.go@290-1[b](FSharpList`1 gs, FSharpList`1 acc, Int32 size, StdGen r0) in C:\Users\Kurt\Projects\FsCheck\FsCheck\src\FsCheck\Gen.fs:line 295
       at [email protected](Int32 n, StdGen r) in C:\Users\Kurt\Projects\FsCheck\FsCheck\src\FsCheck\Gen.fs:line 297
       at [email protected](Int32 n, StdGen r0) in C:\Users\Kurt\Projects\FsCheck\FsCheck\src\FsCheck\Gen.fs:line 63
       at FsCheck.Gen.sample@155[a](Int32 size, Gen`1 gn, Int32 i, StdGen seed, FSharpList`1 samples) in C:\Users\Kurt\Projects\FsCheck\FsCheck\src\FsCheck\Gen.fs:line 157
       at FsCheck.Gen.Sample[a](Int32 size, Int32 n, Gen`1 gn) in C:\Users\Kurt\Projects\FsCheck\FsCheck\src\FsCheck\Gen.fs:line 155
       at <StartupCode$FSI_0025>.$FSI_0025.main@() in C:\projects\Alberta\Core\TestFunc\Script.fsx:line 57
    Stopped due to error
    

The Situation

The object is as follows:

type Event =
| InitEvent of string
| RefEvent of string
type Stream = Event seq

The object must be follow these rules to be valid:

  1. All InitEvents must come before all RefEvents
  2. All InitEvents strings must be unique
  3. All RefEvent names must have an earlier corresponding InitEvent
  4. But it's OK if some InitEvents NOT have later corresponding RefEvents
  5. But it's OK if multiple RefEvents have the same name

The Working Workaround

If I have the generator call a function which returns a valid object and do a Gen.constant (function), I never run into the exceptions, but this is not the way FsCheck is meant to be run! :)

/// <summary>
/// This is a non-generator equivalent which is 100% reliable
/// </summary>
let randomStream size =
   // valid names for a sample
   let names = Gen.sample size size Arb.generate<string> |> List.distinct
   // init events
   let initEvents = names |> List.map( fun name -> name |> InitEvent )
   // reference events
   let createRefEvent name = name |> RefEvent
   let genRefEvent = createRefEvent <!> Gen.elements names
   let refEvents = Gen.sample size size genRefEvent
   // combine
   Seq.append initEvents refEvents


type MyGenerators =
   static member Stream() = {
      new Arbitrary<Stream>() with
         override x.Generator = Gen.sized( fun size -> Gen.constant (randomStream size) )
   }

// repeatedly running the following two lines ALWAYS works
Arb.register<MyGenerators>()
let foo = Gen.sample 10 10 Arb.generate<Stream>

The Broken Right Way?

I cannot seem to completely get away from generating a constant (need to store the list of names outside of the InitEvents so that RefEvent generation can get at them, but I can get more in line with how FsCheck generators seem to work:

type MyGenerators =
   static member Stream() = {
      new Arbitrary<Stream>() with
         override x.Generator = Gen.sized( fun size ->
            // valid names for a sample
            let names = Gen.sample size size Arb.generate<string> |> List.distinct
            // generate inits
            let genInits = Gen.constant (names |> List.map InitEvent) |> Gen.map List.toSeq
            // generate refs
            let makeRef name = name |> RefEvent
            let genName = Gen.elements names
            let genRef = makeRef <!> genName
            Seq.append <!> genInits <*> ( genRef |> Gen.listOf )
         )
   }

// repeatedly running the following two lines causes the inconsistent errors
// If I don't re-register my generator, I always get the same samples.
// Is this because FsCheck is trying to be deterministic?
Arb.register<MyGenerators>()
let foo = Gen.sample 10 10 Arb.generate<Stream>

What I've Already Checked

  • Sorry, forgot to mention in original question that I've tried to Debug in Interactive, and because of the inconsistent behavior, it's somewhat hard to track down. However, when the exceptions hit, it seems to be between the end of my generator code and what is asking for the generated samples -- while FsCheck is DOING the generation, it seems to be trying to process a malformed sequence. I'm further assuming that this is because I've incorrectly coded the generator.
  • IndexOutOfRangeException using FsCheck suggests a potentially similar situation. I have tried running my Xunit tests both via Resharper test runner as well as Xunit's console test runner on the real-world tests on which the above simplification is based. Both runners exhibit identical behavior, so the issue is somewhere else.
  • Other "How do I generate..." questions such as In FsCheck, how to generate a test record with non-negative fields? and How does one generate a "complex" object in FsCheck? deal with the creation of objects of a lesser complexity. The first was a great help for getting to the code I have, and the second gives a much-needed example of Arb.convert, but Arb.convert doesn't make sense if I'm converting from a "constant" list of randomly generated names. It all seems to come back to that -- the need to make random names, which are then pulled from to make a complete set of InitEvents, and some sequence of RefEvents, both which refer back to the "constant" list, doesn't match anything that I've yet come across.
  • I've looked through most examples of FsCheck generators I can find, including the included examples in FsCheck: https://github.com/fscheck/FsCheck/blob/master/examples/FsCheck.Examples/Examples.fs These also do not deal with an object needing internal consistency, and do not seem to apply to this case, even though they have been helpful overall.
  • Perhaps this means that I'm approaching the generation of the object from an unhelpful perspective. If there is a different way to generate an object which follows the above rules, I'm open to switching to it.
  • Further backing away from the problem, I've seen other SO posts which roughly say "If your object has such restrictions, then what happens when you receive an invalid object? Perhaps you need to rethink the way this object is consumed to better handle invalid cases." If, for example, I were able to initialize on-the-fly a never before seen name in a RefEvent, the entire need for giving an InitEvent first would go away -- the problem gracefully reduces to simply a sequence of RefEvents of some random name. I am open to this kind of solution, but it would require a bit of rework -- in the long run, it may be worth it. In the mean time, the question remains, how can you reliably generate a complex object which follows the above rules using FsCheck?

Thanks!

EDIT(S): Attempts to Solve

  • The code in Mark Seemann's answer works, but yields a slightly different object than I was looking for (I was unclear in my object rules -- now hopefully clarified). Putting his working code in my generator:

    type MyGenerators =
       static member Stream() = {
          new Arbitrary<Stream>() with
             override x.Generator =
                gen {
                   let! uniqueStrings = Arb.Default.Set<string>().Generator
                   let initEvents = uniqueStrings |> Seq.map InitEvent
    
                   let! sortValues =
                      Arb.Default.Int32()
                      |> Arb.toGen
                      |> Gen.listOfLength uniqueStrings.Count
                   let refEvents =
                      Seq.zip uniqueStrings sortValues
                      |> Seq.sortBy snd
                      |> Seq.map fst
                      |> Seq.map RefEvent
    
                   return Seq.append initEvents refEvents
                }
        }
    

    This yields an object where every InitEvent has a matching RefEvent, and there is only one RefEvent for each InitEvent. I'm trying to tweak the code so that I can get multiple RefEvents for each name, and not all names need to have a RefEvent. ex: Init foo, Init bar, Ref foo, Ref foo is perfectly valid. Trying to tweak this with:

    type MyGenerators =
       static member Stream() = {
          new Arbitrary<Stream>() with
             override x.Generator =
                gen {
                   let! uniqueStrings = Arb.Default.Set<string>().Generator
                   let initEvents = uniqueStrings |> Seq.map InitEvent
    
                   // changed section starts
                   let makeRef name = name |> RefEvent
                   let genRef = makeRef <!> Gen.elements uniqueStrings
                   return! Seq.append initEvents <!> ( genRef |> Gen.listOf )
                   // changed section ends
                }
       }
    

    The modified code still exhibits the inconsistent behavior. Interestingly, out of 20 sample runs, only three worked (down from 10), while the insufficient number of elements was thrown 8 times and The input must be non-negative was thrown 9 times -- these changes have made the edge case more than twice as likely to be hit. We're now down to a very small section of code with the error.

  • Mark quickly responded with another version to address changed requirements:

    type MyGenerators =
       static member Stream() = {
          new Arbitrary<Stream>() with
             override x.Generator =
                gen {
                   let! uniqueStrings = Arb.Default.NonEmptySet<string>().Generator
                   let initEvents = uniqueStrings.Get |> Seq.map InitEvent
    
                   let! refEvents =
                      uniqueStrings.Get |> Seq.map RefEvent |> Gen.elements |> Gen.listOf
    
                   return Seq.append initEvents refEvents
                }
       }
    

    This allowed for some names to not have a RefEvent.

FINAL CODE A very minor tweak gets it so that duplicate RefEvents can happen:

type MyGenerators =
   static member Stream() = {
      new Arbitrary<Stream>() with
         override x.Generator =
            gen {
               let! uniqueStrings = Arb.Default.NonEmptySet<string>().Generator
               let initEvents = uniqueStrings.Get |> Seq.map InitEvent

               let! refEvents =
                  //uniqueStrings.Get |> Seq.map RefEvent |> Gen.elements |> Gen.listOf
                  Gen.elements uniqueStrings.Get |> Gen.map RefEvent |> Gen.listOf

               return Seq.append initEvents refEvents
            }
   }

Big thanks to Mark Seemann!

Evaporate answered 10/3, 2016 at 17:34 Comment(4)
Here's a suggestion: what happens if on the last line of your generator you use List.append instead of Seq.append?Satiny
Thanks for suggestion, I just tried turning the overall Stream type to be an Event list instead of Event seq, which means that at the end of my generator I do a List.append instead -- end result is same behavior, same inconsistency, same Sequence-related classes even though it's now a list. I'm not sure what the sequence is, perhaps internals of FsCheck.Gen?Evaporate
The stack trace points to GenBuilder.bind@62 github.com/fscheck/FsCheck/blob/master/src/FsCheck/Gen.fs#L61, which is completely generic and doesn't reference anything sequence-y at all. This makes me believe that the sequence functions come from the arguments of GenBuilder.bind. Initially I thought that the arguments are supplied by you, but now that you don't have any seqs in there either, that theory doesn't stand.Satiny
Not yet getting computational expressions, I'm not sure what FsCheck is doing here. Looking down the call stack one and two levels, puts you inside a nested recursive function which seems to convert a sequence to a list one element at a time, with line 295 github.com/fscheck/FsCheck/blob/master/src/FsCheck/Gen.fs#L295 calling bind for each item? The initial sequence passed into the SequenceToList function according to the call stack is again bind @62, I've run into a circle, and I'm not sure where to get off! :)Evaporate
N
6

Gen

Here's one way to address the requirements:

open FsCheck

let streamGen = gen {
    let! uniqueStrings = Arb.Default.Set<string>().Generator
    let initEvents = uniqueStrings |> Seq.map InitEvent

    let! sortValues =
        Arb.Default.Int32()
        |> Arb.toGen
        |> Gen.listOfLength uniqueStrings.Count
    let refEvents =
        Seq.zip uniqueStrings sortValues
        |> Seq.sortBy snd
        |> Seq.map fst
        |> Seq.map RefEvent

    return Seq.append initEvents refEvents }

The semi-official answer on how to generate unique strings is to generate a Set<string>. Since Set<'a> also implements 'a seq, you can use all the normal Seq functions on it.

Generating the InitEvent values, then, is a simple map operation over the unique strings.

Since each RefEvent must have a corresponding InitEvent, you can reuse the same unique strings, but you may want to give the RefEvent values on option to appear in a different order. To do that, you can generate sortValues, which is a list of random int values. This list has the same length as the set of strings.

At this point, you have a list of unique strings, and a list of random integers. Here are some fake values that illustrate the concept:

> let uniqueStrings = ["foo"; "bar"; "baz"];;
val uniqueStrings : string list = ["foo"; "bar"; "baz"]

> let sortValues = [42; 1337; 42];;    
val sortValues : int list = [42; 1337; 42]

You can now zip them:

> List.zip uniqueStrings sortValues;;
val it : (string * int) list = [("foo", 42); ("bar", 1337); ("baz", 42)]

Sorting such a sequence on its second element will give you a randomly shuffled list, and then you can map to only the first element:

> List.zip uniqueStrings sortValues |> List.sortBy snd |> List.map fst;;
val it : string list = ["foo"; "baz"; "bar"]

Since all InitEvent values must come before the RefEvent values, you can now append refEvents to initEvents, and return this combined list.

Verification

You can verify that streamGen works as intended:

open FsCheck.Xunit
open Swensen.Unquote

let isInitEvent = function InitEvent _ -> true | _ -> false
let isRefEvent =  function RefEvent  _ -> true | _ -> false

[<Property(MaxTest = 100000)>]
let ``All InitEvents must come before all RefEvents`` () =
    Prop.forAll (streamGen |> Arb.fromGen) <| fun s ->
        test <@ s |> Seq.skipWhile isInitEvent |> Seq.forall isRefEvent @>

[<Property(MaxTest = 100000)>]
let ``All InitEvents strings must be unique`` () =
    Prop.forAll (streamGen |> Arb.fromGen) <| fun s ->
        let initEventStrings =
            s |> Seq.choose (function InitEvent s -> Some s | _ -> None)
        let distinctStrings = initEventStrings |> Seq.distinct

        distinctStrings |> Seq.length =! (initEventStrings |> Seq.length)

[<Property(MaxTest = 100000)>]
let ``All RefEvent names must have an earlier corresponding InitEvent`` () =
    Prop.forAll (streamGen |> Arb.fromGen) <| fun s ->
        let initEventStrings =
            s
            |> Seq.choose (function InitEvent s -> Some s | _ -> None)
            |> Seq.sort
            |> Seq.toList
        let refEventStrings =
            s
            |> Seq.choose (function RefEvent s -> Some s | _ -> None)
            |> Seq.sort
            |> Seq.toList

        initEventStrings =! refEventStrings

These three properties all pass on my machine.


Looser requirements

Based on the looser requirements outlined in the comments to this answer, here's an updated generator that draws values from the InitEvents strings:

open FsCheck

let streamGen = gen {
    let! uniqueStrings = Arb.Default.NonEmptySet<string>().Generator
    let initEvents = uniqueStrings.Get |> Seq.map InitEvent

    let! refEvents =
        uniqueStrings.Get |> Seq.map RefEvent |> Gen.elements |> Gen.listOf

    return Seq.append initEvents refEvents }

This time, uniqueStrings is a non-empty set of strings.

You can use Seq.map RefEvent to generate a sequence of all valid RefEvent values based on uniqueStrings, and then Gen.elements to defined a generator of valid RefEvent values that draws from that sequence of valid values. Finally, Gen.listOf creates lists of values generated by that generator.

Tests

These tests demonstrate that streamGen generates values according to the rules:

open FsCheck.Xunit
open Swensen.Unquote

let isInitEvent = function InitEvent _ -> true | _ -> false
let isRefEvent =  function RefEvent  _ -> true | _ -> false

[<Property(MaxTest = 100000)>]
let ``All InitEvents must come before all RefEvents`` () =
    Prop.forAll (streamGen |> Arb.fromGen) <| fun s ->
        test <@ s |> Seq.skipWhile isInitEvent |> Seq.forall isRefEvent @>

[<Property(MaxTest = 100000)>]
let ``All InitEvents strings must be unique`` () =
    Prop.forAll (streamGen |> Arb.fromGen) <| fun s ->
        let initEventStrings =
            s |> Seq.choose (function InitEvent s -> Some s | _ -> None)
        let distinctStrings = initEventStrings |> Seq.distinct

        distinctStrings |> Seq.length =! (initEventStrings |> Seq.length)

[<Property(MaxTest = 100000)>]
let ``All RefEvent names must have an earlier corresponding InitEvent`` () =
    Prop.forAll (streamGen |> Arb.fromGen) <| fun s ->
        let initEventStrings =
            s
            |> Seq.choose (function InitEvent s -> Some s | _ -> None)
            |> Seq.sort
            |> Set.ofSeq

        test <@ s
                |> Seq.choose (function RefEvent s -> Some s | _ -> None)
                |> Seq.forall initEventStrings.Contains @>

These three properties all pass on my machine.

Natala answered 10/3, 2016 at 21:8 Comment(2)
Sorry for brusqueness, hindered by comment length! :) Your code works as is, but isn't quite what I am looking for. I was unclear -- no need for each Init to have a matching Ref, and there can be multiple Refs for any given Init. I'm trying to tweak your code, but I don't yet get the gen {...} -- let appears to keep the result in the realm of the Gen, while let! "drops" the result to a value the Gen is based on. (Drop being the opposite of "lifting" a value into a Gen). Am I on the right track? My first attempt still runs into same error, so at least honing in on problem!Evaporate
Perfect! I have much appreciation and gratitude for your time spent preparing a helpful and informative answer. You have given me many things to think about!Evaporate

© 2022 - 2024 — McMap. All rights reserved.