language-ext v3.4.14-beta Release Notes

Release Date: 2020-03-05 // 5 months ago
  • An ongoing thorn in my side has been the behaviour of Traverse and Sequence for certain pairs of monadic types (when nested). These issues document some of the problems:

    #552
    #651

    The Traverse and Sequence functions were previously auto-generated by a T4 template, because for 25 monads that's 25 * 25 * 2 = 1250 functions to write. In practice it's a bit less than that, because not all nested monads should have a Traverse and Sequence function, but it is in the many hundreds of functions.

    Because the same issue kept popping up I decided to bite the bullet and write them all by hand. This has a number of benefits:

    • 🏗 The odd rules of various monads when paired can have bespoke code that makes more sense than any auto-generated T4 template could ever build. This fixes the bugs that keep being reported and removes the surprising nature of Traverse and Sequence working most of the time, but not in all cases.
    • 🐎 I'm able to hand-optimise each function based on what's most efficient for the monad pairing. This is especially powerful for working with Traverse and Sequence on list/sequence types. The generic T4 code-gen had to create singleton sequences and the concat them, which was super inefficient and could cause stack overflows. Often now I can pre-allocate an array and use a much faster imperative implementation with sequential memory access. Where possible I've tried to avoid nesting lambdas, again in the quest for performance but also to reduce the amount of GC objects created. I expect a major performance boost from these changes.
    • The lazy stream types Seq and IEnumerable when paired with async types like Task, OptionAsync, etc. can now have bespoke behaviour to better handle the concurrency requirements (These types now have TraverseSerial and SequenceSerial which process tasks in a sequence one-at-a-time, and TraverseParallel and SequenceParallel which processes tasks in a sequence concurrently with a window of running tasks - that means it's possible to stop the Traverse or Sequence operation from thrashing the scheduler.

    Help

    🐎 Those are all lovely things, but the problem with writing several hundred functions manually is that there's gonna be bugs in there, especially as I've implemented them in the most imperative way I can to get the max performance out of them.

    🏗 I have just spent the past three days writing these functions, and frankly, it was pretty soul destroying experience - the idea of writing several thousand unit tests fills me with dread; and so if any of you lovely people would like to jump in and help build some unit tests then I would be eternally grateful.

    Sharing the load on this one would make sense. If you've never contributed to an open-source project before then this is a really good place to start!

    I have...

    • 🚀 Released the updates in 3.4.14-beta - so if you have unit tests that use Traverse and Sequence then any feedback on the stability of your tests would be really helpful.
    • Created a github project for managing the cards of each file that needs unit tests. It's the first time using this, so not sure of its capabilities yet, but it would be great to assign a card to someone so work doesn't end up being doubled up.
    • The code is in the hand-written-traverse branch.
    • The folder with all the functions is transformers/traverse

    Things to know

    • Traverse and Sequence take a nested monadic type of the form MonadOuter<MonadInner<A>> and flips it so the result is MonadInner<MonadOuter<A>>
    • If the outer-type is in a fail state then usually the inner value's fail state is returned. i.e. Try<Option<A>> would return Option<Try<A>>.None if the outer Try was in a Fail state.
    • If the inner-type is in a fail state then usually that short-cuts any operation. For example Seq<Option<A>> would return an Option<Seq<A>>.None if any of the Options in the Seq were None.
    • 👻 Where possible I've tried to rescue a fail value where the old system returned Bottom. For example: Either<Error, Try<A>>. The new system now knows that the language-ext Error type contains an Exception and can therefore be used when constructing Try<Either<Error, A>>
    • 🏁 All async pairings are eagerly consumed, even when using Seq or IEnumerable. Seq and IEnumerable do have windows for throttling the consumption though.
    • 🚚 Option combined with other types that have an error value (like Option<Try<A>>, Option<Either<L, R>>, etc.) will put None into the resulting type (Try<Option<A>>(None), Either<L, Option<A>>(None) if the outer type is None - this is because there is no error value to construct an Exception or L value - and so the only option is to either return Bottom or a success value with None in it, which I think is slightly more useful. This behaviour is different from the old system. This decision is up for debate, and I'm happy to have it - the choices are: remove the pairing altogether (so there is no Traverse or Sequence for those types) or return None as described above

    ✅ Obviously, it helps if you understand this code, what it does and how it should work. I'll make some initial tests over the next few days as guidance.

    Discussion will be here


Previous changes from v3.4.11

  • 🆓 Free monads allow the programmer to take a functor and turn it into a monad for free.

    🆓 The [Free] code-gen attribute provides this functionality in C#.

    Below, is a the classic example of a Maybe type (also known as Option, here we're using the Haskell naming parlance to avoid confusion with the language-ext type).

    [Free]public interface Maybe\<A\> { [Pure] A Just(A value); [Pure] A Nothing(); public static Maybe\<B\> Map\<B\>(Maybe\<A\> ma, Func\<A, B\> f) =\> ma switch { Just\<A\>(var x) =\> Maybe.Just(f(x)), \_=\> Maybe.Nothing\<B\>() }; }
    

    👀 Click here to see the generated code

    The Maybe<A> type can then be used as a monad:

    var ma = Maybe.Just(10);var mb = Maybe.Just(20);var mn = Maybe.Nothing\<int\>();var r1 = from a in mafrom b in mbselect a + b; // Just(30)var r2 = from a in mafrom b in mbfrom \_ in mnselect a + b; // Nothing
    

    And so, in 11 lines of code, we have created a Maybe monad that captures the short-cutting behaviour of Nothing.

    But, actually, it's possible to do this in fewer lines of code:

    [Free]public interface Maybe\<A\> { [Pure] A Just(A value); [Pure] A Nothing(); }
    

    🏗 If you don't need to capture bespoke rules in the Map function, the code-gen will build it for you.

    A monad, a functor, and a discriminated union in 6 lines of code. Nice.

    🆓 As with the discriminated-unions, [Free] types allow for deconstructing the values when pattern-maching:

    var txt = ma switch{ Just\<int\> (var x) =\> $"Value is {x}", \_=\> "No value"};
    

    🆓 The type 'behind' a free monad (in Haskell or Scala for example) usually has one of two cases:

    • Pure
    • 🆓 Free

    Pure is what we've used so far, and that's why Just and Nothing had the Pure attribute before them:

    [Pure] A Just(A value); [Pure] A Nothing();
    

    They can be considered terminal values. i.e. just raw data, nothing else. The code generated works in exactly the same way as the common types in language-ext, like Option, Either, etc. However, if the [Pure] attribute is left off the method-declaration then we gain an extra field in the generated case type: Next.

    Next is a Func<*, M<A>> - the * will be the return type of the method-declaration.

    For example:

    [Free]public interface FreeIO\<T\> { [Pure] T Pure(T value); [Pure] T Fail(Error error); string ReadAllText(string path); Unit WriteAllText(string path, string text); }
    

    👀 Click here to see the generated code

    👀 If we look at the generated code for the ReadAllText case (which doesn't have a [Pure] attribute), then we see that the return type of string has now been injected into this additional Next function which is provided as the last argument.

    public sealed class ReadAllText\<T\> : FreeIO\<T\>, System.IEquata... { public readonly string Path; public readonly System.Func\<string, FreeIO\<T\>\> Next; public ReadAllText(string Path, System.Func\<string, FreeIO\<T\>\> Next) { this.Path = Path; this.Next = Next; }
    

    💅 Why is all this important? Well, it allows for actions to be chained together into a continuations style structure. This is useful for building a sequence of actions, very handy for building DSLs.

    var dsl = new ReadAllText\<Unit\>("I:\\temp\\test.txt", txt =\> new WriteAllText\<Unit\>("I:\\temp\\test2.txt", txt, \_ =\> new Pure\<Unit\>(unit)));
    

    You should be able to see now why the [Pure] types are terminal values. They are used at the end of the chain of continuations to signify a result.

    But that's all quite ugly, so we can leverage the monadic aspect of the type:

    var dsl = from t in FreeIO.ReadAllText("I:\\temp\\test.txt") from \_ in FreeIO.WriteAllText("I:\\temp\\test2.txt", t) select unit;
    

    The continuation itself doesn't do anything, it's just a pure data-structure representing the actions of the DSL. And so, we need an interpreter to run it (which you write). This is a simple example:

    public static Either\<Error, A\> Interpret\<A\>(FreeIO\<A\> ma) =\> ma switch{ Pure\<A\> (var value) =\> value, Fail\<A\> (var error) =\> error, ReadAllText\<A\> (var path, var next) =\> Interpret(next(Read(path))), WriteAllText\<A\> (var path, var text, var next) =\> Interpret(next(Write(path, text))), };static string Read(string path) =\>File.ReadAllText(path);static Unit Write(string path, string text) { File.WriteAllText(path, text); return unit; }
    

    We can then run it by passing it the FreeIO<A> value:

    var result = Interpret(dsl);
    

    🔔 Notice how the result type of the interpreter is Either. We can use any result type we like, for example we could make the interpreter asynchronous:

    public static async Task\<A\> InterpretAsync\<A\>(FreeIO\<A\> ma) =\> ma switch{ Pure\<A\> (var value) =\> value, Fail\<A\> (var error) =\> await Task.FromException\<A\>(error), ReadAllText\<A\> (var path, var next) =\> await InterpretAsync(next(await File.ReadAllTextAsync(path))), WriteAllText\<A\> (var path, var text, var next) =\> await InterpretAsync(next(await File.WriteAllTextAsync(path, text).ToUnit())), };
    

    Which can be run in a similar way, but asynchronously:

    var res = await InterpretAsync(dsl);
    

    And so, the implementation of the interpreter is up to you. It can also take extra arguments so that state can be carried through the operations. In fact it's very easy to use the interpreter to bury all the messy stuff of your application (the IO, maybe some ugly state management, etc.) in one place. This then allows the code itself (that works with the free-monad) to be referentialy transparent.

    🤡 Another trick is to create a mock interpreter for unit-testing code that uses IO without having to ever do real IO. The logic gets tested, which is what is often the most important aspect of unit testing, but not real IO occurs. The arguments to the interpreter can be the mocked state.

    Some caveats though:

    • The recursive nature of the interpreter means large operations could blow the stack. This can be dealt with using a functional co-routines/trampolining trick, but that's beyond the scope of this doc.
    • 🐎 Although it's the perfect abstraction for IO, it does come with some additional performance costs. Generating the DSL before interpreting it is obviously not as efficient as directly calling the IO functions.

    🆓 Caveats aside, the free-monad allows for complete abstraction from side-effects, and makes all operations pure. This is incredibly powerful.

    📚 Full code-gen documentation