How to return all relevant error messages in a composable way.

I've previously suggested that I consider validation a solved problem. I still do, until someone disproves me with a counterexample. Here's a fairly straightforward applicative validation example in C#.

After corresponding and speaking with readers of Code That Fits in Your Head I've learned that some readers have objections to the following lines of code:

Reservation? reservation = dto.Validate(id);
if (reservation is null)
    return new BadRequestResult();

This code snippet demonstrates how to parse, not validate, an incoming Data Transfer Object (DTO). This code base uses C#'s nullable reference types feature to distinguish between null and non-null objects. Other languages (and earlier versions of C#) can instead use the Maybe monad. Nothing in this article or the book hinges on the nullable reference types feature.

If the Validate method (which I really should have called TryParse instead) returns a null value, the Controller from which this code snippet is taken returns a 400 Bad Request response.

The Validate method is an instance method on the DTO class:

internal Reservation? Validate(Guid id)
{
    if (!DateTime.TryParse(At, out var d))
        return null;
    if (Email is null)
        return null;
    if (Quantity < 1)
        return null;
 
    return new Reservation(
        id,
        d,
        new Email(Email),
        new Name(Name ?? ""),
        Quantity);
}

What irks some readers is the loss of information. While Validate 'knows' why it's rejecting a candidate, that information is lost and no error message is communicated to unfortunate HTTP clients.

One email from a reader went on about this for quite some time and I got the impression that the sender considered this such a grave flaw that it invalidates the entire book.

That's not the case.

Rabbit hole, evaded #

When I wrote the code like above, I was fully aware of trade-offs and priorities. I understood that this particular design would mean that clients get no information about why a particular reservation JSON document is rejected - only that it is.

This was a simplification that I explicitly decided to make for educational reasons.

The above design is based on something as simple as a null check. I expect all my readers to be able to follow that code. As hinted above, you could also model a method like Validate with the Maybe monad, but while Maybe preserves success cases, it throws away all information about errors. In a production system, this is rarely acceptable, but I found it acceptable for the example code in the book, since this isn't the main topic.

Instead of basing the design on nullable reference types or the Maybe monad, you can instead base parsing on applicative validation. In order to explain that, I'd first need to explain functors, applicative functors, and applicative validation. It might also prove helpful to the reader to explain Church encodings, bifunctors, and semigroups. That's quite a rabbit hole to fall into, and I felt that it would be such a big digression from the themes of the book that I decided not to go there.

On this blog, however, I have all the space and time I'd like. I can digress as much as I'd like. Most of that digression has already happened. Those articles are already on the blog. I'm going to assume that you've read all of the articles I just linked, or that you understand these concepts.

In this article, I'm going to rewrite the DTO parser to also return error messages. It's an entirely local change that breaks no existing tests.

Validated #

Most functional programmers are already aware of the Either monad. They often reach for it when they need to expand the Maybe monad with an error track.

The problem with the Either monad is, however, that it short-circuits error handling. It's like throwing exceptions. As soon as an Either composition hits the first error, it stops processing the rest of the data. As a caller, you only get one error message, even if there's more than one thing wrong with your input value.

In a distributed system where a client posts a document to a service, you'd like to respond with a collection of errors.

You can do this with a data type that's isomorphic with Either, but behaves differently as an applicative functor. Instead of short-circuiting on the first error, it collects them. This, however, turns out to be incompatible to the Either monad's short-circuiting behaviour, so this data structure is usually not given monadic features.

This data type is usually called Validation, but when I translated that to C# various static code analysis rules lit up, claiming that there was already a referenced namespace called Validation. Instead, I decided to call the type Validated<FS>, which I like better anyway.

The type arguments are F for failure and S for success. I've put F before S because by convention that's how Either works.

I'm using an encapsulated variation of a Church encoding and a series of Apply overloads as described in the article An applicative password list. There's quite a bit of boilerplate, so I'll just dump the entire contents of the file here instead of tiring you with a detailed walk-through:

public sealed class Validated<FS>
{
    private interface IValidation
    {
        T Match<T>(Func<F, T> onFailure, Func<S, T> onSuccess);
    }
 
    private readonly IValidation imp;
 
    private Validated(IValidation imp)
    {
        this.imp = imp;
    }
 
    internal static Validated<F, S> Succeed(S success)
    {
        return new Validated<F, S>(new Success(success));
    }
 
    internal static Validated<F, S> Fail(F failure)
    {
        return new Validated<F, S>(new Failure(failure));
    }
 
    public T Match<T>(Func<F, T> onFailure, Func<S, T> onSuccess)
    {
        return imp.Match(onFailure, onSuccess);
    }
 
    public Validated<F1, S1> SelectBoth<F1S1>(
        Func<F, F1> selectFailure,
        Func<S, S1> selectSuccess)
    {
        return Match(
            f => Validated.Fail<F1, S1>(selectFailure(f)),
            s => Validated.Succeed<F1, S1>(selectSuccess(s)));
    }
 
    public Validated<F1, S> SelectFailure<F1>(
        Func<F, F1> selectFailure)
    {
        return SelectBoth(selectFailure, s => s);
    }
 
    public Validated<F, S1> SelectSuccess<S1>(
        Func<S, S1> selectSuccess)
    {
        return SelectBoth(f => f, selectSuccess);
    }
 
    public Validated<F, S1> Select<S1>(
        Func<S, S1> selector)
    {
        return SelectSuccess(selector);
    }
 
    private sealed class Success : IValidation
    {
        private readonly S success;
 
        public Success(S success)
        {
            this.success = success;
        }
 
        public T Match<T>(
            Func<F, T> onFailure,
            Func<S, T> onSuccess)
        {
            return onSuccess(success);
        }
    }
 
    private sealed class Failure : IValidation
    {
        private readonly F failure;
 
        public Failure(F failure)
        {
            this.failure = failure;
        }
 
        public T Match<T>(
            Func<F, T> onFailure,
            Func<S, T> onSuccess)
        {
            return onFailure(failure);
        }
    }
}
 
public static class Validated
{
    public static Validated<F, S> Succeed<FS>(
        S success)
    {
        return Validated<F, S>.Succeed(success);
    }
 
    public static Validated<F, S> Fail<FS>(
        F failure)
    {
        return Validated<F, S>.Fail(failure);
    }
 
    public static Validated<F, S> Apply<FTS>(
        this Validated<F, Func<T, S>> selector,
        Validated<F, T> source,
        Func<F, F, F> combine)
    {
        if (selector is null)
            throw new ArgumentNullException(nameof(selector));
 
        return selector.Match(
            f1 => source.Match(
                f2 => Fail<F, S>(combine(f1, f2)),
                _  => Fail<F, S>(f1)),
            map => source.Match(
                f2 => Fail<F, S>(f2),
                x  => Succeed<F, S>(map(x))));
    }
 
    public static Validated<F, Func<T2, S>> Apply<FT1T2S>(
        this Validated<F, Func<T1, T2, S>> selector,
        Validated<F, T1> source,
        Func<F, F, F> combine)
    {
        if (selector is null)
            throw new ArgumentNullException(nameof(selector));
 
        return selector.Match(
            f1 => source.Match(
                f2 => Fail<F, Func<T2, S>>(combine(f1, f2)),
                _  => Fail<F, Func<T2, S>>(f1)),
            map => source.Match(
                f2 => Fail<F, Func<T2, S>>(f2),
                x  => Succeed<F, Func<T2, S>>(y => map(x, y))));
    }
 
    public static Validated<F, Func<T2, T3, S>> Apply<FT1T2T3S>(
        this Validated<F, Func<T1, T2, T3, S>> selector,
        Validated<F, T1> source,
        Func<F, F, F> combine)
    {
        if (selector is null)
            throw new ArgumentNullException(nameof(selector));
 
        return selector.Match(
            f1 => source.Match(
                f2 => Fail<F, Func<T2, T3, S>>(combine(f1, f2)),
                _  => Fail<F, Func<T2, T3, S>>(f1)),
            map => source.Match(
                f2 => Fail<F, Func<T2, T3, S>>(f2),
                x  => Succeed<F, Func<T2, T3, S>>((yz) => map(x, y, z))));
    }
 
    public static Validated<F, Func<T2, T3, S>> Apply<FT1T2T3S>(
        this Func<T1, T2, T3, S> map,
        Validated<F, T1> source,
        Func<F, F, F> combine)
    {
        return Apply(
            Succeed<F, Func<T1, T2, T3, S>>((xyz) => map(x, y, z)),
            source,
            combine);
    }
}

I only added the Apply overloads that I needed for the following demo code. As stated above, I'm not going to launch into a detailed walk-through, since the code follows the concepts lined out in the various articles I've already mentioned. If there's something that you'd like me to explain then please leave a comment.

Notice that Validated<FS> has no SelectMany method. It's deliberately not a monad, because monadic bind (SelectMany) would conflict with the applicative functor implementation.

Individual parsers #

An essential quality of applicative validation is that it's composable. This means that you can compose a larger, more complex parser from smaller ones. Parsing a ReservationDto object, for example, involves parsing the date and time of the reservation, the email address, and the quantity. Here's how to parse the date and time:

private Validated<string, DateTime> TryParseAt()
{
    if (!DateTime.TryParse(At, out var d))
        return Validated.Fail<string, DateTime>($"Invalid date or time: {At}.");
 
    return Validated.Succeed<string, DateTime>(d);
}

In order to keep things simple I'm going to use strings for error messages. You could instead decide to encode error conditions as a sum type or other polymorphic type. This would be appropriate if you also need to be able to make programmatic decisions based on individual error conditions, or if you need to translate the error messages to more than one language.

The TryParseAt function only attempts to parse the At property to a DateTime value. If parsing fails, it returns a Failure value with a helpful error message; otherwise, it wraps the parsed date and time in a Success value.

Parsing the email address is similar:

private Validated<string, Email> TryParseEmail()
{
    if (Email is null)
        return Validated.Fail<string, Email>($"Email address is missing.");
 
    return Validated.Succeed<string, Email>(new Email(Email));
}

As is parsing the quantity:

private Validated<stringintTryParseQuantity()
{
    if (Quantity < 1)
        return Validated.Fail<stringint>(
            $"Quantity must be a positive integer, but was: {Quantity}.");
 
    return Validated.Succeed<stringint>(Quantity);
}

There's no reason to create a parser for the reservation name, because if the name doesn't exist, instead use the empty string. That operation can't fail.

Composition #

You can now use applicative composition to reuse those individual parsers in a more complex parser:

internal Validated<string, Reservation> TryParse(Guid id)
{
    Func<DateTime, Email, int, Reservation> createReservation =
        (atemailquantity) =>
        new Reservation(id, at, email, new Name(Name ?? ""), quantity);
    Func<stringstringstringcombine =
        (xy) => string.Join(Environment.NewLine, x, y);
 
    return createReservation
        .Apply(TryParseAt(), combine)
        .Apply(TryParseEmail(), combine)
        .Apply(TryParseQuantity(), combine);
}

createReservation is a local function that closes over id and Name. Specifically, it uses the null coalescing operator (??) to turn a null name into the empty string. On the other hand, it takes at, email, and quantity as inputs, since these are the values that must first be parsed.

A type like Validated<FS> is only an applicative functor when the failure dimension (F) gives rise to a semigroup. The way I've modelled it here is as a binary operation that you need to pass as a parameter to each Apply overload. This seems awkward, but is good enough for a proof of concept.

The combine function joins two strings together, separated by a line break.

The TryParse function composes createReservation with TryParseAt, TryParseEmail, and TryParseQuantity using the various Apply overloads. The combination is a Validated value that's either a failure string or a properly encapsulated Reservation object.

One thing that I still don't like about this function is that it takes an id parameter. For an article about why that is a problem, and what to do about it, see Coalescing DTOs.

Using the parser #

Client code can now invoke the TryParse function on the DTO. Here is the code inside the Post method on the ReservationsController class:

[HttpPost("restaurants/{restaurantId}/reservations")]
public Task<ActionResult> Post(int restaurantId, ReservationDto dto)
{
    if (dto is null)
        throw new ArgumentNullException(nameof(dto));
 
    var id = dto.ParseId() ?? Guid.NewGuid();
    var parseResult = dto.TryParse(id);
 
    return parseResult.Match(
        msgs => Task.FromResult<ActionResult>(new BadRequestObjectResult(msgs)),
        reservation => TryCreate(restaurantId, reservation));
}

When the parseResult matches a failure, it returns a new BadRequestObjectResult with all collected error messages. When, on the other hand, it matches a success, it invokes the TryCreate helper method with the parsed reservation.

HTTP request and response #

A client will now receive all relevant error messages if it posts a malformed reservation:

POST /restaurants/1/reservations?sig=1WiLlS5705bfsffPzaFYLwntrS4FCjE5CLdaeYTHxxg%3D HTTP/1.1
Content-Type: application/json
{ "at""large""name""Kerry Onn""quantity": -1 }

HTTP/1.1 400 Bad Request
Invalid date or time: large.
Email address is missing.
Quantity must be a positive integer, but was: -1.

Of course, if only a single element is wrong, only that error message will appear.

Conclusion #

The changes described in this article were entirely local to the two involved types: ReservationsController and ReservationDto. Once I'd expanded ReservationDto with the TryParse function and its helper functions, and changed ReservationsController accordingly, the rest of the code base compiled and all tests passed. The point is that this isn't a big change, and that's why I believe that the original design (returning null or non-null) doesn't invalidate anything else I had to say in the book.

The change did, however, take quite a bit of boilerplate code, as witnessed by the Validated code dump. That API is, on the other hand, completely reusable, and you can find packages on the internet that already implement this functionality. It's not much of a burden in terms of extra code, but it would have taken a couple of extra chapters to explain in the book. It could easily have been double the size if I had to include material about functors, applicative functors, semigroups, Church encoding, etcetera.

To fix two lines of code, I didn't think that was warranted. After all, it's not a major blocker. On the contrary, validation is a solved problem.


Comments

Dan Carter #
you can find packages on the internet that already implement this functionality

Do you have any recommendations for a library that implements the Validated<F, S> type?

2022-08-15 11:15 UTC

Dan, thank you for writing. The following is not a recommendation, but the most comprehensive C# library for functional programming currently seems to be LanguageExt, which includes a Validation functor.

I'm neither recommending nor arguing against LanguageExt.

  • I've never used it in a real-world code base.
  • I've been answering questions about it on Stack Overflow. In general, it seems to stump C# developers, since it's very Haskellish and quite advanced.
  • Today is just a point in time. Libraries come and go.

Since all the ideas presented in these articles are universal abstractions, you can safely and easily implement them yourself, instead of taking a dependency on a third-party library. If you stick with lawful implementations, the only variation possible is with naming. Do you call a functor like this one Validation, Validated, or something else? Do you call monadic bind SelectMany or Bind? Will you have a Flatten or a Join function?

When working with teams that are new to these things, I usually start by adding these concepts as source code as they become useful. If a type like Maybe or Validated starts to proliferate, sooner or later you'll need to move it to a shared library so that multiple in-house libraries can use the type to communicate results across library boundaries. Eventually, you may decide to move such a dependency to a NuGet package. You can, at such time, decide to use an existing library instead of your own.

The maintenance burden for these kinds of libraries is low, since the APIs and behaviour are defined and locked in advance by mathematics.

2022-08-16 5:54 UTC
If you stick with lawful implementations, the only variation possible is with naming.

There are also language-specific choices that can vary.

One example involves applicative functors in C#. The "standard" API for applicative functors works well in Haskell and F# because it is designed to be used with curried functions, and both of those languages curry their functions by default. In contrast, applicative functors push the limits of what you can express in C#. I am impressed with the design that Language Ext uses for applicative functors, which is an extension method on a (value) tuple of applicative functor instances that accepts a lambda expression that is given all the "unwrapped" values "inside" the applicative functors.

Another example involves monads in TypeScript. To avoid the Pyramid of doom when performing a sequence of monadic operations, Haskell has do notation and F# has computation expressions. There is no equivalent language feature in TypeScript, but it has row polymorphism, which pf-ts uses to effectively implement do notation.

A related dimension is how to approximate high-kinded types in a language that lacks them. Language Ext passes in the monad as a type parameter as well as the "lower-kinded" type parameter and then constrains the monad type parameter to implement a monad interface parametereized by the lower type parameter as well as being a struct. I find that second constraint very intersting. Since the type parameter has a struct constraint, it has a default constructor that can be used to get an instance, which then implements methods according to the interface constraint. For more infomration, see this wiki article for a gentle introduction and Trans.cs for how Language Ext uses this approach to only implement traverse once. Similarly, F#+ has a feature called generic functions that enable one to write F# like map aFoo instead of the typical Foo.map aFoo.

2022-09-20 02:00 UTC

Tyson, thank you for writing. I agree that details differ. Clearly, this is true across languages, where, say, Haskell's fmap has a name different from C#'s SelectMany. To state the obvious, the syntax is also different.

Even within the same language, you can have variations. Functor mapping in Haskell is generally called fmap, but you can also use map explicitly for lists. The same could be true in C#. I've seen functor and monad implementations in C# that use method names like Map and Bind rather than Select and SelectMany.

To expand on this idea, one may also observe that what one language calls Option, another language calls Maybe. The same goes for Result versus Either.

As you know, the names Select and SelectMany are special because they enable C# query syntax. While methods named Map and Bind are 'the same' functions, they don't light up that language feature. Another way to enable syntactic sugar for monads in C# is via async and await, as shown by Eirik Tsarpalis and Nick Palladinos.

I do agree with you that there are various options available to an implementer. The point I was trying to make is that while implementation details differ, the concepts are the same. Thus, as a user of one of these APIs (monads, monoids, etc.) you only have to learn the mental model once. You still have to learn the implementation details.

I recently heard a professor at DIKU state that once you know one programming language, you should be able to learn another one in a week. That's the same general idea.

(I do, however, have issues with that statement about programming languages as a universal assertion, but I agree that it tends to hold for mainstream languages. When I read Mazes for Programmers I'd never programmed in Ruby before, but I had little trouble picking it up for the exercises. On the other hand, most people don't learn Haskell in a week.)

2022-09-20 17:42 UTC


Wish to comment?

You can add a comment to this post by sending me a pull request. Alternatively, you can discuss this post on Twitter or somewhere else with a permalink. Ping me with the link, and I may respond.

Published

Monday, 25 July 2022 06:56:00 UTC

Tags



"Our team wholeheartedly endorses Mark. His expert service provides tremendous value."
Hire me!
Published: Monday, 25 July 2022 06:56:00 UTC