Reason #714 that I’m loving F#: Discriminated Unions

The more experience I gain with F#, the more I like it. So, when contemplating how I might convince someone to give it a try, I briefly contemplated which feature of the language might be most compelling, and quickly decided on Discriminated Unions.

I’ll try to explain the value of Discriminated Unions by walking you through an example, rather than trying to define them in a paragraph format. I will walk through the example by using a C#-centric perspective, because that is where most of my experience and the experience of my peers tends to be.


Consider the following C# code snippet which is designed to throw various exceptions if it is unable to perform its task:

public decimal CalculateSalesTax(Invoice invoice)
    if (invoice == null)   throw new ArgumentNullException("invoice");

    if (invoice.Customer.Address.State == null)
        throw new InvalidOperationException("State is required for sales tax calculation!");

    // Pretend sales tax calculation
    return 1.23m;

Typically, teams will struggle to consistently design code in a manner which will reliably handle an exception in such a way that both provides a sensible response to the user of the software, but which also does notobscure details about the cause or origin of the exception from future maintainers of the software. Designing code to sensibly handle exceptions is both an art and a science, and few teams manage to come close to getting it right.

Additionally, when using someone else’s code (such as a 3rd party library) it can be difficult or impossible to know or anticipate every type of exception that the code might throw. Therefore it can be difficult or impossible to design your code with certainty that some unknown exception type will not make it crash in the future. Extensive testing (both manual and automated) can help you discover errors that your code might encounter, but you can never be completely certain that you’ve handled all scenarios until you’ve handed your code off to customers and your customers have been using your software forever.

Null References

Null references can become the bane of programming in C#. Until software developers have mastered the right combinations of Null Object Patterns, Invariant Protections, and contextual constraints, any object reference can be a potentially application-crashing null reference.

Many software development teams fail to master these techniques, and instead resort to what has been cleverly termed the Fly Swallower Anti-Pattern. That is to say, nearly every object reference is checked for null, and unfortunately, the context leading to the object’s nullness in the first place is not fixed.

The code sample above can be modified for a nicely typical example:

public decimal CalculateSalesTax(Invoice invoice)
    if (invoice != null
        && invoice.Customer != null
        && invoice.Customer.Address != null
        && invoice.Customer.Address.State != null)
        // Pretend sales tax calculation
        return 1.23m;

    return 0m;

It might look ugly, you might know better than to ever do anything like this, but man, I see this sort of thing all the time. This tendency can be even more prevalent in void methods because the compiler does not enforce that the method returns any value or even performs any operation at all.

Both of these problems, Exceptions and Null references, and be virtually eliminated, or at least handled far, far more gracefully and at compile time, using F# and Discriminated Unions.

Consider this F# code snippet:

type Option<'a> =       
   | Some of 'a         
   | None   

… and replace the <‘a> with < T > in your head if it helps, to translate the generic type parameter into C#-ese. We can use this discriminated union to write code as follows:

type CustomerAddress = {
    HouseNumber : int
    StreetName : string
    State : string    

type Invoice = {
      Address : CustomerAddress Option 

let calculateSalesTax invoice =
    match invoice.Address with
    | Some x -> 10
    | None -> failwith "Need a customer address calculate sales tax!"

(Ignore for the moment that failwith is basically throwing an exception. We will first address the issue of the possible nullness of the State property of the invoice record, and then will provide a more elegant solution to the exception throwing in a bit.)

So far, having declared our State as a string Option prevents us (and future developers) from failing to consider that the Address portion of the Invoice record is optional. The compiler prevents us from doing this:

let calculateSalesTax invoice =
    match invoice.Address.State with
    | "WA" -> 10
    | _ -> 0

This code snippet fails at compile time, with the error: Type constraint mismatch. The type ‘Option< CustomerAddress >’ is not compatible with the type ‘CustomerAddress’

So in other words, the compiler makes it absolutely impossible for you, or any other developer, to neglect to consider that the Invoice’s CustomerAddress might or might not exist.

You can still encounter null references just a readily when interacting with any other .NET code but at the very least, you can limit your code’s awareness of null references at the F# boundary by converting every reference you receive into an Option type.

And oh yeah, although creating a discriminated union for optional references is no more difficult than the sample declaration above, this particular type is already built right into the F# language for you.

So now we want to tackle how to improve the failwith above to be something a bit less error prone. For this, I will use a success/failure discriminated union which I have ganked from the site, which itself is a really fantastic resource:

type Result<'TSuccess,'TFailure> = 
    | Success of 'TSuccess
    | Failure of 'TFailure  

The sales tax calculation can then be modified to use the success/failure discriminated union as follows:

let calculateSalesTax invoice =
    match invoice.Address with
    | Some x -> Success 10
    | None -> Failure "Need a customer address calculate sales tax!"        

Now, it is impossible to neglect to account for the fact that calculateSalesTax can fail in some situations. The following code snippet:

let processInvoice invoice =
    let salesTax = calculateSalesTax invoice
    let productsTotal = 100 // todo: leverage F#'s unit of measure feature
    let invoiceTotal = productsTotal + **salesTax** // problem occurs here

… results in the compilation error The type ‘Result< int,string >’ does not match the type ‘int’

To get the code to compile, both Success and Failure cases must be accounted for, such as follows:

let processInvoice invoice =
    let salesTax = calculateSalesTax invoice
    let productsTotal = 100 // todo: leverage F#'s unit of measure feature
    match salesTax with
    | Success x -> Success (productsTotal + x)
    | Failure x -> Failure x

… which will require the next caller in the call stack to account for the potential failure of the operation, and so on.

There are much more elegant solutions to reap these exact same benefits without having a match … with | Success | Failure explosion in your code, but I’ll save that for a future blog post, or perhaps just refer once again to the excellent article here.

Hopefully I’ve given at least a good enough overview to illustrate to an experienced C# developer how some of these F# techniques can be used to create less error prone code. Huge classes of errors that plague C# code can be eliminated at compile time in F#.

If You’re Waiting for Permission to Refactor Technical Debt, You’re Doing it Wrong

Or perhaps more accurately, if you’re waiting for your business to allocate time to refactor existing technical debt, you’re doing it wrong say, about 97% of the time.

Having worked numerous shorter-term contract jobs in recent years, I find that the majority of teams are dissatisfied (and I believe rightfully so) with the current state of Technical Debt in their projects. And yet, these teams seem consistently unable to convince their business counterparts to allocate time for refactoring to improve their situation.

Often times, these teams are saddled with mountains of poorly-designed legacy code from which they cannot seem to escape. In spite of the fact that developers working on such projects are painfully aware of the extent to which their big ball of mud is killing their daily productivity, the teams never seem able to convince their business counterparts to allocate them time to correct, or even attempt to improve, the mess.

I believe this usually stems from the difficulty in quantifying software development productivity, which in turn makes a dollars and cents-based justification for software refactoring essentially impossible.

I also believe that we as software developers struggle with some unfortunate stereotypes which, at least on many teams, I have found to be completely inaccurate. While I have encountered the occasional neckbeard stereotype, I have also worked at times on entire teams of well-adjusted, intelligent and good-natured developers, perfectly capable of understanding the need to produce valuable software features consistently, in order to justify their own team’s continued existence.

I have personally found the inability to convince business to allocate development time toward refactoring particularly frustrating given that my whole intent is to be able to produce software that is of value in the most rapid, efficient manner possible, while also ensuring the correctness of that code (for which my primary techniques are 1) automated testing and 2) maximally leveraging static type checking). As Robert Martin would say, “The only way to go fast is to go well.

The ‘Providing Business people with 100% of the Information They Need to Make What is Essentially a Technical Decision About Which They Know Nothing so that They Can Make the Decision Themselves’ Anti-Practice

Have you ever found yourself in this situation: Some business person at your company is involved with technical decision making, and at some point you realize that you’re having to provide 100% of the information that person needs to make a decision? In the worst cases, the correct decision might be completely obvious to any software developer, and yet the business person might head off in some nonsensical direction, and so you have to provide even more and more justification to convince the business person to do the right thing.

Why is this person involved in decision making at this level in the first place? This is a tell-tale sign that the business person is being involved in decision making at the wrong level.

If a plumber were repair a water pipe in your house, and that plumber were to continually ask you about what types of material to use for welding pipe joints, about how much weld to add to each pipe joint, what would your response be? You would probably respond that plumber should just apply his/her expertise to ensure that the pipe gets repaired correctly, and to ensure that the water keeps flowing. Involving you, the client, about the nuances of pipe welding would simply be involving you at the wrong level of abstraction, so to speak.

Such is often the case with refactoring technical debt. As has been pointed out elsewhere, software developers can easily fall into the trap of thinking that refactoring should be some large scale effort in which time is taken away from producing valuable feature work to “clean house,” but as Erik points out in his block post, even the filthiest of home owners probably shouldn’t take weeks off from their day jobs to clean up their squalor. Rather, such cleaning should be done incrementally, a few hours per evening, and perhaps those small incremental efforts should be ongoing and never necessarily ending.

Granted, this suggestion implies a certain level of refactoring expertise on the part of each developer on the team. Comprehending large-scale changes that need to be made to an architecture, but applied in an incremental manner (a la Martin Fowler’s book on refactoring), is a skill that can take some time to develop. But there’s nothing to do but try. Expecting the business to allocate time for large “time out” refactorings is a strategy almost certain to fail.

Sometimes teams do convince business to allocate large time out-style refactorings, but these efforts can go badly for the same reason that integrating large modules towards the end of Waterfall-style projects can go badly. I once worked with a team which, prior to the first release of their product, convinced the business to allow them to take a ~ 1-year long timeout to reengineer the project’s architecture, only to discover toward the end of that year that the reengineering was based on some substantially harmful decisions. Ouch! It’s almost always better to work incrementally.

Working incrementally can add quite a bit of cognitive overhead to day-to-day feature work. I know personally, for the first few years of my career, banging out new code or fixing bugs right where my cursor happened to be was about all that I could handle. But over time, common refactoring patterns can become second nature, making it quite feasible to mentally hold your day-to-day feature work, while simultaneously considering an undesirable old architecture you wish to be moving away from, as well as a new architecture to work towards.

Aggressively refactoring every day is not ‘going rogue’

Some might argue “secretly” refactoring, as a part of one’s day-to-day work, might run counter to the Agile philosophy of openness. I would counter that this refactoring would be “secret” in the same sense that you’re “secretly” choosing to make any particular method public or private, virtual or abstract, without involving a business person in that decision. This is simply not the level of in-depth technical decision making at which business people need to be operating.

In case anyone might still be put off by this notion of refactoring, even at a very granular scale, without involving business in the decision, let me offer a counter-perspective. I would willingly encourage any business for which I work to continue judging me, or my software team, by my/our continued feature-based work output. This provides the added benefit that I/we had better be damn sure we’re not just gold-plating the code, refactoring according to personal style preferences, or anything else that does not improve the long-term viability of the project.

At any point at which a refactoring effort might lend itself to a quantifiable business justification, consider involving the business. Suppose that your car mechanic tells you he can replace your water pump for $50 in parts plus $500 in labor, but while he has your engine disassembled he can replace your timing belt for another $50 in parts, with no added labor cost. Scenarios in software development tend to be less measurable so I struggle to envision (or to recall) a comparable situation, but I’m certain that it’s possible.

If I were to come a bit closer to tooting my own profession’s horn, I might argue that, while software developers certainly possess at least some ability to appreciate the importance of regularly producing valuable feature work, business stakeholders do not typically possess much, and often not any, ability to understand the impact that crippling Technical Debt can have on a team’s productivity. From a technical perspective, your typical software developer might much better understand what is necessary to ensure the long-term technical viability of the project.


Might it be possible that by involving the business with technical debt decisions, that we are involving them at the entirely wrong level of decision-making? The wrong level of abstraction? Might development teams be better off positioning themselves as feature brokers, who handle issues of technical debt internally, leaving the business to judge the team based only on their output of customer-facing features? I believe this is indeed most often the case. If you’re waiting for your business to allocate time for you to improve upon code that “already works,” odds are you’ll be waiting forever. What is your move going to be?