Wednesday, January 25, 2012

Pattern Matching and Guards are Fundamentally Different

Pattern Matching and Guards are fundamentally quite different! At least in Haskell, at any rate.

Function using pattern matching

check :: [a] -> String
check [] = "Empty"
check (x:xs) = "Contains Elements"
Function using guards

check_ :: [a] -> String
check_ lst
    | length lst < 1 = "Empty"
    | otherwise = "Contains elements"

Guards are both simpler and more flexible: They're essentially just special syntax that translates to a series of if/then expressions. You can put arbitrary boolean expressions in the guards, but they don't do anything you couldn't do with a regular if.

Pattern matches do several additional things: They're the only way to deconstruct data, and they bind identifiers within their scope. In the same sense that guards are equivalent to if expressions, pattern matching is equivalent to case expressions. Declarations (either at the top level, or in something like a let expression) are also a form of pattern match, with "normal" definitions being matches with the trivial pattern, a single identifier.

Pattern matches also tend to be the main way stuff actually happens in Haskell--attempting to deconstruct data in a pattern is one of the few things that forces evaluation.

By the way, you can actually do pattern matching in top-level declarations:

square = (^2)

(one:four:nine:_) = map square [1..]

This is occasionally useful for a group of related definitions.

GHC also provides the ViewPatterns extension which sort of combines both; you can use arbitrary functions in a binding context and then pattern match on the result. This is still just syntactic sugar for the usual stuff, of course.

As for the day-to-day issue of which to use where, here's some rough guides:

Definitely use pattern matching for anything that can be matched directly one or two constructors deep, where you don't really care about the compound data as a whole, but do care about most of the structure. The @ syntax lets you bind the overall structure to a variable while also pattern matching on it, but doing too much of that in one pattern can get ugly and unreadable quickly.

Definitely use guards when you need to make a choice based on some property that doesn't correspond neatly to a pattern, e.g. comparing two Int values to see which is larger.

If you need only a couple pieces of data from deep inside a large structure, particularly if you also need to use the structure as a whole, guards and accessor functions are usually more readable than some monstrous pattern full of @ and _.

If you need to do the same thing for values represented by different patterns, but with a convenient predicate to classify them, using a single generic pattern with a guard is usually more readable. Note that if a set of guards is non-exhaustive, anything that fails all the guards will drop down to the next pattern (if any). So you can combine a general pattern with some filter to catch exceptional cases, then do pattern matching on everything else to get details you care about.

Definitely don't use guards for things that could be trivially checked with a pattern. Checking for empty lists is the classic example, use a pattern match for that.

In general, when in doubt, just stick with pattern matching by default, it's usually nicer. If a pattern starts getting really ugly or convoluted, then stop to consider how else you could write it. Besides using guards, other options include extracting subexpressions as separate functions or putting case expressions inside the function body in order to push some of the pattern matching down onto them and out of the main definition.