Static and dynamic types in pattern matching
Pattern matching is a control flow structure in some functional programming languages. It allows developers to examine the structure of an object (or some other data), bind variables to parts of the structure, and execute different cases depending on the structure. I wrote about pattern matching earlier when I added it to Gypsum.
Gypsum now supports pattern matching using both static and dynamic type information.
Let's break this down a bit, since this is a complicated statement. In general, pattern matching involves checking whether a value has a given type at run-time using dynamic type information. For example, we may want to check whether an
Object is a
String. Not all types can be checked at run-time though. For example, we can't check whether something is a
List[String] using only dynamic type information because of type erasure. The Gypsum compiler discards type arguments like
[String] at compile-time.
By incorporating static type information about the value being matched, we can perform some checks that wouldn't normally be safe to perform at run-time.
The Gypsum standard library has an abstract class called
Result with two concrete subclasses,
Err. Functions that can fail may return some kind of
Result as an alternative to throwing exceptions.
public abstract class Result[static +V, static +E] ... public final class Ok[static +V](public value: V) <: Result[V, Nothing] ... public final class Err[static +E](public error: E) <: Result[Nothing, E] ...
Recall that type parameters are defined in square brackets. Type parameters with a
+ prefix are covariant. In the example above,
V is the type of a successful result;
E is the type of an error.
Suppose we want to define a function that helps us recover from an error: it takes a result and a default value. If the result is
Ok, it returns the value from the result. If the result is
Err, it returns the default value. We can implement this function using pattern matching.
def recover(result: Result[String, _], default-value: String): String = match (result) case Ok[String](value) => value case _ => default-value
This function needs to be able to determine that its argument is an instance of
Ok[String]. This is impossible using only run-time type information because of type erasure. We could only determine whether the object is an instance of the
Ok class. If the compiler allowed the check, we might match an
Ok[I32] instead of an
Using static type information
Instead of relying solely on run-time information to perform the type check, we can also use the static type information available to the compiler. The compiler knows that the value being matched is of type
Result[String, _]. This means that if the value is an instance of
Ok, it must be an instance of
Ok[String]. Therefore, it's sufficient to only check that the value is an instance of
Ok at run-time.
The compiler function that determines whether these run-time type checks are safe is
typeCanBeTested. Most of the heavy lifting is done by
extractTypeArgsForSubtype though. This function takes a type (in this example,
Result[String, _]), down-casts it to a given class (
Ok), then figures out what the type arguments would be in the resulting type (
typeCanBeTested compares the extracted type arguments against the type arguments of the type being matched (
Ok[String]). If all type arguments are equivalent or if type arguments are covered by wildcards or existential variables, then the match is deemed safe.
What happens if the supertype is too vague? For example, suppose we were trying to test whether an
Object is an
Ok it won't be able to tell that the type argument should be
String so it just returns
typeCanBeTested will judge the match to be unsafe, and the compiler will report an error.
Pattern matching is a really powerful construct, and I wish more mainstream languages had it. Scala does an amazing job with pattern matching. It combines static and dynamic information the same way that Gypsum does; Scala is an inspiration for Gypsum in many ways.
Languages that use variants, like OCaml and Haskell, also do this. They lack a "top" type (like
Object in Gypsum), so it's difficult to get into a situation where a pattern can't be matched.