Pattern Matching

Scala supports a general way of taking some expression apart and checking if it matches some pattern (with its fractional constituents). Data corresponding to the whole pattern or to some of its constituents can be extracted and saved to new values for additional processing.

Pattern matching is commonly used for accessing the individual types of objects from a class hierarchy and their data. It can be understood as a reverse process of object’s construction: We need to find out which type we have the honor with and extract necessary data and process it accordingly. Typically this is the same data which was passed to the constructor when the instance was created.

Pattern matching is not just the switch, it is rather an extraction of data associated with the following branching of the code, according to the results of the extraction. So it’s a generalization of switch command, known from Java/C. It is sometimes called “switch on steroids”.

The idea of pattern matching is not largely inspired by the C family of languages, it has its roots in the theory of algebraic data types, composite types that are created by combining different types that can be analyzed (decomposed) using pattern matching. Pattern matching is therefore known structure in the functional languages like Haskell, OCaml, Standard ML, etc.

Syntax of match expression

Matched expression is followed by the match keyword and curly braces with a list of case branches. There is matched pattern after the case keyword, then arrow symbol (=>) and a list of commands which will be executed when the expression matches the pattern. Case patterns are tried sequentially. After the matching pattern was found, no other case branch is executed.

When no case branch is executed, scala.MatchError is thrown. Case branches must cover all the possible cases exhaustively, uninteresting/general cases can be summarized under the last “case _”. Pattern _ will match everything.

Match is an expression with a return value and returns result of evaluation of the branch, that matched the expression (common supertype of types returned from the individual branches):

Extraction of data from objects

Type must implement method unapply which serves for extraction of data from an instance, so the instance can be decomposed and its parameters matched. Method unapply is dual operation to apply (which is typically used as a factory method for construction of instances using input parameters). Both these special methods are recognized by the Scala compiler.

For example, we can parse a file in CSV format using unapply method defined on an object:

products.csv (id, name and price of products): 23;Chair;60.50 45;Table;430.40 object Product { def unapply(str: String): Option[(Int, String, BigDecimal)] = { val parts = str split "" if (parts.length == 3) Some((Integer.valueOf(parts(0)), parts(1), BigDecimal(parts(2)))) else None } } import scala.io.Source val lines = Source.fromFile("products.csv").getLines() for (Product(id, name, price) println(i) }

Types of patterns

We will briefly introduce individual types of patterns, that can be written after the case keyword.

Typed pattern with optional assignment of the matched non-null instance of given type into a named val. Not only case classes, but also value types such as Int can be matched via typed pattern.

case n: Number => n

Constructor pattern with optional assignment of the matched instance into a named val. Also parameters (extractable inner data) of the instance can be matched. Pattern matching is recursively applied to the parameters (deep matching): Constructor pattern (for inner extractable classes), variable pattern or constant pattern can be used.

case op @ BinOperation("+", expr, Number(n)) => op

Expression @ can be used before any pattern, value matched by this pattern is stored to the named val. Constructor pattern supports classes with defined method unapply (for e.g. case classes which includes also tuples).

Sequence pattern allows matching of sequences with any possible number of elements. For e.g. List(0, 1, _, _*) matches a list with given two first elements, any third element, _* matches following zero or more elements of the sequence. Sequence pattern cannot be used with Sets or Maps, just with sequences.

Variable pattern which matches anything – matched instance is stored into a named val.

case x => x

Constant pattern. Matches a value against a named constant. It differs from the variable pattern simply by naming convention – first letter of the constant name is in uppercase. Would the name started with a small letter, variable pattern would be used. Qualified name such as math.Pi is always considered a constant name. Also val and object can be used in a constant pattern (e.g. Nil matches only the empty List).

case Pi => println("PI = " + Pi)

Wildcards

Wildcard _ can be used in variable pattern instead of the name of bound value (if the matched value is not interesting for the further code). _ can be used also in typed pattern in place of concrete type parameters (e.g. m: Map[_,_]). Also in cosntructor pattern, _ greatly fits as a placeholder for parameter which matches anything. As you can see, underscore is especially favored and almighty symbol in Scala.

Pattern guards

The case keyword with a pattern can be followed by if with a condition (written before the arrow => symbol). We can use named values representing matched patterns in this condition. If the condition will not be fulfilled, case branch will not match the expression and the following cases will be evaluated.

n match { case i : Int if i > 0 => "positive int" case s : String if s.length() > 0 => "non-empty string" }

Pattern matching everywhere

Pattern matching is not a marginal structure in Scala, it is a natural integral part of Scala, used for e.g. for handling the exceptions:

try { file.createNewFile() } catch { case e: IOException => println( "IO error while creating new file '" + file.getCanonicalPath() + "': " + e.getMessage()) case e: Exception => println("Error while creating new file '" + file.getCanonicalPath() + "': " + e.getMessage()) }

It can also be used for direct decomposition of an instance to certain number of vals:

val myTuple = (1, 2) val (x, y) = myTuple // (x, y) is constructor pattern for tuple println(x) // 1 val addOperation = Add(Number(1), Add(Number(2), Number(3))) val Add(Number(_), secondOperand @ Add(_, b)) = addOperation // Add(Number(_), secondOperand @ Add(_, b)) is constructor pattern // with nested constructor patterns for first and second parameters, // b is matched and bound by variable pattern println(b) // Number(3)

Pattern matching or polymorphism?

Polymorphism known from the object oriented programming is an alternative to pattern matching. It also can be used in intuitive and typesafe manner to decompose instances from a class hierarchy, extract necessary data and perform followed-up operations above it. Each class can implement logic of the operation in its own way using overridable virtual method.

If we are implementing the operation which is processing different instances from a class hierarchy using the pattern matching, we can detach the operation implementation entirely outside the class hierarchy, or place it into the base trait that is mixed into all classes of the hierarchy. In each case, we have the complete implementation of the operation in one place, one source code file. We can easily prepare various implementations (strategies) of the operation and choose between them, but this can be achieved also with polymorphic operations and implicitly passed parameters – strategies.

Pattern matching is generally preferable in the case we want the possibility of flexible adding of new operations, performed above the class hierarchy, and operations are changed more frequently than number of classes in the hierarchy which is relatively stable. In the case of polymorfic implementation, adding of a new operation requires intervention directly in diffent classes of class hierarchy, whereas using the pattern matching, the logic can be comfortable changed in one place.

By contrast, when we demand flexible adding of another classes and operations do not change much, it is preferable to use the polymorphism – we can add specific class and implement necessary operations in specific way for it (all changes are done only in one source file again).

It is desirable to always use suitable technique for somewhat contradictory situations of multiplication of classes and multiplication of methods in the classes.

Where to next?

For additional examples of pattern matching, see Scala Essentials: Pattern Matching and Case Classes or you can read the chapter Case Classes and Pattern Matching from the book Programming in Scala.

Article has 0 comments