Saturday, September 12, 2009

Type Lists and Heterogeneously Typed Arrays

In previous blog posts (HList in Scala and HList in Scala Revisited (or Scala Metaprogramming Works!)) I described a way to encode heterogeneously typed lists, HList, in Scala. This data type can be used as a generic replacement for the TupleX types found in the Scala library. In this post I will generalize the concept of type lists and introduce a new data type, HArray.

Note: the MetaScala code linked to in this post must be compiled using a recent Scala 2.8.0 build. I've given up on making it work with 2.7.x.

Type Lists

In HList the types of the elements and the element values are intertwined in the same type. It doesn't have to be this way though and the types of the elements can be tracked separate from the actual element values. To do this we create an purely abstract type (it has no instances) which models a list of types, let's call it TList. Here's a simple implementation:

// Abstract base type
sealed trait TList

// The empty type list
final class TNil extends TList

// A pair of a type and a type list
final class TCons[H, T <: TList] extends TList {
type Head = H
type Tail = T
}

// For infix notation
type ::[H, T <: TList] = TCons[H, T]

which would allow us to build type lists using infix notation, for example:

type SomeTypes = Int :: Boolean :: String :: TNil

This is only the basic definition of TList, in practice you want to add lots of functionality to this type, like append, length, indexing, reverse etc. The full TList code is available in MetaScala. TList can be used for implementing HList, but also heterogeneously typed arrays which I describe below.

Heterogeneously Typed Arrays

With a heterogeneously typed array I mean an array where each element can have a unique type which is tracked at compile time (the length of the array is also tracked at compile time). This ensures that the element values are handled in a type safe way. The values are stored in an Array[Any] and are cast to the correct type when extracted from the array (remember that Java generics are implemented using casts as well so this is no different from using the TupleX classes, or a Java ArrayList). The element types are tracked using a TList.

Here's part of the implementation which shows the prepend, append and nth methods:

final class HArray[L <: TList](private val elems : Array[Any]) extends HSeq[L] {
...

def ::[T](v : T) = {
val a = new Array[Any](elems.length + 1)
a(0) = v
Array.copy(elems, 0, a, 1, elems.length)
new HArray[T :: L](a)
}

def :::[L2 <: TList](l : HArray[L2]) = {
val a = new Array[Any](elems.length + l.elems.length)
Array.copy(l.elems, 0, a, 0, l.elems.length)
Array.copy(elems, 0, a, l.elems.length, elems.length)
new HArray[L2#Append[L]](a)
}

def apply[N <: Nat](implicit nth : INth[N]) : L#Nth[N] = elems(nth.index).asInstanceOf[L#Nth[N]]

...
}

Notice that the method implementations use simple array operations (array copy and indexing) so the performance should be comparable to using the TupleX classes in the Scala library (no benchmarks performed yet though). The full HArray implementation can be found here.

The HArray interface is similar to HList and virtually the same operations are supported. There are some usage examples in MetaScala:

// Create a HArray of an Int, Boolean and a pair
val a1 = 10 :: true :: (10.1, "Hello") :: HArrayNil

// Extract the second element, note that the element type
// information is preserved and we can safely perform a
// boolean and operation
val b = a1(_1) && false

// Create another HArray using alternative syntax (faster)
val a2 = HArray(1.1, "string", false)

// Replace the second element in the list, it used to
// be a String, but now it's an Int
val a3 = a2.removeNth(_1).insert(_1, 14)

// Type information preserved, we can use an Int operation
// on the element
val i = a3(_1) / 2

// Append l2 to l1
val a4 = a1 ::: a2

// Statically check that the length of l4 is 6
type T = Equal[_6, a4.Size]


Other TList Applications

There are other applications for the TList type, for example it can be used for modeling type unions like this:

case class OneOf[TS <: TList](value : Any)

which would be a lot easier to use than Either when you have more than two choices. A problem with this solution is that it wouldn't play well with Scala's pattern matcher, you would to add need unnecessary default cases for example. It should be possible to create type matching functions though to alleviate this problem. More work is needed in this area and it might be material for a future blog post.

If you can think of other applications for type lists and sets I would be interested in hearing them.

Tuesday, April 21, 2009

Equality, Mutability and Products

The thoughts in this post was sparked by a discussion on the scala-user mailing list. The discussion was about the Object.equals() method and what its semantics should be. There are many different types of equality, but IMO the most usable one is if the expression a.equals(b) returns the same value regardless of when it's evaluated (provided the references a and b are constant), i.e. it's time invariant. If this was not the case, it would be quite impossible to implement Object.hashCode() in a useful way and you wouldn't be able to use objects as keys in a hash map or as members of a set.

The Java standard library breaks this practice in a number of places, for example java.util.ArrayList implements equals() as element equality which of course is not time invariant as the list can change over time (unless it's read only). I think the rationale behind this might be that there are no immutable collection classes in the Java libraries, and thus ArrayList is used for both mutable and immutable lists.

Equality for Mutable Case Classes

When using Scala case classes you automatically get an equals() and hashCode() implementation which calls the corresponding class fields methods. The implementation is the same regardless of whether the fields are mutable or not, so if you use vars the implementations will not be time invariant. For example:

scala> case class A(var i : Int)
defined class A

scala> val a1 = A(1)
a1: A = A(1)

scala> val a2 = A(1)
a2: A = A(1)

scala> a1 == a2
res57: Boolean = true

scala> a1.hashCode
res58: Int = -1085252538

scala> a1.i = 2

scala> a1 == a2
res59: Boolean = false

scala> a1.hashCode
res60: Int = -1085252537

How can we fix this? The Scala compiler only generates equals() and hashCode() implementations if we have not already implemented them in a super type (excluding java.lang.Object). So we can simply define a super trait for all mutable case classes:

trait Mutable {
override def equals(v : Any) = super.equals(v)
override def hashCode = super.hashCode
}

Now we can easily define mutable case class with the default, time invariant equals() and hashCode() methods:

scala> case class A(var i : Int) extends Mutable
defined class A

scala> val a1 = A(1)
a1: A = A(1)

scala> val a2 = A(1)
a2: A = A(1)

scala> a1 == a2
res61: Boolean = false

scala> a1.hashCode
res62: Int = 22213838

scala> a1.i = 2

scala> a1 == a2
res63: Boolean = false

scala> a1.hashCode
res64: Int = 22213838

A more elegant solution is to define a Var trait and use this instead of vars in case classes, but this is not always a practical solution and it has some performance drawbacks.

Time Variant Equalily for Products

So, what if I want to compare my case classes for element equality at a given moment in time? Turns out it's quite easy to implement such a function in Scala as all case classes implements the Product trait. Here's an example implementation (not thoroughly tested):

def equalElements(v1 : Any, v2 : Any) : Boolean = (v1, v2) match {
case (p1 : Product, p2 : Product) =>
p1.getClass == p2.getClass &&
(0 until p1.productArity).forall(i => equalElements(p1.productElement(i), p2.productElement(i)))
case _ => v1 == v2
}

This function has some limitations and won't work correctly with nested product and non-product objects. In this case a more type specialized solution is required.

Example usage:

scala> val a1 = A(1)
a1: A = A(1)

scala> val a2 = A(1)
a2: A = A(1)

scala> equalElements(a1, a2)
res68: Boolean = true

scala> a1.i = 2

scala> equalElements(a1, a2)
res69: Boolean = false

Monday, February 23, 2009

New Verson of InfoNode Docking Windows

(This post is not Scala related)

I'm happy to announce the release of InfoNode Docking Windows 1.6.0. This release mainly contains fixes for some bugs which has been discovered since the previous release in 2007.

A short description of InfoNode Docking Windows is that it's a Swing based docking windows library, similar to what you find in Eclipse, Netbeans, Visual Studio etc. but more light weight. It's very easy to integrate into an existing Swing application (heavy weight components are supported as well) and it's highly customizable using a powerful properties system. Much of the window behaviors/looks can be tweaked with very little code. A number of themes are included in the distribution, and it's easy to create custom themes.

Well, a Web Start application says more than a thousand words, so why not try the demos for yourself.

InfoNode Docking Windows is dual licensed under the GPL and commercial licenses. The GPL version can be downloaded from this page.

For more information, see the InfoNode web page.

Saturday, October 4, 2008

Functional Purity in Scala

An important concept in functional programming is pure functions. The properties of pure functions means that they are easily verifiable and allows concurrent execution (an important advantage on today's multi-core machines). You can write pure functions in pretty much all programming languages, but AFAIK there are only two popular programming languages that verifies the purity at compile time: Haskell and D. In Haskell you can't use side effects or externally visible variables if you don't explicitly indicate so using monads in your function signature. D takes a different approach where the default is that you are allowed to use side effects and variables in your functions, and you must explicitly mark them as pure, see this presentation for more information. The difference in the implementations are surely because Haskell was designed from the start to be pure, but in D pure functions was added in version 2.0 of the language.

Pure Functions in Scala

I've been thinking about the best approach to implement pure function verification in the Scala compiler. An approach similar to the one in D would fit a lot better than the one used in Haskell (which would break all existing code and cause some problems due to strict evaluation). A solution using annotations would be quite simple to implement:

class Pure extends Annotation

In practice you want to define this as a runtime annotation in Java, but let's stick to Scala here. Now you can mark a function/method as pure:

@Pure def pureFunc(x : Int, y : Int) = x + y

There are some rules that the compiler must verify for a pure method/function:
  • Only calls to pure functions are allowed. This requires that a large number of functions in the Scala library must be marked using the pure annotation otherwise it will be impossible to write new pure functions.

  • Non-local vars can't be read or written. Local vars can be used in a pure function (as an accumulator for example), but you can't access static variables or variables reachable from the arguments.

  • A pure method can only be overridden by a pure method, and an interface method defined as pure can only be implemented as a pure method.

So far things are simple, but the restrictions imposed are quite severe, for example we can't create an array inside a pure function and use locally as this would result in calls to the non-pure array apply/update methods. Clearly this has to be allowed, which leads to the concept of "semi-pure" functions.

Semi-Pure Functions

Let's define the concept of a semi-pure function/method:

class SemiPure extends Annotation
class Pure extends SemiPure

As you can see a Pure function is a subtype of SemiPure, so anywhere a SemiPure function is required a Pure function can be used, for example a method marked as SemiPure can be overridden by a Pure method (but not the other way around). Here's an example of a semi-pure method:

case class Var(var value : Int) {
@SemiPure def inc = {
value += 1 // Ok, we can modify fields in this object
value
}
}

The following compilation rules applies to a semi-pure function:
  • It can call pure and semi-pure functions.

  • It can use local variables.

  • It can read and write variables reachable from its arguments. This includes the implicit this argument passed for class methods, so a class method can read/write fields of a class instance.

  • It can only be overridden by/implemented as a semi-pure or pure method.

So what use do we have for semi-pure functions? Well, they allow us to loosen the restrictions placed on pure functions somewhat: a pure function may call a semi-pure function if, and only if, all the argument values passed are created locally or are "invariant". "Invariant" is a term I borrowed from D, it's basically a deeply immutable value, i.e. it's an immutable value that only contain invariant values :). For example, List[Int] is invariant, but List[Array[Int]] is not invariant even though List[Array[Int]] is an immutable type.

So with this loosened restriction you can for example define a function that creates an array and updates it in a loop, and it can still be a pure function. A quite powerful concept that blends the imperative and purely functional programming styles.

Here's an example of a legal pure function that calls a semi-pure function (let's assume the method List[T].head is defined as pure):

case class Var(var value : Int) {
@SemiPure def inc = {
value += 1
value
}
}

@Pure def pureFunc(l : List[Var]) = {
val x = l.head // Ok, pure method called
val v2 = Var(10) // Object created locally
v2.inc // Ok, call of semi-pure function with locally created object
}

However, this is not allowed:

@Pure def pureFunc2(l : List[Var]) = {
val x = l.head // Ok, pure method called
x.inc // Error: semi-pure method called on variant external object
}

as it would result in an externally visible modification.

How to verify that a type is invariant is a problem that needs to be explored further. It should be doable with an addition of an immutable/invariant annotation, but there are some complications with subtyping and type parameters.

Function Objects

One more thing needs to be solved: how to declare a pure function parameter for a higher-order function. A quite simple solution is to create traits for pure and semi-pure functions and mix them with the function types. For example if you want to define a pure map function:

trait TSemiPure
trait TPure extends TSemiPure

@Pure def map[T, U](l : List[T], fn : (T => U) with TPure) : List[U] =
if (l.isEmpty) Nil else fn(l.head) :: map(l.tail, fn)

The TPure type would simply mean that the function objects apply method is considered pure by the compiler. There would be some restrictions on how the (semi-)pure traits would be allowed to be mixed in by the programmer. Another, maybe simpler, option is to create a new set of (Semi)PureFunction0-22 traits that extends the existing Function0-22 traits.

For lambda expressions the compiler could automatically infer if the function is pure, semi-pure or impure, and use the correct trait. During eta expansion the correct trait can be used depending on the annotation of the method/function expanded.

Final Words

Using the constructs I've presented in this post I think it would be feasible to implement checking of functional purity in the Scala compiler without too much effort. Hopefully this will result in a SIP in the near future, so that Scala hackers can utilize the powerful tool of statically checked pure functions.

I'm sure I've made some error or missed something along the way, so I'm grateful for any comments/corrections you might have.

Sunday, September 7, 2008

HList in Scala Revisited (or Scala Metaprogramming Works!)

The Scala compiler team listened to my request for supporting recursive type projections (ticket #1291) and has added experimental support for it. If you have a new build of the compiler (latest nightly or Eclipse plugin will do) you can enable it by add the compiler option "-Yrecursion x", where x is the maximum recursion depth allowed. Thanks to the EPFL team for the help, and particularly Geoffrey for implementing the fix.

The HList code I presented in my previous post now compiles without errors. I've setup a project at Assembla with the HList code and other metaprogramming constructs like booleans, natural numbers, integers, units etc. The library is in pre-alpha stage and there's lots of work to be done, for example the HList type is missing many methods found in the Scala List type.

Basically anything you can do with a normal List you can do with a HList, but all operations are type safe on an element by element basis. Here is some example code using HLists:

object HListExample {
import HLists._
import Nats._
import Utils._

// Create a HList of an Int, Boolean and a pair
val l1 = 10 :: true :: (10.1, "Hello") :: HNil

// Extract the second element, note that the element type
// information is preserved and we can safely perform a
// boolean and operation
l1.nth[_1] && false

// Create another HList, note the use of an operator in
// the type expression
val l2 : Double :: String :: Boolean :: HNil = 1.1 :: "string" :: false :: HNil

// Replace the second element in the list, it used to
// be a String, but now there's an Int inserted
val l3 = l2.remove[_1].insert(_1, 14)

// Type information is preserved, we can use an Int operation
// on the element
l3.nth[_1] * 10

// Append l2 to l1
val l4 = l1 ::: l2

// Statically check that the length of l4 is 6
type T = Equal[_6, l4.type#Length]
}

So, what are HLists useful for, besides being a fancy replacement for tuple types? Well, they can for example be used to implement a completely type safe object oriented programming language with support for delegation, subtyping etc. inside Scala. I will get back to that in a later blog entry.

The project code also contains a simple example of a type safe units library. Here's a usage example:

object UnitsTest {
import Units._
import Integers._

val dist : Length = measure(2.3) * m
val time : Time = measure(1.7) * s
val speed : Speed = dist / time
}

All units are checked at compile time, so you can't assign a length to a time variable for example.

There are lots of other use cases for metaprogramming and the library will continue to grow over time. However, there are some issues with metaprogramming in Scala, one being the speed of the compiler. Compiling types which expands to long paths can sometimes take a very long time. I have submitted a ticket about this too. Hopefully some optimizations are made to alleviate the problem. Meanwhile, I think certain applications of metaprogramming, for example compile time checked bounded integers, will put to much stress on the compiler and should probably be avoided.

Sunday, August 31, 2008

HList in Scala

This is my first blog post about Scala, so be gentle in your criticism :).

I've been trying to implement a linked list of heterogeneously typed elements in Scala, similar to HList for Haskell. This data type can for example be used as a replacement for all the TupleX types in the Scala standard library. A simple implementation of the HList data type in Scala might look like this:

object HListTest {
sealed trait HList

final class HNil extends HList {
def ::[T](v : T) = HCons(v, this)
}

val HNil = new HNil()

final case class HCons[H, T <: HList](head : H, tail : T) extends HList {
def ::[T](v : T) = HCons(v, this)
}

type ::[H, T <: HList] = HCons[H, T]
}

A list of arbitrary length can now be created:

val list : Int :: String :: Boolean :: HNil = 10 :: "hello" :: true :: HNil

Fine, now let's try to implement something more advanced like the append function:

case class Appender[L1 <: HList, L2 <: HList, Result <: HList](fn : (L1, L2) => Result)
{
def apply(l1 : L1, l2 : L2) = fn(l1, l2)
}

def append[L1 <: HList, L2 <: HList, Result <: HList](l1 : L1, l2 : L2)
(implicit fn : Appender[L1, L2, Result]) : Result = fn(l1, l2)

Note that the return type of the append function depends on the argument types. To use the append function we must provide two implicit functions that create instances of the Appender class for appending to the empty list and to a cons list:

implicit def nilAppender[L <: HList] : Appender[HNil, L, L] =
Appender((v : HNil, l : L) => l)

implicit def consAppender[H, T <: HList, L2 <: HList, R <: HList]
(implicit fn : Appender[T, L2, R]) : Appender[HCons[H, T], L2, HCons[H, R]] =
Appender[HCons[H, T], L2, HCons[H, R]]((l1 : HCons[H, T], l2 : L2) => HCons(l1.head, fn(l1.tail, l2)))

A simple test case:

append(HNil, HNil)

What? Compiler error, no implicit argument found? Closer examination of the error message indicates that the Scala compiler has bound the Result type parameter of the append function to Nothing, and then it tried to find an implicit matching this type. This will obviously not work. Explicitly passing the nilAppender works:

append(HNil, HNil)(nilAppender)

It seems the return type of a function can't be inferred from implicit arguments, we must be able to specify the return type of append only with the help of the non-implicit parameters. This is unfortunately an example where the Scala type inferer is not as powerful as the one in Haskell.

Game over? No, we have one more ace up our sleeve: abstract types. Let's declare the return type of the append function as an abstract type in the HList trait, and modify the append and implicit functions to use type projections:

sealed trait HList {
type Append[L <: HList] <: HList
}

final class HNil extends HList {
type Append[L <: HList] = L

def ::[T](v : T) = HCons(v, this)
}

final case class HCons[H, T <: HList](head : H, tail : T) extends HList {
type Append[L <: HList] = HCons[H, T#Append[L]]

def ::[T](v : T) = HCons(v, this)
}

def append[L1 <: HList, L2 <: HList](l1 : L1, l2 : L2)
(implicit fn : Appender[L1, L2, L1#Append[L2]]) : L1#Append[L2] = fn(l1, l2)

implicit def nilAppender[L <: HList] : Appender[HNil, L, L] =
Appender((v : HNil, l : L) => l)

implicit def consAppender[H, T <: HList, L2 <: HList, R <: HList]
(implicit fn : Appender[T, L2, T#Append[L2]]) : Appender[HCons[H, T], L2, HCons[H, T]#Append[L2]] =
Appender[HCons[H, T], L2, HCons[H, T]#Append[L2]]((l1 : HCons[H, T], l2 : L2) => HCons(l1.head, fn(l1.tail, l2)))

Now the simple test case works:

append(HNil, HNil)

A more advanced test works as well:

append(true :: HNil, 10 :: HNil).tail.head * 3

Nice! Have we implemented a type safe append function for heterogeneous lists of arbitrary length? Unfortunately not:

append(true :: "hello" :: HNil, 10 :: HNil)

This will cause an "illegal cyclic reference" compiler error. A type path can't contain the same type declaration twice, even though it or the enclosing type has different type arguments. In this case the HCons#Append type will be recursively evaluated twice for the first list type.

I've filed an enhancement ticket to remove this restriction. Hopefully it will be implemented some time in the near future, so the full power of abstract types and type projections can be utilized. In fact, there are many more things that can be implemented if this restriction is removed, see C++ MPL and Boost for some examples.

So, this is the current state of my exploration of Scala's wonderful type system. Unfortunately, at the moment it seems like HList can't be fully implemented in Scala. I'm grateful for comments proving me wrong though :)