2012-03-09

Scalagen - Java to Scala Conversion

Scalagen was born out of the idea that we would like to port some of our projects from Java to Scala source code. Instead of porting all the code manually a tool could be used to do most of the bulk work. We did some initial experiments with Jatran, but found it lacking.

After that we investigated some options to write such a tool ourselves. First we needed a parser for Java sources to turn into abstract syntax trees. We decided to use the Javaparser framework since it supports Java 5 source and had an easy to use API.

Initial Architecture

Then work began on Scalagen. The initial parts were written in Java, such as a visitor implementation for the javaparser which prints out the tree as Scala syntax. After that had been done this visitor was used to convert the visitor class from Java to Scala. After that the implementation of Scalagen continued to be fully Scala-based.

After that we wrote visitor implementations which did transformations on the AST. Needed transformations were to extract static members into companion objects and to change various control statements into a more Scala-esque form. At first the visitor would just mutate the original AST from one state to another. This proved to cause problems with Scala's immutable collection types so we rewrote the transforming visitors to always return new detached AST versions.

Now the initial architecture was set up and further transformations could be written. After a while we had an impressive list of transformations which could be easily used via both their class and object variants:

object Primitives extends Primitives


class Primitives extends UnitTransformerBase {

...

}

Wrapping the Parser

Then we began to adapt the javaparser API to be more easy to use. We declared short type aliases in a singleton object to strip off redundant suffixes such as Stmt and Expr:

type Annotation = AnnotationExpr 

type AnnotationDecl = AnnotationDeclaration 

type AnnotationMember = AnnotationMemberDeclaration

type Assign = AssignExpr  

type Binary = BinaryExpr   

type Block = BlockStmt 

type BodyDecl = BodyDeclaration

Then we began to write deconstructors to make AST pattern matching more concise:

object FieldAccess {
  def unapply(f: FieldAccess) = Some(f.getScope, f.getField)
} 

object For {
  def unapply(f: For) = Some(toScalaList(f.getInit), f.getCompare, 
    toScalaList(f.getUpdate), extract(f.getBody))
}

object Foreach {
  def unapply(f: Foreach) = Some(f.getVariable, f.getIterable, 
    extract(f.getBody))
}

object MethodCall {
  def unapply(m: MethodCall) = Some(m.getScope, m.getName, 
    toScalaList(m.getArgs))
}

And something more complex which provides both a new-less constructor via apply and a deconstructor via unapply.

object VariableDeclaration {
  def apply(mod: Int, name: String, t: Type): VariableDeclaration = {
     val variable = new VariableDeclarator(new VariableDeclaratorId(name))
     new VariableDeclaration(mod, t, variable :: Nil)
   }
  def unapply(v: VariableDeclaration) = 
    Some(v.getType, toScalaList(v.getVars))
}

The singleton names matched the type aliases so we had a very Scala-like meta-layer on top of the javaparser AST classes.

Here is a fairly complex example of how the pattern matching could be used in the transforming visitors.

override def visit(nn: Foreach, arg: CompilationUnit): Node = {
  val n = super.visit(nn, arg).asInstanceOf[Foreach]
  n match {
    case Foreach(
      VariableDeclaration(t, v :: Nil),
      MethodCall(scope, "entrySet", Nil), body) => {
        val vid = v.getId.toString
        new Foreach(
          VariableDeclaration(0, "(key, value)", Type.Object),
          scope, n.getBody.accept(toKeyAndValue, vid).asInstanceOf[Block])    
    }
    case _ => n
  }   
}

This looks still quite cryptic if you are not familiar with the AST structure of javaparser, but for the ones familiar with structure of the AST, this is a fairly intuitive way to match AST patterns.

Packaging

Scalagen provides direct Maven support via a plugin. You can use it directly via the command line like this

mvn com.mysema.scalagen:scalagen-maven-plugin:0.1.3:main \
  -DtargetFolder=target/scala

and for test sources

mvn com.mysema.scalagen:scalagen-maven-plugin:0.1.3:test \
  -DtargetFolder=target/scala

Here is the snippet for an explicit configuration in a POM:

<plugin>
 <groupId>com.mysema.scalagen</groupId>
 <artifactId>scalagen-maven-plugin</artifactId>
 <version>0.1.3</version>
</plugin>

To convert main sources run

mvn scalagen:main

and to convert test sources run

mvn scalagen:test

The conversion results are to be seen as a starting point for the Java to Scala conversion. Some elements are not transformed correctly for various reasons and will need manual intervention.

Finally

Scalagen is an experimental effort and has still lots of rough edges, but we are open to improvement suggestions and stylistic changes.