Ravi Mohan's Blog

Tuesday, January 01, 2008

Sliding Into Scala

I've been dissatisfied with java for a while now, but to give the devil his due, it does hit a sweet spot. When you are looking for a combination of cross platform, fast, statically typed, easily deployable language with tonnes of libraries, there isn't much else available. But on the other hand the language itself is mind numbingly verbose and since most of my coding these days is in a combination of scheme and C, when I do switch back to java, it is as if I am suddenly running through quicksand, eclipse notwithstanding.

Generics (the implementation not the idea) was the first misstep in Java's evolution. It makes it easy to write code that is almost impossible to read. For the AIMA code for example, when I used pure Java 5, generics heavy code, students complained that they couldn't make out what the code was doing and so for code that other people have to maintain or extend, I end up writing Java in a style I call "1.4 +" - Java 1.4 + enums + new for loop + generics *for collections* (only). The actual type system of Java 5 is a fairly simple (even simplistic) one (once you've worked through some books like TAPL), but the syntax is atrocious. The present controversy on "closures" (closure != anonymous function and I am tired of programming illiterates abusing terms with clearly defined meanings but that is a rant for another day) convinced me that Java is on the wrong track and is well on its way to obsolescence. "Java is the new Cobol" indeed.

What COBOL never had was an open source cross platform VM that other languages could target and a few hundred thousand libraries. Java != JVM, in other words. Couldn't I just use another language on the JVM ? Well yes, but - I don't particularly like Ruby. Jython hasn't caught up with the latest version of Python. Both these languages have dynamic type systems and are slower than compiled java. So there isn't too much actual choice given the parameters I listed above (fast, statically typed ... ).

Enter Scala. I looked at Scala a year ago, tried some sample scripts, stumbled across a bug and gave up. But things have changed since then. I've been experimenting with Scala over the last few days and I am tremendously impressed. Besides writing small scripts to explore various language features, I've been rummaging through the code for the compiler and type checker (written in Scala of course). Martin Odersky has something that is all too rare in the software world today - a strong sense of design (I am looking at you, Ruby On Rails).Scala has too many brilliant features to go into any great detail here, (there are plenty of blogs that do go into detail see this or this for e.g) but in essence code compresses to almost nothing, the type system is brilliant, and pattern matching is something I've wanted in a jvm language for a long time. One can even do monadic programming in Scala! I am VERY impressed.

Now don't get me wrong, there are a few rough edges. For example one of the things it doesn't have (as of this moment) is a good reflection API. I tried to write a xunit clone to get my feet wet and got stuck on identifying all the methods that start with "test", for example. (It *can* be done by leveraging the underlying Java but not from native Scala). The build system is a mess (try building from source and watch the build fail for lack of heap space - this is unforgivable in 2008) and there is less of a focus on testing/regression than I'd like. And oh, if you are using the interpreter make sure you use jline ("sbaz jline" should do it- I don't know why this isn't part of the standard distribution).

But these are minor quibbles - Scala is a brilliant accomplishment. I think Sun should just declare Scala to be Java 8 and be done with it.

16 comments:

Sidu said...

"I think Sun should just declare Scala to be Java 8 and be done with it."
+1

Anonymous said...

If you have not seen it yet, you might CAL interesting too:

http://labs.businessobjects.com/cal/

It is, in essence, Haskell for the JVM.

Matt said...

Thanks for the tip about jline -- the interpreter is awful to use without it. It's a bit strange that it's not included by default.

Michael Nischt said...

"I think Sun should just declare Scala to be Java 8 and be done with it."

+1

Yardena said...

"I think Sun should just declare Scala to be Java 8 and be done with it."

+1

Anonymous said...

I think Sun should just be done with it !

I mean, literally "just done with it !" :)

wilfred said...

+1

Vladimir Levin said...

Ravi,

I've got a question. I am not sure how relevant this is to your post, but you seem to be the kind of person who would knoe and it can't be worse than all those annoying +1's.

It seems that closures are the new "hot feature" for upcoming versions of Java. My question is, what kinds of things are closures useful for?

I've used the idea of invoking a function that is passed in as though it is an object in some simple ways. For example, the typical ruby example where you'd do something like list.find { |item| some criteria } or the idea of passing in event handlers. However, beyond that, this idea of just a function with some stack wrapped around it seems like a kind of primitive and somewhat dangerous ancestor of oo programming. I have done some quick searches on the web, but I haven't found a clear explanation of cases where closures are useful. I read in one discussion that they enable concurrency frameworks to be developed more cleanly... My next step is to start reading a scheme tutorial, but in the meantime, have you got any comments/tips/examples?

Ravi said...

@Vlad,

"What are closures good for?"

First, closures are not anonymous functions, but going into this would take too long so I'll use the terms interchangeably for this comment. Now as to your question,

This is like asking "What are Objects good for? One can always use structures and functions after all and switch on the type"

Answers tend to be generic like "increased abstraction" or "less boiler plate code".

Fred Brooks said it best

"What does a high-level language accomplish? It frees a program from much of its accidental complexity. An abstract program consists of conceptual constructs: operations, datatypes, sequences, and communication. The concrete machine program is concerned with bits, registers, conditions, branches, channels, disks, and such. To the extent that the high-level language embodies the constructs wanted in the abstract program and avoids all lower ones, it eliminates a whole level of complexity that was never inherent in the program at all." - No Silver Bullet,

So closures are "just an abstraction mechanism" which make (some kinds of) programs shorter, less error prone and easier to read, write and modify.
Just as a while loop is closer to the way you think about a program than a functionally equivalent condition + a goto, for certain programming tasks,(
e.g traversals of tree/graph structures, or sequences), closures provide a closer match with your thought process than objects. And sequences (for e.g) are more prevalent than most programmers relaize. From SICP chapter 2 (emphasis mine),

"Richard Waters (1979) developed a program that automatically analyzes traditional Fortran programs, viewing them in terms of maps, filters, and accumulations. He found that fully 90 percent of the code in the Fortran Scientific Subroutine Package fits neatly into this paradigm. One of the reasons for the success of Lisp as a programming language is that lists provide a standard medium for expressing ordered collections so that they can be manipulated using higher-order operations. The programming language APL owes much of its power and appeal to a similar choice. In APL all data are represented as arrays, and there is a universal and convenient set of generic operators for all sorts of array operations."

Needless to say there are more nteresting data structures than just straightforward sequences.

What you do with closures in java, you can do with "standard" objects/classes in java at the cost of more boiler plate and more of a burden on the programmer using those classes.

Now an object (or for that matter a class) is also just a (set of) "function(s) with a stack wrapped around it " at the implementation level. In simple interpreters , objects are *implemented as* closures, with type and name tags.



Closures are not "more primitive" than objects. Objects and closures are duals and in an emergency one can use one to implement the other (see SICP for details on how this works).

You didn't ask this question but Scala provides a lot more than "closures". If all one wants is closures, you could use JRuby.

Scala's pattern matching and elegant type system. *in combination with* "closures", promote a kind of high abstraction concise programming style that would take 10 times as much code in java, just like(to flog an overused analogy again) abolishing while and for loops from java and providing a single "goto" construct, would increase the size and complexity of java programmings even if the "goto" were logically equivalent to higher level loops.


The best way to get a feel for the kind of programming Scala enables is not to try to figure out abstractly what individual features are good for, but to work in a language that makes radically different design choices than teh ones you use regularly.

I would suggest Haskell to get some truly mind bending insights. Write a few thousand line non trivial program in Haskell (should take you only a month or so even working an hour or two a day) and you will "grok" the answer to your question much better than I could convey in a blog comment.

Vladimir Levin said...

Thanks for your comments Ravi. I will have a look at Haskell. I've always had trouble with very pure functional programming in the past, but I will try again. In the past I've always found it to have a quality that I might describe as "annoying." It always seemed very dense and mathematical to me - harder to understand in pieces. For a simple example, I was not very fond of the mutally recursive assign/eliminate functions in Norving's sudoku example (probably better named as try_to_assign and try_to_eliminate, but anyway...).

The thing of it is that I like to think I can provide nice simple, clear examples of why basic OO features are helpful in reducing code duplication, as well as providing a common model for programming. I wish I could find something similar in the way of explaining closures. Ah well. I will try to get some understanding of Haskell and perhaps write a blog entry about in a few months. Cheers!

Ravi said...

"'ve always found it to have a quality that I might describe as "annoying." It always seemed very dense and mathematical to me - harder to understand in pieces. "

You are just unfamiliar with the constructs and idioms. If you didn't understand metaclasses, ruby's method_missing can be "annoying".

"For a simple example, I was not very fond of the mutally recursive assign/eliminate functions in Norving's sudoku example"

Again, you are just unfamiliar with mutually recursive functions. They are a common "pattern" in many languages and are the clearest expression of the underlying idea to programmers who are familiar with the idiom.




For e.g in C the "while(*s++ = *t++)" construct would be unfamiliar to say a java programmer (or even a novice C programmer), but is clear and concise to a good C programmer. Most people who write code only in Java (say) use recursion very sparingly (recursion is fundamentally broken in jva but that is arant for another day) and so are unfamiliar with the uses of recursion. SOmeoen who is an expert in language that facilitates easy usage of recursion would use teh style when appropriate without a second thought.

The key to understanding recursions is to understand the underlying mathematical idea of a recurrence. I'll write more on this sometime but yes the usage of mutual recursion does look unfamiliar to people not familair with mutual recursion ;-).

You shouldn't judge
"clarity" of languages features by how people unfamiliar with the language find a particular expression. The way to decide if a piece of prose in French(say) is clear or turgid is to judge it *in terms of* French as understood by a Frenchman, not by laboriously converting it word by word into English and then looking at the English "translation"



"(probably better named as try_to_assign and try_to_eliminate, but anyway...)."

better for who? Youa re still "thinking in java", I think.

Returning a distinct value (or type) (null or empty list or " False" as in this case ) to denote failure is a common idiom in dynamically typed languages. In Python in particular, a non empty list evaluates to the boolean true in a context which requires a boolean.

Thus assign returning a list of values or a False is a common idiom in python which allows you to write code like
"if not assign (blah) : " .

(Question to ponder on ; Why does Norvig return a distinct *type* vs a distinct value of the same type, say the empty list? hint: collecting parameter).

In Java of course the returned values from a method have to have a single predeclared type and therefore you can't return a List *or* a False from the same method depending on what happens when it executes, and even if you could you wouldn't be able to write "if not myFunc(blah)" because a list cannot be used where a boolean is required.

When you port the python code to java you of course have to adopt java specific idioms and failure has to be explicitly indicated (by returning null, throwing an exception, doing an explicit compare of the returned value to the passed in collection parameter or return a custom datatype that is the union of a list and a boolean or whatever.


*In java* your proposed name changes make more sense. Not so much n Python. This is an idiom experienced Pythoneers know and use.

To translate accurately from French to English, one needs to be fluent in French *and* English. To convert "je t'aime" to "I you love" and then say "Oh that Frenchman constructs sentences awkwardly" betrays the cluelessness of the translator!

All that said ,

"I wish I could find something similar in the way of explaining closures"

Try to write programs that traverse or transform (arbitrarily) complex data structures - a code generator(traverses and transforsm an ast tree), an XML transformer, a SQL query optimizer, a network protocol implementation etc, and you will see where a functional style provides advantages over an oo style (and vice versa).

This,

"I will try to get some understanding of Haskell and perhaps write a blog entry about in a few months. Cheers!"

is a great idea! Replace "some" with "a good", though! ;-)

Happy New Year, Vlad

Vladimir Levin said...

Thanks Ravi,

Happy new year to you too!

Sidu said...

Ravi, do you think it would be possible for you to do a session (on anything) at DevCamp? I'm asking because you've registered and was hoping you'd be able to talk about some of the stuff you've been working on.

Thanks,
Sidu.

Ravi said...

Sidu,

If I attend (more on this below) I will certainly conduct a session. The whole idea of dev camp is that there aren't any passive participants (isn't it?) and I look forward to attending an unconference of people who write code.

Now in Bangalore, the barcamps for example are often inundated by vague people (search engine optimization folks, people doing the umpteenth web 2.0/social networking type startups and looking to meet venture capitalists, photography freaks, movie fans and so on).

I am a bit concerned that the same crowd will swamp devcamp, in spite of the explicit dev focus.

If this happens with dev camp I probably won't attend.


I don't know how exactly the organizers plan to restrict this "camp" to the dev folks who actually write code on a day to day basis, but knowing Thoughtworkers they probably have this well in hand.


As of this moment I do plan to present at DevCamp, something on machine learning or robotics or type systems - something of that nature, all with accompanying code(of course)!



Thanks for asking,
ravi

Venkatesh Sellappa said...

Hi Ravi,

Slightly lateral to the main topic of the blogpost.

For a programmer coming from a imperative background ( re C,C++,Java ) is it a pre-requisite to understand the Maths behind Functional Programming before attempting to learn a Functional Language ?

Ravi said...

"For a programmer coming from a imperative background ( re C,C++,Java ) is it a pre-requisite to understand the Maths behind Functional Programming before attempting to learn a Functional Language ?"

no.

just download the language of your choice and start hacking!