One Man Hacking: January 2008

Sunday, January 27, 2008

About GRE scores

Ever since I published my GRE scores on this blog, many people write to me asking me how to prepare etc. After writing the nth email with the same content, I thought I'll write the "official" answer here once and for all.

This is the "official" answer to "What advice can you give me on how to prepare for the GRE"?.

The answer is "Nothing"!

I skimmed the Barron's guide vocabulary section and the (16 iirc) sections (in the same guide) for the quantitative section to refresh my memory one day before the exam. I was overloaded with work then and had no time to prepare.

My exam was scheduled for 8 o clock in the morning and I was awake till about 05.00 working on a program. Since I got only about 2 hours of sleep, I was in this weird half asleep/awake state, which had the effect that I was totally relaxed and did not have the bandwidth to track the time remaining etc. I just answered each question as it came up. I got the very last question in the quantitative section wrong because for the first and only time I checked the clock and found I had like 3 seconds to answer and so I panicked and randomly clicked an answer(which turned out to be wrong). Oh well.

Beyond the above I have nothing to say on the GRE. Don't bother asking.

Friday, January 25, 2008

Engineering - Some Working Definitions

This is the third in a series of four blog posts. Read parts one and two, to get some context

Dr Douglas Lauffenberger's (from MIT's Biological Engineering Dept) talk (warning - mp3) provided enough ideas for a (sufficient for my purposes) working definition of "engineering".

(paraphrase begins)

Dr Lauffenberger begins by breaking down engineering into two aspects - science ( the study of things that exist ) and technology ( making things that don't exist).

So,

engineering = science (analysis - studying things that exist, break down into components, and methods of combining them) + technology (synthesis, building things by putting together the components identified by analysis ).

Engineering further adds a "design principles" (how things get put together) focus to both analysis and synthesis.

All engineers study mathematics. (This is a given).

An engineering discipline has a base of science for its components and methods of combination. A branch of engineering picks a branch of science to base itself on. So, for example, Mechanical Engineering has a base of Physics and Materials Engineering has a base of Chemistry ( + Physics and the omnipresent Mathematics).

Another way of thinking about engineering is

"measure (properties of systems of interest), (use mathematics to ) model, manipulate (components and methods of combination, guided by the model) and make (things that don't exist)". --> (1)

Yet another way to think about engineering,

Engineering = mathematics + science + application area --> (2)

There can be various combinations of (and subcomponents to) each of these three components.

The "science" component in that equation needs to be manipulatable, quantifiable, modellable etc.

(paraphrase ends)

Later in the speech, Dr L goes into why Biology only recently became modelable etc and so before that, how the various branches of BioEngineering used Physics, Chemistry etc as the underlying science, rather than biology. Biology was often the "application area", but not the underlying science. Thus at MIT biology would be a minor and other engineering disciplines like mechanical or electrical engineering (including comp sci like robotics and algorithms) would be applied to a biological domain like pharmaceuticals or prostheses.

Dr L goes on to explain how this changed and why and how Biology is, these days, a science you can base an engineering discipline on (and you really ought to listen to the full speech), but for the purposes of this post (1) ind (2) are what I am interested in. ie,

engineering = mathematics + science + application area and

(doing) engineering = measure (properties of systems of interest), ( use mathematics to ) model, manipulate (components and methods of combination, guided by the model) and make (things that don't exist).

I suggest that programming fits into the "model" part of things, complementing mathematics. This is just an insight, not rigorously tested etc, but there are a couple of straws in the wind that make me think I am right.

First, a scientist I work with explicitly identified the combination of programming and mathematical skills as a "force multiplier" that enables someone who has mastered both to zoom past someone who is strong only in one, explicitly mentioning a programmer (a genius at programming, way better than I am, who couldn't make as much progress as I could because he couldn't wrap his head around the "maths as a modelling tool" idea) and another person, a scientist this time, who gets stuck periodically because he couldn't write production quality code.

There are analogues in enterprise programming where someone who has mastered a domain *and* programming can provide an order of magnitude more business value (which is the main metric in enterprise programming) than someone who knows only banking or J2EE.

Richard Hamming says in his speech "You and Your Research" (if you haven't read this you really ought to do it right away!)

"... ``How will computers change science?'' For example, I came up with the observation at that time that nine out of ten experiments were done in the lab and one in ten on the computer. I made a remark to the vice presidents one time, that it would be reversed, i.e. nine out of ten experiments would be done on the computer and one in ten in the lab. They knew I was a crazy mathematician and had no sense of reality. I knew they were wrong and they've been proved wrong while I have been proved right. They built laboratories when they didn't need them. I saw that computers were transforming science because I spent a lot of time asking ``What will be the impact of computers on science and how can I change it?'' I asked myself, ``How is it going to change Bell Labs?'' I remarked one time, in the same address, that more than one-half of the people at Bell Labs will be interacting closely with computing machines before I leave. Well, you all have terminals now. I thought hard about where was my field going, where were the opportunities, and what were the important things to do. Let me go there so there is a chance I can do important things. ..."

If that were true in 1986, when Hamming made his speech, how much more true is it likely to be now?

The mistake most scientists make is to consider programming a "blue collar" activity, not worth focusing on (this might be true for the top 1% or so whose native genius will carry them through, but most scientists I know can use all the tools they can get). I hypothesize that in the 21st century a scientist (or engineer) who can't code is handicapped - not so badly as a poor public speaker or a poor writer would be (and both are very vital skills for a research career), but handicapped nonetheless. Maybe there is a case for making (say) SICP + python + algorithm analysis + usage of basic version control tools a part of the science curriculum.

The mistake most programmers (who loathe their cubicle farms and the brain dead enterprise codebases they maintain) make is to think that research and engineering are somehow beyond their ability to tackle. One crucial contributing factor to this perception is that most people learn mathematics as a bunch of formulae to memorize for an exam than as a powerful modeling tool that penetrates and simplifies complex systems. It doesn't help that, in India at least, the engineering and science educational system is fundamentally broken and emphasizes rote learning and obedience to authority over curiosity and intellectual rigor.

I write my blog to help me clarify my thinking. I couldn't care less if no one reads it. (Having said that, I have "met" some brilliant people through the blog). This and the two preceding blog entries came about because I have been struggling with nailing down a research statement - something beyond the current "I am interested in Robotics and Compilers". One of my mentors asked me to do this. I think this is good advice. More on this in the next (and last in this series) post.

Thursday, January 24, 2008

So what's wrong if you aren't an engineer?

Nothing at all! To quote Reg Brathwaite (again! But the man has a way of using words that is very eloquent. I can't resist).

[This is the second in a series of four blog entries. If you haven't already you might want to read part one first.]

What's wrong with being a clerk? Nothing. It's only a problem if deep in your heart you despise clerks and you spend your life in denial about the career you have chosen. I wouldn't wish that on anyone, so I asked my readers to think about that carefully.

Likewise, we can argue about what activities from programming can or cannot be considered Engineering. But really, even if you don't do any Engineering, what's wrong with that?

Heh!.

My last blog post seems to have ignited a mini firestorm. Observing how people react to an idea is sometimes more fun than the original idea itself. Reg gets it. In one of his replies to a comment (do read his blog entry), he says

"the statement “~p implies ~q” says nothing about whether p implies q." .(typo corrected. Thanks Arne!)

This is a bit more subtle than it seems. Try substituting p = "You use mathematics" and q = "You are an engineer") and try working out "~p => ~q" and " p => q" ( => is implication and ~ is not). You may be surprised !

I am continually astounded at how many people respond to arguments or claims without doing a logical analysis of what's being said. (Note: I am using "argument" and "claim" (and other words like "theory") in the logical/scientific sense, NOT in the " I had this 3 hours argument with my wife. She asked me to wash the car and I refused. At the end I was shouting and she was in tears. My theory is that women are a different species" sense).

When I was a debater, in my younger days, one of the lessons I learned early is to understand that you don't counter an argument from your emotional or "gut" level, by calling your opponent names, or attributing motives to him (unless you are trying to be a politician, when these tactics do pay off).The way to counter an argument is to dissect its logical structure, and show it is invalid (in specific contexts, if required). Rhetoric by itself can be powerful (and many politicians know this), but when layered on top of a logically sound argument is devastating. There are very precise ways of doing this, going back many centuries, at least as far as Aristotle and Plato.

In my last blog post, I made the claim that most software developers are not engineers. Here, I'll make another claim, even more provocative. Most software developers don't understand logic either. You think I am wrong? Quick, (assuming you are a software developer) what is the difference between the "if .. then" construct in programming languages like java and the logical "if ..then" (aka implication, often denoted by =>) ? If you, a software developer, answered correctly without having to think about it, rest assured, you are in a minority.

Using logical implication you can say(assuming we are talking about this Earth and this time stream) "If Napoleon Bonaparte was born in Europe, the Sun rises in the East" and have it evaluate to true. But of course. What's so surprising? "If Napoleon Bonaparte was born in India, the sun rises in the West" or "If Napoleon Bonaparte was born in India, the sun rises in the East" also evaluate to True! :-D. ("If Napoleon Bonaparte was born in Europe, the sun rises in the West" evaluates to False.) What does the birthplace of Napoleon Bonaparte have to do with where the sun rises? ;-)

Confused? heh! Don't worry it is a most people get totally zonked when they see this example for th first time. The key is to realize that implication is not causation. (neither is correlation but that is another topic. See this debunking of a claim that "research supports the effectiveness of TDD" to see an example of correlation vs causation - The trick I pulled is of course that many variants of the English "If.. then" are different from the logical "if ..then". [1]). Many people learn the truth table of implication without really internalizing what it means.

When I teach programmers first order logic, this is a constant stumbling block. The solution is simple. I ask them to think of the logical "if X then Y" (where x and Y are booleans or boolean valued expressions) construct as equivalent to a programmatic "If X then Y else True". The "else True " is key. In other words, (thinking programatically) does X have a value of true? if so return (the truth value of) Y else return true. Apply this to the "Napoleon" arguments and you'll get the correct (logical) answer for all possible combinations.

Why is such a confusing notion important? Proofs are logical structures using the primitives of FOL. A large part of mathematics(and science) is proofs. From science comes engineering. and if you are not using mathematics you are not an engineer (ducks for cover ;-)).

There are 5 connectives (not, and, or, if (or implication) and iff (or double implication) ) and two quantifications (Universal and existential) in First Order Logic, which need to be mastered before one can go on to things like proofs and logical structure. That's the bad news. The good news is that working through a book on logic (and there are PLENTY of those) will teach you how to use logic.

Reg goes on to say (in the same comment stream)

I remind everyone that "exists x such that x ~member of E does not imply that for all x, x ~member of E."

In other words, the fact that not using math means you aren't an Engineer does not imply that using math makes you an Engineer, for whatever definition of math we agree on.

Exactly so!

The point of the last blog post wasn't "you suck you enterprise developer subhuman moron", but "Don't delude yourself". No more, No less. Do what you love. And have fun!

PS: Once you know logic, using to to construct (or deconstruct) an argument is trivial. But for those who want to make sound arguments without necessarily studying "raw" logic (I hope you are not a sw dev ;-)), take a look at "The Craft of Argument" by Joseph Williams and Gregory Colomb.)

[1] Wikipedia

Part Three of this series is here.

Wednesday, January 23, 2008

Ratcatchers and Engineers

Ever wondered if what you are doing is (software) engineering? Here is a heuristic.

If you don't use mathematics in your day to day work, you aren't (an engineer). All engineers (say those who build bridges, or space craft, or cars) make heavy use of mathematics and/or hard sciences like Physics on a regular basis.

Now, not being an engineer is ok. Being a carpenter or a plumber is a perfectly honorable choice as is being a musician or actor or teacher. If you enjoy being a carpenter/plumber/automobile mechanic, more power to you. You should do what makes you happy and puts bread on the table. That said, a craftsman is not an engineer. The guy in the garage who fixes your engine is not an automobile engineer who could design the next generation car. Not close.

This insight was triggered by Raganwald's "No Disrespect" blog post. I quote

...Let me tell you the cold, hard, truth. You aren’t going to like this, but I ask you to believe me when I say that I am telling you this for your own good:

There is a culture of pretending business programming is more than it is. Some of you calling for more Java in University may take false hope that I am on your side. You may think that the people arguing for Scheme, Haskell, and OCaml are elitists. Wrong. They do not have a problem. You are the one with a problem because you don’t want to tell all your friends you have a job as a clerk.....

We all know what the typical software "engineer" job ad looks like. A job ad for a real engineer would look like this. (Noe the absence of "10 years in java/dotNet/Ror" type crap. Note that he explicitly asks for a Phd (and tell you under what circumstances he will waive it)

The distinguishing trait of an engineer (and Werner's job description explicitly ask for this) is that he builds and works with mathematical models to design a real world effect or system. They also use other tools(simulations, prototypes, experiments etc) but (mathematical) modeling is key. A scientist,as distinct from an engineer, uses roughly the same tools to advance the state of knowledge without necessarily affecting the real world. The borders are fuzzy. You have scientist-engineers and engineer-scientists, as well as people who focus on "pure" science or engineering.

"Modeling" is a deep topic. Read the book I've referred to at the end of this blog for examples of how this works. Suffice to say If I am not building (say) algorithmic models to help me decide how to build my software, or to generalize, if I am not using "theory" on a day to day basis, I am not *engineering* anything. Modeling has a very precise meaning in Engineering and Science. (No, UML diagrams, or "story cards" are not engineering artifacts no matter what the methodology vendors say ;-) ).

Enterprise software is the least amenable to the modeling/engineering approach. There are exceptions but most "enterprise" developers are the equivalent of clerks, as Raganwald so eloquently points out. There is nothing wrong with being a clerk as long as you know you are one and are not deceiving yourself. Most enterprise software projects, in keeping with their clerical nature, are life draining. But hey if you like it, go for it.

Another friend of mine, who is a very talented programmer (he recently moved away from enterprise app development) , when asked why he changed the focus of his career, told me,(paraphrased) "Humanity has only a very limited amount of talented people. It is a crime against humanity to employ that talent to bug fix enterprise applications or futz around with Ruby On Rails deployment issues. I want to do something meaningful". (As you can see I have interesting friends :-) ).

To conclude, the title "Software Engineer" is (most of the time) a particularly deceptive one. To be accurate it should something like "Software Maintenance Worker" or "Software Handyman", but I guess it is easier to hire someone if his job title is "Rodent Officer" instead of "Ratcatcher". .

PS:- Please, before anyone feels offended and sends me hate mail or snarky comments, please read (something like) "The Idea Factory - Learning to Think at MIT". (This was the book that jolted me out of my complacency and set my feet firmly on the research/engineering path). Then actually think about it :-). Either be happy as a rat catcher or do something about it.

[This is the first in a series of four blog posts. Read part two and part three]

Wednesday, January 16, 2008

Dev Camp Bangalore

Some folks at my former employer, Thoughtworks, are planning a DevCamp in Bangalore on February 9th, 2008.

This is an excellent idea, well overdue. Barcamps in Bangalore are overrun by people trying to finance the next "social networking" (gag!!) or "web 2.0"(double gag) "startup" by finding some dumb non-technical vc who doesn't know what's going on, photography and movie clubs and so on. And what technical sessions do exist are of the "Introduction to X" variety where X element of { Ruby On Rails, Erlang, Fad-du-jour} mostly cut and pasted from the web.

The Dev Camp web page has these instructions prominently displayed (emphasis mine).

"However, please assume a high level of exposure and knowledge on the part of your audience and tailor your sessions to suit. Avoid 'Hello World' and how-to sessions which can be trivially found on the net. First hand war stories, in-depth analysis of topics and live demos are best."

This should keep the fakes away (touch wood). Also there will be a few participants from Thoughtworks, and if the past is any guide the Thoughtworkers sessions will be well worth listening to.

One of the tenets of a "camp" is that there are no passive participants. Given that this is meant for developers, I hope to attend some high quality sessions. And if I attend, in the spirit of Xcamp,

I'll be presenting too. I will speak on one of

Monads in Depth - what they are, the underlying mathematics, what they are good for and how to use them in your favorite language
Reinforcement Learning - Algorithms and Applications in Robotics
Proof Techniques - a tutorial for Developers
Vajra - a high performance Lisp for Robotics Programming

Of course the topic list is highly fluid and I just might end up speaking on something else, in keeping with the spirit of XCamp, but if you have a preference, send me email!

if any readers of this blog attend, stop by and say hello!

Tuesday, January 01, 2008

Sliding Into Scala

I've been dissatisfied with java for a while now, but to give the devil his due, it does hit a sweet spot. When you are looking for a combination of cross platform, fast, statically typed, easily deployable language with tonnes of libraries, there isn't much else available. But on the other hand the language itself is mind numbingly verbose and since most of my coding these days is in a combination of scheme and C, when I do switch back to java, it is as if I am suddenly running through quicksand, eclipse notwithstanding.

Generics (the implementation not the idea) was the first misstep in Java's evolution. It makes it easy to write code that is almost impossible to read. For the AIMA code for example, when I used pure Java 5, generics heavy code, students complained that they couldn't make out what the code was doing and so for code that other people have to maintain or extend, I end up writing Java in a style I call "1.4 +" - Java 1.4 + enums + new for loop + generics *for collections* (only). The actual type system of Java 5 is a fairly simple (even simplistic) one (once you've worked through some books like TAPL), but the syntax is atrocious. The present controversy on "closures" (closure != anonymous function and I am tired of programming illiterates abusing terms with clearly defined meanings but that is a rant for another day) convinced me that Java is on the wrong track and is well on its way to obsolescence. "Java is the new Cobol" indeed.

What COBOL never had was an open source cross platform VM that other languages could target and a few hundred thousand libraries. Java != JVM, in other words. Couldn't I just use another language on the JVM ? Well yes, but - I don't particularly like Ruby. Jython hasn't caught up with the latest version of Python. Both these languages have dynamic type systems and are slower than compiled java. So there isn't too much actual choice given the parameters I listed above (fast, statically typed ... ).

Enter Scala. I looked at Scala a year ago, tried some sample scripts, stumbled across a bug and gave up. But things have changed since then. I've been experimenting with Scala over the last few days and I am tremendously impressed. Besides writing small scripts to explore various language features, I've been rummaging through the code for the compiler and type checker (written in Scala of course). Martin Odersky has something that is all too rare in the software world today - a strong sense of design (I am looking at you, Ruby On Rails).Scala has too many brilliant features to go into any great detail here, (there are plenty of blogs that do go into detail see this or this for e.g) but in essence code compresses to almost nothing, the type system is brilliant, and pattern matching is something I've wanted in a jvm language for a long time. One can even do monadic programming in Scala! I am VERY impressed.

Now don't get me wrong, there are a few rough edges. For example one of the things it doesn't have (as of this moment) is a good reflection API. I tried to write a xunit clone to get my feet wet and got stuck on identifying all the methods that start with "test", for example. (It *can* be done by leveraging the underlying Java but not from native Scala). The build system is a mess (try building from source and watch the build fail for lack of heap space - this is unforgivable in 2008) and there is less of a focus on testing/regression than I'd like. And oh, if you are using the interpreter make sure you use jline ("sbaz jline" should do it- I don't know why this isn't part of the standard distribution).

But these are minor quibbles - Scala is a brilliant accomplishment. I think Sun should just declare Scala to be Java 8 and be done with it.

One Man Hacking