Ravi Mohan's Blog

Tuesday, June 27, 2006

Farewell to Windows

I just switched the Thinkpad over to Ubuntu Linux. I am generally not prone to waxing lyrical about Linux distros, but Ubuntu is really awesome installs flawlessly on the Thinkpad in about 20 minutes flat (and that includes the windows partition resize step).

I had forgotten what it was to mind meld with your computer in the last few years when I used windows + cygwin almost exclusively on my laptop.

I am home again :-)

Thursday, June 22, 2006

SW Methodology - Shaolin Style

One advantage of blogging is that it enables you to articulate your vague inchoate thoughts more coherently and more importantly, it enables you to develop your thinking. After ranting about the inappropriate use of TDD, I've been able to develop my insights further. Expressing ideas helps to form them as Paul Graham says.

The most fruitful way to think of any pre defined software methodology seems to be that is an encoding of certain practices intended to teach you certain lessons.

In Karate (or any martial art) there are sequences of fixed movements called kata. In music, there are fixed forms called "etudes" (in western music. A rough analogue would be the varna in (South)Indian classical music). Conceptualising a methodology as an "encoding" of what worked for certain people in certain situations is an alternative to "methodology as religion"

The practitioner of a kata (or etude) seeks to perfect each component of the larger structure (a punch in a kata, a particular phrase in an etude) and since absolute perfection is elusive, the practice may go on for years or a life time.

If one thinks of the 12 practices of Extreme Programming (say) as forming a "fixed form" that teaches certain principles of software development (vs as some kind of sanctified system of "rules to follow or you'll burn in hellfire") then it becomes easy to both strive for perfection of each component (say unit tests) while practising, and modifying or discarding the same component during performance as need demands.

Combat is different from kata practice in a dojo. When you are in a dark alley facing someone who tries to gut you with a knife, the important things is to win that fight, not conform to the structure of the kata. When you are practising the kata, the idea is not be creative and do "free flow" but to focus on mastering the form exactly as expressed in the kata. The immediate goal of kata is "perfection of the form" (and the lessons learned therefrom) not "effectiveness in attaining the ultimate purpose".

Guitar Etudes have odd fingerings and chord sequences which will never make it into a lead solo. But when you practice the etude you do what the notation says , not what might best express your musical intent or satisfy an audience. While practising, a gutar maestro will be able to perform an etude perfectly as written in the notation, awkward fingerings and all, but when he is on stage, improvising a solo, he really doesn't think of "exactness of fingering". He focusses on musical effect and allows his fingers to go where they will.

Once you accept that a set of rules telling you how to develop software is just a "fixed form", it is easy to discern the two errors that software methodology practioners (not to mention the "evangelists" who have found the "One True Path") consistently fall into.

The first error is to apply the criteria of practising the form to the performance. iow, to insist that you should fight only using the movements of the kata you learned. "Ohhh you are not writing a test, you are not Agile... Code without tests is 'legacy code' Shame on you ".

The second, and more subtle, error is to apply the criteria of performance to the practice of the form, ie you treat the kata as a combat situation. "Aargh writing tests for javascript is impossible so I won't write any tests because obviously this unit test stuff is impractical" or " Writing tests means I write more code so I will be slower. So I will never write any tests".

Of late there have been some attempts to use the underlying principles of "kata" to improve software development skills. This is the "practise the form perfectly in a non combat situation" approach.

What has been missing is the more important second half -- "combat is not kata". In a hypothetical combat situation, in that dark alley with someone lunging at you with a knife, it isn't very important that you take up the perfect "horse stance" and then do a perfect roundhouse kick. If you end up in hospital, leaking blood from half a dozen stab wounds it doesn't matter if you "took the stance perfectly". As Bruce Lee once said "On the street, no on knows you are a black belt".

In the alley, facing a knife thrust, using a garbage can lid to block the knife and knock the attacker out cold will work fine, nevermind that in Shotokan Karate there are no "Garbage Can Lid Kata".Of course if you've practiced your forms (and other practices like kumite) well for many years, you won't be sweatily flailing around, panting for air and driven by adrenaline, like an untrained person will. You will be in control, using the minimal amount of movement and force and stay balanced all the time , and be aware of things like distance, force, balance, vector of attacks and so on.

You can even learn something from the combat situation and incorporate it into your practice ("hmm fighting in the dark in a cluttered alley is quite different from sparring in well lighted uncluttered dojos so how do I learn to fight better in such conditions?"). Who knows, you may be able to distill your experiences to create a "dark room kata" for others to learn from. And the practitoiners of this "dark room kata" wil in turn diverge from its fixed patterns in their own combat situations. And that is exactly as it should be.

One problem with software development is that "practise" and "performance" are mixed up thoroughly. Both generally happen within the context of a project someone else is paying for. A software developer learns "on the job". It is often impossible to change this state of affairs, but it is helpful to distinguish the "practice" and "performance" aspects , the kata and combat, the etude and the concert.

When you are learning extreme programing, do exactly what Kent Beck teaches in the "white book". When you are actually developing software it doesn't matter if you aren't "exact". If you have difficulties in applying a particular practice, make note and use what works. Later you can think of whether the problem is one of insufficient practice of the technique (in which case you practise more) or whether the technique itself is a misfit for your circumstances (in which case you modify the technique or discard it entirely). There is no intrinsic merit in force fitting the combat situation into the framework of your "school" of fighting. You'll just get stabbed.

There is no "perfect kata" which will reward rigid adherence with universally victorious combat ability. There is no "perfect methodology" which will guarantee success if it is followed "perfectly". Extreme Programming (to use one particular methodology) is not magic. Is is just the encoding of what Kent Beck (and others) learned by trying to get better in the projects they worked on. By all means learn from Kent. Don't worship him (or XP). Ignore the high priced consultants. Your situation and requirements are likely different from what Kent faced. Nobody cares what methodology you follow as long as you write code that delivers value. Arguments about "Shotokan Karate is better than Shaolin Kung Fu" are on the same level as "XP is better than RUP (or waterfall for that matter)".

Combat is not practice. A dojo is not an alley. And vice versa.

Tuesday, June 20, 2006

The 'Cult Of the MBA'

In Joel Spolsky's article, there is one sentence that wish I had written

The cult of the MBA likes to believe that you can run organizations that do things that you don't understand.

On second thought though, the ideas behind that sentence look a little bit more complex than they appear. I have seen managers (used interchangeably with "folks who have mbas" for the rest of this post) have made a (positive) difference. Maybe it is just that competent people, with or without an MBA, do make a difference in situations including them. So the question becomes "Is there a significant differential advantage conferred by an MBA?" Somehow I doubt it.(I am talking of differntials of capability, not differntials of social standing or ability to climb organizational ladders) I have seen too many clueless morons with MBAs screw up situations behind repair.

Taking another vantage point, for a company that is essentially about doing cutting edge, innovative things (with or without software) and redefining the way the world works, it would be insane to hand the driving wheel over to a (non engineer) MBA. Which is probably why Larry and Sergei agonized over selecting their CEO for so long, and why Jobs is btter for Apple than Sculley. OTOH if company is involved in "software services" or "offshoring" or whatever, maybe its makes sense to hand over the reins to an MBA who can then 'manage "resources" ',"scale up operations" etc.

And anyway "cults" are not confined to MBAs. On the geek side of the fence we have the Cult of Apple, The Cult of Agile etc. Nothing new here folks, move right along.

I do NOT use Yahoo 360

so please don't "add me as a friend" in Y360 :-)

I get 2 or 3 mails everyday saying that "Mr/Ms X has requested that you add them as a friend". The thought is appreciated. Seriously. but there is NO point in adding you guys to an account I do not use.

I use practically nothing from Yahoo except a webfacing email id and Yahoo messenger and the latter is for historical reasons, not because I am impressed with its quality. Those of you who know me enough to be "added as friends" please use either my gmail id or ping me on yahoo messenger or skype or... :-) .

I hope I didn't come across as arrogant but I really do NOT like yahoo's flashy ad riddled pages or its Y360 service. Please understand. Thanks in advance.

:-)

Monday, June 19, 2006

Compilers, TDD, Mastery

I was talking to a friend yesterday about the upcoming compiler project when he asked me a question "Will you be coding the compiler in an Agile fashion? I mean using TDD etc?". This turns out to be an intriguing question.

I got sick of the "silver bullet" style evangelization of the Agile Religion and formally gave up "agile" some time ago. However I do believe in programmer written unit tests being run frequently and that is part of "agile". What I do not believe in is the notion of the design "emerging" from the depths of the code like the Venus of Milo from the waters by just repeating the "write-test, write code, refactor" cycle, a practice otherwise known as TDD.

Anyway, I decided to google for compiler s and tdd and came up with some absolute gems.Here's the first.

On the comp.lang.extreme-programming mailing list, I came across this absolutely hilarious exchange.

In the midst of a tedious discussion on whether XP scales to large projects, "mayan" asked,

I am not asking what kinds of companies are doing XP work - I believe a lot of them are, and successfully. What I am asking is anyone using XP for large size/high complexity/high reliability problems. To be more specific - stuff like optimizing compilers, OS kernels, fly-by-wire systems, data-base kernels etc

Ron Jeffries, one of the "gurus" of agile, replied,

. I'm not aware of any teams doing compilers or operating systems using XP, but having done both, I'm quite sure that they could be done with XP, and the ones I did (at least) would have benefited from it, even though they were successful both technically and in the market.

Aha this looked interesting! Someone actually thinks a compiler can be written in a XP (and presumably TDD) fashion. Mayan issued a challenge to Ron which looked like this

Excellent: lets talk about the following problem: write an optimizing back-end for a C compiler (assume we purchased the front-end from someone). How would we use XP, or adapt XP to it?

Some problems with compiler back-ends:

- its fairly clear what has to be done (core {intermediate-language/instruction selection/register-allocation/code-generation} + set of optimizations); its fairly clear what order they have to be done in (first the core, then a particular sequence of optimizations)

- you don't really need customer input. Having a customer around doesn't help.

- you can't demo very much till you have a substantial portion of the code written - the core (say about 30-60klocs) - no small (initial) release

- you had better get your IL (intermediate language) data-structures right - if you pick ASTs or 3 or 4 address form, you will do fine for the basic "dragon-book" optimizations, but later on you will either run into severe problems doing more advanced optimizations, or you will have to rewrite your entire code base [probably about 100-150klocs at this time]. Is this BUFD?

- you had better think of performance up front. Toy programs are easy; but a compiler has to be able to handle huge 100kloc+ single functions. Many heuristics are N^2 or N^3. Both run-time efficiency and memory usage end up being concerns. You can't leave optimization till the end - you have to keep it in mind always. It also pretty much determines your choice of language and programming style.

- TDD may be applicable for some of the smaller optimizations; on the other hand, for doing something like register-allocation using graph coloring, or cache blocking - I wouldn't even be able to know where to begin.

- The basic granule (other than the core) is the optimzation. An optimization can be small (constant propagation in a single basic block) or large (unroll-and-jam, interprocedural liveness analysis). The larger ones take multiple days to be written. Integrating them "often" is either pointless (you integrate the code, but disable the optimization) or suicidal (you integrate the code, but it breaks lots and lots of tests; see below). Best case: integrate once the optimization is done.

- Its not easy to split an optimization into subproblems; so typically one (programmer/pair) works on an optimization. For the larger ones, if it needs to be tweaked, or fixed, the unit that wrote it is the best unit to fix it. The overhead to grok a couple of thousand lines of code (or more!) vs. getting the original team to fix it is way too high.

- Testing, obviously, is a lot more involved. Not only do we have to check for correctness, we have to check for "goodness" of the produced code. Unfortunately, many optimizations are not universally beneficial - they improve some programs, but degrade others. So, unit testing cannot prove or disprove the goodness of the implementation of the optimization; it must be integrated with the rest of the compiler to measure this. Further, if it degrades performance, it may not be a problem with that optimization - it may have exposed something downstream from it.

- Typical compiler test suites involve millions of lines of tests. They tend to be run overnight, and over weekends on multiple machines. If you have a badly integrated optimization, you've lost a nights run. And passing all tests before integration is, of course, an impossibility. Even a simple "acceptance" set of tests will check barely a small percentage of the total function in the compiler.

Hmmm.....does this still look a lot like XP to you? I can see that at least 1/3rd of the XP practices being broken (or at least, severely bent).

Based on your experience, do you disagree with any of the constraints I outlined?

Mayan

I expected Ron to post a reply explaining how all the above can fit into the Agile/XP framework, but there was only [ the sound of crickets chirping].

On the XPChallengeCompilers page, Ron repeats his claim that "I've written commercial-quality compilers, operating systems, database management systems ". Hmm. Yeah. Whatever. He doesn't make any useful points about how one would actually go about doing something like this.

On the same page, Ralph Johnson has (as one would expect) a more thoughtful and articulate view point about how XP would apply to writing a (simple) compiler, but he focusses more on the fact that

" DoTheSimplestThingThatCouldPossiblyWork is true for compilers, as well. My problem is that people seem to think that the simplest thing is obvious, where my experience is that it is often not obvious. One of the things that makes someone an expert is that they KNOW the simplest thing that will really work."

Now this makes sense. There is no hint of the design "emerging" from "average" developers grinding through the tdd cycle and there is a strong hint that you do have to understand the domain of a compiler well before you can even conceive of the "simplest thing". He concludes with

"The XP practices will work fine for writing an E++ compiler. However, I think there will need to be some other practices, such as comparing code to specification, as well as appreciating the fact that you sometimes must be an expert to know what simple things will work and which won't".. Ahh the blessed voice of rationality.

Another area where Agile falls flat on its face is when dealing with concurrency. See the XP Challenge concurrency page for the flailing attempts of an "agilist" to "design by test" a simple problem in concurrency. Tom Cragill proposed a simple synchronization problem and Don Wells (and some of the agile "gurus") attempted to write a test exposing the bug and didn't succeed till Tom pointed the bug out.

Coming back to compilers, there are projects like the PUGS project, led by the incredible Autrijus Tang, in which unit tests are given a very high priority. I couldn't find any references to the PUGS design "evolving" out of the TDD cycle however. It seems as if they build up a suite of programmer written tests and run it frequently.I can see how that practice would be valuable in any project. Accumulating tests and an "automated build" are the practices I retained from my "agile" years. AFAIK the pugs folks don't do "TDD", expecting the design to emerge automagically.

This whole idea of "emergent design" (through TDD grinding) smacks of laziness, ignorance and incompetence. Maybe if you are doing a web->db->web "enterprise" project you can hope for the TDD "emergence" to give you a good design, (well, it will give you a design :-P) but I would be very surprised if it worked for complex software like compilers, operating systems, transaction monitors etc. Maybe we should perusade Linus to "TDD" the Linux kernel. TDD-ShmeeDeeDee! Bah! :P

Update: Via Jason Yip's post, Richard Feynman's brilliant speech exposes what exactly is wrong with the practice of "Agile" today. An excerpt

I think the educational and psychological studies I mentioned are examples of what I would like to call cargo cult science. In the South Seas there is a cargo cult of people. During the war they saw airplanes with lots of good materials, and they want the same thing to happen now. So they've arranged to make things like runways, to put fires along the sides of the runways, to make a wooden hut for a man to sit in, with two wooden pieces on his head to headphones and bars of bamboo sticking out like antennas--he's the controller--and they wait for the airplanes to land. They're doing everything right. The form is perfect. It looks exactly the way it looked before. But it doesn't work. No airplanes land. So I call these things cargo cult science, because they follow all the apparent precepts and forms of scientific investigation, but they're missing something essential, because the planes don't land.

Saturday, June 17, 2006

[DevNote] Interpreter To Compiler

I generally use a wiki (moinmoin in case anyone is interested) on my laptop to keep notes on various projects I work on. However the wiki has not been transferred from my old laptop to my new one. With all the travellng I am doing, this won't get done till the next month or so. Meanwhile I'll use this blog to record (some of) my dev notes (when i can get net access ...sigh.....). Since Blogger doesn't support tagging or categories, I'll prepend a "[DevNote]" string to the title of the post so readers can ignore the post if they wish. These notes are likely to be cryptic and incoherent and most readers should just skip them.

To transform an interpreter to a compiler,

  1. Rewrite interpreter in CPS form.
  2. Identify Compile time and Runtime actions(CPAs and RTAs)
  3. Abstract RTAs into separate 'action procedures'.Use thunks to wrap any weird control structures where necessary
  4. Abstract Continuation Building procedures (CBPs)
  5. Convert RTAs and CBPs into records. Now the eval on the parse tree spits out a datastructure.
  6. Write an evaluator for the resulting data structure making sure to (a) implement register spilling (b) 'flatten' code to generate "assembly like" code.
  7. TODO : investigate the pros and cons of Bottom Up Rewriting Systems (BURS) vs the scheme above vs generating a (gcc) RTL like intermediate language
  8. TODO : investigate the effects on stack discipline and garbage collection when the various schemes are adopted . Which is more convenient?

Thursday, June 01, 2006

StockHive goes live

I am still internet challenged and will be offline for the next month or so, but this is worth commenting on.

My friend Yogi has launched StockHive, an online tool for folks investing in India's stock markets.

If you are an investor, you should definitely look at StockHive. For the techies amongst you the site is built on Rails.