Ravi Mohan's Blog

Sunday, September 17, 2006

Spreadsheets and Self Sabotage A Case Study

This is a true story. Names have been changed to protect the innocent (and the guilty).

Once upon a time,long long ago, in a faraway land, there was this company, a typical enterprisey company we'll call BankCo (in reality the company doesn't work in banking, the domain is similar but different, and I am just obfuscating things as explained above). BankCo was running all its "enterprise" operations off a set of excel sheets and at some point decided to redo it "properly". The folks at BankCo were essentially businessmen, not programmers so they decided to get some consultants at the cutting edge of technology. Enter AgileCo. AgileCo decided to rewrite the business logic as a 3 tired OO app using all the latest blub technologies.

The result? the BHS project (in AgileCo most projects have three letter acronyms) has frustrated developers, grumbling clients and absolutely yucky code.

Now there were a lot of things the team did wrong. Fundamentally the project was doomed the moment the team was "volunteered" for such an aggressive schedule (committed to the client by the sales manager, "agile planning" notwithstanding) but the thinking might have been that "hey we just have a bunch of spreadsheets, how hard can it be to convert them into a 3 tiered Blub app" . The (non intuitive) correct answer is "potentially very very hard".

Spreadsheets are very subtle beasts. They just look very simple. There are these rows and columns of cells and even non programmers can quickly create useful "programs" so it should be fairly easy to translate a set of spreadsheets into OO programs right?

Wrong.

For one, a spreadsheet is essentially a functional program.[warning - pdf]

[Excerpt]

"It may seem odd to describe a spreadsheet as a programming language. Indeed, one of the great merits of spreadsheets is that users need not think of themselves as doing “programming”, let alone functional programming — rather, they simply “write formulae” or “build a model”. However, one can imagine printing the cells of a spreadsheet in textual form, like this:

A1 = 3 A2 = A1-32 A3 = A2 * 5/9

and then it plainly is a (functional) program. Thought of as a programming language, though, a spreadsheet is a very strange one. In particular, it is completely flat: there are no functions apart from the built-in ones(*). Instead, the entire program is a single flat collection of equations of the form “variable = formula”. Amazingly, end users nevertheless use spreadsheets to build extremely elaborate models, with many thousands of cells and formulae."

Simon goes on to state how he and his colleagues worked on removing the "procedural abstraction barrier" in the functional language embodied by a typical spreadsheet. (There are folks working on integrating more powerful functional languages and other paradigms into spread sheets. The key point is that a spreadsheet is essentially functional. Besides, a spreadsheet propogates changes in values and formulae automatically, unlike imperative and oo languages (e.g if the cell A1 is set to 5 instead of 3 in the above sample, A2 automatically becomes -27).

Besides value propagation , the 2 dimensional spatial relationships between spreadsheet cells makes it unlike most programming interfaces in existence today.

" One of the most successful end-user programming techniques ever invented, the spreadsheet, uses what is essentially a functional model of programming. Each cell in a spreadsheet contains a functional expression that specifies the value for the cell, based on values in other cells. There are no imperative instructions, at least in the basic, original spreadsheet model. In a sense each spreadsheet cell pulls in the outside values it needs to compute its own value, as opposed to imperative systems where a central agent pushes values into cells. In some sense each cell may be thought of as an agent that monitors its depended-upon cells and updates itself when it needs to. As Nardi (1993)(Nardi 1993) points out, the control constructs of imperative languages are one of the most difficult things for users to grasp. Spreadsheets eliminate this barrier to end-user programming by dispensing with the need for control constructs, replacing them with functional constructs. " [ Link

Thus, when you try to convert the functionality of a complex set of spreadsheets to a blub-dot-net program, you are, in essence translating from a functional language with value propagation to a very different paradigm (OO in this case) and also adding things like persistence and multi user interaction which a normal spreadsheet does not have to deal with. This project needs competent developers who have enough resources (including time) to do a proper job,and are expert in both functional and object oriented programming with a keen awareness of the chasm between the two paradigms and a knowledge of how to bridge it. You wouldn't have much use for a German to English translator who didn't know any German, would you?

Besides this technical misstep there are a host of others that make AgileCo's BHS project an interesting case study.

The first is the premature time commitment to clients, bypassing the essence of AgileCo's favorite planning methodology. This is the killer mistake. The second is an arbitrary team structure with unclear roles and shaky team buy in. The third factor worth discussing is how the dominant methodology at AgileCo discourages deep thought by developers and encourages them to grab a "User Story" and hit the keyboard as soon as possible. This eliminates the chances of success for finding a technical approach that attacks the core of the problem. The fourth is how a totally unnecessary onshore/offshore team split adds dysfunctional management, unnecessary paperwork and communication overhead. The fifth and most crucial insight is about how dumb management can totally screw up a project's chances of success beyond repair.

Postscript:- The team is now on deathmarch mode and works on Saturdays and some Sundays. Of course the managers don't work on weekends. The brighter developers are looking for other jobs.

(*) Most spreadsheets have some kind of extension mechanism. In Excel, for e.g. you can use VB script to write your own extensions. But this means (a) and end user has to learn a programming language and (b) subtle errors can creep in when you mix imperative and functional code. Debugging spreadsheets is a nightmare at the best of times.

8 comments:

Anonymous said...

Completely agree on this.

Leaving aside the self sabotage i am engaged in a very similar project attempting to replace Excel Spreadsheets in a bank.

I am surprised by the number of mistakes in assumptions,requirements and actual execution made by very smart people because they enter the project with the thought "Its only a Spreadsheet".

On another note your link to Linus Torvald on the left points to Peter Norvig.Is that intentional ?!!

Ravi said...

"On another note your link to Linus Torvald on the left points to Peter Norvig.Is that intentional ?!!"

Heh! Cut n' paste error. Good catch. Fixed. Thanks

Anonymous said...

Send this to the bank in question..

Hopefully they'll get back to using Spreadsheets..

Anonymous said...

Will using online spreadsheet like EditGrid be helpful to them?

Keith said...

Excel (like SQL) is an amazing, amazing universe. All my heavy math coding I do in Excel to test back and forth thoroughly before moving it into "real code". Not only, BTW, do you have VBScript, but you can write DLLs that can be called from within a spreadsheet. And of course you can do everything you can do in VBScript through Excel's COM object in any COM capable language.

Here is neat trick. Say you have two column (A1:A20, B1:B20) and you want the sum of the product of A and B. Most people would do a third column C such that Cn = An * Bn and then sum(C1:C20). In fact, you do it in one cell. Write =Sum(A1:A20*B1:20). If you just hit enter after typing the cell, you'll see it doesn't work. But instead, hit shift-ctrl-enter! By magic, it's a whole different thing now!

Anonymous said...

I worked on a similar project in the financial industry some years ago that was used to manage $20 Billion in portfolio assets. The best hope for a useful system result is to design a domain specific language, the requirements for which must be carefully teased out of the spreadsheets by a developer/architect with significant domain expertise.

I took this approach and produced a successful replacement that was even end-user programmable. A useful place to look for inspiration is some of the fantastic spreadsheet software for the NeXT computer in the early nineties. Lotus had a spreadsheet for that platform which was far superior to Excel today and showed how spreadsheets really should be done.

The inability to name columns and rows something meaningful instead of A1/B2/C3 is the biggest problem with creating truely powerful and reusable spreadsheets.

Other related problems with "modern" spreadsheets include data-sourcing, versioning, and copy-and-paste errors. When minor mistakes can cost millions, the justification for re-doing it correctly is clear. However, this is not stuff for standard engineers or consultants. You need a small team of star developers. At least one must have a knack for language design.

blinks said...

Why wouldn't AgileCo (or any other company attempting to engineer a shift from spreadsheets) use a functional language for development, then?

Anonymous said...

Jocelyn Paine and others have done wonderful work in the direction of improving spreadsheets and converting spreadsheets to auditable languages.