- Problem 1 : Transferring Mathematical Intuition Using AI effectively often demands a deep understanding of the mathematics underlying whatever AI 'paradigm'/algorithm you use. For e.g. to even think about whether Neural Networks are a good solution for a problem, one needs to understand Linear Algebra at very intuitive level. To many people maths == a set of equations or symbols to be manipulated. And this manipulation and re arrangement of concepts to achieve precise effects is something programmers are naturally good at and so this tendency is even more deeply rooted in good programmers. For e.g., a lot of us have learned the equation F = m * a , where 'F' is Force, 'm' is mass and 'a' is acceleration. And to get through our exams we treat this as some kind of pluggable system in which two quantities are known (or can be derived from the given data) and the third has to be derived. This looks fairly simple. To check whether you understand the reality underlying the equation ask yourself this - "Does Force cause acceleration? Or Vice versa? Or both? Why? When?". The use of equations to *calculate* a quantity is different from being able to think effectively in terms of the concepts underlying the equation. To use mathematics effectively one has to constantly translate between the world of mathematics and the domain of the problem. Very few people(including mathematicians) feel the need to acquire this skill. People who use mathematics to get work done (e.g. Physicists/ Astronomers/ Race Car Designers) acquire this skill out of sheer necessity. Forcing oneself to translate (and asking students to translate) back and forth between English(without using mathemetical terms) and Mathematics is a valuable exercise. (try doing this for a simple differential equation and you will see what i mean) Thus, it isn't possible for someone to grab a few equations out of the latest AI/Pattern Recognition/Robotics etc book and apply them straight away to solve real world problems. To even know what is possible takes an understanding of the mathematical undepinnings of the various "families" of AI algorithms and this intuition takes a long time(at least for folks like me) to acquire. If programmers are just shown some algorithms or mathematical proofs, it is very unlikely they will be able to program an effective system or even maintain a system to meet changing environments. This is complicated by the fact that intuition may take a few years to acquire but has to be transferred in a week or less. While there is no perfect solution to this, I find that one can transfer large "chunks" of intuition through carefully selected (sequences of) motivating examples. In my first "iteration" of teaching, I said (in effect), "o.k. here is the (basic) theory. This is the algorithm in pseudocode. Use this and you will get the results you want". Everyone seemed happy but the whole effort "thrashed " for quite a while before delivering the (astoundingly effective) results. So these days, I try to instill an understanding of the theory distinct from the programming effort. More of a "teach how to fish" while giving the student enough "fish" so he doesn't starve till he learns to fend for himself. Thus instead of just throwing out the algorithms for (say) prediction and monitoring of data streams using Markov models, I start with real world examples of prediction vs monitoring and slowly feed in the maths (equations first, trivial programs next, then proofs, then real world programs). Given sharp "students" (as most of the programmers I interact with on a daily basis are, (Thank God!)) this works suprisingly well. I sometimes feel I am spinning a rope bridge across yawning chasms but the notion of a "slice" through a system works just as well in mathematics as in agile "story card" based implementation. Once a few cables are thrown across, programmers feel brave enough to explore the abyss a few feet at a time without too much assistance.
- Problem 2 : Getting beyond "right" and "wrong" Sometimes in spite of my best attempts at simplifying and communicating, my "students" (I think of them as peer programmers rather than as students, hence the quote marks) sometimes misunderstand and "screw up" the concepts and write strange programs or advance baffling arguments. In the beginning my response was "That is NOT what I said! THIS is what you need to do. PLEASE look at the examples/notes/code. Aaaargh!!!! Not Again! ". A more effective way is to try to figure out what the student's mental model is, faulty though it be. So now, instead of reacting to a totally off track argument by (mentally) banging my head on the nearest wall, I say to myself "What he says makes sense to him. So what mental/model thought processes has he adopted which causes him to see the world in a fashion that makes this argument logical?". The moment I can identify this precisely I am able to come up with an illuminating example that demonstrates the error. And sometimes it turns out they are on to something important and it is my perceptions that need fixing. "Learn Twice" indeed.
- Problem 3 : Ability to Do != Ability to Teach Sometimes the "teaching" gets so fatigue-inducing that I think to myself "I could have coded up all this in one tenth the time it taks me to communicate to someone else". My client, (being wiser than I) insisted on the teaching approach. What I have realized that many people (including Yours Truly) know how to do many thngs but are not often able to explain how they do them or teach others to do likewise. When you teach others you have to know each concept in a crystal clear fashion and also grok all the inteconnections between the concepts. Teaching often involves re-arranging concepts in increasing order of complexity and interdependence, seeking real life and programming examples that illustrate each facet of a concept with great clarity and slowly transitioning into complex real world problems and solutions. After teaching others, I understand many things at a much deeper level than I used to. That being said, teaching consumes massive amounts of time. I have to work 10 hours or more to prepare for a three hour "lecture". Thus "teaching" conflicts with "doing". For the time being, I'll focus on being a programmer more than a "teacher" type but I wonder how others manage. There is a lot more I could write but this post is long enough. In another entry I will look into the difference between "Real world" and "Toy" AI.
Ravi Mohan's Blog
Monday, January 30, 2006
To Teach is To Learn Twice ... And More
I am sometimes asked by clients to implement Artificial Intelligence based solutions to problems involving massive datasets, real time requirements etc. A non trivial part of this work is to transfer the knowledge of AI algorithms and the underlying maths to the client's programmers. After doing this a few times, I now have a clearer idea of some of the difficulties involved and some partial solutions for them.