- the first step in mastering math is be to learn to read the notation. just like learning the syntax of a new programming language
- the second is to grasp the reality expressed by the notation at a gut level , like understanding the paradigm and patterns lying underneath a programming language , like ,say ,beginning to grok "oo"
- the third is to use that understanding to create new possibilities
- the fourth is to use a programming language to embody and refine those possibilities , thus creating programs that do what has never been done before.
Ravi Mohan's Blog
Tuesday, February 21, 2006
Yet Another Math Milestone
A few days ago, I discovered an error in the mathematics of a paper (on neural network optimization) being prepared for publication by a very eminent scientist. My drawing attention to this discrepancy in the proof has led to a total recasting of the approach to the problem and I will be now listed as a co-author of the paper.
Hmm yeah. Whatever. So how is this significant?
Well this is the first time I have used my skill in mathematics (vs my skill in programming) to contribute significantly to a scientific effort. In an old blog post I had theorised that the acquisition of "mathematical thinking" would follow a four step path. I said ..
Subscribe to:
Post Comments (Atom)
26 comments:
Interesting post. I have always been interested in math but only to the level of higher secondary calculus and quadratic equations. Can you suggest a book chain (to use your terminology :) )for some (if not all) of the 4 steps?
chandrakant,
me suggest a book chain in math? bwaa ha ha! To reccomend a "book chain" you need an expert. I am still very much a beginner in math. :-)
Since you came out of IIT, I would say that you are in a better (mathematical) position than I. :-) I should be asking you
ok anyway since you asked, this is the books *I* used to learn basic math(take this with a MOUNTAIN of salt, unlike in programming, I am on very shaky ground with maths and far from an expert).
Calculus, Vol. 1: One-Variable Calculus with an Introduction to Linear Algebra (Hardcover)
Calculus, Vol. 2: Multi-Variable Calculus and Linear Algebra with Applications (Hardcover)
Mathematical Analysis, Second Edition (Paperback)
by Tom M. Apostol
Linear Algebra and its applications David Lay
Concrete Mathematics Knuth et al
Introduction to Applied Mathematics (Hardcover) Gilbert Strang
The rest of the books I used have to do with specific areas of AI and would probably depend on what exactly you are doing
I am fortunate enough to be ina position to have to *use* this mathematics on a daily basis. That gives the impetus to get through steps 2 and 3. I have not yet seen books that do that.
all the above fwiw.
The Nature of Statistical Learning Theory
by Vladimir N. Vapnik
Thanks for the suggestions. Well, let me say that graduating from IIT in no way indicates better mathematical position. The people who used to do well in mathematics did so because they had a genuine love for it. Others just wanted to pass the JEE ;)
I do admit that I got a rush out of formulating and solving mathematical equations, but the books that were given did not help much. But then again, I dont think many math books are written to keep the reader engrossed. Perhaps its assumed that the reader is interested enough by himself rather than depend on an external source of motivation like the book. I will take a look at some of the books, thanks again. By the way, congrats on the math paper. Let me know when its published.
the paper ( a neural network paper - the math is "embedded") is going to be submitted by June/July (to some nn conference I have no clue which) the last I heard. It still has to be accepted etc but the folks doing the work seemed fairly confident. Now the whole thing is being rethought and redone (thanks to me :-P). When it get published I'll post a link here.
Ravi,
Congratulations!! Guess, this should have been a really satisfactory effort :)
Ananth
Yet Another Math Milestone, hmmm, Mind sharing what the earlier milestones you achieved were ?
Ravi,
Seems to me that there are layers of abstraction that you are not considering. At one level you have mathematical foundations for CS like RDBMS links to the relational algebra. At a higher level you have applications that are built on this like SQL. IMO, the agilists claim that you can get the right solution for a problem at the higher level not at the lower level. I still have to see a way of developing quick sort *algorithm* using refactoring, but once I know the algorithm I can use refactoring to get the right implementation.
These two are complementary.
-- KD
KD,
you say "Seems to me that there are layers of abstraction that you are not considering."
But this is PRECISELY what I am saying. There ARE multiple leyers and to be effective you have to master *many* layers.
Like you say, the "pure agilists" and the "mahematicians" focus on different layers (SQL and Relational Algebr, o use your example) and consider them sufficient.
I said (in the entry) "One is not a substitute for other."
You say "These two are complementary."
So, yes, I agree with you that the to skillsets/modes of thinking are complementary. That is precisely what I was trying to say.
Ravi,
Are you not saying that both the skills are needed for getting the best possible solution for a problem? If not, I misunderstood the post.
My point is that where as the skills are complementary, one do not need both the skills to develop great programs. The conceptual skills are more important for developing good programs and the mathematical foundations can be ignored for the development of most applications. I understand that there are areas where both skills are needed, but they are just handful.
-- KD
KD,
Ok Now I get you.
I am indeed saying that both skills are necessary to develop "masterpieces vs identikit" programs (to use my own words from the post).
You said
"The conceptual skills are more important for developing good programs and the mathematical foundations can be ignored for the development of most applications."
I agree that you can *develop* programs without maths (or understanding the domain). I never said otherwise.
What I claim is that such programs will be suboptimal compared to what could be developed by a person who understood BOTH the domain AND programming. Imagine someone who knows banking (say)inside out AND is an expert programmer vs someone who is just a good programmer and depends on "conversation" or "documents" for knowledge transfer.
Are you saying there would be no difference in the quality of the programs they write and the solutions they envision?
If so, we'll have to agree to disagree and leave it at that :-)
I claim there will be an order of magnitude difference between someone who knows and understands a domain intimately and *also* programs well as compared to someone who just programs well.
KD,
You also say "The conceptual skills are more important for developing good programs and the mathematical foundations can be ignored for the development of most applications"
This is simply not true in my opinion.
here is a simple (trivial?) example.
A lack of knowledge of basic type theory is at the root of most people not understanding the type systems of even "common" langages like java and c++.
How many people know how to use the new generics feature of java well? Of those who do, how many know the *exact* weaknesses or 'holes" in the half assed implementation that comes with java 1.5?
As for type theory the same for concurrency, transactions, security, scalablity etc . All have (maths based) theory underlying them. You CAN create programs that embody these concepts without knowing the underlying math. You will NOT be as effective as if you *also* ("also" not "only") knew the math.
And if a programmer doesn't understand (vs being able to use blindly) the basic underpinnings of the language he uses, how can he write "great" programs? This would be like a writer in English who doesn't know the difference between anoun clause and a prepositional clause.
Do such "natural" writers (who know nothing of grammar but write exceptional prose) exist? Of course! Are they a very very small minority ? yes.
Are most great writers conversant with the rules of grammar? yes.
Given someone who has a modicum of writing talent and wants to beome a great writer , would we advise him to learn grammar(besides writing regularly etc)? yes.
substitute "programming" for "writing" and "programmer" for "writer" in the sentence above and that is the exact point I am trying to make.
I know too many people who are excellent programmers ( in the sense thay give you beautifully factored programs that get the job done) but miss subtleties in the domain (and thus the miss the chance to write prograsthat are 100 times shorter faster etc) becaue they have no mental tools besides "refactoring", "test first" etc.
I am NOT saying "learn math and throw away all your knowledge of programming(like refactoring)".
I AM saying if you ADD mathematics (and other meta-domain and domain specific skills) to your programming skills you will be MORE effective IRRESPECTIVE of your domain.
aahh this should perhaps go into another blog entry. Thanks for the comments. Helped me clarify my thoughts.
Ravi,
You said
"A lack of knowledge of basic type theory is at the root of most people not understanding the type systems of even "common" langages like java and c++."
Not really. It is more to do with lack of application of mind and efforts. Next time when you meet a C++ programmer ask him/her how many chapters from Stroustrup they studied. Ask them also what "gotw" stands for in C++ community. You can also ask whether they used Boost libraries. My guess is that someone who answer rightly for all these questions will be a great programmer. What we need is not basic type theory, but the understanding of the concepts C++ implements using those theories. An understanding of theory behind ADTs and AST doesn't hurt, but not a necessary condition.
You also said
"As for type theory the same for concurrency, transactions, security, scalablity etc . All have (maths based) theory underlying them."
All roads lead to mathematics ;-). Let us take concurrency. A basic method of avoiding deadlocks is to aquire resources in the same order in all threads. I know that there is underlying maths - but where is the use of it in a typical application? Security - avoiding buffer overflows is simply following good practices (like using strn* instead of str* in a C application). In what way understanding how to exploiting buffer overflow is going to make my application more robust. I can give plenty of examples for each one of these - where application of math allows to arrive at a solution and once the solution is known it is applying that concept over and over again.
Finally, you mentioned that
"Do such "natural" writers (who know nothing of grammar but write exceptional prose) exist? Of course! Are they a very very small minority ? yes.
Are most great writers conversant with the rules of grammar? yes."
Anologies can be misleading. Our field is struggling because of 'software engineering' misnomer for long enough and comparing building software with sky-scrappers. For each and every great programmer who has CS/maths background I am sure you can find another great programmer without such a background. I seriously feel you need to rethink the basis.
That said, it might be better for us to agree to disagree (like you said) and go back to our coding.
BTW, my blog (blog.jaliansystems.com) is up with a first post. May be I will add a digression on this subject sometime later today.
-- KD
KD this is a good discussion.
I have a feeling we are speaking ast each other. I agree that to be good at programming you should know particluar languges(say C++) well, read the canonocal books (sat stroustroup) and understand the canonical libraries of that language (say boost) wel. Where I differ from you is the notion that that this is enough.
Not so. (imo). You need MORE than languages, (language ) refernce manuals and knowledge of libraries. You *also* need to understand theory (and yes, maths).
Let me try once more.
you say (knowlege of) boost/stroustroup/gotw ..
and then
"My guess is that someone who answer rightly for all these questions will be a great programmer."
fwiw, I disagree. Still this seems to revolve around what makes a programmer "great" . You believe an excellent knowledge of c++ implementation issues makes one a great programmer. I think it needs (quite a bit) more. I won't take this further.
" What we need is not basic type theory, but the understanding of the concepts C++ implements using those theories."
Huh? If you don't know what is being implemented, then how can you judge the quality of implementation? I don't get it. You need to know BOTH (this is my point BOTH not either).
I would say you need to know WHAT is being implemented (basic type theory which will tech you wht teh vrious form of polymorphism (for e.g) are AND knowledge of how well (or not) these insights are implemented in your favorite langauge.
"Let us take concurrency. A basic method of avoiding deadlocks is to aquire resources in the same order in all threads. I know that there is underlying maths - but where is the use of it in a typical application?"
Ha! Got you! :-)
See the assumption you make by equating "concurrency" to "thread based concurrency". thread != concurrency.
Both thread based and process based concurrency are instances of a "family" of concurrency mechanisms called "State based concurrency". There are (at least ) 2 more major families of concurrency ("declarative" and "message passing"),both of which are applicable in 90% of situations way more than thread or process based concurrency.
And the (basic) math behind concurrency is called "pi calculus". And no, if someone says he "understands" concurrency ( a very very hard claim to make in the first place) without any notion of pi- calc I'd say he is one of those folks who "knows not and knows not that he knows not" .
Understanding concurrency in terms of "lock and release resources in the same order" is a *dangerous* oversimplification.
Let us talk about "typical applications".
In *any* application which needs concurrency, the programmer should be able to select the *appropriate* model of concurrency.
If he doesn't know of these, he has to choose between variants of "thread based", "process based" or "event based" (all subsets of "state based" concurrency), never even knowing there might be otehr alternatives.
People often use (for e.g) "Pthreads" because that is the only thng they have seen.
Compare the "performance" of an erlang based webserver(YAWS, using "message passing concurrency") vs that of (say) apache. (http://www.sics.se/~joe/apachevsyaws.html
There is no reason why a similair web server cannot be written in C but it is NOT possible using threads.
Sorry for teh "lecture" but if I am in the business of writing heavily cncurrent apps, I'll damn well need some knowledge of what is possible and what is not in each form of concurrency. And yes, to do this properly you'll need some math.
anyway on to "security"
You say
"Security - avoiding buffer overflows is simply following good practices (like using strn* instead of str* in a C application)."
who said "security" == "buffer overflow" ? *I* didn't .
All roads lead to C++? ;-)
"security" >> "buffer overflow" .There are security issues even in softare written in languges that don't ave buffer overflows. And yes you can't really understand security if you skip the maths.
But that is another post.
"Anologies can be misleading. Our field is struggling because of 'software engineering' misnomer for long enough and comparing building software with sky-scrappers."
Correct. But the answer is not to brush aside all analogies but to point out the exact flaw in the analogy used.
Also, just as dangerous is the error of trying to understand important concepts though the prism of their implementation in a particular language (C/C++ in this case).
This is *exactly* what "just programmers" do. They try to grok complex phenomena through a particular language(say c++) and/or particular paradigm (objects) and/or particluar mthodology (agile).
Such oversimplifcation (imo) is dangerous and limits the programmers horizons leading to a "frog in the well" syndrome.
" For each and every great programmer who has CS/maths background I am sure you can find another great programmer without such a background. I seriously feel you need to rethink the basis."
I don't think so. It all comes back to the definition of "great". If by "great" you mean someone who understands c++/java/lisp/j2ee whatever enough to get some code written, Of course you don't need to know any math (or have any other domain knowledge).
Every single "great" (top 1%) programmer I have seen is well versed in the underlying theory of his "area of greatness" (including the math) . Let us take C/C++ as an example .
From Dennis Ritchie bio
"I was born Sept. 9, 1941 in Bronxville, N.Y., and received Bachelor's and advanced degrees from Harvard University, where as an undergraduate I concentrated in Physics and as a graduate student in Applied Mathematics. The subject of my 1968 doctoral thesis was subrecursive hierarchies of functions."
Kernighan is a computer Science Prof at Cambridge. Take a look at his publications and see how "math influenced" they are( http://www.cs.princeton.edu/~bwk/bwkbib.html)Prof.
Stroustroup is a PhD in Comp Sci.( from his home page "Born in Aarhus Denmark 1950. Cand.Scient. (Mathematics and Computer Science), 1975, University of Aarhus Denmark. Ph.D. (Computer Science) 1979, Cambridge University, England.")
As for Stepanov (here are the papers list see how much "math" he seesm to know ( http://www.stepanovpapers.com/)
And of course all of C/C++ came out of the Bell (adn ATT) research labs.
Thus the "greatest" programmers n C++ all seem to be pretty solid in their mathematical background. Where are the C++ hackers of equal calibre who are clueless in maths?
Anyway, to a large extent you ae countering a strawman. I *never* said a "great programmer" (however you define it) *should* know math (though a lot of them seem to be good at it).
What I said is "A knowledge of maths OR other meta domain OR domain specific skills ) PLUS excellent programming skills will make one a *better* programmer than just learning ONLY programming.". What exactly is the problem with this position again?
To make that more concrete, if someone knew all the template stuff but could not explain how to estimate if a *newly provided* algorithm is O(n) vs O(n log n), i don't care what he knows about boost/gotw wahtever.
Conversely if someone knows all sorts of fantastic theory but can't produce (good) code, I would consider him suboptimal.
Thus, in essence I still stand by my claim that programming and mathematics(in all its forms) (and other domin knowledge like say econmics) provide complementary views and insights and an aspiring programmer should (imo) attempt to be strong in both.
Anyhow I will take this up with you when I see you on the coming Tuesday. Typing these long replies is too time consuming.
Back to coding.
Hi Ravi,
One question. How do you stop yourself from getting distracted while pusuing the study of maths? I shall cite an example from my experience. I started off studying probability and i encountered situations which convinced me that i need to go back to old school lessons about permutations and combinations. While i was reading em, i realised i need to understand more of binomial therom, which leter took me to Higher algebra...in the whole process i have left probability far behind.
Praful,
It might help if you learn math *in order to understand somethng else* particularly in the beginning. Most of my maths has been acquired through trying to understand AI algorithms.
For a friend of mine, hs interest in equity trading and economics drives an interest in mathematics. "grindng through" maths doesn't seem to help.
So once you learn probability upto say Bayes' theorm, you coud write a Bayesian Filter. Learn a bit more and write a Hidden Markove Classifier . And so on.
Vivek,
I think you over estimate me. Help India out of drkness? Yeow! Ego boostng for sure but not really true! :-)
Nothing so grand my friend. Just trying to led meaningful life out of the 9 to 5 grind.
Ravi,
I should have followed PG's style and left the examples out of the comments. Don't flog the examples - flog the concept.
You said:
"fwiw, I disagree. Still this seems to revolve around what makes a programmer "great" . You believe an excellent knowledge of c++ implementation issues makes one a great programmer. I think it needs (quite a bit) more. I won't take this further."
I think you are sticking to the example too literally. I am talking about the attitude here and someone with a flair for learning.
You also referred to examples of concurrency and security from my comment. Boss, these are only examples. I did not set out enumerate all available forms of security or concurrency. What I am saying is that for the given examples once a solution is found, applying the solution deligently is what makes someone a great programmer. The deligency requires discipline and that makes a lot of difference.
I still stick to my gut feel that being a great programmer does not need deep understanding of theory. Looks like we are going to have a nice discussion on tuesday. Looking forward to it.
-- KD
kd,
"What I am saying is that for the given examples once a solution is found, applying the solution deligently is what makes someone a great programmer. The deligency requires discipline and that makes a lot of difference."
I agree 100 % with all this, in the sense that diligence is absolutely required.
The ONLY disagreement I hve with you is with "being a great programmer does not need deep understanding of theory. "
Here I disagree with you 100 %.
(And that is quite all right. If two people think 100% alike. one of them is unnecessary :-))
"Looks like we are going to have a nice discussion on tuesday. Looking forward to it."
Totally agreed. I DEEPLY appreciate your thoughtful comments, which make me think a lot. (new blog entry coming up on this SPECT SOON).
Ravi, I want to take away some points to ponder from this. I wrote down what I thought you said and also some inferences on my own. Let me know what you think.
Assuming you want to develop truly effective programs,
1. Its better to give as much importance to deepening your knowledge in the domain as deepening your knowledge of programming languages involved in the domain.
2. For someone like a software programmer who hops from domain to domain, this means he/she has to spend enough time in one domain to get enough knowledge about it to write effective programs.
3. Since jumping from domain to domain involves a lot of re-learning from scratch, (say switching from a project in healthcare to supply-chain logistics), learn something like mathematics which pervades both.
4. There may be other fields of study that involve lot of domains but math is the one that covers most.
5. Even though at this point you are not sure how mathematics pervades both, the only way to know for sure is to learn mathematics and find out.
6. Alternatively, you could choose to stick to one or two domains for life, and strive to develop a deep understanding in those just by sheer learning by experience during the vast amount of time you spent on it.
Chandrakant,
Some quick comments.
"1. Its better to give as much importance to deepening your knowledge in the domain as deepening your knowledge of programming languages involved in the domain."
replace "as much importance" with equal importance and "programming languages " with "programming skills" and I agree to a first approximtion.
"2. For someone like a software programmer who hops from domain to domain, this means he/she has to spend enough time in one domain to get enough knowledge about it to write effective programs."
I am not very sure that domain knowledge/insight is a function of time *only* (though time is a factor too) . It is more a fucntion of the mental tools one has in one's repertoire, working with good people (in the domain) , the nature of teh challnge one takes up within the domain etc etc
"3. Since jumping from domain to domain involves a lot of re-learning from scratch, (say switching from a project in healthcare to supply-chain logistics), learn something like mathematics which pervades both."
Hmm I am not sure whether I agree wi th this. I don't agree with the "rathr than X learn Y" idea.
All I am saying is 9a)mathematics , like programming is a meta domain skill. (b) knowing the domain gives you insights that knowing only programming doesn't (c) maths is also an "insight provider" in many domains (d) there is no conflict between learning maths vs understanding a domain
now whetehr all that can be cast as "larn math *as opposed to* learning a domain", I think not.I guess I need to do more thinking on this one.
"4. There may be other fields of study that involve lot of domains but math is the one that covers most."
I don't remember saying this but I think this is (broadly) right.
5. Even though at this point you are not sure how mathematics pervades both, the only way to know for sure is to learn mathematics and find out.
hmmm I'm fairly clear as to how maths pervades *some* domians. I also don't get what "both" means in the sentnce above.
Regds, Ravi
6. Alternatively, you could choose to stick to one or two domains for life, and strive to develop a deep understanding in those just by sheer learning by experience during the vast amount of time you spent on it.
10:11 PM
I think there's something to be gained by expanding the concept of "programmer" somewhat. As it stands--with "programmer" taken to mean roughly "someone who implements things in a programming lanugage"--the analogous mathematical concept is something like "calculus student".
It is true that many problems have elegant foundations in mathematics, but there's another level of structure (meta-structure?) that's essential to understand: abstraction. This often overlaps with what you're calling "implementation," but there's a bit more of interest behind it.
It is a truism that everything in programming is built on abstraction--C++ is an abstraction over the some low-level parts of C, C is an abstraction over some of the tedium of assembler, assembler is an abstraction over all manner of hardware activity, and even that is expressed in bits, an abstraction over analog electric circuits. (and, of course, that description was an abstraction over some messier truths) An excellent programmer has to be good at using and developing abstraction to simplify his thinking, while being aware of what's lost so that he can decipher unexpected (buggy) behavior.
At some point, we close the loop and begin to develop an abstract concept of "abstraction." This is, I think, a more direct approach to the sort of meta-domain thinking you're talking about than mathematics. Mathematics is sufficiently general and foundational that it can often subsume interesting topics from other fields, but by reducing them to math it ends up avoiding general cross-domain meta-thinking.
I recommend you take a look at programming language semantics, or category theory.
You'll find a nice interesection of math/logic and programming.
Most mathematicians are not up for that stuff -- too abstract.
I spent the past few years doing what is largely classified as 'AI', 'Machine Learning' or 'Datamining' programming. Learning the 'syntax' or notation is surely 1st. Working with domain experts (like quants in the finance context) was what really helped me see where I needed to work on my math skills and 'take it to the next level'.
At the end of the day - it is really rewarding to be able to sift through the cutting edge of contemporary AI/ Machine learning THEORY (white papers, etc) and then REALLY synthesize and apply the ideas in a useful context.
Go for the math degree :) But stay as someone who APPLIES the ideas (a software professional perhaps :) ).
-Chris
-Chris
Ravi,
I think I get the gist of what you are saying. Thanks.
Are you familiar with Lambda the Ultimate? It's a really cool programming languages weblog with links to lots & lots of papers; the folks there speak category theory, a bunch of different logics, etc. (in addition to sometimes being practically minded). It's my absolute favorite web-site, and if you're looking for a formal approach it's very worth your time. You won't see any kernel-based methods or other COLT/ML resources there, but for programming theory it can't be beaten.
Clayton, Chris, JIm,
Thank You for your great comments.
@Clayton
I don't think your notion of abstraction as meta strcuture contradicts anything I said. Good Point though.
I am still not sure this kind of high level cross domain abstraction is "more direct" than mathematical abstraction. Have to think about that. I *intuit* that they are complementary.
Chris,
I agree totally with the notion that math skills need to be "grounded" by using them in practical contexts. Hence :-)\
Jim,
Yes Lambda the ultimate is one of my favorite sites as well.
Thanks all for teh great comments.
Ravi
HI Ravi,
Can you suggest a book chain (to use your terminology :) for learning Java. I will be moving into a J2EE project soon.
regards,
Praful
Post a Comment