Integral Calculus in Haskell
by Elben Shira
We started learning Haskell in Professor Misra’s class. Since this is the first functional programming language I have been exposed to, I decided to write an integrator.
small_dx = 0.0001 -- integrate a function f from a to b integrate f a b = integrateGen f a b small_dx 0 -- Generalize the integrate function to take -- advantage of tail recursion. -- integrate a function f from x to b integrateGen f x b dx sum | x > b = sum | otherwise = integrateGen f (x+dx) b dx (sum + (f x) * dx) f1 x = 1 f2 x = x**2 f3 x = x * exp (x**2) -- normal distribution std = 1 mean = 0 normal x = (1/(std*sqrt (2*pi)))*exp(-(x-mean)**2 / (2*std**2))
integrate takes in three parameters, f is a function, and a and b are doubles. If you are familiar with Haskell types, here is the type of integrate:
*Main> :t integrate
integrate :: (Double -> Double) -> Double -> Double -> Double
integrate calls a general version of integrate called integrateGen. integrateGen takes advantage of tail recursion, which allows the compiler to internally convert the recursive function (which would take a lot of stack space) to an iterative function. The dx in integrateGen is the usual calculus , and
sum holds what we have calculated so far.
The functions we declared are:
where
Now we are ready to test this baby out. Let’s start with an easy one:
*Main> integrate f1 0 5 5.000000000001686
This is pretty close to the actual answer:
Let’s try a more interesting problem:
*Main> integrate f3 0 1 0.8592768342831555
The true answer is:
Next time, we will use this little program to do some useful statistical calculations.
Optimizations
We generalized integrate to take advantage of tail recursion, but for some reason, the GHC interpreter does not figure this out. As a result, we can get stack overflows when we have large limits. So we are forced to compile it with optimizations turned on. First we need to add a main function:
-- replace with whatever you wanted to integrate main = print (integrate funct2 0 999)
Then we compile it with optimizations turned on:
$ ghc -O --make filename.hs
This is one thing that bugs me about functional programming. Thinking about and writing programs recursively may be more elegant and easier to understand, but then we are putting our trust on the compiler for optimizations, which is scary if we don’t understand how the compiler works. But nevertheless, Haskell is a fun and powerful language, and I can’t wait to do some other cool things with it!
Comments
Nifty, you might try using Haskell’s Algebraic datatypes to do some kind of symbolic integration, since if you could lift basic math operations into some kind of datatype, then integration symbolically ought to become a series of pattern matches which break down an equation and apply basic rules.
You might also be interested in Conal Elliot’s Automatic Differentiation stuff, it’s on his blog, some others have done Computer-Algebra type stuff.
Very neat!
“Thinking about and writing programs recursively may be more elegant and easier to understand, but then we are putting our trust on the compiler for optimizations, which is scary if we don’t understand how the compiler works.”
This is a common misconception for newbies to the language. In reality, we aren’t really putting any trust in the compiler to perform certain optimizations. It’s just a matter of understanding Haskell’s evaluation model.
In this case, you believed that making a function tail recursive should prevent stack overflows, but you didn’t realize that laziness can play a factor here. The guards in your integrateGen function only force two parameters to evaluate. The other parameters are not forced until the very end. Without realizing, you have defeated the purpose of tail recursion by creating a long chain of unevaluated thunks instead of accumulating a result as you go, a stack overflow.
The reason that optimizations fixed this is because GHC’s strictness analyzer figured out that the result will always be evaluted at the end anyway, so it causes the binary to go ahead and evaluate it earlier.
Your stack overflows are probably due to the code exceeding the limits in the Double type.
You may want to look at this in Lisp too. Here’s the equivalent code in Lisp –> http://jng.imagine27.com/articles/2009-04-09-161839_integral%20calculus_in_lisp.html
Lisp itself is really just a calculus too(lambda calculus) implemented as a programming language that has the same syntax. This lends well to many types of math coding. There is little to translate.
@Jake McArthur: you are right, my ignorance is the problem, not GHC, but I’m not sure what you mean by “the other parameters are not forced until the end.” Can you provide an example?
@Elben Shira: what Jake M means is that the compiler only evaluates the ‘x’ and ‘b’ parameters with each iteration because it has to in order to perform the comparison and continue the computation. The other parameters aren’t evaluated at all until the very end because you haven’t needed to evaluate them! So they exist as ‘thunks’ (a computation that has yet to be evaluated). Look at
seqto add strictness.Your sum parameter accumulates thunks because you don’t need it until the end. That is, for each recursive call, it might grow something like this:
0
0+(f 0)*0.0001
0+(f 0)*0.0001+(f 0.0001)*0.0001
0+(f 0)*0.0001+(f 0.0001)*0.0001+(f 0.0002)*0.0001
0+(f 0)*0.0001+(f 0.0001)*0.0001+(f 0.0002)*0.0001+(f 0.0003)*0.0001
0+(f 0)*0.0001+(f 0.0001)*0.0001+(f 0.0002)*0.0001+(f 0.0003)*0.0001+(f 0.0004)*0.0001
and so on. This is what overflows your stack. (Notice the similarities to a function that is not tail recursive!)
There are ways to avoid this problem, but since GHC’s strictness analyzer can tell what’s going on anyway it’s easiest to just use optimizations.
Nicely done.
Part of me was hoping for something a little bit more: you should take the time and effort to learn about Automatic Differentiation.
You are spot-on when you say that it’s important to understand how your compiler works. I must admit, Haskell is a challenging one in this regard, but I have a reasonably good understanding at this point, although my understanding of GHC pales in comparison to certain other compilers for other languages. Keep at it!
@Joe Fredette: nothing symbolic here: it’s a purely numerical technique. It could be improved a bit by using the trapezoidal or Simpson’s Rule. I second the AD suggestion, obviously.
@Jake McArthur and jberryman: very erudite comments. One big thing about learning Haskell well is learning when there will be thunks versus when concrete values will be produced.
Basically, common situations where evaluation will be forced, avoiding the creation of thunks, is on conditional branches, pattern matching, and uses of “seq” and friends.
You might like John Hughes’ famous article “Why Functional Programming Matters”, which goes over a few examples like this. It was written in the 1980’s and uses Miranda, an older language that Haskell is descended from, that should be easy to understand. You should be able to find it online easily.