Tuesday, December 2, 2008

Error Propagation: an unexpected beauty

So, say you have a couple of measurements, x and y, with some associated uncertainty; the true value of x might be, say, 3 units above or below the measured value, and the same for y. The problem I'm currently working on required me to propagate error through several stages - and also to have a maintain a "confidence value" for that error (i.e., a probability that the observed error will not exceed the predicted limit).

It turns out that there is a little mathematical nook crammed full of simple and useful methods for dealing with exactly such a problem; a helpful colleague introduced me to "Error Propagation" (see link for handy formulae). The amount of error expected in x (up to 3 units) is termed dx, and similarly for y. In choosing dx, you are not really saying (contrary to what I first thought) that dx will never exceed 3 units; instead, you first decide how sure you want to be that your error won't be large enough to surprise you - say 95% - and then choose a value for dx that will rarely (1 time in 20) be exceeded. For x + y, the expected error of the result, d(x+y) is given by this very familiar formula:

d(x + y) = (dx^2 + dy^2)^1/2

Seems like that formulat is familar? Yes, Pythagoras all over again - dx and dx are now the first lengths of the first two sides of right-angled "error triangle", and the result is the length of the hypotenuse... wow, we started with probability, and now we have a result that can easily be expressed geometrically. Unexpected and beautiful!

What if you want an error value that will never, ever be exceeded? Then you'll need a much larger dx; to get, say, a confidence of 99.99%) you will need dx to be at the fourth standard deviation (see Wikipedia on "normal distribution"). The trouble with certainty is that it costs - you might have to increase dx quite a lot to reach that level, assuming your error is normally distributed).

The really neat thing, for my particularly application, is that once you've chosen dx and dy with a certain confidence, then d(x + y) will have a matching confidence: set the confidence for the input, and your output - umpteen calculations later - will have the same confidence for its error value. Time to go play with code and see if I really understand all this...

No comments: