
Why Computers are Bad at Algebra
Season 1 Episode 31 | 14m 24sVideo has Closed Captions
The answer lies in the weirdness of floating-point numbers and the computer's perception..
The answer lies in the weirdness of floating-point numbers and the computer's perception of a number line.
Problems playing video? | Closed Captioning Feedback
Problems playing video? | Closed Captioning Feedback

Why Computers are Bad at Algebra
Season 1 Episode 31 | 14m 24sVideo has Closed Captions
The answer lies in the weirdness of floating-point numbers and the computer's perception of a number line.
Problems playing video? | Closed Captioning Feedback
How to Watch Infinite Series
Infinite Series is available to stream on pbs.org and the free PBS App, available on iPhone, Apple TV, Android TV, Android smartphones, Amazon Fire TV, Amazon Fire Tablet, Roku, Samsung Smart TV, and Vizio.
Providing Support for PBS.org
Learn Moreabout PBS online sponsorship[MUSIC PLAYING] KELSEY HOUSTON-EDWARDS: People tend to think that computers are infallible when it comes to math.
But computers make mistakes too, except they're deliberate mistakes, ones that computer scientists can exploit to develop faster computer algorithms.
[THEME MUSIC] 10 00:00:22,830 --> 00:00:28,680 In 1994, Intel recalled, to the tune of $475 million, an early model of their Pentium processor after they discovered it was making arithmetic errors.
Arithmetic mistakes, like the one Intel's Pentium processors were making, are often rooted in the computer's unusual version of the real number line.
Here's roughly what my picture of the real number line looks like.
It's kind of boring.
You can zoom in, but it'll look exactly the same.
It's a long straight line.
You can move to the right until you're bored.
But it will always look the same.
You'll never hit infinity.
But here's a computer's number line.
There's a beginning, an end, and some non-uniformly spaced points in between.
Let's explore the structure of this computer number line.
You've probably heard that computers use binary, or base 2 numbers.
In good old base 10, we represent the digits using increasing powers of 10, 1, 10, 100, 1,000, and so on.
But in base 2, the digits represent increasing powers of 2, 1, 2, 4, 8, 16, 32, and so on.
When we write 391 in base 10, it means 3 times 10 squared, plus 9 times 10 to the first, plus 1 times 10 to 0.
But the same number written in base 2 is 110000111, which means it's 2 to the eighth plus 2 to the seventh plus 2 squared plus 2 to the first plus two to the 0.
Bits, the basic unit of a computer, have two distinct states, usually represented as 0 and 1, which makes it very natural to represent numbers in binary.
Here's a string of 64 bits.
Taken literally in base 2, this string is a positive integer between 0 and 2 to the 64 minus 1.
But in order to produce negative numbers and fractions, we'll need to interpret it as a floating point number, which is basically binary scientific notation.
You might have learned that it can be useful to write big numbers, like this, as this, and little numbers, like this, as this.
In base 10, we can write any non-zero number as a number between 1 and 10, which is called the significand or mantissa, times 10 raised to a power.
The exponent is positive for big numbers and negative for little numbers.
Well, the same thing works in base 2.
We can write this as this, and this as this, where multiplication and exponentiation is done base 2.
In this case, the mantissa is also written in binary.
So 1.011 means 1 plus 1/4 plus 1/8.
Keeping in mind that it's modeled off scientific notation, let's figure out how to decode this 64-bit floating point representation of a number.
The first digit tells us the sign of the number.
It's 0 if the number is positive and 1 if the number is negative.
There are 52 bits that represent the mantissa, and 11 bits for the exponent.
Notice that when we wrote a number in binary scientific notation, the mantissa always started with a 1.
It's 1 point something, so there's no reason to record that one.
It's implicit.
The digits of our mantissa are actually the digits after the decimal.
Convert this mantissa to base 10 and call it m. Also convert the exponent to base 10 and call it E. E is between 0 and 2 to the 11th minus 1.
So the exponent is always non-negative.
The first bit tells us whether the entire number is positive or negative, essentially whether it's on the right half or left half of the number line.
But in scientific notation, the exponent has a separate sign.
It's positive for numbers with absolute value bigger than 1, and negative for numbers close to 0.
We need to represent positive and negative exponents.
To solve that problem, we subtract 1,023 from E. If E is between 0 and 1,023, the exponent becomes negative after subtracting.
So this string of 64 bits represents this number.
But there all the math is happening in base 2, and you get a binary number out.
It's more familiar when you use this form, where you convert everything to base 10 first and then get the answer in base 10.
This peculiar way of representing numbers means that any 64-bit floating point number has 53 significant binary digits.
Those are all the digits in the mantissa.
The exponent part just tells you where to put the decimal place.
Restricting the number of significant digits and then separately deciding where to place the decimal results in a trade-off between size and precision.
That's the first interesting feature of the computer number line, a trade-off between size and precision.
We'll go back to the binary world, but let's examine this trade-off in the more intuitive base 10.
Here's two mantissas, each with four significant digits, 1.234 and 1.235.
If we multiply them by 10 to a negative exponent, we get two small numbers.
And if we multiply them by 10 to a positive exponent, we get two big numbers.
Notice that the small numbers are much closer together than the big numbers.
Let's look at two numbers that are big and close together, like these, and rewrite them in scientific notation.
The mantissa is nine digits long.
That's nine significant digits.
It takes a lot more space to store a number which is both big and precise.
That's the trade-off when working in scientific notation.
With a restricted number of significant digits, we can write tiny numbers very precisely or big numbers with a lot less precision.
For example, with four significant digits, we can write all the integers up to 10,000, which is 10 to the fourth.
Here's 1, 12, 9,812, 9,999.
What happens next?
Well, we can write 1.000 times 10 to the fourth to get 10,000.
But then the next number is 1.001 times 10 to the fourth, which is 10,010.
And the next number after that is 1.002 times 10 to the fourth, which is 10,020.
Suddenly, we can only represent the integers that are a multiple of 10.
There's no way to represent 10,004 using only four significant digits.
Similarly, once we get to 100,000, we can only count by 100 using four significant digits.
The same thing is happening in a computer.
Since the mantissa in a floating point number is restricted to 53 significant binary digits, which is roughly 15 or 16 significant digits in base 10, it can represent small numbers much more accurately than large numbers.
Alex Townsend, a professor of mine and a valuable resource for this episode, first got me interested in the subject by asking, how high can a computer count?
Applying what we just saw in base 10 to base 2, we can see that computers can represent all integers up to 2 to the 53rd in floating point.
That is, a computer can count to 2 to the 53rd.
After that, it starts skipping numbers.
Between 2 to the 53rd and 2 to the 54th, it can only represent the even numbers.
Between 2 to the 54th and 2 to the 55th, it can only represent the multiples of four.
The numbers a computer knows about are becoming more and more spaced out as they get bigger.
The opposite happens near zero.
The representable numbers are becoming closer and closer together.
That's why, from a computer's perspective, this is basically what the number line looks like, which gets us to the next two surprising properties, both related to exactly which numbers are on this computer version of a number line.
Many numbers that feel intuitively simple are not present.
We tend to think in base 10 and consider numbers like 0.1 and 10 to the 20th to be more important than 0.0625 or 242,144.
But computers have the opposite bias, causing some well-known errors, like that 0.1 plus 0.1 does not produce 0.2.
The number 0.1 is not representable.
It's simply not present on the computer number line, so it's replaced by a rounded approximation.
In base 10, the number 1/6 is 0.16666 repeating.
It has infinitely many digits.
Similarly, it takes infinitely many digits to write 0.1 in base 2, so the computer is forced to round.
And the third property is that the representable floating point numbers include some surprising things which are not on my human number line.
It includes a negative 0 and a positive 0, as well as a negative infinity and a positive infinity, which is kind of weird.
The floating point numbers aren't the numbers you intuitively think should be represented.
This tells us a bit about the shape of the computer number line.
And it begins to explain the fourth property, the oddities and errors of rounding floating point arithmetic.
Let's just think about it graphically at first.
If I ask you to add 0.07 to this really big number, you'd probably just imagine shifting the point over a tiny bit on the number line.
But the computer number line has gaps between these big numbers.
Instead of shifting over a little bit, it just rounds and stays where it is.
That can lead to some funny cancellation errors, like 2 to the 54th plus 1 minus 2 to the 54th equals 0, since the computer thinks 2 to the 54th plus 1 is just 2 to the 54th.
This constant rounding can cause compounding errors when adding lots of numbers together, like when evaluating this sum, where we add 1 together 54 times.
The actual answer is 2 to the 54th plus 2 to the 54th, which is 2 to the 55th.
But if the computer were just to add the terms one by one, it would get 2 to the 54th.
That's because each time we add 2 to the 54th plus 1, it just rounds down to 2 to the 54th.
And that just happens over and over and over again.
All the 1s get consumed by the rounding error.
Errors like these can make arithmetic with floating point numbers seem weird.
But it becomes a lot less weird when you remember that a computer is working with a restricted number of binary significant digits.
It's constantly rounding and making trade-offs between size and precision.
The IEEE, Institute for Electrical and Electronics Engineers, sets the precise standards for floating point numbers in an attempt to minimize all this weirdness and maximize efficiency, called the IEEE 754.
Numerical analysts can exploit this funny computer number line to speed things up.
But they also have to be careful to understand the intentional rounding.
Unless you build computer chips for a living, you're probably not worried about it.
But if that does happen to be your job, then you should probably worry about it.
Thanks for all the awesome comments on our episode about probability theory.
So Neroox said, "I've always thought probability isn't real.
If we look at the darts example, we can in theory accurately calculate where the dart will hit if we take into consideration the relevant data, like the player's strength, the dart's direction, the weight distribution, et cetera."
That's a good point.
There is a way in which probability is like a catchall for the unknowns.
And I think that's kind of cool.
So Joshua Sherwin said, "probability is, in my humble opinion, the most difficult of all mathematical subjects.
This is because the topic appears at first to be completely intuitive and very quickly becomes unintuitive."
I don't know whether it's difficult.
I think pretty much all of math is difficult, in a totally awesome way.
But I do agree that probability is very deceptive.
Probability has the most things that are called paradoxes, really, of all the branches of math.
And that's because it's the easiest to defy your intuition.
We feel like we interact with random objects all the time, and we sort of do.
But we generally don't understand them very well.
And this can actually be problematic in some cases.
And finally, we're on a Joshua kick today with the comments, Joshua Hillerup says, "are there any other theories that have been proven equivalent to measure theory for probabilities?"
So they're not necessarily equivalent to measure theory, but there are other types of math one can use to ground probability.
There are other types of math one can use to calculate probability theories.
So like for example, there's a new one called free probability theory-- relatively new within the scope of mathematics-- that I think is pretty cool.
And I'll check-- I'll put some links to other ones in the description.
OK.
Support for PBS provided by: