Floating Point Numbers
Written by Mike James   
Article Index
Floating Point Numbers
The radix
Algorithms
A standard for floating point arithmetic


IEEE standard

To try to make it safer there is a standard for floating point arithmetic, the IEEE standard, which is used by nearly all floating point hardware, including all flavours Intel derived hardware since the Pentium was introduced.

Single precision IEEE numbers are 32 bits long and use 1 sign bit, an 8-bit exponent with a bias of 127 and a 23-bit fraction with the first bit taken as a 1 by default. This gives 24 bits of precision and you should now see why the  loop listed above fails to operate at exactly 7 zeros before the 1.

Similar problems arise if you try other arithmetic operations or comparisons between floating point values that are on very different scales.

 

fig1

The IEEE standard floating point formats are used inside nearly every modern machine.

 

It isn't so long ago that floating point hardware was an optional extra. Even quite sophisticated microprocessors such as the 486 lacked the hardware. You could buy add-on numeric coprocessors which were microprocessors that were optimized for doing floating point arithmetic. From the software point of view not knowing if floating point arithmetic was going to be provided by a library or hardware was a big problem and it stopped early desktop microcomputers doing some types of work.

Early numeric coprocessors didn't always get the algorithms right and they weren't necessarily based on the IEEE standard. There have even been failures to implement the hardware that computes with numbers in the IEEE standard but today things have mostly settled down and we regard floating point arithmetic as being available on every machine as standard and as being reliable.

The exception, of course, are the small micro-controller devices of the sort that are used in the Arduino, say. When you come to program any of these integer arithmetic and hence fixed point arithmetic is what you have to use. The only alternative is to use a software based floating point library which is usually slow and takes too much memory.

So the art of fixed point arithmetic is still with us and you do need to know about binary fractions.

Related Articles

Binary Arithmetic

Binary - negative numbers

Andrew Booth and the ARC

Pre-history of computing

 

Banner


Hashing - The Greatest Idea In Programming

Although it is a matter of opinion, you can't help but admire the idea of the hash function. It not only solves one of the basic problems of computing - finding something that you have stored som [ ... ]



The Meaning of Life

John Conway's Life isn't just a fascinating program, it's an example of a cellular automaton. The theory of cellular automata (CA) sounds intimidating, but in fact it's simple and fun. It is a deep my [ ... ]


Other Articles

 

To be informed about new articles on I Programmer, subscribe to the RSS feed, follow us on Google+Twitter, Linkedin or Facebook, install the I Programmer Toolbar or sign up for our weekly newsletter.

blog comments powered by Disqus

 

<ASIN:081764704X>

<ASIN:1568811608>

<ASIN:0898714826>

<ASIN:0073051896>

 



 
 

   
RSS feed of all content
I Programmer - full contents
Copyright © 2014 i-programmer.info. All Rights Reserved.
Joomla! is Free Software released under the GNU/GPL License.