# Know your floats better!

In C, the primary floating point types are `float`

, `double`

, and `long double`

. The range and number of bits for these types can vary depending on the system and compiler, but they generally adhere to the IEEE 754 standard for floating-point arithmetic. Here's a breakdown:

**float**Typically 32 bits (4 bytes).

IEEE 754 standard for single-precision floating-point numbers.

Range: Approximately ±1.5 x 10^-45 to ±3.4 x 10^38.

Example representation:

`1 bit`

(sign) +`8 bits`

(exponent) +`23 bits`

(fraction).

**double**Typically 64 bits (8 bytes).

IEEE 754 standard for double-precision floating-point numbers.

Range: Approximately ±5.0 x 10^-324 to ±1.7 x 10^308.

Example representation:

`1 bit`

(sign) +`11 bits`

(exponent) +`52 bits`

(fraction).

**long double**Size varies: often 80, 96, or 128 bits.

The standard and range can vary more than

`float`

and`double`

.May follow IEEE 754 or have a completely different format.

### IEEE 754 Floating Point Representation

In IEEE 754 standard, a floating point number is represented by three parts:

**Sign bit**: Determines if the number is positive (0) or negative (1).**Exponent**: Encodes the range of the number. It's stored in a biased format.**Fraction (or mantissa)**: Represents the precision of the number.

### Example in C

Let's consider a `float`

example in C:

```
#include <stdio.h>
#include <stdint.h>
typedef union {
float f;
uint32_t bits;
} FloatUnion;
void printBinary(uint32_t number, int bits) {
for (int i = bits - 1; i >= 0; i--) {
printf("%d", (number >> i) & 1);
if (i == 31 || i == 23) printf(" ");
}
printf("\n");
}
int main() {
FloatUnion fu;
fu.f = -13.625; // Example float number
printf("Floating point value: %f\n", fu.f);
printf("Binary representation: ");
printBinary(fu.bits, 32);
return 0;
}
```

In this code:

We use a union to interpret the bits of a

`float`

as an`uint32_t`

.The

`printBinary`

function prints the binary representation of the number, separating the sign, exponent, and fraction parts for clarity.`-13.625`

in binary form is broken down into the sign bit, exponent, and fraction according to IEEE 754.

When you run this code, it'll display the binary representation of `-13.625`

as a 32-bit floating-point number.

As an actual encoding example, for the decimal number`123.456`

, when encoded into a`float`

and represented according to the IEEE 754 standard, the binary representation is broken down into three parts:

**Sign bit**:`0`

(indicating a positive number)**Exponent**:`10000101`

**Mantissa (Fraction)**:`11101101110100101111001`

The binary representation is thus displayed as:

```
┌0┐┌10000101┐┌11101101110100101111001┐
```

Here, each box represents a different part of the IEEE 754 floating-point format:

The first box contains the sign bit.

The second box shows the exponent.

The third box is the mantissa (or fraction part) of the number.

The exponent `10000101`

in the IEEE 754 representation of a floating-point number is not the actual exponent but a "biased" exponent. This biasing is a technique used in the IEEE 754 standard to handle both positive and negative exponents in a way that simplifies the design of floating-point hardware.

In the case of 32-bit `float`

(single precision), the bias used is 127. Here's how the biased exponent is calculated:

**Find the Actual Exponent**: First, determine the actual exponent of the number in binary form. For the decimal number`123.456`

, it's necessary to convert it to binary and normalize it to a form where there's only one non-zero digit before the decimal point.**Normalize the Number**: To convert`123.456`

to binary, we can focus on the integer part and fractional part separately:The integer part

`123`

in binary is`1111011`

.The fractional part

`.456`

is converted to binary by repeatedly multiplying by 2 and taking the integer part of the result, but we can stop after a few digits for this example.Combining these, the binary representation would start as

`1111011.011...`

(and so on).

**Adjust to Scientific Notation**: Adjust this binary number to scientific notation (binary form), which would be approximately`1.111011011... x 2^6`

. Here,`6`

is the actual exponent since we moved the decimal point 6 places to the left.**Apply the Bias**: Add the bias (127 for 32-bit floats) to the actual exponent. So, the biased exponent is`6 + 127 = 133`

.**Convert to Binary**: Finally, convert`133`

to binary, which is`10000101`

.

Therefore, in the IEEE 754 representation of the number `123.456`

, the exponent part `10000101`

represents the biased exponent, and it is this biased value that is stored in the binary representation of the floating point number.

Understanding the exact binary representation requires familiarity with the IEEE 754 format, including how to calculate the biased exponent and the fraction part from the actual number. This example shows the underlying binary form but understanding the conversion from a floating-point number to this binary form and vice versa is a bit more complex and involves understanding of binary and floating-point arithmetic.