Integers are perhaps the most commonly used data type in programming, and even the integer data type is often supported in different sizes by various programming languages. For example, the Java programming language breaks down this information into four types of signed integers: byte, short, int, and long. Other languages, such as C based languages, include the unsigned integer type. Unsigned integers are exactly that, they do not contain a positive or negative sign, while a signed integer type does.
To further explain the types of integers supported, a byte is an 8 bit binary string that allows for
unsigned integers to be stored in it, or if the integer is signed it can store between the range of
to
. A short is a 16 bit binary string that can also be either signed or unsigned. The Integer type, int is a standard integer which is represented using a 32 bit string. And lastly, a long type is twice that of an integer, or a 64 bit binary string. Conversion between the integers is easy to do. Conversion backwards, for example int to short could result in errors unless the integer that stored in the int type is small enough to be represented using a smaller number of bits. However, conversion to a larger integer type can be done without any errors at all. In fact, if integer arithmetic operations result in values that are too large for the original type, they will be stored as larger integer types.
The maximum number of integers than an n-bit integer can store can be determined using simple math:
| Unsigned |
max:  |
| Signed |
min:  |
max:  |
Most of these integer types are supported directly by the hardware, however some languages exist that have integer types which are not directly supported by hardware. [1] One example is the Python’s long integer type. A value stored in Python’s long integer type can have an unlimited length. This is achieved by specifying the numbers as string literals, such as: 2456785357321546L.
The last necessary information to understand the integer data type is the differences between unsigned and signed integers, and how they are stored. An unsigned integer is by far the easier of the two to explain. An unsigned integer is a number that represents a natural number (although it is stored as a positive number). This is stored in binary form as described above. The conversion of a natural number to its binary form should have been discussed in class.
Signed integers pose different challenges, and because of this, several methodologies exist to add a signed bit to a natural number. The simplest for us to understand is the Sign-Magnitude representation. Sign-magnitude representation simply uses the most significant bit (first bit) to store whether the number is positive or negative. The following bits are used to represent the number as it would in an unsigned integer. This method has many drawback including unnecessary computation to calculate addition and subtraction, due to the need of both the sign bit and the magnitude of the natural number to do the arithmetic. It also becomes more difficult to test for a zero in a problem, as zero is simply a magnitude, and by using this method both signs must be tested.
There is a method which alleviates these issues by using an entirely different methodology, known as “Two’s Complement” representation. This methodology, though harder for an average human to understand, allows the computer to do basic arithmetic without extra computations and since there is only a single zero value, simplifies any tests for zero. In this representation, the most significant bit is once again used to represent the sign (positive being 0 and negative being 1). We begin by counting normally, as with an unsigned integer. However, when the most significant bit switches to negative, we begin counting the magnitude of the number backwards. So the magnitude is increment by 1 initially, and that number is then made negative. All subsequent numbers are decremented as the binary digit become larger. Therefore if we use a nibble (4 bits) to count, our counting would go as follows:
| Decimal |
Binary |
| 0 |
0000 |
| 1 |
0001 |
| 2 |
0010 |
| 3 |
0011 |
| 4 |
0100 |
| 5 |
0101 |
| 6 |
0110 |
| 7 |
0111 |
| -8 |
1000 |
| -7 |
1001 |
| -6 |
1010 |
| -5 |
1011 |
| -4 |
1100 |
| -3 |
1101 |
| -2 |
1110 |
| -1 |
1111 |
Try playing around with some binary arithmetic to see how easy using two’s complement is with addition and how difficult it would be for a computer to do binary arithmetic using the Sign-Magnitude representation.
For further reading on the Integer Data Type, I would recommend (especially the latter):
[1] Sebesta, Robert W. “Concepts of Programming Languages”, Boston: Addison Wesley, 2008. ISBN:978-0-321-49362-0
[2] Stallings, William. “Computer Organization and Architecture”, Boston: Prentice Hall, 2010. ISBN: 978-0-13-607373-4