The pitfalls with unsigned integers in C

This post was written by eli on June 18, 2026
Posted Under: Linux kernel,Software

Background

These are my notes as I tried to figure out if it’s OK to assume that a multiplication of two signed integers results in something repeatable. Or more specifically, if I multiply two radix-2 strictly positive integers that are defined as “int” in C, is it enough to check that the result is strictly positive in order to ensure that the result hasn’t wrapped? Is this a portable solution?

I’ll give my answer right away: It’s a definite maybe. And since I’m sure that these integers are strictly positive, the simple way out is to change their definition to unsigned int, and call it a day. Which is what I eventually did.

The rest of this post consists of pieces of information I randomly gathered as I went along. I was never a big fan of unsigned integers of any sort, and this reminded me why.

-5 is bigger than 2?!

Consider this simple program (try.c), which compares signed integers from -5 to 4 with the constant 2.

#include <stdio.h>

int main(void) {
  int i;
  const unsigned int a = 2;

  for (i=-5; i<5; i++)
    if (i > a)
      printf("%d is bigger than 2\n", i);

  return 0;
}

The result is obvious, right? Only 3 and 4 are bigger than 2. Let’s run this:

$ gcc -Wall -O3 try.c -o try
$ ./try
-5 is bigger than 2
-4 is bigger than 2
-3 is bigger than 2
-2 is bigger than 2
-1 is bigger than 2
3 is bigger than 2
4 is bigger than 2

Say what? How is -5 bigger than 2? The short answer is that “i” is converted to unsigned int before it’s compared with “a”. Therefore, all negative numbers are bigger than half the range of positive numbers.

Compiling with -fno-strict-overflow made no difference (the compiler didn’t issue any warning, and the result was the same).

The same output is obtained when replacing the if-statement with

    if (i > (unsigned int) 2)

And almost needless to say, when this if-statement is applied, one gets the “expected” result:

    if (i > 2)

Compiling and running:

$ gcc -Wall -O3 try.c -o try
$ ./try
3 is bigger than 2
4 is bigger than 2

This is, after all, what we’re all using all the time.

And can you guess what this yields?

    if (i > (unsigned char) 2)

The answer, surprisingly or not, is that it’s the same as the literal “2″: Only 3 and 4 are considered bigger. “unsigned char” didn’t have the same effect as “unsigned int”.

Why this happens is explained under “Mixed arithmetic operations” below. I just wanted to mention this crucial issue first.

And last, since I started with chars:

#include <stdio.h>

int main(void) {
  char i;
  const unsigned char a = 2;

  for (i=-5; i<5; i++)
    if (i > a)
      printf("%d is bigger than 2\n", i);

  return 0;
}

This is exactly like the first example, but with char and unsigned char instead of the respective int types. This gives…

$ gcc -Wall -O3 try.c -o try
$ ./try
3 is bigger than 2
4 is bigger than 2

So no weird stuff here. “unsigned char” didn’t have any effect on (signed) “char” either. How come the “problem” went away? It’s called “integer promotion”, also explained briefly below. But first, a small detour.

Unsigned wraps safer

The C standard says that if an arithmetic operation of two unsigned integer leads to a number that exceeds the target’s number of bits, the lower bits are assigned into the target. This is plain truncation. The gcc indeed implements it this way, according to the gcc manual, section 6.3.1.

As for signed integers, the behavior is undefined, according to the C standard. However, the discussion regarding undefined behavior seems to revolve around expressions like:

int x;

if (x > x + 1)
  printf("Oh no! x will explode!\n");

When disregarding the possibility of a wraparound, this if-statement is never true, so the compiler can treat it as an if (0), and optimize away this piece of code. gcc might do that, depending on the flags used.

In particular, gcc’s -fno-strict-overflow flag tells the compiler (among others) to take the possibility of a signed overflow into account, and not to optimize away the code.

I’ll demonstrate this with this simple program, named try.c:

#include <stdio.h>

int main(void) {
  unsigned int m = ~0;
  int x = m >> 1; // Reduce to maximal signed number

  if (x + 20 < x)
    printf("Oh no! Overflow!\n");
  return 0;
}

And now attempting to compile and run this with and without the flag:

$ gcc -Wall -O3 try.c -o try
try.c: In function ‘main’:
try.c:7:3: warning: assuming signed overflow does not occur when assuming that (X + c) < X is always false [-Wstrict-overflow]
   if (x + 20 < x)
   ^~
$ ./try
$ gcc -fno-strict-overflow -Wall -O3 try.c -o try
$ ./try
Oh no! Overflow!

This pretty much explains itself. And here’s the punchline: The Linux kernel is compiled with gcc’s -fno-strict-overflow flag (as indicated by Linus himself, and I also checked it with a V=1 test compilation).

This kind of if-statement is a quite common way to check if a wraparound will occur. Kees Cook attempted to replace such checks with add_would_overflow() macros (as defined in linux/overflow.h), however Linus didn’t like that at all. It’s worth mentioning that Kees Cook applied his patch on places where the variables in question were unsigned (in some cases or all, didn’t check), so the behavior is well-defined either way. His goal with this was to prevent false alarms with automatic code checking tools. So make code difficult to understand in order to please an automatic tool. One can understand why Linus didn’t fall in love. This little incident is covered in this LWN page.

Signed arithmetic is in the mist

An example of the ambiguous relation to using signed arithmetic is kernel commit 5a581b367b5, which made sure that the time_after() macro, implemented in linux/jiffies.h, subtracts jiffie values as unsigned long and then casts the result to (signed) long in order to evaluate if their difference is negative.

The jiffies mechanism is central in the Linux kernel, so it’s crucial that it works right. As it turned out, there was a slight delay with the application of the patch. Linus expressed the level of urgency as

… or is this queued up for 3.12 as being “not likely to actually matter”, which is quite possibly true (since we compile with “-fno-strict-overflow”, and thus gcc should hopefully not ever do any transformations that depend on signed integer overflows being undefined)”

Mixed arithmetic operations: How are they done?

If a binary arithmetic operation is carried out between two operands with different types, one of them is converted to the type having a higher “rank”. Generally speaking, floating-point has higher rank than integers, and unsigned integers have higher rank than signed. Among the two groups, the type that occupies more space has a higher rank.

The catch is that unsigned integers have a higher rank than signed. This is in particular confusing when comparing an unsigned value with a signed one. This is demonstrated in the examples above: If an unsigned int is compared with a signed int, the signed int is first converted to an unsigned int. So -1 is treated as an unsigned int with the value 0xffffffff (on a 32-bit machine), and that’s surely bigger than 2, as shown above.

But this is applied only if both variables have the same size. So if an unsigned char is compared with a (signed) int, it’s the unsigned char that is converted to a signed int, so a signed comparison takes place.

Actually, there is another reason why an unsigned char is converted into a signed int: It’s called integer promotion and is mentioned in the C11 standard section 6.3.1.1, as well as the gcc manual section 24.4: If an integer (char and short in particular, signed or unsigned) can be represented as an “int”, it’s converted to “int”, and then the operation is carried out.

Note that the variable, to which the result is assigned, plays no role in relation to type conversions within the operation itself.

As for gcc, the rules are (according to the gcc manual, section 24.5):

Arithmetic binary operators (except the shift operators) convert their operands to the common type before operating on them. Conditional expressions also convert the two possible results to their common type. Here are the rules for determining the common type.
If one of the numbers has a floating-point type and the other is an integer, the common type is that floating-point type.
If both are floating point, the type with the larger range is the common type.
If both are integers but of different widths, the common type is the wider of the two.
If they are integer types of the same width, the common type is unsigned if either operand is unsigned, and it’s long if either operand is long. It’s long long if either operand is long long.
These rules apply to addition, subtraction, multiplication, division, remainder, comparisons, and bitwise operations. They also apply to the two branches of a conditional expression, and to the arithmetic done in a modifying assignment operation.

As I’ve already pointed out, the really annoying thing is that this applies to comparison operations. It’s really easy to mess up with this.

Add a Comment

Previose Post: Utdelning eller lön från enmansbolag?

my tech blog

Popular Posts

Latest Posts

Archives