The myth of signed arithmetic wraparound

Pop quiz time! Suppose you’re on a machine where int is 32 bits wide and thus INT_MAX, the maximum possible value for an int, is 2147483647. What is the output of the following C program?

#include <limits.h>

#include <stdio.h>

#include <stdlib.h>

 

int main (int argc, char *argv[])

{

        int sum = INT_MAX + 1;

        printf("INT_MAX + 1 == %d\n", sum);

        return EXIT_SUCCESS;

}

If you’re like most C programmers, you probably answered the following:

INT_MAX + 1 == -2147483648

And if pressed to explain how you could add two positive numbers together and wide up with a negative number, you’d explain that it’s because of how two’s complement arithmetic works:

2147483647 == 0111 1111 1111 1111 1111 1111 1111 1111 + 1 == 0000 0000 0000 0000 0000 0000 0000 0001 ----------- --------------------------------------- -2147483648 == 1000 0000 0000 0000 0000 0000 0000 0000

It turns out that you’d be wrong. In C, the behavior of signed arithmetic overflow — that is, what happens when the result of an arithmetic operation is outside the range of what’s representable with that data type — is undefined. You might witness the wraparound behavior you’d expect from two’s complement arithmetic, but it’s certainly not guaranteed, since C doesn’t require the use of two’s complement arithmetic.

But, you might protest, that’s an awfully pedantic point to make, since pretty much any system you’ll encounter these days uses two’s complement arithmetic, with alternative representations like one’s complement or sign-and-magnitude being extremely difficult to find outside of legacy hardware. Two’s complement and arithmetic and the wraparound behavior it exhibits may not technically be part of the C standard, but in any practical application it’s a de facto standard because of the use of two’s complement arithmetic by the underlying processor, right?

Actually, no, not at all; real-world compilers will exploit the fact that the C standard does not define the behavior of integer arithmetic overflow when it’s optimizing your program. As a result, you can’t at all rely on wraparound happening. For instance, consider the following program:

#include <sys/stat.h>

#include <sys/types.h>

#include <fcntl.h>

#include <limits.h>

#include <stdio.h>

#include <stdlib.h>

#include <string.h>

#include <unistd.h>

 

int process_something (int fd, int size)

{

        // Catch integer overflow

        if (size > size + 1)

                abort();

 

        char *string = malloc(size + 1);

        int count = read(fd, string, size);

        string[count] = 0;

        int len = strlen(string);

        free(string);

        return len;

}

 

int main (int argc, char *argv[])

{

        int fd = open("/etc/issue", O_RDONLY);

        int len = process_something(fd, INT_MAX);

        close(fd);

        return EXIT_SUCCESS;

}

The function process_something wants to allocate size + 1 bytes and is trying to be careful not to accidentally pass a negative value to malloc, which would happen if size == INT_MAX. The code tries to exploit how two’s complement works by checking size > size + 1 to detect integer overflow, and aborting if that’s the case.

This code does not work when compiled with gcc, depending on the optimization level. At -O2 and above, gcc realizes that size > size + 1 should never be true — think about it mathematically — and so eliminates the entire if block. What about if there is integer overflow? The C standard says the behavior of the program is undefined if that happens, so the compiler doesn’t need to worry about it.

When compiled at optimization level -O0 or -O1, the output of the program is:

Aborted

Which is what happens when abort gets called. This is the behavior the programmer expected.

But when compiled at optimization level -O2 or -O3, the code that calls abort gets optimized away, and this happens (or at least, could happen) instead:

Segmentation fault

Why? On x86 Linux, malloc returns NULL when it receives a negative argument, since size + 1 is negative due to two’s complement wraparound. Since malloc returned a NULL pointer, a segmentation fault happens when the following code attempts to dereference it.

This is at least what happens on my computer. But since the program invokes undefined behavior, it could do anything at all and be considered correct, from a technical standpoint. Other possibilities are: successfully reading the input file anyway, failing silently without crashing, sending threatening e-mails to your boss, drinking your coffee when you’re not looking, or making demons fly out of your nose. Undefined behavior is undefined. Anything is possible.

The point is, relying on undefined behavior is a bad idea, even if you think you know what’s actually going to happen. Part of the reason that languages like C provide areas where undefined behavior can occur, as opposed to languages like Java that try to specify what must happen in every circumstance, is that it gives compilers a lot more flexibility in optimizing code. A compiler can assume, for example, that a < a + 1 without having to first prove that a will never be the maximum possible value for its type; if process_something in our previous example were in a separate library, the compiler would have no way of knowing what a could be at runtime, and would have to assume the worst.

For more about the gotchas of undefined behavior in C, read What Every C Programmer Should Know About Undefined Behavior (part 2) (part 3) on the LLVM Project Blog, which inspired this post and showed me why I would have answered the question I opened this post with incorrectly.

The myth of signed arithmetic wraparound

Trending Articles

Bath man appears in court charged with attempted murder of a man...

MACLEAN, Allan

Black Angus Grilled Artichokes

Practice Sheet of Right form of verbs for HSC Students

Police blotter for Jan. 12

99 God Status for Whatsapp, Facebook

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Notorious Naushad of Ippa gang nabbed

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Sonible Smartlimit v1.1.5-R2R

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Arrow Flash 2 – Sinhala Dubbed – Episode 23 – 20th March 2016

[GET] AI Traffic Goldmine

[E² Plugin] HDF-Radio

Universal Multi-Patch v1.3 By RADIXX11

IWAN – Thanks and Praise ( Throw Back Thursday )

RONALD P SONDERGAARD Arrested by Miami-Dade County Corrections on Mar 03, 2017

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

HSSC Excise & Taxation Inspector Result 2017 Scorecard/ Category Wise Merit List