sajad torkamani

A compiler is a program that takes source code (e.g., C, C++, Go) and converts into executable machine code (binary). There are lots of compilers that can compile source code written in different programming languages.

As an example, let’s take a closer look at what the GCC compiler does.

Four stages of compilation

There are four stages to GCC’s compilation process:

  1. Preprocessing
  2. Compilation
  3. Assembling
  4. Linking
Stages of compilation

Let’s consider how these stages work when compiling a simple hello world program like the following:

#include <stdio.h>

int main() {
  printf("Hello world!\n");
  return 0;
}

1. Preprocessing

The preprocessor (cpp) looks for preprocessor directives (e.g., #include <stdio.h>) and replaces them with the contents of the corresponding system file (e.g., stdio.h). The result is another C program file ending with the .i suffix.

2. Compilation

The compiler (cc1) translates the hello.i file into the text file hello.s which is an assembly-language program. For example, the main function would be translated to:

main:
  subq $8, %rsp
  movl $.LC0, %edi
  call puts
  movl $0, %eax
  addq $8, %rsp
  ret

3. Assembly

The assembler (as) translates the hello.s into machine-level instructions, packages them into a form known as a relocatable object program and stores the result in the object file hello.o. This is a binary file.

4. Linking

The linker (ld) looks at the hello.o program and merges any other required object files needed to run it. For example, the hello world program uses printf which resides in a separated precompiled printf.o object file. The linker will merge all these dependencies to create a single executable hello file that is ready to be loaded inton memory and executed.

Sources

  • Bryant, R. and O’Hallaron, D., 2015. Computer systems. pp.41-44.
Tagged: Computing