Compiler, Assembler, Linker & Loader

Lets tries to investigate how the C/C++ source codes preprocessed, compiled, linked and loaded as a running program.  It is based on the GCC (GNU Compiler Collection).  When you use the IDE (Integrated Development Environment) compilers such as Microsoft Visual C++, Borland C++ Builder etc. the processes discussed here quite transparent.

Compilers, Assemblers and Linkers

Normally the C’s program building process involves four stages and utilizes different ‘tools’ such as a preprocessor, compiler, assembler, and linker.

At the end there should be a single executable file.  Below are the stages that happen in order regardless of the operating system/compiler and graphically illustrated below.
  • Preprocessing is the first pass of any C compilation. It processes include-files, conditional compilation instructions and macros.
  • Compilation is the second pass. It takes the output of the preprocessor, and the source code, and generates assembler source code.
  • Assembly is the third stage of compilation. It takes the assembly source code and produces an assembly listing with offsets. The assembler output is stored in an object file.
  • Linking is the final stage of compilation. It takes one or more object files or libraries as input and combines them to produce a single (usually executable) file. In doing so, it resolves references to external symbols, assigns final addresses to procedures/functions and variables, and revises code and data to reflect new addresses (a process called relocation). 
                      
                     Process Address Space ---> Primary Memory eg. RAM

Preprocessor :

The C Preprocessor is not part of the compiler, but is a separate step in the compilation process. In simplistic terms, a C Preprocessor is just a text substitution tool and they instruct compiler to do required pre-processing before actual compilation. So basically preprocessor processes :
  • Include-files
  • Conditional compilation instructions
  • Macros
The C preprocessor modifies a source file before handing it over to the compiler, allowing conditional compilation with #ifdef, defining constants with #define, including header files with #include, and using builtin macros such as __FILE__. This page lists the preprocessor directives, or commands to the preprocessor. All preprocessor commands begin with a pound symbol (#) that are available: 

#include          Inserts a particular header from another file
#define            Substitutes a preprocessor macro
#undef             Undefines a preprocessor macro
#if                     Tests if a compile time condition is true
#ifdef               Returns true if this macro is defined
#ifndef             Returns true if this macro is not defined
#else                The alternative for #if
#elif                  #else an #if in one statement
#endif               Ends preprocessor conditional
#error              Prints error message on stderr
#pragma        Issues special commands to the compiler, using a standardized method

# (
Stringize)                  converts a macro parameter into a string constant 
## (Token Pasting )     within a macro definition combines two arguments                   

Predefined Macros
ANSI C defines a number of macros. Although each one is available for your use in programming, the predefined macros should not be directly modified.

__DATE__    The current date as a character literal in "MMM DD YYYY" format
__TIME__    The current time as a character literal in "HH:MM:SS" format
__FILE__    This contains the current filename as a string literal.
__LINE__    This contains the current line number as a decimal constant.
__STDC__    Defined as 1 when the compiler complies with the ANSI standard.
 

Example :

/* predefinedMacros.c */
#include <stdio.h>
int main()
{
        printf("File :%s\n", __FILE__ );
        printf("Date :%s\n", __DATE__ );
        printf("Time :%s\n", __TIME__ );
        printf("Line :%d\n", __LINE__ );
        printf("ANSI :%d\n", __STDC__ );
        return 0;
}

Output :
File :prdifnedMacros.c
Date :Sep  8 2015
Time :06:10:21
Line :8
ANSI :1


Preprocessor Operators

The C preprocessor offers following operators to help you in creating macros:

Macro Continuation (\)

A macro usually must be contained on a single line. The macro continuation operator is used to continue a macro that is too long for a single line. For example:

#define  fun(a, b)  \
    printf(#a " and " #b " are IT City\n")


Stringize (#)
The stringize or number-sign operator ('#'), when used within a macro definition, converts a macro parameter into a string constant. This operator may be used only in a macro that has a specified argument or parameter list. For example:

#include <stdio.h>
#define  message_for(a, b)  \
    printf(#a " and " #b ": are IT City\n")

int main(void)
{
   message_for(Bengaluru, Hyderabad);
   return 0;
}

Output :
Bengaluru and Hyderabad: are IT City

Token Pasting (##)
The token-pasting operator (##) within a macro definition combines two arguments. It permits two separate tokens in the macro definition to be joined into a single token. For example:

#include <stdio.h>
#define numberAppend(n) printf ("number" #n " = %d", number##n)


int main(void)
{
   int number34 = 40;

   numberAppend(34);
   return 0;
}


Output :
number34 = 40

How it happened, because this example results in the following actual output from the preprocessor:

printf ("number34 = %d", number34);

 
This example shows the concatenation of number##n into number34 and here we have used both stringize and token-pasting.

The defined() Operator
The preprocessor defined operator is used in constant expressions to determine if an identifier is defined using #define. If the specified identifier is defined, the value is true (non-zero). If the symbol is not defined, the value is false (zero). The defined operator is specified as follows:

#include <stdio.h>
#if !defined (MESSAGE)
   #define MESSAGE "Good Luck!"
#endif

int main(void)
{
   printf("Here is the message: %s\n", MESSAGE); 
   return 0;
}

 
Output :
Here is the message: Good Luck!

4 comments: