Compilation Definition

Introduction

Compiling is the process computers use to translate high-level programming languages into computer-understandable machine language. A compiler is the name of the software that performs this conversion.

Compilation Definition

Source code is the format in which programmers create their programs. Before a program can be executed, the source code should go through several steps. The source code must go through a compiler to convert the high-level language instruction into object code.

When the compiler has generated object code, passing that code through a linker is the last step in creating an executable program. The linker creates machine code, integrating modules and assigning actual values to all symbolized addresses.

The two components of compilation are analysis and synthesis. The analysis stage separates the source code into its constituent elements and produces an intermediate representation of the source program. The target program is created from the intermediate term by the synthesis component.

Compilation Process

Compiling a high-level program into binary low-level machine code is called compilation. The computer can execute only binary machine commands because of its hardware design. Thus, every program designed in a language other than machine code must first be translated into machine instructions.

The compilation is a multi-stage process that transforms high-level computer programs that are understandable by humans into low-level, binary code that is readable by machines. Four steps convert a program's source code into an executable file. These four steps in the compilation process include preprocessing, the compiler, assembly, and linking.

Compilation Definition

Preprocessor

The preprocessing stage is the initial step in the compilation process, and this stage is also known as the lexical analysis stage. The program source code file is entered into the preprocessing step, which outputs a file with the dot i(.i) extension that has been preprocessed.

The compiler searches the source code file for any # include and # define class preprocessor directives. The entire header file set is processed at the preprocessing stage, and all macros are handled by replacing them with absolute values. But comments are not processed at this stage.

Compiler

The second step in the compilation process is compilation itself. The compiler accepts the preprocessed file as input, producing an output file containing the assembly code containing the dot s(.s) extension. The compiler translates all high-level software instructions into their corresponding assembly code instructions. These instructions were built for a particular architecture and are platform-dependent.

Assembler

The third step in the compilation process is the assembler. The assembler translates basic computer commands into binary code for the computer's processor to perform its fundamental operations.

These instructions are written in assembly language or assembler. An assembler is used to translate assembly code into object code. The name of the source file and the object file created by the assembler is the same.

Linker

The fourth and last step in the compilation process is linking. The primary purpose of linking is to combine all object code files into a single executable file (sourcefile.exe). Big computer programs are organized into a variety of manageable files.

Separate files are used to store the user-defined functions. In the header section, these files are linked to the main program file like the C language's # included. Similarly, the programming language offers built-in standard library functions that programs can use immediately to simplify coding tasks.

As the standard library code is in object code (a pre-compiled format), it may be immediately incorporated at the linking step by the linker while producing an executable file during the compilation process.

Why are Computer Programs Compiled?

The need to compile high-level programs is something that computer science students should understand fully. High-level programming languages like C, C++, Python, Dot Net, and Java are used to create high-level programs.

Compilation Definition

Humans can understand high-level programs. Every high-level programming language has a unique syntax and reserved keywords instructing the machine to perform particular operations.

The ease of programming is a special consideration in developing high-level programming languages. The majority of keywords in high-level programming languages are common English words.

Compiler

Compilers are pieces of software that change source code into object code. In other words, it transforms high-level language into machine/binary language. Also, carrying out this step is important to make the program executable, and this is because the computer understands only binary language.

Compilation Definition

Some compilers translate the high-level language into an assembly language as an initial step. At the same time, others translate it into machine code. Compilation refers to the process of transforming source code into machine code.

Types of compiler

Compilers come in a variety of forms, including the following:

  • Single-Pass Compiler: Tokens are extracted from line sources after they have been scanned during processing in a single-pass compiler. Hence, once the line syntax is examined, the tree structure and various tables containing details about each token are generated.
    Compilation Definition
    After ensuring that the semantical component is correct, the code is finally written. Each line of code goes through the same procedure until the entire program is compiled. The parser that will call methods to carry out various functions typically serves as the central component of the compiler.
  • Two-Pass Compiler: A two-pass compiler is a processor that executes the program to be translated twice. In the two-pass compiler, there are two sections, i.e.
    • Front end: It converts the legal code into an Intermediate Representation (IR).
    • Back end: The target machine is mapped with IR.

    Compilation Definition
    Retargeting is made easier by the two-pass compiler approach. Moreover, it supports various front ends.
  • Multi-Pass Compiler
    Compilation Definition
    The compiler makes the first changed structure after scanning the input source once, then makes a second modified structure after scanning the first form it created, and so on until the object form is completed. The term "multi-pass compiler" refers to this kind of compiler.

Phases/Structure Of Compiler

There are various phases in the compilation process. Also, the results of each stage serve as the input for the following step. The compilation process contains the following phases or structures:

Compilation Definition

1. Lexical Analyzer

  • It accepts the source code of high-level languages as input.
  • It examines the source code's characters from left to right. As a result, scanner is another term.
  • The words are grouped into lexemes. Lexemes are a collection of characters with a specific meaning.
  • Each lexeme fits together to create a token.
  • White space and comments are eliminated.
  • Lexical mistakes are checked and fixed.

2. Syntax Analyzer

  • The syntax analyzer is also referred to as a "parser".
  • The lexical Analyzer's output serves as its input.
  • It checks the source code for syntax errors.
  • It accomplishes this by creating a parse tree from each token.
  • The parse tree needs to follow source code grammar rules for the syntax to be correct.
  • A context-free grammar is an appropriate grammar for such codes.

3. Semantic Analyzer

  • It checks the syntax analyzer's parse tree.
  • It verifies the programming language validity of the code, such as data type compatibility, variable declaration, initialization, etc.
  • It also generates a verified parse tree. The annotated parse tree is another name we give to this tree.
  • Moreover, it conducts type checking, flow checking, etc.

4. Intermediate Code Generator (ICG)

  • It produces intermediate code.
  • Neither machine language nor high-level language is used in this program, and it's in an intermediate form.
  • Although translated into machine language, the final two steps depend on the platform.
  • All compilers use the same intermediate code. Also, we create the machine code following the platform.
  • The three-address code is an instance of an intermediate code.

5. Code Optimizer

  • The intermediate code is optimized.
  • Its purpose is to modify the code to run faster and consume fewer resources (CPU, memory).
  • It rearranges the code and deletes any unnecessary lines.
  • The source code still has the same meaning.

6. Target Code Generator

  • The optimized intermediate code is then transformed into machine code.
  • This is the compilation's final step.
  • This process generates relocatable machine code.

Compiler Operations

Compilation Definition

The compiler's crucial operations include the following:

  • After breaking it down into smaller pieces, it provides grammatical structure to each source program segment.
  • It also allows you to use the intermediate representation to build the symbol table and the desired target program.
  • The compiler assists with error detection and source code compilation.
  • It organizes and saves all variables and codes.
  • Compiler support is provided for separate compilation.
  • It reads the entire program, analyzes it, and then translates it into a language equivalent in semantics.
  • The compiler is the process of transforming source code to object code, depending on the type of machine.

Difference Between An Interpreter And A Compiler

Compilation Definition

A compiler checks the entire program at once. Once the whole program has been reviewed, it displays all the errors in one location. Conversely, an interpreter goes through the program line by line, and the execution stops if an error is found.






Latest Courses