How does the compilation/linking process work? Ask Question

Question

The compilation of a C++ program involves three steps:

Preprocessing: the preprocessor takes a C++ source code file and deals with the #includes, #defines and other preprocessor directives. The output of this step is a "pure" C++ file without pre-processor directives.
Compilation: the compiler takes the pre-processor's output and produces an object file from it.
Linking: the linker takes the object files produced by the compiler and produces either a library or an executable file.

Preprocessing

The preprocessor handles the preprocessor directives, like #include and #define. It is agnostic of the syntax of C++, which is why it must be used with care.

It works on one C++ source file at a time by replacing #include directives with the content of the respective files (which is usually just declarations), doing replacement of macros (#define), and selecting different portions of text depending of #if, #ifdef and #ifndef directives.

The preprocessor works on a stream of preprocessing tokens. Macro substitution is defined as replacing tokens with other tokens (the operator ## enables merging two tokens when it makes sense).

After all this, the preprocessor produces a single output that is a stream of tokens resulting from the transformations described above. It also adds some special markers that tell the compiler where each line came from so that it can use those to produce sensible error messages.

Some errors can be produced at this stage with clever use of the #if and #error directives.

Compilation

The compilation step is performed on each output of the preprocessor. The compiler parses the pure C++ source code (now without any preprocessor directives) and converts it into assembly code. Then invokes underlying back-end(assembler in toolchain) that assembles that code into machine code producing actual binary file in some format(ELF, COFF, a.out, ...). This object file contains the compiled code (in binary form) of the symbols defined in the input. Symbols in object files are referred to by name.

Object files can refer to symbols that are not defined. This is the case when you use a declaration, and don't provide a definition for it. The compiler doesn't mind this, and will happily produce the object file as long as the source code is well-formed.

コンパイラは通常、この時点でコンパイルを停止できます。これにより、各ソースコードファイルを個別にコンパイルできるため、非常に便利です。これにより得られる利点は、 1 つのファイルのみを変更する場合は、すべてを再コンパイルする必要がないことです。

生成されたオブジェクトファイルは、後で簡単に再利用できるように、静的ライブラリと呼ばれる特別なアーカイブに格納できます。

この段階で、構文エラーやオーバーロード解決の失敗エラーなどの「通常の」コンパイラエラーが報告されます。

リンク

リンカーは、コンパイラが生成したオブジェクトファイルから最終的なコンパイル出力を生成します。この出力は、共有 (または動的) ライブラリ (名前は似ていますが、前述の静的ライブラリとはあまり共通点がありません) または実行可能ファイルのいずれかになります。

未定義のシンボルへの参照を正しいアドレスに置き換えることで、すべてのオブジェクトファイルをリンクします。これらの各シンボルは、他のオブジェクトファイルまたはライブラリで定義できます。標準ライブラリ以外のライブラリで定義されている場合は、リンカーにそのことを通知する必要があります。

この段階で最も一般的なエラーは、定義の欠落または定義の重複です。前者は、定義が存在しない (つまり、記述されていない) か、定義が存在するオブジェクトファイルまたはライブラリがリンカーに指定されていないことを意味します。後者は明らかです。同じシンボルが 2 つの異なるオブジェクトファイルまたはライブラリで定義されています。

Answer 1

The compilation of a C++ program involves three steps:

Preprocessing: the preprocessor takes a C++ source code file and deals with the #includes, #defines and other preprocessor directives. The output of this step is a "pure" C++ file without pre-processor directives.
Compilation: the compiler takes the pre-processor's output and produces an object file from it.
Linking: the linker takes the object files produced by the compiler and produces either a library or an executable file.

Preprocessing

The preprocessor handles the preprocessor directives, like #include and #define. It is agnostic of the syntax of C++, which is why it must be used with care.

It works on one C++ source file at a time by replacing #include directives with the content of the respective files (which is usually just declarations), doing replacement of macros (#define), and selecting different portions of text depending of #if, #ifdef and #ifndef directives.

The preprocessor works on a stream of preprocessing tokens. Macro substitution is defined as replacing tokens with other tokens (the operator ## enables merging two tokens when it makes sense).

After all this, the preprocessor produces a single output that is a stream of tokens resulting from the transformations described above. It also adds some special markers that tell the compiler where each line came from so that it can use those to produce sensible error messages.

Some errors can be produced at this stage with clever use of the #if and #error directives.

Compilation

The compilation step is performed on each output of the preprocessor. The compiler parses the pure C++ source code (now without any preprocessor directives) and converts it into assembly code. Then invokes underlying back-end(assembler in toolchain) that assembles that code into machine code producing actual binary file in some format(ELF, COFF, a.out, ...). This object file contains the compiled code (in binary form) of the symbols defined in the input. Symbols in object files are referred to by name.

Object files can refer to symbols that are not defined. This is the case when you use a declaration, and don't provide a definition for it. The compiler doesn't mind this, and will happily produce the object file as long as the source code is well-formed.

コンパイラは通常、この時点でコンパイルを停止できます。これにより、各ソースコードファイルを個別にコンパイルできるため、非常に便利です。これにより得られる利点は、 1 つのファイルのみを変更する場合は、すべてを再コンパイルする必要がないことです。

生成されたオブジェクトファイルは、後で簡単に再利用できるように、静的ライブラリと呼ばれる特別なアーカイブに格納できます。

この段階で、構文エラーやオーバーロード解決の失敗エラーなどの「通常の」コンパイラエラーが報告されます。

リンク

リンカーは、コンパイラが生成したオブジェクトファイルから最終的なコンパイル出力を生成します。この出力は、共有 (または動的) ライブラリ (名前は似ていますが、前述の静的ライブラリとはあまり共通点がありません) または実行可能ファイルのいずれかになります。

未定義のシンボルへの参照を正しいアドレスに置き換えることで、すべてのオブジェクトファイルをリンクします。これらの各シンボルは、他のオブジェクトファイルまたはライブラリで定義できます。標準ライブラリ以外のライブラリで定義されている場合は、リンカーにそのことを通知する必要があります。

この段階で最も一般的なエラーは、定義の欠落または定義の重複です。前者は、定義が存在しない (つまり、記述されていない) か、定義が存在するオブジェクトファイルまたはライブラリがリンカーに指定されていないことを意味します。後者は明らかです。同じシンボルが 2 つの異なるオブジェクトファイルまたはライブラリで定義されています。

How does the compilation/linking process work? Ask Question

ベストアンサー1

Preprocessing

Compilation

リンク

おすすめ記事