How a C\C++ Linker Works

Ah, the linker!

That mysterious entity that lurks in the build process, waiting for the perfect moment to throw a hundred cryptic error messages your way.

If you’ve ever stared at an “undefined reference” error and questioned your life choices, congratulations—you’ve met the linker.

But why does the linker exist?

What arcane magic does it perform? And why does it seem to enjoy tormenting developers?

Why Was the Linker Created?

Back in the early days of programming, people wrote simple programs, compiled them, and ran them. Life was good.

Then, programs got bigger.

Instead of writing everything in one giant file, developers started splitting code into multiple files. This made things more manageable, but it introduced a new problem:

how do you combine these separate files into a working program?

Enter the linker.

The linker was created to take compiled object files (.o or .obj files), find all the function and variable references, and stitch everything together into a final executable.

Without it, you’d have to manually copy-paste all your code into one massive file, which would make debugging even more of a nightmare than it already was.

Evolution of Linkers

1970s-1980s: Early linkers were simple. They just took object files and smooshed them together. No fancy optimizations, no dynamic linking—just good old brute force.
1990s: With the rise of shared libraries (.so, .dll), linkers became more sophisticated. They had to deal with dynamic linking, symbol resolution, and version compatibility.
2000s-Present: Modern linkers like GNU ld, LLVM lld, and MSVC’s linker have gotten ridiculously complex. They handle static and dynamic libraries, link-time optimizations, position-independent code, and a million other things that ensure your builds will break in new and exciting ways.

The Compilation Process

The process of converting human-readable source code into an executable binary happens in multiple stages.

These stages are broadly categorized into:

Preprocessing
Compilation
Assembly
Linking

Step 1: Preprocessing (`.c` / `.cpp` → `.i`)

The preprocessor handles preprocessor directives (like #include, #define, and #ifdef) before the actual compilation begins. This step:

Expands macros
Includes header files
Evaluates #if conditions

Example:

1
2
3
4
5
6
#include <stdio.h>
#define PI 3.14
int main() {
    printf("PI = %f", PI);
    return 0;
}

After preprocessing (gcc -E file.c -o file.i), the code expands to:

1
2
3
4
int main() {
    printf("PI = %f", 3.14);
    return 0;
}

Step 2: Compilation (`.i` → `.s`)

The compiler translates the preprocessed source into assembly code specific to your CPU architecture. This step involves:

Syntax checking
Type checking
Converting C/C++ code into lower-level intermediate representation (IR)
Optimizing code
Generating assembly output

Example (gcc -S file.i -o file.s):

1
2
3
4
5
6
7
main:
    push    rbp
    mov     rdi, format
    mov     esi, 3.14
    call    printf
    pop     rbp
    ret

Step 3: Assembly (`.s` → `.o`)

The assembler takes the assembly output and converts it into machine code, producing an object file (.o or .obj).

Command: gcc -c file.s -o file.o

Step 4: Linking (`.o` → Executable)

The linker combines object files and libraries into a final executable. This step involves:

Resolving symbols
Address allocation
Merging multiple .o files

The Linking Process

Linking can be static (everything included in the binary) or dynamic (uses shared libraries like .so or .dll).

Symbol Resolution

Each .o file contains symbols (functions, variables). The linker matches undefined symbols with their definitions.

Example:

main.o has:

1
2
extern printf
call printf

libc.a has:

1
2
printf:
    ...

The linker resolves printf by linking it to the actual function.

Differences Between C and C++ Compilation

While C and C++ follow the same basic compilation steps, C++ adds complexity:

Name Mangling: C++ allows function overloading, requiring name transformations to differentiate functions.
Object Code Differences: C++ includes classes, virtual tables, and exceptions that C does not have.
Stronger Type Checking: C++ enforces stricter type checking than C.

Name Mangling Example

Consider:

1
2
3
// C++
void func(int);
void func(double);

The compiler mangles them into:

1
2
_Z4funci   ; func(int)
_Z4funcd   ; func(double)

Without mangling, both functions would collide in the symbol table!

How to Disable Name Mangling

Use extern "C":

1
extern "C" void myFunction();

This tells the compiler to use C-style naming, avoiding mangling.

Key Ideas

Concept	Summary
Preprocessing	Expands macros, includes headers
Compilation	Converts code to assembly
Assembly	Translates assembly to machine code
Linking	Resolves symbols, merges `.o` files
Name Mangling	C++ changes function names for uniqueness
Static Linking	Includes everything in the final binary
Dynamic Linking	Uses external `.so` / `.dll` files

Basically…

Understanding how all this works is critical for debugging and optimization.

While C and C++ share similarities, C++’s name mangling and object-oriented features make its compilation more complex.

And! more likely for a newbie to be staring a 98 pages of linker errors wondering how to fix them :)

Tricky Linker Errors and How to Survive Them

Linker errors are like riddles from an evil wizard. Here are some of the most common ones and why they happen.

1. Undefined Reference to `someFunction()`

Why It Happens

You declared a function but forgot to define it.
You forgot to link against the correct library.
You’re compiling a C++ file but forgot to use extern "C" when linking with C code.

How to Fix It

Make sure the function is actually defined somewhere.
Check that you’re linking against the correct .lib, .a, or .so file.
If mixing C and C++, wrap C function declarations in extern "C" {}.

2. Multiple Definition of `someFunction()`

Why It Happens

You accidentally defined the same function in multiple .cpp files.
A header file contains a function definition instead of just a declaration.

How to Fix It

Use #pragma once or include guards (#ifndef HEADER_H … #define HEADER_H … #endif).
Move function definitions out of header files and into .cpp files.

3. Cannot Open File `someLibrary.lib`

Why It Happens

You’re trying to link against a static library that doesn’t exist.
The library path isn’t set correctly.

How to Fix It

Double-check that the library file exists in the correct directory.
Update your linker settings to include the correct library path.

Bad things That Can Happen..

As compilers and operating systems have evolved, so have the ways in which we compile and link code. This has made linkers more complex, leading to some truly bizarre build failures.

1. Debug vs. Release Builds

Release builds are optimized, debug builds aren’t. If you mix them up—like linking a debug .lib to a release executable—you’re going to have a bad time.

2. DLL Hell

If you’ve ever had to troubleshoot Windows DLL linking issues, you know why it’s called “DLL Hell.”

Version mismatches, missing symbols, and incorrect calling conventions can make your life miserable.

3. Static vs. Dynamic Linking Gone Wrong

Accidentally linking against the static version of a library when you meant to use the dynamic one?

Or vice versa?

Congratulations, you’ve just won a ticket to undefined behavior land.

4. The One Setting That Breaks Everything

One wrong setting in your build system can trigger hundreds of linker errors.

Wrong architecture?

Wrong calling convention?

Wrong runtime library?

Fun Times!

Key Ideas

Concept	Explanation
What does a linker do?	It combines compiled object files into a final executable.
Why do linker errors happen?	Missing definitions, incorrect linking settings, and library mismatches.
Common linker errors	Undefined references, multiple definitions, missing libraries.
Debug vs. Release builds	Mixing them can lead to linker nightmares.
Static vs. Dynamic linking	Getting this wrong can break everything.

Why Was the Linker Created?

Evolution of Linkers

The Compilation Process

Step 1: Preprocessing (.c / .cpp → .i)

Step 2: Compilation (.i → .s)

Step 3: Assembly (.s → .o)

Step 4: Linking (.o → Executable)

The Linking Process

Symbol Resolution

Differences Between C and C++ Compilation

Name Mangling Example

How to Disable Name Mangling

Key Ideas

Basically…

Tricky Linker Errors and How to Survive Them

1. Undefined Reference to someFunction()

Why It Happens

How to Fix It

2. Multiple Definition of someFunction()

Why It Happens

How to Fix It

3. Cannot Open File someLibrary.lib

Why It Happens

How to Fix It

Bad things That Can Happen..

1. Debug vs. Release Builds

2. DLL Hell

3. Static vs. Dynamic Linking Gone Wrong

4. The One Setting That Breaks Everything

Key Ideas

References

Step 1: Preprocessing (`.c` / `.cpp` → `.i`)

Step 2: Compilation (`.i` → `.s`)

Step 3: Assembly (`.s` → `.o`)

Step 4: Linking (`.o` → Executable)

1. Undefined Reference to `someFunction()`

2. Multiple Definition of `someFunction()`

3. Cannot Open File `someLibrary.lib`