Featured image of post OpenMP in a Nutshell

OpenMP in a Nutshell

Hardcore Parallel Programming ...

Introduction

Ever written a C, C++, or Fortran program and thought, “Wow, this could run faster if I just threw more cores at it”?

That’s exactly what OpenMP was made for.

OpenMP (Open Multi-Processing) is a standard API that makes parallel programming easy in languages like C, C++, and Fortran. It allows you to take advantage of multi-core processors by adding just a few compiler directives (aka “magic comments”) to your code.


The History of OpenMP

OpenMP was first released in 1997 as a way to simplify parallel programming. Before OpenMP, writing multi-threaded code was a nightmare—you had to use low-level threading libraries like POSIX threads (pthreads) or Windows Threads, which involved a lot of manual thread management.

Why Was OpenMP Created?

  • Writing parallel code was too complex, and not portable across different platforms.
  • Researchers and engineers needed an easy way to parallelize loops and computations.
  • It needed to work with existing C, C++, and Fortran codebases without major rewrites.

Key Innovations of OpenMP

Simple Parallelism → Just add a #pragma directive, and boom! Parallel code.
Portable and Scalable → Works on any modern CPU with multiple cores.
Automatic Thread Management → No need to manually create or join threads.
Fine-Grained Control → Can handle loop parallelism, task-based execution, and data-sharing policies.

Further Reading:


OpenMP vs. Modern Parallel Computing Techniques

FeatureOpenMPModern Equivalent
Parallel Loops✅ Yes✅ CUDA, OpenCL, TBB
Shared Memory Model✅ Yes✅ POSIX Threads, C++ Threads
Automatic Thread Management✅ Yes✅ Java Threads, Python Multiprocessing
Fine-Grained Synchronization✅ Yes✅ MPI, C++ Concurrency
GPU Support❌ No (CPU only)✅ CUDA, OpenCL

💡 Verdict: OpenMP is still one of the easiest ways to parallelize CPU-based programs.


OpenMP Syntax Table

ConceptOpenMP CodeEquivalent in Pthreads / C++
Parallel Region#pragma omp parallelstd::thread
Parallel for Loop#pragma omp parallel forstd::async
Critical Section#pragma omp criticalstd::mutex
Atomic Operation#pragma omp atomicstd::atomic
Reduction (Sum, Min, Max)#pragma omp parallel reduction(+:sum)Manual loop with locks

10 OpenMP Code Examples

1. Hello, World! (Parallel Execution)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
#include <omp.h>
#include <stdio.h>

int main() {
    #pragma omp parallel
    {
        printf("Hello from thread %d\n", omp_get_thread_num());
    }
    return 0;
}

2. Parallel For Loop

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
#include <omp.h>
#include <stdio.h>

int main() {
    #pragma omp parallel for
    for (int i = 0; i < 10; i++) {
        printf("Iteration %d executed by thread %d\n", i, omp_get_thread_num());
    }
    return 0;
}

3. Setting the Number of Threads

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
#include <omp.h>
#include <stdio.h>

int main() {
    omp_set_num_threads(4);
    #pragma omp parallel
    {
        printf("Thread %d is running\n", omp_get_thread_num());
    }
    return 0;
}

4. Critical Section

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
#include <omp.h>
#include <stdio.h>

int main() {
    int count = 0;
    #pragma omp parallel
    {
        #pragma omp critical
        {
            count++;
            printf("Thread %d increased count to %d\n", omp_get_thread_num(), count);
        }
    }
    return 0;
}

5. Reduction (Sum Calculation)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
#include <omp.h>
#include <stdio.h>

int main() {
    int sum = 0;
    #pragma omp parallel for reduction(+:sum)
    for (int i = 1; i <= 10; i++) {
        sum += i;
    }
    printf("Sum = %d\n", sum);
    return 0;
}

6. Barrier Synchronization

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
#include <omp.h>
#include <stdio.h>

int main() {
    #pragma omp parallel
    {
        printf("Before barrier - Thread %d\n", omp_get_thread_num());
        #pragma omp barrier
        printf("After barrier - Thread %d\n", omp_get_thread_num());
    }
    return 0;
}

7. Private Variables

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
#include <omp.h>
#include <stdio.h>

int main() {
    int x = 42;
    #pragma omp parallel private(x)
    {
        x = omp_get_thread_num();
        printf("Thread %d has x = %d\n", omp_get_thread_num(), x);
    }
    return 0;
}

8. Parallel Sections

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
#include <omp.h>
#include <stdio.h>

int main() {
    #pragma omp parallel sections
    {
        #pragma omp section
        printf("This is section 1\n");
        
        #pragma omp section
        printf("This is section 2\n");
    }
    return 0;
}

Key Takeaways

  • OpenMP makes parallel programming EASY compared to raw threads.
  • Works with C, C++, and Fortran without requiring major rewrites.
  • Still relevant today for multi-core CPU workloads.

References

  1. OpenMP Wikipedia
  2. Official OpenMP Documentation
  3. OpenMP Tutorial