Nicholas Dodds

OpenCV CUDA Guide

Building OpenCV 4.12.0 with CUDA Support

A comprehensive guide for Linux systems with NVIDIA GPUs

Introduction

This guide walks through building OpenCV from source with CUDA and cuDNN acceleration on Ubuntu/Debian-based systems. This installation was originally performed on a server named "Serval" running Ubuntu 24.04 with an NVIDIA H200 GPU supporting CUDA architecture 9.0, but the process is generalizable to other Linux systems with NVIDIA GPUs.

System Requirements

  • Ubuntu 24.04 or later (or equivalent Debian-based distribution)
  • NVIDIA GPU with CUDA support
  • Sufficient disk space (~5GB for build files)
  • Root or sudo access for installation

Prerequisites

Before beginning, ensure you have the following installed on your system:

Required Software

  • CUDA Toolkit (version 12.0 or compatible with your GPU)
    Download from NVIDIA CUDA Downloads
  • cuDNN (version 8.9.2 or compatible)
    Download from NVIDIA cuDNN (requires free NVIDIA Developer account)
  • CMake (version 3.15 or later)
    Install via: sudo apt install cmake
  • GCC/G++ Compiler (GCC 12 recommended)
    Install via: sudo apt install gcc-12 g++-12
  • Python 3 with development headers
    Install via: sudo apt install python3 python3-dev python3-numpy

Recommended Development Libraries

These libraries enable support for various image and video formats:

sudo apt install build-essential cmake git pkg-config \
    libgtk-3-dev libavcodec-dev libavformat-dev libswscale-dev \
    libv4l-dev libxvidcore-dev libx264-dev libjpeg-dev libpng-dev \
    libtiff-dev gfortran openexr libatlas-base-dev libtbb-dev \
    libdc1394-dev libopenexr-dev libgstreamer-plugins-base1.0-dev \
    libgstreamer1.0-dev libwebp-dev

Determining Your GPU Architecture

OpenCV's CUDA compilation requires specifying your GPU's compute capability. To find yours:

  1. Check your GPU model: nvidia-smi
  2. Look up your compute capability at NVIDIA's CUDA GPUs page
  3. Note the value (e.g., 8.6 for RTX 3090, 9.0 for H100/H200, 7.5 for RTX 2080)
Important: The build command in this tutorial uses CUDA_ARCH_BIN=9.0. You must replace this with your GPU's compute capability.

Build Instructions

1 Create Build Directory

mkdir opencv
cd opencv

2 Download Source Files

Download both the main OpenCV repository and the contributed modules:

curl -Lo opencv-4.12.0.tar.gz https://github.com/opencv/opencv/archive/4.12.0.tar.gz
curl -Lo opencv_contrib-4.12.0.tar.gz https://github.com/opencv/opencv_contrib/archive/4.12.0.tar.gz

3 Extract Archives

tar xf opencv-4.12.0.tar.gz
tar xf opencv_contrib-4.12.0.tar.gz

4 Create Build Directory

mkdir build
cd build

5 Configure with CMake

CMake is a build system generator that configures the compilation process. The following command sets up OpenCV to build with CUDA support, contributed modules, and optimizations:

cmake -DOPENCV_EXTRA_MODULES_PATH=../opencv_contrib-4.12.0/modules \
    ../opencv-4.12.0 \
    -D WITH_CUDA=ON \
    -D WITH_CUDNN=ON \
    -D OPENCV_DNN_CUDA=ON \
    -D CMAKE_C_COMPILER=gcc-12 \
    -D CMAKE_CXX_COMPILER=g++-12 \
    -D CUDA_ARCH_BIN=9.0
Key Configuration Options:
  • OPENCV_EXTRA_MODULES_PATH: Includes additional modules from opencv_contrib
  • WITH_CUDA=ON: Enables CUDA support
  • WITH_CUDNN=ON: Enables cuDNN acceleration for deep learning
  • OPENCV_DNN_CUDA=ON: Enables CUDA backend for DNN module
  • CMAKE_C_COMPILER / CMAKE_CXX_COMPILER: Specifies compiler versions
  • CUDA_ARCH_BIN: Replace 9.0 with your GPU's compute capability

CMake will check your system and display a configuration summary. Review it to ensure CUDA and cuDNN are detected correctly.

6 Compile OpenCV

Build OpenCV using all available CPU cores (adjust -j 60 to match your system's core count):

cmake --build . -j 60

This step will take 15-60 minutes depending on your system. You can use -j $(nproc) to automatically use all available cores.

7 Install

Install OpenCV system-wide to /usr/local:

sudo cmake --install .

This will place:

  • Binaries in /usr/local/bin
  • Libraries in /usr/local/lib
  • Headers in /usr/local/include/opencv4
  • Python module in /usr/local/lib/python3.12/dist-packages/cv2

8 Update Library Path

Ensure the system can find the OpenCV libraries:

sudo ldconfig

Verification

Test Python Installation

python3 -c "import cv2; print(f'OpenCV version: {cv2.__version__}')"

Expected output: OpenCV version: 4.12.0

Verify CUDA Support

python3 -c "import cv2; print(f'CUDA devices: {cv2.cuda.getCudaEnabledDeviceCount()}')"

This should print a number greater than 0 if CUDA support is properly enabled.

Check Build Information

python3 -c "import cv2; print(cv2.getBuildInformation())" | grep -A 5 "NVIDIA CUDA"

This should show your CUDA version and GPU architecture.

Test C++ Installation

Create a simple test file test_opencv.cpp:

#include <opencv2/opencv.hpp>
#include <opencv2/cudaarithm.hpp>
#include <iostream>

int main() {
    std::cout << "OpenCV version: " << CV_VERSION << std::endl;
    std::cout << "CUDA devices: " << cv::cuda::getCudaEnabledDeviceCount() << std::endl;
    return 0;
}

Compile and run:

g++ test_opencv.cpp -o test_opencv $(pkg-config --cflags --libs opencv4)
./test_opencv

Troubleshooting

CUDA Not Detected During CMake Configuration

Symptoms: CMake reports "CUDA: NO" in configuration summary

Solutions:

  • Verify CUDA installation: nvcc --version
  • Ensure /usr/local/cuda/bin is in your PATH
  • Check that CUDA libraries are in /usr/local/cuda/lib64

cuDNN Not Found

Symptoms: CMake reports "cuDNN: NO"

Solutions:

  • Verify cuDNN installation in /usr/local/cuda/include and /usr/local/cuda/lib64
  • Ensure cuDNN version matches CUDA toolkit version
  • Set CUDNN_INCLUDE_DIR and CUDNN_LIBRARY explicitly in CMake

Driver Version Mismatch

Symptoms: Runtime errors mentioning CUDA driver version

Solutions:

  • Check driver compatibility: nvidia-smi shows both driver and CUDA versions
  • Ensure NVIDIA driver version supports your CUDA toolkit version
  • Update driver: sudo apt install nvidia-driver-xxx (replace xxx with appropriate version)

Import Error in Python

Symptoms: ImportError: libopencv_core.so.4.12: cannot open shared object file

Solutions:

  • Run sudo ldconfig to update library cache
  • Verify /usr/local/lib is in /etc/ld.so.conf.d/ configuration
  • Check Python path: python3 -c "import sys; print(sys.path)"

Compile Errors During Build

Symptoms: Build fails with compiler errors

Solutions:

  • Ensure all prerequisites are installed
  • Try using a different GCC version (9, 10, 11, or 12)
  • Check available disk space (build requires ~5GB)
  • Clear build directory and reconfigure: rm -rf * && cmake ...

Build Configuration Details

The installation includes the following key features:

Enabled Modules: Core CV, DNN with CUDA, CUDA-accelerated image processing, feature detection, object detection, video I/O, and contributed modules (aruco, face, text recognition, etc.)

Image Format Support: JPEG, PNG, TIFF, WebP, JPEG 2000, OpenEXR, GIF

Video Support: FFmpeg, GStreamer, V4L/V4L2

GUI Backend: GTK3

Python Version: 3.12 (matches system Python at /usr/bin/python3)

Notable Disabled Features: Non-free algorithms (patent-encumbered), Java bindings, MATLAB bindings

If you require any disabled features, you can rebuild with additional CMake flags. Refer to the OpenCV CMake options documentation for details.

Additional Resources

This tutorial is based on a successful installation performed on Ubuntu 24.04 with CUDA 12.0, cuDNN 8.9.2, and an NVIDIA H200 GPU (compute capability 9.0).