Building OpenCV 4.12.0 with CUDA Support
A comprehensive guide for Linux systems with NVIDIA GPUs
Introduction
This guide walks through building OpenCV from source with CUDA and cuDNN acceleration on Ubuntu/Debian-based systems. This installation was originally performed on a server named "Serval" running Ubuntu 24.04 with an NVIDIA H200 GPU supporting CUDA architecture 9.0, but the process is generalizable to other Linux systems with NVIDIA GPUs.
System Requirements
- Ubuntu 24.04 or later (or equivalent Debian-based distribution)
- NVIDIA GPU with CUDA support
- Sufficient disk space (~5GB for build files)
- Root or sudo access for installation
Prerequisites
Before beginning, ensure you have the following installed on your system:
Required Software
- CUDA Toolkit (version 12.0 or compatible with your GPU)
Download from NVIDIA CUDA Downloads - cuDNN (version 8.9.2 or compatible)
Download from NVIDIA cuDNN (requires free NVIDIA Developer account) - CMake (version 3.15 or later)
Install via:sudo apt install cmake - GCC/G++ Compiler (GCC 12 recommended)
Install via:sudo apt install gcc-12 g++-12 - Python 3 with development headers
Install via:sudo apt install python3 python3-dev python3-numpy
Recommended Development Libraries
These libraries enable support for various image and video formats:
sudo apt install build-essential cmake git pkg-config \
libgtk-3-dev libavcodec-dev libavformat-dev libswscale-dev \
libv4l-dev libxvidcore-dev libx264-dev libjpeg-dev libpng-dev \
libtiff-dev gfortran openexr libatlas-base-dev libtbb-dev \
libdc1394-dev libopenexr-dev libgstreamer-plugins-base1.0-dev \
libgstreamer1.0-dev libwebp-dev
Determining Your GPU Architecture
OpenCV's CUDA compilation requires specifying your GPU's compute capability. To find yours:
- Check your GPU model:
nvidia-smi - Look up your compute capability at NVIDIA's CUDA GPUs page
- Note the value (e.g., 8.6 for RTX 3090, 9.0 for H100/H200, 7.5 for RTX 2080)
CUDA_ARCH_BIN=9.0. You must replace this with your GPU's compute capability.
Build Instructions
1 Create Build Directory
mkdir opencv
cd opencv
2 Download Source Files
Download both the main OpenCV repository and the contributed modules:
curl -Lo opencv-4.12.0.tar.gz https://github.com/opencv/opencv/archive/4.12.0.tar.gz
curl -Lo opencv_contrib-4.12.0.tar.gz https://github.com/opencv/opencv_contrib/archive/4.12.0.tar.gz
3 Extract Archives
tar xf opencv-4.12.0.tar.gz
tar xf opencv_contrib-4.12.0.tar.gz
4 Create Build Directory
mkdir build
cd build
5 Configure with CMake
CMake is a build system generator that configures the compilation process. The following command sets up OpenCV to build with CUDA support, contributed modules, and optimizations:
cmake -DOPENCV_EXTRA_MODULES_PATH=../opencv_contrib-4.12.0/modules \
../opencv-4.12.0 \
-D WITH_CUDA=ON \
-D WITH_CUDNN=ON \
-D OPENCV_DNN_CUDA=ON \
-D CMAKE_C_COMPILER=gcc-12 \
-D CMAKE_CXX_COMPILER=g++-12 \
-D CUDA_ARCH_BIN=9.0
OPENCV_EXTRA_MODULES_PATH: Includes additional modules from opencv_contribWITH_CUDA=ON: Enables CUDA supportWITH_CUDNN=ON: Enables cuDNN acceleration for deep learningOPENCV_DNN_CUDA=ON: Enables CUDA backend for DNN moduleCMAKE_C_COMPILER / CMAKE_CXX_COMPILER: Specifies compiler versionsCUDA_ARCH_BIN: Replace 9.0 with your GPU's compute capability
CMake will check your system and display a configuration summary. Review it to ensure CUDA and cuDNN are detected correctly.
6 Compile OpenCV
Build OpenCV using all available CPU cores (adjust -j 60 to match your system's core count):
cmake --build . -j 60
This step will take 15-60 minutes depending on your system. You can use -j $(nproc) to automatically use all available cores.
7 Install
Install OpenCV system-wide to /usr/local:
sudo cmake --install .
This will place:
- Binaries in
/usr/local/bin - Libraries in
/usr/local/lib - Headers in
/usr/local/include/opencv4 - Python module in
/usr/local/lib/python3.12/dist-packages/cv2
8 Update Library Path
Ensure the system can find the OpenCV libraries:
sudo ldconfig
Verification
Test Python Installation
python3 -c "import cv2; print(f'OpenCV version: {cv2.__version__}')"
Expected output: OpenCV version: 4.12.0
Verify CUDA Support
python3 -c "import cv2; print(f'CUDA devices: {cv2.cuda.getCudaEnabledDeviceCount()}')"
This should print a number greater than 0 if CUDA support is properly enabled.
Check Build Information
python3 -c "import cv2; print(cv2.getBuildInformation())" | grep -A 5 "NVIDIA CUDA"
This should show your CUDA version and GPU architecture.
Test C++ Installation
Create a simple test file test_opencv.cpp:
#include <opencv2/opencv.hpp>
#include <opencv2/cudaarithm.hpp>
#include <iostream>
int main() {
std::cout << "OpenCV version: " << CV_VERSION << std::endl;
std::cout << "CUDA devices: " << cv::cuda::getCudaEnabledDeviceCount() << std::endl;
return 0;
}
Compile and run:
g++ test_opencv.cpp -o test_opencv $(pkg-config --cflags --libs opencv4)
./test_opencv
Troubleshooting
CUDA Not Detected During CMake Configuration
Symptoms: CMake reports "CUDA: NO" in configuration summary
Solutions:
- Verify CUDA installation:
nvcc --version - Ensure
/usr/local/cuda/binis in your PATH - Check that CUDA libraries are in
/usr/local/cuda/lib64
cuDNN Not Found
Symptoms: CMake reports "cuDNN: NO"
Solutions:
- Verify cuDNN installation in
/usr/local/cuda/includeand/usr/local/cuda/lib64 - Ensure cuDNN version matches CUDA toolkit version
- Set
CUDNN_INCLUDE_DIRandCUDNN_LIBRARYexplicitly in CMake
Driver Version Mismatch
Symptoms: Runtime errors mentioning CUDA driver version
Solutions:
- Check driver compatibility:
nvidia-smishows both driver and CUDA versions - Ensure NVIDIA driver version supports your CUDA toolkit version
- Update driver:
sudo apt install nvidia-driver-xxx(replace xxx with appropriate version)
Import Error in Python
Symptoms: ImportError: libopencv_core.so.4.12: cannot open shared object file
Solutions:
- Run
sudo ldconfigto update library cache - Verify
/usr/local/libis in/etc/ld.so.conf.d/configuration - Check Python path:
python3 -c "import sys; print(sys.path)"
Compile Errors During Build
Symptoms: Build fails with compiler errors
Solutions:
- Ensure all prerequisites are installed
- Try using a different GCC version (9, 10, 11, or 12)
- Check available disk space (build requires ~5GB)
- Clear build directory and reconfigure:
rm -rf * && cmake ...
Build Configuration Details
The installation includes the following key features:
Enabled Modules: Core CV, DNN with CUDA, CUDA-accelerated image processing, feature detection, object detection, video I/O, and contributed modules (aruco, face, text recognition, etc.)
Image Format Support: JPEG, PNG, TIFF, WebP, JPEG 2000, OpenEXR, GIF
Video Support: FFmpeg, GStreamer, V4L/V4L2
GUI Backend: GTK3
Python Version: 3.12 (matches system Python at /usr/bin/python3)
Notable Disabled Features: Non-free algorithms (patent-encumbered), Java bindings, MATLAB bindings
If you require any disabled features, you can rebuild with additional CMake flags. Refer to the OpenCV CMake options documentation for details.
Additional Resources
This tutorial is based on a successful installation performed on Ubuntu 24.04 with CUDA 12.0, cuDNN 8.9.2, and an NVIDIA H200 GPU (compute capability 9.0).