We find that tensorFlow failing to find a GPU is one of the most common setup problems in ML engineering. The symptom is consistent — training runs but only on CPU, silently — and the causes are specific and diagnosable. This article covers the verification commands, the common failure modes, and the systematic diagnostic approach to get from “TensorFlow can’t find my GPU” to a working configuration. Verifying GPU Detection The definitive check is tf.config.list_physical_devices: import tensorflow as tf gpus = tf.config.list_physical_devices('GPU') print(f"GPUs available: {len(gpus)}") for gpu in gpus: print(f" {gpu}") If this returns an empty list, TensorFlow cannot see any GPU. If it returns GPU devices, TensorFlow has successfully initialized the CUDA runtime and detected hardware. A secondary check that also shows TensorFlow’s CUDA build configuration: print(tf.test.is_built_with_cuda()) # True if TF was built with CUDA support print(tf.test.is_gpu_available()) # Deprecated but still informative print(tf.sysconfig.get_build_info()) # Shows CUDA and cuDNN versions TF was built against For confirming actual GPU execution (not just detection): import tensorflow as tf with tf.device('/GPU:0'): a = tf.constant([[1.0, 2.0], [3.0, 4.0]]) b = tf.constant([[5.0, 6.0], [7.0, 8.0]]) c = tf.matmul(a, b) print(c) If this executes without error and doesn’t fall back to CPU, GPU execution is confirmed. What does this mean in practice? This is the most frequent cause. TensorFlow has strict CUDA version requirements — a TF binary built against CUDA 11.8 will not work with CUDA 12.x installed, and vice versa. Check what TensorFlow expects: import tensorflow as tf build_info = tf.sysconfig.get_build_info() print(f"CUDA version TF built with: {build_info['cuda_version']}") print(f"cuDNN version TF built with: {build_info['cudnn_version']}") Check what’s installed: nvcc --version # CUDA toolkit version nvidia-smi # Driver version and CUDA compatibility cat /usr/local/cuda/version.txt # Installed CUDA version ls /usr/local/cuda-*/ # All installed CUDA versions TensorFlow CUDA compatibility matrix (key versions): TensorFlow Version Python CUDA cuDNN 2.13 3.8–3.11 11.8 8.6 2.14 3.9–3.11 11.8 8.7 2.15 3.9–3.11 12.2 8.9 2.16 3.9–3.12 12.3 8.9 2.17 3.9–3.12 12.3 8.9 If there’s a mismatch, either install the TF version matching your CUDA, or install the CUDA version matching your TF. The simplest resolution is using tensorflow[and-cuda] with pip on Linux, which installs the correct CUDA libraries automatically for TF 2.12+: pip install tensorflow[and-cuda] 2. Driver Version Too Old The NVIDIA driver must be new enough to support the installed CUDA toolkit. CUDA 12.x requires driver ≥525.85 on Linux. Installing a new CUDA toolkit without updating the driver is a common mistake. Check driver version: nvidia-smi --query-gpu=driver_version --format=csv,noheader Minimum driver versions: CUDA Version Minimum Linux Driver Minimum Windows Driver 11.8 520.61 522.06 12.0 525.85 527.41 12.2 535.54 536.25 12.3 545.23 545.84 12.4 550.54 551.61 3. TensorFlow Not Built with GPU Support The tensorflow package on PyPI for some platforms or CPU architectures is the CPU-only build. Verify: pip show tensorflow | grep -i "version\|location" python -c "import tensorflow as tf; print(tf.test.is_built_with_cuda())" If is_built_with_cuda() returns False, you have the CPU-only package. On Linux, install tensorflow or tensorflow[and-cuda] from PyPI (GPU builds are default on Linux x86_64). On Apple Silicon Macs, use tensorflow -metal instead. 4. CUDA Libraries Not on Library Path The CUDA runtime libraries (libcuda.so, libcudnn.so, libcublas.so) must be findable by the linker. If installed in non-standard locations: ldconfig -p | grep libcuda # Check if libcuda is in linker cache ldconfig -p | grep libcudnn # Check cuDNN ls /usr/local/cuda/lib64/ # Check default CUDA lib location Fix if libraries exist but aren’t found: export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH For a permanent fix, add to /etc/ld.so.conf.d/ and run ldconfig. 5. GPU Not Visible in Container In Docker/container environments, GPU visibility requires: NVIDIA Container Toolkit installed on the host --gpus all or --gpus "device=0" flag in the docker run command Or deploy.resources.reservations.devices in docker-compose docker run --gpus all nvidia/cuda:12.2-base nvidia-smi # Verify container GPU access Systematic Diagnostic Checklist nvidia-smi runs and shows GPU? (If not: driver issue) nvcc --version shows expected CUDA version? (If not: toolkit not installed or wrong PATH) tf.test.is_built_with_cuda() returns True? (If not: wrong TF package) tf.sysconfig.get_build_info()['cuda_version'] matches installed CUDA? (If not: version mismatch) Driver version meets minimum for the CUDA version? ldconfig -p | grep libcuda finds the library? (If not: library path issue) In container: --gpus all flag present? NVIDIA Container Toolkit installed? tf.config.list_physical_devices('GPU') returns devices? (Final confirmation) Enabling GPU Memory Growth After confirming GPU detection, enabling memory growth prevents TensorFlow from pre-allocating all available VRAM at startup (which blocks other processes): gpus = tf.config.list_physical_devices('GPU') for gpu in gpus: tf.config.experimental.set_memory_growth(gpu, True) This is recommended for multi-process or multi-model environments. The broader GPU configuration workflow for AI workloads — including NVIDIA system settings that affect TensorFlow performance — is covered in GPU Performance Settings for AI. Where this leaves us TensorFlow GPU detection failures have four common causes: CUDA version mismatch (most frequent), driver too old for the CUDA version, wrong TF package installed, and CUDA libraries not on the linker path. Each is diagnosable with specific commands. The systematic diagnostic checklist above resolves the issue in most cases without reinstalling the OS. For containerized deployments, the NVIDIA Container Toolkit and correct docker run flags are the additional requirements.