Archive - Nullify.net

Archive - Historical Articles

You are viewing records from 02/15/2026 23:27:43 to 06/17/2026 20:17:48. I'll be adding support for selecting a date range in future.

« Older Articles Show everything Newer Articles »

[Blog] Nvidia DGX: When you aren't out of memory, but your OS makes itself think it is... - by simon at Wed, 17 Jun 2026 20:17:48 GMT

I kept having out of memory situations where I was unable to finish fine-tuning or quanization jobs when I had hundreds of GB's of video memory seemingly free - I kept getting:

Failed to load checkpoint: Some modules are dispatched on the CPU or the disk

It turns out that in a unified memory Grace Blackwell system you need to drop your OS cache or it can consume too much of the unified memory, resulting in paging to disk instead of GPU use.

sudo sh -c 'sync; echo 3 > /proc/sys/vm/drop_caches'

Permalink

[Blog] Add CUDA support to faster-whisper - by simon at Sun, 15 Feb 2026 23:27:43 GMT

Today I tried to get voice recognition working using hardware acceleration on an ARM64 system with an Nvidia GPU. I eventually found the easiest way was to make my own Dockerfile based on their own, but using this base image:-

FROM nvidia/cuda:12.3.2-cudnn9-runtime-ubuntu22.04

This comes with all the necessary Cuda support libraries pre-loaded and allowed me to quickly compile, build and publish a Grace Blackwell compatible docker image of faster-whisper. Once I've tested it a bit and set up an automated pipeline I'll publish it.

The difference between CPU and GPU processing was only a second or so, but it does make the recognition feel snappy and you can use the large model easily instead of the tiny one.

Permalink

« Older Articles Show everything Newer Articles »

nullify

A blog by Simon Soanes

Archive - Historical Articles

[Blog] Nvidia DGX: When you aren't out of memory, but your OS makes itself think it is... - by simon at Wed, 17 Jun 2026 20:17:48 GMT

[Blog] Add CUDA support to faster-whisper - by simon at Sun, 15 Feb 2026 23:27:43 GMT