Archive - Historical Articles
You are viewing records from 01/20/2026 16:54:18 to 06/17/2026 20:17:48. I'll be adding support for selecting a date range in future.
I kept having out of memory situations where I was unable to finish fine-tuning or quanization jobs when I had hundreds of GB's of video memory seemingly free - I kept getting:
Failed to load checkpoint: Some modules are dispatched on the CPU or the disk
It turns out that in a unified memory Grace Blackwell system you need to drop your OS cache or it can consume too much of the unified memory, resulting in paging to disk instead of GPU use.
sudo sh -c 'sync; echo 3 > /proc/sys/vm/drop_caches'Permalink
Today I tried to get voice recognition working using hardware acceleration on an ARM64 system with an Nvidia GPU. I eventually found the easiest way was to make my own Dockerfile based on their own, but using this base image:-
FROM nvidia/cuda:12.3.2-cudnn9-runtime-ubuntu22.04
This comes with all the necessary Cuda support libraries pre-loaded and allowed me to quickly compile, build and publish a Grace Blackwell compatible docker image of faster-whisper. Once I've tested it a bit and set up an automated pipeline I'll publish it.
The difference between CPU and GPU processing was only a second or so, but it does make the recognition feel snappy and you can use the large model easily instead of the tiny one.
PermalinkSo I was trying to give my current personal AI project access to Gitea, and had to read the code to get it working. If you want to start it up on docker and use http as the transport, then your docker-compose is:-
version: "3.8"
services:
gitea-mcp:
image: gitea/gitea-mcp-server:latest
container_name: gitea-mcp
restart: no
ports:
- "8080:8080"
environment:
- MCP_MODE=http
- GITEA_URL=https://yourgiteainstance
- GITEA_TOKEN=yoursecrettoken
Permalink