tag
#ollama
4 posts
-
A disk usage query taking 39 minutes isn't slow hardware — it's a misconfigured agent. Here's how to find the real bottleneck and fix it.
-
CPU-only inference on an OptiPlex 7070 is painfully slow. Here's a practical comparison of cloud APIs and second-hand GPU upgrades — with real costs in NZD.
-
Comparing local models for CPU-only inference on a 16 GB machine — focused on the constraints that actually matter for a Hermes agent: 64k context, reliable tool use, fitting in RAM, and response speed.
-
Installing Ollama and Hermes on a headless Ubuntu box, wiring up the Claude gateway, and working through the rough edges including a gateway install bug and a model that wouldn't use tools.