Categories
Uncategorized

Intelligence Mining on legacy hardware: Local LLMs and Data Sovereignty

Running Large Language Models (LLMs) locally is the only true path to data sovereignty. For my private projects, keeping data off the cloud is a non-negotiable priority—a challenge that is equally critical for enterprises looking to leverage their proprietary data securely.

However, high-performance AI doesn’t always require the latest enterprise-grade silicon. In this article, I explore how I repurpose legacy hardware to run large-scale LLMs and process huge datasets, proving that robust local AI is possible with the gear you already own.

The Intelligence Rig

I run LLMs on legacy hardware, that might have been viewed as exotic but does the job for me so far. My LLM rig runs on an AMD Ryzen 5, 24 GB RAM (due to current prices) and a 500GB SSD (with a 64GB swap file). But the main core is a bundle of GPUs:

  • 4 NVIDIA GeForce GTX 1080 Ti
  • 3 NVIDIA GeForce GTX 1070
  • 1 NVIDIA GeForce GTX 1080

Which makes overall 76GB of VRAM!

I don’t go into details here regarding the setup, I found Google Gemini very helpful in most cases, for the setup, and sometimes also for prompt engineering. My software setup is ollama with Open WebUI.

Since the hardware is built on an open rig, of course it doesn’t run 24/7 at my home. Heat and power consumption is not a huge problem for me, since the cards only work when they compute data, and even then, the utilisation is only one GPU with 100% at a time (when “thinking”), or if parallel (when generating text) it is around 30% utilisation per GPU.

Processing Large Datasets

I specifically utilize Qwen 3.5 35B on the rig for processing extensive texts. Thanks to the 76GB VRAM, I can push the num_ctx (context window) parameter high enough to load entire books directly into memory. So far, I’ve found that the most reliable method for this is simply copying the entire text directly into the prompt.

Known Difficulties

It isn’t a perfect system. With a mixed GPU setup, some LLMs lack stability. I haven’t yet determined if this is a configuration error on my part or a fundamental limitation caused by the differing architectures of the Pascal-era cards (1080 Ti vs. 1070/1080).

Workstation HP Z440

For my daily tasks, I rely on an HP Z440 (besides my notebook), a machine that first hit the market in late 2014. While the Intel Xeon CPU and 32 GB of RAM are standard for a workstation of this era, the GPU is the standout feature for AI work.

It’s equipped with an NVIDIA Quadro P5000. Originally optimized for CAD and professional visualization, its 16 GB of VRAM makes it surprisingly capable for modern LLM and image generation tasks.

Text Generation with LM Studio

For local text generation on this machine, I use LM Studio. It’s a clean, user-friendly tool that handles different model formats without fuss. On this setup, I primarily run:

  • Qwen 3.5
  • GPT-OSS

Image generation

To my surprise, the P5000 handles image generation quite well, despite its age. I use two main packages that offer a streamlined experience:

  • Forge (Stable Diffusion): I use the “one-click” installation package, which simplifies the entire backend setup.
  • SwarmUI: It works, but I did not test it in depth.

While image creation isn’t my primary objective, having the 16GB buffer on the Quadro P5000 makes it a fun and educational environment to explore.

Comparison and Showcase

Technical Specifications

This table outlines the specifications of each model across my two setups.

Quality and Speed Performance

For a comparison of quality and speed on the different models I run on the rig and the workstation, I used this prompt with standard parameters:

“Act as a Senior GRC (Governance, Risk, and Compliance) Consultant.

Scenario: A company is fully ISO 27001:2022 certified. They now need to comply with the NIS 2 Directive.

  1. Identify 3 specific areas where ISO 27001 is NOT sufficient for NIS 2 (e.g., Supply Chain Security, Incident Reporting timelines, or Management Liability).
  2. Explain the ‘Reporting Obligation’ under NIS 2: How does it differ from the standard ISO incident management?
  3. Analyze the risk of ‘Personal Liability’ for management under NIS 2 compared to ISO 27001.
  4. Verdict: Can the company claim ‘Automatic Compliance’? Why or why not?“

Conclusion

This showcase proves that once a model reaches a “large enough” parameter threshold, the output quality remains consistently valid for high-level tasks. While I did not check all answers in depth, I might miss hallucination and some logical flaws here. In this test I declare GPT-OSS the winner, followed by QWEN 3.5.

Furthermore, this demonstrates why you must pick the right LLM/hardware combination for your use case and why size doesn’t always matter.

I use the workstation for all my personal daily tasks (besides cloud LLMs like Gemini, ChatGPT and copilot for non-critical data), but whenever it comes to computing a large amount of data with LLMs I use the rig.

Final thoughts

The takeaway is clear: You don’t need state-of-the-art, “Enterprise-grade” hardware to perform meaningful AI tasks. Whether it is a decade-old CAD workstation or a custom-built cluster of legacy mining cards, these machines are still highly capable of high-level intelligence mining.

However, the key to success is understanding the restrictions.

Success with local LLMs isn’t a matter of buying the most expensive GPU; it’s about matching your specific use case to your hardware’s strengths. By managing VRAM limits and selecting models that fit your memory “budget,” you can transform supposedly obsolete hardware into a powerful, sovereign AI environment.

Data sovereignty and high-level computation are accessible to anyone willing to experiment with the hardware they already have.

Leave a Reply

Discover more from danielsauder

Subscribe now to keep reading and get access to the full archive.

Continue reading