Don’t Send Your Company’s Private Data to China: Self-Host DeepSeek AI Model with Ollama and Open WebUI GUI

Today, we’re taking the latest AI model, DeepSeek-R1 for a spin, but not through their web or mobile app! Our focus on data privacy means we want to avoid our data ending up on a server in China. To maintain control over our data when using DeepSeek, we’ll run the model locally on our servers (self-hosting). This article builds on my previous guide, “Build Your Own Private, Customizable, and Self-hosted AI GPT using Llama2 and Open WebUI with RAG”, where I detailed how to create a private, customizable, and self-hosted AI GPT using Llama 2 and Open WebUI. If you haven’t completed that setup, I recommend doing so before proceeding.

DeepSeek is yet another AI model, similar to Llama2, Llama3, Mistral, Qwen, and OpenAI-o1. The groundwork is already laid, and in this article, we’ll simply download the DeepSeek model using the Ollama API and interact with it via the user-friendly Open WebUI. This approach allows you to test all the models available in the Ollama library.

At this stage, I assume you already have the following components running properly on your server:

Open WebUI: A user-friendly interface for interacting with large language models (LLMs).
Docker: Used to containerize and deploy Open WebUI and Nginx (our reverse proxy for handling and redirecting HTTPS requests).
Ollama API: Simplifies the process of pulling and running LLMs locally.

If everything is set up correctly, we can proceed to our Open WebUI GUI. As shown in the screenshot below, the DeepSeek model is currently not available for selection. Our goal in this article is to add the DeepSeek model so we can interact with it through Open WebUI.

But before we start pulling the new DeepSeek model, it’s good practice to update and upgrade our OS to the latest software patches. To do this, use the following commands:

# apt update
# apt upgrade -y

Selecting the Right Version of DeepSeek-R1 for Your System

DeepSeek-R1 comes in various versions: 1.5B, 7B, 8B, 14B, 32B, 70B, and 671B. It’s crucial to understand and select the version best suited for your system and use case. Each version has different requirements in terms of VRAM and computational power. Smaller models are more efficient and suitable for lightweight applications, while larger models offer higher performance and accuracy for complex tasks. Additionally, quantized versions can help reduce VRAM requirements, making it easier to run larger models on less powerful hardware. By evaluating these factors, you can choose the most appropriate version of DeepSeek-R1 to meet your needs.

Hardware Requirement and Use Cases

1.5B: Requires around 3.9 GB of VRAM. Suitable for users with mid-range GPUs like the NVIDIA RTX 3060. Ideal for lightweight applications and quick iterations.
7B: Needs approximately 18 GB of VRAM. Ideal for high-end GPUs such as the NVIDIA RTX 4090. Suitable for more demanding tasks that require better reasoning capabilities.
8B: Requires about 21 GB of VRAM. Also suitable for high-end GPUs like the NVIDIA RTX 4090. Good for tasks that need a balance between performance and resource usage.
14B: Needs around 36 GB of VRAM. Best for multi-GPU setups, e.g., two NVIDIA RTX 4090s. Suitable for more complex problem-solving and reasoning tasks.
32B: Requires approximately 82 GB of VRAM. Suitable for multi-GPU setups, e.g., four NVIDIA RTX 4090s. Ideal for high-performance applications.
70B: Needs around 181 GB of VRAM. Ideal for multi-GPU setups, e.g., three NVIDIA A100 80GB GPUs. Best for research and applications where maximum performance is critical.
671B: Requires a substantial 1,543 GB of VRAM. Only feasible with a large multi-GPU setup, e.g., sixteen NVIDIA A100 80GB GPUs. Suitable for the most demanding and complex tasks.

For this demo, we are going to run the 32B version of DeepSeek (deepseek-r1:32b). Log in to your server’s CLI and use the Ollama API to pull this model with the following command:

# ollama run deepseek-r1:32b

Once the model has been successfully downloaded, return to your Open WebUI GUI and refresh the portal. You should now see the new DeepSeek model available for selection. You can then select it and start interacting with it.

In this article, we successfully demonstrated how to self-host the DeepSeek model using the Ollama API and Open WebUI. By following these steps, we ensured that our data remains private and under our control, avoiding the need to rely on external servers. We began by setting up the necessary components, including Docker, Open WebUI, and the Ollama API. We then downloaded and ran the 32B version of DeepSeek, making it available for interaction through the Open WebUI.

This approach not only enhances data privacy but also provides flexibility in experimenting with different AI models. The Ollama library offers a wide range of models, each suited for various applications and hardware capabilities. I encourage you to explore these models and find the ones that best meet your needs.

By following the same procedure, you can easily test and interact with other AI models available in the Ollama library. Whether you’re working on lightweight applications or complex tasks, there’s a model that fits your requirements. Take control of your data and harness the power of AI by self-hosting these models on your server.

Happy experimenting!

About the Author

Joshua Makuru Nomwesigwa is a seasoned Telecommunications Engineer with vast experience in IP Technologies; he eats, drinks, and dreams IP packets. He is a passionate evangelist of the forth industrial revolution (4IR) a.k.a Industry 4.0 and all the technologies that it brings; 5G, Cloud Computing, BigData, Artificial Intelligence (AI), Machine Learning (ML), Internet of Things (IoT), Quantum Computing, etc. Basically, anything techie because a normal life is boring.

Spread the word:

Don’t Send Your Company’s Private Data to China: Self-Host DeepSeek AI Model with Ollama and Open WebUI GUI

About the Author

IT Support Engineer – Abu Dhabi, UAE

PRODUCT SOLUTION MANAGER – IP in UK

Contract Manager – South Africa & Kenya

MBB RAN Engineer- Islamabad Head Office

RF Engineer in Santiago, Chile

4G/5G RAN Test Engineer IV in Plano, Texas

RF Optimization Engineer – Remote (India)

Network Technical Support Specialist – Italy

RAN Performance Lead – Managed Services in KSA

MNS Core Senior Services Engineer in KSA

CCTV Security Solution Engineer

Linux and Windows server Administrator – Dubai

Senior Power Platform Developer – Remote

Network, Voice & Security L3 in Dubai, UAE

BSS Application Support Engineer in Dubai, U.A.E

Carrier Account Manager (Rwanda)

Business Configuration Engineer – Ericsson Catalogue

Mobile Core Network Performance Engineer – Ericsson Core

IN Engineer – Charging System & Network Performance

RF Optimization Engineers (Multiple Locations in USA)

About the Author

Related Posts

Resolved: FreeRADIUS Docker Container Exits on Startup When ran with Docker-Compose (chmod raddb Fix)

Problem Resolution Report: Docker Repository GPG Key Error (“signatures couldn’t be verified”) on Ubuntu 22.04.5 LTS

TShark Cheat Sheet for Network Engineers: Master Command-Line Packet Analysis (Wireshark CLI)

IT Support Engineer – Abu Dhabi, UAE

PRODUCT SOLUTION MANAGER – IP in UK

Contract Manager – South Africa & Kenya

MBB RAN Engineer- Islamabad Head Office

RF Engineer in Santiago, Chile

4G/5G RAN Test Engineer IV in Plano, Texas

RF Optimization Engineer – Remote (India)

Network Technical Support Specialist – Italy

RAN Performance Lead – Managed Services in KSA

MNS Core Senior Services Engineer in KSA

CCTV Security Solution Engineer

Linux and Windows server Administrator – Dubai

Senior Power Platform Developer – Remote

Network, Voice & Security L3 in Dubai, UAE

BSS Application Support Engineer in Dubai, U.A.E

Carrier Account Manager (Rwanda)

Business Configuration Engineer – Ericsson Catalogue

Mobile Core Network Performance Engineer – Ericsson Core

IN Engineer – Charging System & Network Performance

RF Optimization Engineers (Multiple Locations in USA)