Setting up Local LLM
Project URL
https://github.com/waqaskhan137/local-llm
Ollama Docker Compose Setup
Welcome to the Ollama Docker Compose Setup! This project simplifies the deployment of Ollama using Docker Compose, making it easy to run Ollama with all its dependencies in a containerized environment.
Getting Started
Prerequisites
Make sure you have the following prerequisites installed on your machine:
- Docker
- Docker Compose
Project Structure
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
.
├── CONTRIBUTING.md
├── docker-compose-ollama-gpu.yaml
├── docker-compose.yml
├── Dockerfile
├── LICENSE
├── README.md
├── requirements.txt
└── src
├── basic_chain.py
├── main.py
├── rag.py
├── test.py
└── web
├── index.html
└── local-rag-test.html
GPU Support (Optional)
If you have a GPU and want to leverage its power within a Docker container, follow these steps to install the NVIDIA Container Toolkit:
1
2
3
4
5
6
7
8
9
10
11
12
13
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
# Configure NVIDIA Container Toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
# Test GPU integration
docker run --gpus all nvidia/cuda:11.5.2-base-ubuntu20.04 nvidia-smi
Configuration
Clone the Docker Compose repository:
1
git clone https://github.com/waqaskhan137/ollama-docker.git
Change to the project directory:
1
cd ollama-docker
Usage
Start Ollama and its dependencies using Docker Compose:
if gpu is configured
1
docker-compose -f docker-compose-ollama-gpu.yaml up -d
else
1
docker-compose up -d
Visit http://localhost:8000 in your browser to access Ollama-webui.
Model Installation
Navigate to settings -> model and install a model (e.g., llama2). This may take a couple of minutes depends on your internet speed, but afterward, you can use it just like ChatGPT.
I will list some famous opensource models
- gemma
- Gemma is a family of lightweight, state-of-the-art open models built by Google DeepMind.
- llama2
- Llama 2 is a collection of foundation language models ranging from 7B to 70B parameters.
- mistral
- The 7B model released by Mistral AI, updated to version 0.2.
- mixtral
- A high-quality Mixture of Experts (MoE) model with open weights by Mistral AI.
- llava
- 🌋 LLaVA is a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding. Updated to version 1.6.
- neural-chat
- A fine-tuned model based on Mistral with good coverage of domain and language.
- codellama
- A large language model that can use text prompts to generate and discuss code.
If you want to explore more you can find those here
Explore Langchain and Ollama
You can explore Langchain and Ollama within the project. A third container named app has been created for this purpose. Inside, you’ll find some examples.
Stop and Cleanup
To stop the containers and remove the network:
1
docker-compose down
Contact
If you have any questions or concerns, please contact us at waqaskhan137@gmail.com.
Enjoy using Ollama with Docker Compose! 🐳🚀