A 10-step guide to running Ollama in Docker and connecting it to a Spring Boot application for secure, cost-effective local LLM inference.
Install and Verify Docker
Ensure Docker and Docker Compose are installed and running on your system.
Define the Stack
Create a project folder and define the Ollama service in a `docker-compose.yml` file.
Launch and Map Ports
Start the container (`docker-compose up`) mapping port 11434 for access.
Test Container Health
Verify Ollama is running using a simple HTTP request to the host port 11434.
Pull the LLM Model
Execute `ollama pull [model]` (e.g., `llama2`) to prepare the model for use.
Add Spring AI Dependency
Include the `spring-ai-starter-model-ollama` to your Spring Boot project.
Configure Ollama Endpoint
Set `base-url: http://localhost:11434` and the desired `model` in `application.yml`.
Create a Service Component
Inject `OllamaChatModel` into a service to manage LLM calls easily.
Build the REST Controller
Expose an endpoint (e.g., `/api/ollama/ask`) to accept prompts from users.
Verify Final Integration
Run the Spring app and test the REST endpoint to receive LLM responses.