Local AI Power: Ollama + Spring AI Setup

A 10-step guide to running Ollama in Docker and connecting it to a Spring Boot application for secure, cost-effective local LLM inference.

Phase I: Ollama Containerization

1

Install and Verify Docker

Ensure Docker and Docker Compose are installed and running on your system.

2

Define the Stack

Create a project folder and define the Ollama service in a `docker-compose.yml` file.

3

Launch and Map Ports

Start the container (`docker-compose up`) mapping port 11434 for access.

4

Test Container Health

Verify Ollama is running using a simple HTTP request to the host port 11434.

5

Pull the LLM Model

Execute `ollama pull [model]` (e.g., `llama2`) to prepare the model for use.

Phase II: Spring AI Connection

6

Add Spring AI Dependency

Include the `spring-ai-starter-model-ollama` to your Spring Boot project.

7

Configure Ollama Endpoint

Set `base-url: http://localhost:11434` and the desired `model` in `application.yml`.

8

Create a Service Component

Inject `OllamaChatModel` into a service to manage LLM calls easily.

9

Build the REST Controller

Expose an endpoint (e.g., `/api/ollama/ask`) to accept prompts from users.

10

Verify Final Integration

Run the Spring app and test the REST endpoint to receive LLM responses.