Create an Ollama on Docker and Connect with Spring AI

For developers building AI-powered applications, containerization ensures consistency and reliability. Whether you’re coding on a Mac or deploying on Linux, Docker keeps it stable. That’s why we’ll use docker-compose.yml to define our Ollama setup.

Why Run Ollama with Docker?

Docker makes life easier for developers. Instead of struggling with local installations, you run applications in containers. These containers hold everything the app needs. That means fewer conflicts, simpler sharing, and faster deployments.

When you run Ollama with Docker, you avoid heavy system dependencies. You spin up the container, and Ollama is ready. No more broken paths or mismatched libraries. Just clean, portable environments.

Additionally, Docker Compose facilitates the management of multiple containers. With one YAML file, you can start Ollama, databases, and supporting services together. It’s like having a universal remote control for your stack.

Step 1: Install Docker and Docker Compose

Before we begin, ensure Docker is installed. If you don’t have it yet, grab it from the official Docker site. Installation is straightforward. Follow the steps specific to your operating system.

Once Docker is running, check the version. Open a terminal and type:

docker --version
docker-compose --version

If both commands return versions, you’re good to go. If not, troubleshoot the installation first. It’s better to confirm before moving ahead.

Docker Compose may come bundled with Docker Desktop. On Linux, you might need to install it separately. Once installed, you’ll use it to orchestrate your containers.

Step 2: Create a Docker Project for Ollama

Let’s organize things. First, make a project folder.

mkdir ollama-docker
cd ollama-docker

Inside this folder, we’ll add our configuration. Creating a clean workspace avoids confusion later. Keep your files together, and version control becomes simple.

The heart of our setup is the docker-compose.yml file. This file describes the container. You’ll define the image, ports, and any volumes needed. Think of it as a blueprint for Docker.

Step 3: Write the docker-compose.yml File

Now let’s build the YAML file. Open your favorite editor and create docker-compose.yml. Paste the following:

services:
  ollama:
    image: ollama/ollama:latest
    container_name: ollama-container
    ports:
      - "11434:11434"
    volumes:
      - ./ollama-data:/root/.ollama
    restart: unless-stopped

Let’s break this down:

  • Version: Defines Docker Compose format.
  • Services: Lists our containers. Here we have only ollama.
  • Image: Uses the latest Ollama image from Docker Hub.
  • container_name: Assigns a readable name.
  • Ports: Maps container port 11434 to the same host port.
  • Volumes: Stores Ollama data outside the container for persistence.
  • Restart: Restarts automatically unless stopped.

This file alone will run Ollama in a neat, isolated container.

Step 4: Start Ollama with Docker Compose

Once the YAML file is ready, start the service. Run:

docker-compose up -d

The -d flag runs the container in the background. Docker will download the Ollama image if it’s missing. Then it launches the container.

Check the status with:

docker ps
CONTAINER ID   IMAGE                       COMMAND                  CREATED          STATUS          PORTS                                             NAMES
d57acc3d9909   ollama/ollama:latest        "/bin/ollama serve"      55 minutes ago   Up 55 minutes   0.0.0.0:11434->11434/tcp, [::]:11434->11434/tcp   ollama-container

You should see ollama-container running. If not, check logs with:

docker-compose logs

Congratulations! You now have Ollama running inside Docker. That’s half the journey completed.

Step 5: Test Ollama Locally

Testing ensures everything works. You can curl the Ollama endpoint to check:

curl http://localhost:11434
StatusCode        : 200
StatusDescription : OK
Content           : Ollama is running
RawContent        : HTTP/1.1 200 OK
                    Content-Length: 17
                    Content-Type: text/plain; charset=utf-8
                    Date: Wed, 27 Aug 2025 08:56:52 GMT

                    Ollama is running
Forms             : {}
Headers           : {[Content-Length, 17], [Content-Type, text/plain; charset=utf-8], [Date, Wed, 27 Aug 2025 08:56:52
                    GMT]}
Images            : {}
InputFields       : {}
Links             : {}
ParsedHtml        : mshtml.HTMLDocumentClass
RawContentLength  : 17

If Ollama responds, the container is active. Some images may require initialization. In that case, logs will guide you.

At this stage, you have a working Ollama environment. Next, let’s connect it with Spring AI.

Install ollama2 in a Docker container.

ollama-container

Check the ollama model.

ollama list
NAME    ID    SIZE    MODIFIED 

Suppose you get an empty model. You must install the model to ollama.

ollama pull llama2

Recheck the ollama model.

>ollama list
NAME             ID              SIZE      MODIFIED      
llama2:latest    78e26419b446    3.8 GB    7 seconds ago

Step 6: Why Connect Ollama with Spring AI?

Spring AI simplifies AI integration into Java apps. It offers abstractions for prompts, models, and APIs. With Spring AI, you don’t need to handle raw HTTP calls manually. Instead, you use friendly interfaces that fit well with Spring Boot.

By connecting Ollama and Spring AI, you bring local AI power into your applications—eliminating the need to rely solely on cloud APIs. You control the model environment. That’s great for privacy, experimentation, and cost management.

Think of it as having your personal AI server, wrapped neatly inside your Java project.

Step 7: Add Spring AI Dependencies

Open your Spring Boot project. In pom.xml, add Spring AI dependencies. Here’s an example:

https://start.spring.io/
Initial Spring project

These dependencies allow your project to speak with the Ollama container.

<properties>
   <java.version>21</java.version>
   <spring-ai.version>1.0.1</spring-ai.version>
</properties>
<dependencies>
   <dependency>
       <groupId>org.springframework.boot</groupId>
       <artifactId>spring-boot-starter-web</artifactId>
   </dependency>
   <dependency>
       <groupId>org.springframework.ai</groupId>
       <artifactId>spring-ai-starter-model-ollama</artifactId>
   </dependency>
   <dependency>
       <groupId>org.springframework.boot</groupId>
       <artifactId>spring-boot-devtools</artifactId>
       <scope>runtime</scope>
       <optional>true</optional>
   </dependency>
   <dependency>
       <groupId>org.projectlombok</groupId>
       <artifactId>lombok</artifactId>
       <optional>true</optional>
   </dependency>
   <dependency>
       <groupId>org.springframework.boot</groupId>
       <artifactId>spring-boot-starter-test</artifactId>
       <scope>test</scope>
   </dependency>
</dependencies>
<dependencyManagement>
   <dependencies>
       <dependency>
           <groupId>org.springframework.ai</groupId>
           <artifactId>spring-ai-bom</artifactId>
           <version>${spring-ai.version}</version>
           <type>pom</type>
           <scope>import</scope>
       </dependency>
   </dependencies>
</dependencyManagement>

Step 8: Configure Spring AI to Use Ollama

Spring Boot makes configuration simple. In your application.yml, set Ollama as the provider:

spring:
  application:
    name: spring-ollama
  ai:
    ollama:
      base-url: http://localhost:11434
      chat:
        model: llama2:latest

Here:

  • base-url: Points to your running Ollama container.
  • Model: Selects the model Ollama should use. Adjust it according to the models available.

This trim configuration seamlessly ties your Spring Boot app to Ollama.

Check the Ollama model name.

>curl http://localhost:11434/api/tags
{
    "models": [
        {
            "name": "llama2:latest",
            "model": "llama2:latest",
            "modified_at": "2025-08-27T08:14:50.2608108Z",
            "size": 3826793677,
            "digest": "78e26419b4469263f75331927a00a0284ef6544c1975b826b15abdaef17bb962",
            "details": {
                "parent_model": "",
                "format": "gguf",
                "family": "llama",
                "families": [
                    "llama"
                ],
                "parameter_size": "7B",
                "quantization_level": "Q4_0"
            }
        }
    ]
}

Step 9: Create a Simple Service in Spring AI

Next, write a service to interact with Ollama. Example in Java:

import lombok.RequiredArgsConstructor;
import org.springframework.ai.ollama.OllamaChatModel;
import org.springframework.stereotype.Service;

@Service
@RequiredArgsConstructor
public class OllamaService {

    private final OllamaChatModel ollama;

    public String askOllama(String prompt) {
        return ollama.call(prompt);
    }
}

This service wraps Ollama calls. Your application can now send prompts and receive AI-generated responses.

Step 10: Build a Controller for Testing

To make things interactive, expose a REST endpoint. Example:

import com.example.spring_ollama.service.OllamaService;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

@RestController
@RequestMapping("/api/ollama")
public class OllamaController {

    private final OllamaService service;

    public OllamaController(OllamaService service) {
        this.service = service;
    }

    @GetMapping("/ask")
    public String ask(@RequestParam String prompt) {
        return service.askOllama(prompt);
    }
}

Now you can hit your Spring Boot app and chat with Ollama.

Step 11: Test the Integration

Start your Spring Boot app. With Ollama running in Docker, visit:

http://localhost:8080/api/ollama/ask?prompt=Why is the sky blue?

If everything is correct, Ollama will respond. That means your integration works. You now have a Java app powered by Ollama inside Docker.

The sky appears blue to us because of a phenomenon called Rayleigh scattering. This is the scattering of sunlight by small particles in the atmosphere, such as nitrogen and oxygen molecules, and tiny dust particles. These particles absorb sunlight in all directions, but they scatter shorter (blue) wavelengths more than longer (red) wavelengths.

When sunlight enters Earth's atmosphere, it encounters these small particles and is scattered in all directions. The blue light is scattered more than the red light, so it reaches our eyes from all parts of the sky, giving the appearance of a blue sky. This effect is more pronounced when the sun is high in the sky, as there are more particles in the upper atmosphere to scatter the light.

The reason why we see the sky as blue and not, say, yellow or purple, is because our eyes are most sensitive to blue light. The human eye has cells in the retina that are most sensitive to light with a wavelength of around 450 nanometers (blue light), which is the same wavelength that is scattered most by the small particles in the atmosphere. So, when the sunlight enters our eyes, it appears blue because of the way the light is scattered and the sensitivity of our eyes.

It's worth noting that the color of the sky can appear different under different conditions. For example, during sunrise and sunset, the sky can take on hues of red, orange, and pink due to the scattering of light by larger particles in the atmosphere. Additionally, pollution and other atmospheric factors can also affect the color of the sky.

Step 12: Troubleshooting Tips

Things may break, and that’s fine. Common fixes:

  • If Ollama doesn’t start, check the Docker logs.
  • If Spring AI cannot connect, confirm the port mapping.
  • Ensure that both Docker and Spring Boot applications run simultaneously.
  • Update versions if APIs mismatch.

These checks usually solve most problems.

Next Steps: Expand Your Setup

You’ve built a strong base. From here, you can:

  • Add a frontend for better interaction.
  • Secure your API with authentication.
  • Deploy the stack on a server.
  • Train custom models locally.

The possibilities are endless. You’ve unlocked the first step toward building AI-enabled Java applications.

For GPU (NVIDIA)

This setup enables GPU acceleration, which is highly recommended for faster model inference.

If you want to pull a model when the container starts automatically, you can modify the docker-compose.yml:

services:
  ollama:
    image: ollama/ollama:latest
    container_name: ollama-gpu
    ports:
      - "11434:11434"
    volumes:
      - ollama_data:/root/.ollama
    environment:
      - NVIDIA_VISIBLE_DEVICES=all
      - NVIDIA_DRIVER_CAPABILITIES=compute,utility
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    restart: unless-stopped
    tty: true
    stdin_open: true
    depends_on:
      - model-puller
    
  model-puller:
    image: ollama/ollama:latest
    container_name: ollama-model-puller
    volumes:
      - ollama_data:/root/.ollama
    environment:
      - NVIDIA_VISIBLE_DEVICES=all
      - NVIDIA_DRIVER_CAPABILITIES=compute,utility
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    entrypoint: |
      sh -c "
      ollama serve &
      sleep 10
      echo 'Pulling ollama2 model...'
      ollama pull ollama2
      echo 'Model pulled successfully!'
      pkill ollama
      "
    restart: "no"

volumes:
  ollama_data:
    driver: local

Verify GPU Usage

# From inside container
docker exec -it ollama-gpu nvidia-smi

Troubleshooting

Check NVIDIA drivers:

nvidia-smi

Verify Docker can access the GPU:

docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

Check Docker daemon configuration (/etc/docker/daemon.json):

{
  "runtimes": {
    "nvidia": {
      "path": "nvidia-container-runtime",
      "runtimeArgs": []
    }
  },
  "default-runtime": "nvidia"
}

This setup will give you GPU-accelerated Ollama running in Docker, which will significantly improve inference speed compared to CPU-only execution.

Finally

You just created Ollama on Docker and connected it with Spring AI. That’s a powerful combo! Docker gave you consistency. Spring AI gave you simplicity. Together, they unlock creative projects without relying on the cloud.

Now it’s your turn. Experiment, break things, and improve them. Share your work with others. The community grows when we learn together.

You’ve got the tools, and you’ve got the steps. Time to build something amazing!

This article was originally published on Medium.

Leave a Comment

Your email address will not be published. Required fields are marked *