JPA vs JDBC Performance — 10M Rows (PostgreSQL)

Experiment Summary

This page documents a performance experiment comparing Spring Data JPA and JdbcTemplate on PostgreSQL with a dataset of 10 million rows.

👉 Full narrative and analysis:
Read the Medium article


Test Environment

  • Java 25
  • Spring Boot 4.0.2
  • PostgreSQL 15
  • HikariCP (default)
  • Docker (local environment)
  • Fixed RPS load test

Dataset

  • Table: transactions
  • Rows: 10,000,000
  • Indexes:
    • (created_at DESC)
    • (status, created_at DESC)

Sample Data Generation Script (PostgreSQL)

Table Definition

CREATE TABLE transactions (
  id BIGSERIAL PRIMARY KEY,
  user_id BIGINT NOT NULL,
  status VARCHAR(20) NOT NULL,
  amount NUMERIC(12,2) NOT NULL,
  created_at TIMESTAMP NOT NULL
);

This schema intentionally mirrors a typical production table:

  • Time-based data
  • Status filtering
  • Read-heavy access patterns

Generate 10 Million Rows

This approach uses generate_series, which is far faster and more reliable than inserting rows from an application loop.

INSERT INTO transactions (user_id, status, amount, created_at)
SELECT
  (random() * 100000)::BIGINT AS user_id,
  CASE
    WHEN random() < 0.33 THEN 'SUCCESS'
    WHEN random() < 0.66 THEN 'FAILED'
    ELSE 'PENDING'
  END AS status,
  ROUND((random() * 10000)::numeric, 2) AS amount,
  NOW() - (random() * INTERVAL '365 days') AS created_at
FROM generate_series(1, 10000000);

Notes

  • Data distribution is intentionally realistic
  • created_at follows a time-series pattern
  • Status values are evenly spread
  • Insert time (local SSD):
    • ~2–5 minutes (native PostgreSQL)
    • ~5–8 minutes (Docker volume)

Indexes (Create After Insert)

Always create indexes after loading data.

CREATE INDEX idx_transactions_created_at
ON transactions (created_at DESC);

CREATE INDEX idx_transactions_status_created_at
ON transactions (status, created_at DESC);

Query Pattern

SELECT ...
FROM transactions
WHERE status = ?
ORDER BY created_at DESC
LIMIT ? OFFSET ?

This matches common pagination + filtering queries.

Update Planner Statistics

This step is mandatory for accurate performance testing.

ANALYZE transactions;

Without this, PostgreSQL may select inefficient query plans, skewing your results.

Example Query Used in the Experiment

This is the query pattern tested in both JPA and JDBC:

SELECT id, user_id, status, amount, created_at
FROM transactions
WHERE status = 'SUCCESS'
ORDER BY created_at DESC
LIMIT 50 OFFSET 5000;

This reflects:

  • Dashboard views
  • Admin panels
  • Reporting pages

Why This Dataset Works Well for Performance Testing

  • Large enough to expose hidden ORM costs
  • Still small enough to run locally
  • Reproducible
  • No artificial optimizations

This setup highlights application-side overhead, not database weakness.

Optional: Cleanup Script

DROP TABLE transactions;

JPA Entity + Repository

import jakarta.persistence.*;
import java.math.BigDecimal;
import java.time.LocalDateTime;

@Entity
@Table(name = "transactions",
       indexes = {
           @Index(name = "idx_created_at", columnList = "created_at"),
           @Index(name = "idx_status_created_at", columnList = "status, created_at")
       })
public class TransactionEntity {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    @Column(name = "user_id", nullable = false)
    private Long userId;

    @Column(nullable = false, length = 20)
    private String status;

    @Column(nullable = false, precision = 12, scale = 2)
    private BigDecimal amount;

    @Column(name = "created_at", nullable = false)
    private LocalDateTime createdAt;

    protected TransactionEntity() {
        // JPA only
    }

    // getters only (read-only use case)
    public Long getId() { return id; }
    public Long getUserId() { return userId; }
    public String getStatus() { return status; }
    public BigDecimal getAmount() { return amount; }
    public LocalDateTime getCreatedAt() { return createdAt; }
}

Why this matters

  • Full entity mapping
  • Managed persistence context
  • Dirty checking enabled (even for reads)
  • Object hydration cost is real

This is exactly how JPA is used in most production systems.

Spring Data JPA Repository

import org.springframework.data.domain.Page;
import org.springframework.data.domain.Pageable;
import org.springframework.data.jpa.repository.JpaRepository;

public interface TransactionRepository
        extends JpaRepository<TransactionEntity, Long> {

    Page<TransactionEntity> findByStatusOrderByCreatedAtDesc(
            String status,
            Pageable pageable
    );
}

JPA Service Method (Used in the Test)

import org.springframework.data.domain.PageRequest;
import org.springframework.data.domain.Pageable;
import org.springframework.stereotype.Service;

@Service
public class JpaTransactionService {

    private final TransactionRepository repository;

    public JpaTransactionService(TransactionRepository repository) {
        this.repository = repository;
    }

    public void fetchPage(String status, int page, int size) {
        Pageable pageable = PageRequest.of(page, size);
        repository.findByStatusOrderByCreatedAtDesc(status, pageable);
    }
}

⚠️ Important for fairness

  • No projections
  • No read-only hints
  • No native queries
  • Default behavior only

JdbcTemplate Implementation (Comparison Baseline)

Row Mapper

import org.springframework.jdbc.core.RowMapper;

import java.math.BigDecimal;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.time.LocalDateTime;

public class TransactionRowMapper implements RowMapper<TransactionDto> {

    @Override
    public TransactionDto mapRow(ResultSet rs, int rowNum) throws SQLException {
        return new TransactionDto(
                rs.getLong("id"),
                rs.getLong("user_id"),
                rs.getString("status"),
                rs.getBigDecimal("amount"),
                rs.getTimestamp("created_at").toLocalDateTime()
        );
    }
}

DTO (Not an Entity)

import java.math.BigDecimal;
import java.time.LocalDateTime;

public record TransactionDto(
        Long id,
        Long userId,
        String status,
        BigDecimal amount,
        LocalDateTime createdAt
) {}

Key difference:

  • No persistence context
  • No lifecycle tracking
  • No dirty checking
  • No proxies

JdbcTemplate Service

import org.springframework.jdbc.core.JdbcTemplate;
import org.springframework.stereotype.Service;

import java.util.List;

@Service
public class JdbcTransactionService {

    private final JdbcTemplate jdbcTemplate;
    private final TransactionRowMapper rowMapper = new TransactionRowMapper();

    public JdbcTransactionService(JdbcTemplate jdbcTemplate) {
        this.jdbcTemplate = jdbcTemplate;
    }

    public List<TransactionDto> fetchPage(
            String status,
            int limit,
            int offset
    ) {
        String sql = """
            SELECT id, user_id, status, amount, created_at
            FROM transactions
            WHERE status = ?
            ORDER BY created_at DESC
            LIMIT ? OFFSET ?
            """;

        return jdbcTemplate.query(
                sql,
                rowMapper,
                status,
                limit,
                offset
        );
    }
}

REST Controller for Load Testing (JPA vs JDBC)

Controller

import org.springframework.web.bind.annotation.*;

@RestController
@RequestMapping("/load-test")
public class LoadTestController {

    private final JpaTransactionService jpaService;
    private final JdbcTransactionService jdbcService;

    public LoadTestController(
            JpaTransactionService jpaService,
            JdbcTransactionService jdbcService
    ) {
        this.jpaService = jpaService;
        this.jdbcService = jdbcService;
    }

    /**
     * JPA endpoint
     *
     * Example:
     * GET /load-test/jpa?status=SUCCESS&page=100&size=50
     */
    @GetMapping("/jpa")
    public void jpaTest(
            @RequestParam String status,
            @RequestParam int page,
            @RequestParam int size
    ) {
        jpaService.fetchPage(status, page, size);
    }

    /**
     * JDBC endpoint
     *
     * Example:
     * GET /load-test/jdbc?status=SUCCESS&limit=50&offset=5000
     */
    @GetMapping("/jdbc")
    public void jdbcTest(
            @RequestParam String status,
            @RequestParam int limit,
            @RequestParam int offset
    ) {
        jdbcService.fetchPage(status, limit, offset);
    }
}

JMeter Configuration (Recommended)

Thread Group

  • Threads: 10–200 (start small)
  • Ramp-up: 30–60 sec
  • Loop count: Forever (with test duration)
  • Durations: 300-600 sec
  • Startup delay: 30 sec

Result Summary

MetricJPAJDBCDifference
Average latency~4,000 ms~5 ms~800× slower
P95 latency~18,000 ms~12 ms~1,500× slower
ThroughputHundreds req/minMillions req/minOrders of magnitude
CPU usage (DB)Low–moderateLow–moderateSimilar
CPU usage (App)Very highLowApp bottleneck
Memory pressureHighLowPersistence context
GC activityFrequentMinimalObject churn

Why the Difference Is So Huge (Not 2×, But 1000×)

The gap isn’t about SQL speed.

It’s about where the work happens.

  • JDBC pays once per query
  • JPA pays per row, per request, per transaction

For each row, JPA:

  • Creates entity objects
  • Tracks them in the persistence context
  • Performs dirty checking
  • Increases heap usage and GC pressure

At a small scale, this cost is invisible.
With millions of rows and concurrent traffic, it explodes.

That’s why JDBC stays in milliseconds
while JPA drifts into seconds — or worse.


How to Keep JPA — Without Falling Into the Trap

JPA isn’t slow because the database is slow.
It’s slow because it does a lot of hidden work per row.

To keep JPA and avoid performance collapse:

  • Use entities for writes, not for heavy reads
    Reads should return DTOs or projections.
  • Mark read paths as @Transactional(readOnly = true)
    This disables unnecessary dirty checking.
  • Clear the persistence context for large reads
    Prevents memory buildup and GC pressure.
  • Avoid OFFSET pagination on large tables
    Switch to keyset (cursor-based) pagination.
  • Use native queries or JDBC selectively
    Especially for reporting and dashboards.

JPA should model the domain, not serve every query.


Key Observations

  • Pagination cost increased with the dataset size
  • Database CPU remained stable
  • Application-side overhead dominated latency
  • JPA object lifecycle added measurable cost

Reproducibility

  • SQL data generation script
  • Load test configuration
  • Application configuration

Github


Reference

This experiment supports the analysis described in the Medium article and is intended as a technical reference, not a tutorial.