A practical backend lesson on summary tables, mutable transaction data, and why historical reports need snapshots or data lineage.
“Why does the summary not match the transaction details?”
That question usually starts as a reporting issue.
The API still works. The database is still running. The batch job still finishes. No service is crashing.
But users open the same monthly report and see different numbers.
That is where the system starts failing in a more serious place: trust.
In this scenario, the report generator was technically working. The real problem was that the system did not preserve the exact data used to create the summary.
A reporting snapshot fixes that by saving the dataset used for a report. Instead of only saving the final total, the system also saves which records were included, when the report was generated, and which report run produced the result.
The Real Problem
There were two reports.
The first report was a summary report.
It showed values like:
- Total transactions
- Total amount
- Monthly totals
- Grouped numbers by customer, merchant, product, or status
The second report was a transaction detail report.
It showed the individual records behind the summary.
Users expected both reports to match.
For example, if the summary report showed 10,000 transactions, the detail report should also show 10,000 transaction rows.
If the summary report showed a total amount of 5,000,000, the transaction details should add up to 5,000,000.
That expectation is fair.
But the result was different.
For the same month, the summary showed one number and the detail report showed another. If the report was regenerated later, the number could change again.
Same month.
Same report.
Different numbers.
First Things Developers Usually Check
When summary and detail reports do not match, the developer usually checks the SQL first.
That is the right first step.
Common causes include:
- Wrong SQL query
- Different filters between summary and detail
- Timezone issues
- Different transaction statuses
- Missing joins
- Incorrect date range
- Duplicate records
- Excluded cancelled transactions
- Different rounding rules
For example, the summary query may include only COMPLETED transactions, while the detail query includes both COMPLETED and PENDING.
That kind of mismatch is common.
But in this scenario, the deeper problem was not only SQL.
It was the data model.
The Real Root Cause
The system stored aggregated summary data.
But it did not store which transaction records were used to create that summary.
In simple terms, the summary table stored the answer, but not the evidence behind the answer.
The transaction table kept changing after the summary was generated.
Some records were updated after month-end.
Some transactions were cancelled later.
Some adjustments arrived late.
Some corrections came from operations.
Some records were inserted after the monthly summary had already been calculated.
So the summary and detail report were looking at different versions of reality.
The summary was correct when it was generated.
The detail report was correct when the user opened it.
But the underlying data changed between those two moments.
Without a snapshot, the system could not reproduce the original report.
Why Both Reports Can Be “Correct” But Still Not Match
This is the confusing part.
The summary report may be correct for Monday at 9 AM.
The detail report may be correct for Monday at 3 PM.
But if transaction data changed between 9 AM and 3 PM, they will not match.
For example:
09:00 AM Summary generated: 10,000 transactions Total amount = 5,000,000 03:00 PM 200 old transactions are adjusted or cancelled 03:30 PM User opens detail report: 9,800 transactions Total amount = 4,920,000
Both reports are using valid data.
But they are not using the same data at the same point in time.
From the user’s perspective, this looks broken.
From the system’s perspective, it is doing exactly what the design allows.
The missing concept is a fixed reporting dataset.
Example Setup
Assume the backend uses:
- Java
- Spring Boot
- SQL
- PostgreSQL
- A monthly transaction report
- A live transactions table
- A summary report table
- A detail report screen
The same idea applies to MySQL, Oracle, SQL Server, or other relational databases.
The database engine is not the main issue.
The main issue is whether the report can reproduce the same historical result later.
Example Transaction Table
A simple transaction table may look like this:
CREATE TABLE transactions (
id BIGINT PRIMARY KEY,
transaction_date DATE NOT NULL,
amount DECIMAL(18, 2) NOT NULL,
status VARCHAR(30) NOT NULL,
updated_at TIMESTAMP NOT NULL
);This table stores live transaction data.
If the system allows updates to old rows, this table is mutable. That means the data for January can still change in February, March, or later.
That is risky for historical reporting if the report reads directly from this table every time.
Example Summary Table Without Lineage
A basic summary table may look like this:
CREATE TABLE monthly_report_summary_old (
id BIGINT PRIMARY KEY,
report_month DATE NOT NULL,
total_transactions BIGINT NOT NULL,
total_amount DECIMAL(18, 2) NOT NULL,
generated_at TIMESTAMP NOT NULL
);This table stores monthly totals.
But it does not store which transactions were included.
The developer can see the final count and amount, but cannot explain exactly which transaction IDs created those numbers.
That is the weakness.
A summary without lineage is hard to verify later.
The Missing Concept: Reporting Snapshot
A historical report needs a fixed dataset.
The system should know:
- When the report was generated
- Which transactions were included
- Which version of data was used
- Who or what generated the report
- Whether the month was still open or already closed
- Whether the report is draft, verified, or final
Without this information, the report is just a query result from a moving database.
That may be acceptable for operational dashboards.
It is not enough for historical monthly reports where users expect the same result every time.
Better Design Option 1: Store A report_run_id
The developer can create one unique ID for each report generation.
For example:
report_run_id = 2026-01-monthly-report-v1
Then the summary and detail data can both reference the same report run.
Start with a report_run table:
CREATE TABLE report_run (
id BIGINT PRIMARY KEY,
report_month DATE NOT NULL,
report_name VARCHAR(100) NOT NULL,
status VARCHAR(30) NOT NULL,
generated_at TIMESTAMP NOT NULL,
generated_by VARCHAR(100)
);This table represents one report generation.
For example, January monthly report version 1 can have one report_run record. If the report is regenerated later, the system can create another run.
This gives the report a clear identity.
Add Summary Data With report_run_id
Now the summary table can reference the report run:
CREATE TABLE monthly_report_summary (
id BIGINT PRIMARY KEY,
report_run_id BIGINT NOT NULL,
total_transactions BIGINT NOT NULL,
total_amount DECIMAL(18, 2) NOT NULL,
FOREIGN KEY (report_run_id) REFERENCES report_run(id)
);This table no longer stores only “January summary.”
It stores “the summary for this specific report run.”
That matters because January may have more than one report version if data changes or corrections are needed.
Better Design Option 2: Store Report Detail Snapshots
The developer can also store the transaction rows used for the report.
CREATE TABLE monthly_report_detail (
id BIGINT PRIMARY KEY,
report_run_id BIGINT NOT NULL,
transaction_id BIGINT NOT NULL,
transaction_date DATE NOT NULL,
amount DECIMAL(18, 2) NOT NULL,
status VARCHAR(30) NOT NULL,
FOREIGN KEY (report_run_id) REFERENCES report_run(id)
);This table is a snapshot of the detail records used during report generation.
The report detail screen should read from this table by report_run_id, not directly from the live transactions table.
That way, opening the same historical report later returns the same detail rows.
This design uses more storage, but it gives much better reproducibility.
Example Reconciliation Query
After saving summary and detail snapshot data, the developer can reconcile them.
SELECT
s.total_transactions AS summary_count,
COUNT(d.id) AS detail_count,
s.total_amount AS summary_amount,
COALESCE(SUM(d.amount), 0) AS detail_amount
FROM monthly_report_summary s
JOIN monthly_report_detail d
ON d.report_run_id = s.report_run_id
WHERE s.report_run_id = 1001
GROUP BY s.total_transactions, s.total_amount;This query compares the summary count and amount against the detail snapshot.
If summary_count matches detail_count, and summary_amount matches detail_amount, the report is internally consistent.
The important part is that both sides use the same report_run_id.
Example Spring Boot Flow
A Spring Boot report generation flow can look like this:
public void generateMonthlyReport(YearMonth month) {
ReportRun run = reportRunService.create(month);
List<Transaction> transactions = transactionRepository.findForReportMonth(month);
reportDetailSnapshotService.save(run.getId(), transactions);
reportSummaryService.calculateAndSave(run.getId(), transactions);
reconciliationService.verify(run.getId());
reportRunService.markVerified(run.getId());
}This flow creates a report run, loads the transaction data for the month, stores the detail snapshot, calculates the summary from the same transaction list, verifies the result, and marks the report as verified.
The summary and detail now come from the same dataset.
That is the main design improvement.
Example Repository Query
The transaction query may look like this:
@Query("""
select t
from Transaction t
where t.transactionDate >= :startDate
and t.transactionDate < :endDate
and t.status in :statuses
order by t.id
""")
List<Transaction> findForReportPeriod(
LocalDate startDate,
LocalDate endDate,
List<TransactionStatus> statuses
);This query loads transactions for the report period.
The developer should make sure this query is used consistently during report generation.
If the detail screen later uses a different query against live data, the mismatch can return.
That is why the detail report should read from the snapshot table after generation.
Better Design Option 3: Use Immutable Transactions
Another option is to make transaction records immutable.
Instead of updating old transaction rows directly, the system stores corrections as new adjustment records.
For example:
Original transaction: TXN-1001 amount = 100.00 Adjustment transaction: TXN-1001-ADJ amount = -20.00 Final business amount: 80.00
This design is common in ledger-like systems.
It makes historical reconstruction easier because the system does not overwrite the past. It records what changed.
The developer can still build snapshots, but immutable transactions reduce the risk of silent historical changes.
Better Design Option 4: Add A Monthly Closing Process
A monthly closing process defines when a report period becomes fixed.
For example:
January report period: OPEN -> transactions can still change VERIFYING -> reconciliation is running CLOSED -> direct changes are blocked ADJUSTED -> corrections must use adjustment records
Once the month is closed, the system should prevent direct updates to closed-period data.
If a correction is needed, it should go through an adjustment process.
This makes “month-end” a system rule, not just an assumption in someone’s head.
Current Workaround: Recalculate Before Generating
A common workaround is to recalculate or sync summary and transaction data before generating the report.
This can make the report consistent at that moment.
For example:
1. Read current transaction data. 2. Recalculate summary. 3. Generate summary report. 4. Generate detail report from the same current data.
This helps with today’s mismatch.
But it has trade-offs:
- Report generation becomes slower.
- The process becomes more operationally complex.
- Historical numbers can still change over time.
- Users may see a different January report depending on when they open it.
- The system still may not explain which records were included in the old version.
This workaround is useful, but it is not the same as historical reproducibility.
Step-By-Step Implementation Plan
The developer can improve the design gradually.
- Define whether the report must be reproducible.
- Add a report_run table.
- Save report generation time, month, status, and generated user or job.
- Save summary data with report_run_id.
- Save included transaction IDs or copied transaction details.
- Make the detail report read by report_run_id, not live date range only.
- Add reconciliation between summary and detail.
- Add a period closing rule.
The developer does not always need the most complex solution on day one.
But the system should at least know which dataset produced which report.
Run Or Test The Design
The developer can test the issue with a small example.
First, create one report run:
INSERT INTO report_run (
id,
report_month,
report_name,
status,
generated_at,
generated_by
)
VALUES (
1001,
DATE '2026-01-01',
'MONTHLY_TRANSACTION_REPORT',
'VERIFYING',
CURRENT_TIMESTAMP,
'system'
);Then insert detail snapshot rows:
INSERT INTO monthly_report_detail (
id,
report_run_id,
transaction_id,
transaction_date,
amount,
status
)
VALUES
(1, 1001, 501, DATE '2026-01-05', 100.00, 'COMPLETED'),
(2, 1001, 502, DATE '2026-01-06', 200.00, 'COMPLETED'),
(3, 1001, 503, DATE '2026-01-07', 300.00, 'COMPLETED');Then insert the summary:
INSERT INTO monthly_report_summary (
id,
report_run_id,
total_transactions,
total_amount
)
VALUES (
1,
1001,
3,
600.00
);Now run the reconciliation query.
The summary and detail should match.
Even if the live transactions table changes later, this report run still has its own saved detail records.
That is the value of the snapshot.
Expected Result
After adding snapshots or lineage:
- The summary and detail report match because they use the same report_run_id.
- Developers can explain which transactions were included.
- Opening the same historical report later returns the same result.
- Late updates do not silently change old reports.
- Users can trust monthly report numbers.
- Reconciliation becomes easier and more automatic.
The report becomes explainable.
That is a big difference.
Performance Notes
Snapshot tables use extra storage.
If the monthly report has millions of transaction rows, copying every detail row can be expensive.
The developer should think about:
- Indexes on report_run_id
- Partitioning snapshot tables by report month
- Storing only required report fields
- Compressing archived partitions if supported
- Keeping snapshot retention rules clear
- Using batch inserts for large detail snapshots
For example:
CREATE INDEX idx_monthly_report_detail_run_id ON monthly_report_detail (report_run_id);
This index helps the detail report load records for a specific report run faster.
For very large reports, the developer may need pagination, streaming exports, or asynchronous file generation.
Security And Audit Notes
Reports can contain sensitive financial, customer, or operational data.
The developer should track who generated, verified, approved, downloaded, or regenerated a report.
A report run table can be extended like this:
ALTER TABLE report_run ADD COLUMN verified_at TIMESTAMP, ADD COLUMN verified_by VARCHAR(100), ADD COLUMN finalized_at TIMESTAMP, ADD COLUMN finalized_by VARCHAR(100);
This gives the team a better audit trail.
For compliance-heavy systems, this may be required.
Even for normal business systems, it helps during debugging.
Important Notes
A summary table alone is not enough for historical reporting.
Live transaction data is risky if records can change after month-end.
If users expect stable monthly reports, use snapshots, versioning, or period closing.
Recalculating every time may fix today’s mismatch but can make history unstable.
Do not assume “month ended” means data stopped changing.
Store enough information to explain how the report number was created.
Reporting consistency is not only about correct SQL.
Summary and detail must point to the same dataset at the same point in time.
Recommendation
Before designing a monthly report, ask one important question:
Does this report need to be reproducible?
If the answer is yes, the developer should not rely only on live transaction data and a summary table.
Use one or more of these patterns:
- report_run_id
- Detail snapshot table
- Immutable transactions
- Period closing
- Audit trail
- Reconciliation query
- Report versioning
The exact design depends on the business requirement.
But the principle is simple.
If the report needs to explain the past, the system must preserve the past.
Conclusion
The same monthly report keeps showing different numbers when the system does not preserve the dataset behind the report.
In this case, the summary could be correct when it was generated, and the detail report could be correct when it was opened. But both reports were reading different versions of changing transaction data.
The fix was not only better SQL.
The better design was to add report lineage: a report_run_id, detail snapshots, reconciliation, audit fields, and clear reporting boundaries.
A report is not only a query.
It is a statement about data at a point in time.
If the system cannot answer when the number was calculated, which transactions were included, and whether opening it again will show the same result, then the report is not fully reliable yet.



