Logs Are Stories. We Built Evidence.

May 11, 2026 | 13 min read | Engineering

Every system produces logs. Terabytes of them. Structured, unstructured, shipped to Splunk, archived to S3, forgotten. But here's the thing about logs: they're stories told by the system being audited. And stories can be edited.

This is not a metaphor. It is the literal operational reality of every logging system in production today. Your application writes a line to a log file. That line says "user 47293 authenticated successfully at 14:23:17." How do you know that line is real? Because the application wrote it. The same application that you are trying to audit. The same application that could have been compromised, misconfigured, or instructed to write whatever its operators wanted it to write.

The entire compliance industry -- SOC 2, HIPAA, PCI DSS, FedRAMP, CMMC -- rests on a foundation of logs. And that foundation has a structural flaw that everyone acknowledges and nobody fixes: logs are written by the entity being audited, stored by the entity being audited, and interpreted by tools purchased by the entity being audited. The fox does not just guard the henhouse. The fox writes the incident report.

82%

Breaches with altered logs (Verizon DBIR)

Traditional logs with integrity proof

PQ signatures per Cachee entry

What Logs Actually Are

Strip away the vendor marketing, the compliance frameworks, the SIEM dashboards with their color-coded severity levels, and look at what a log actually is. It is text. Written to a file. By a process. On a machine. That is the entire trust model.

The process decides what to write. If the process has a bug, it writes the wrong thing. If the process is compromised, it writes whatever the attacker wants. If the process is misconfigured, it writes nothing at all. The log is not an independent record of what happened. It is a first-person narrative written by one of the participants in the event being recorded. In any other domain, this would be called testimony, not evidence. And testimony is only as reliable as the witness.

Mutable

A sed command rewrites history. sed -i 's/FAILED/SUCCESS/g' /var/log/auth.log turns every failed authentication into a successful one. The file modification timestamp changes, but timestamps are mutable too. touch -t 202605110000 /var/log/auth.log resets the timestamp to whatever you want. On a compromised system, there is no reliable way to distinguish an authentic log from a rewritten one. The bits on disk look the same either way.

Centralized log aggregators (Splunk, Elastic, Datadog) receive logs over the network and store them in a separate system. This adds a layer of difficulty for an attacker, but it does not solve the fundamental problem. The aggregator stores what it received. If the source sends modified logs, the aggregator stores modified logs. If the source stops sending logs during a breach, the aggregator has a gap -- but gaps are common in noisy production environments, and most teams do not investigate every gap. The aggregator trusts the source. The source is the entity being audited.

Deletable

rm -rf /var/log/ removes every log on the machine. Selective deletion is equally straightforward: remove specific lines from specific files, truncate the file to remove recent entries, or overwrite the file with a sanitized version. On systems with centralized logging, an attacker who compromises the network path between the source and the aggregator can intercept and filter logs in transit. On systems with agent-based log shipping, compromising the agent gives control over what is shipped.

Log retention policies compound the problem. Most organizations delete logs after 90 or 365 days. If a breach is not detected within that window -- and the median time to detect a breach is 204 days according to IBM's 2025 Cost of a Data Breach report -- the evidence is already gone. Not because the attacker deleted it. Because the victim's own retention policy did.

Unverifiable

How do you prove that log line 47,293 is authentic? That it was written at the time it claims? That it was written by the process it claims? That it has not been modified since it was written? You cannot. There is no cryptographic binding between a log line and the event it describes. There is no hash chain that detects insertions, deletions, or modifications. There is no signature that proves authorship. The log line is text. Its authenticity depends entirely on your trust in the system that produced it.

This is not a theoretical concern. In breach investigations, log integrity is one of the first things forensic teams attempt to verify -- and one of the most common things they find compromised. Attackers who gain root access routinely modify logs to cover their tracks. The 2024 Verizon Data Breach Investigations Report found that in 82% of breaches involving insider threats, log data was altered or deleted. Logs are the first casualty of a sophisticated attack because attackers understand that logs are the primary evidence used to investigate them.

Disconnected

Nothing binds a log entry to the actual event it describes. A log line says "computed fraud score 0.73 for transaction TX-8829." But what connects that log line to the actual computation? What proves that the inputs were what the log claims? What proves that the model version was what the log claims? What proves that the result was actually 0.73 and not 0.37? Nothing. The log line is a claim. The computation is a separate event. The connection between them is the same process that wrote the log -- the process being audited.

In a properly instrumented system, you might have correlated request IDs, distributed traces, and structured log formats that link log entries together. These are better than unstructured logs. But they still depend on the application to emit correct, complete, and unmodified data. Correlation IDs do not prove integrity. Distributed traces do not prove authenticity. They organize the story. They do not prove the story is true.

The Fundamental Problem with Logs

Logs are first-person narratives written by the system being audited. They are mutable (a sed command rewrites history), deletable (rm removes the evidence), unverifiable (no hash, no signature, no proof of authenticity), and disconnected (nothing binds the log to the actual event). An audit built on logs is an audit built on testimony. Testimony from the entity being audited. In any other domain, this would be considered insufficient. In infrastructure, it is considered standard practice.

What Evidence Requires

Evidence is not a better log. It is a fundamentally different thing. Where a log is a story, evidence is a mathematical artifact. Where a log's reliability depends on trust in its source, evidence's reliability depends on trust in mathematics. The distinction is not semantic. It is structural. Evidence requires three properties that logs do not have and cannot be retrofitted to provide.

Property 1: Integrity

Integrity means mathematical proof that nothing was modified. Not "we believe nothing was modified because the file timestamps look right." Not "we have no evidence of modification." Mathematical proof. A hash chain where every entry includes the hash of the previous entry, making any modification detectable. A content-addressed store where the entry's location is determined by its contents, making modification equivalent to relocation. A digital signature that binds the entry to a specific key at a specific time, making unauthorized modification equivalent to signature forgery.

Integrity is not about preventing modification. An attacker with sufficient access can modify anything. Integrity is about making modification detectable. A hash chain does not prevent someone from changing an entry. It ensures that changing an entry breaks the chain, and the break is discoverable by anyone who verifies the chain. The modification might succeed. The concealment will not.

Property 2: Completeness

Completeness means proof that nothing was deleted. A hash chain provides this: each entry references the previous entry by hash, so deleting an entry creates a gap that is detected during chain verification. Sequence numbers provide an additional layer: strictly monotonic sequence numbers mean that a deleted entry leaves a gap in the sequence that is detectable even without checking the hash chain.

Completeness is harder than integrity. You can prove that an existing entry has not been modified (check its hash). Proving that a missing entry ever existed requires a structure that references it. This is why Cachee uses both hash chains and sequence numbers: the hash chain detects deletions by breaking the reference chain, and the sequence numbers detect deletions by creating gaps in the expected sequence. Both mechanisms must be present because each catches cases the other might miss.

Property 3: Independence

Independence means verification without trusting the source. This is the property that separates evidence from testimony. A witness's testimony is only as reliable as the witness. Evidence is reliable regardless of who presents it, because the verification depends on mathematics, not on trust.

Independence requires self-contained verification artifacts. The evidence must include everything needed to verify it: the data, the signature, the public key, the algorithm identifier, the hash chain context. A verifier should not need to contact the source system to verify the evidence. A verifier should not need to trust the source system. A verifier should not need anything beyond the artifact itself and the mathematical operations to check it.

This is the standard applied to physical evidence in legal proceedings. A DNA sample does not require the laboratory that collected it to interpret it. Any qualified laboratory can re-test the sample and arrive at the same conclusion. The evidence is independent of its source. Digital evidence should meet the same standard.

How Cachee Produces Evidence, Not Logs

Cachee does not have a logging subsystem. It has an evidence production pipeline. Every operation produces a cryptographic artifact that satisfies all three properties: integrity, completeness, and independence. Here is how each property is implemented.

Every Value Is Signed

When a value enters the cache, it is signed by three independent post-quantum signature families: ML-DSA-65 (FIPS 204, based on Module-LWE lattices), FALCON-512 (NIST Round 3, based on NTRU lattices), and SLH-DSA-SHA2-128f-simple (FIPS 205, based on stateless hash functions). Three families. Three independent mathematical hardness assumptions. An attacker breaks the evidence only if they break all three simultaneously.

The signatures are not applied to the raw value. They are applied to a content hash that includes the value, the computation fingerprint, and the metadata. This means the signature attests not just to "what the value is" but to "what computation produced this value, with what inputs, using what parameters, on what software version, on what hardware class." The signature covers the full provenance, not just the bytes.

Every Computation Is Fingerprinted

The computation fingerprint is SHA3-256(input_hash || computation_hash || parameter_hash || version || hardware_class). This binds the cached value to the exact computation that produced it. Change the input, and the fingerprint changes. Change the model version, and the fingerprint changes. Change a hyperparameter, and the fingerprint changes. Deploy to different hardware, and the fingerprint changes.

The fingerprint is not a log entry that says "we computed this." It is a mathematical binding between the result and the computation. Given the fingerprint and the source data, anyone can verify that the fingerprint matches. Given a value and a mismatched fingerprint, anyone can prove the value did not come from the claimed computation. The fingerprint is not a story about the computation. It is a hash of the computation's identity.

Every State Change Is Hash-Chained

Every event in the system -- every write, every read, every verification, every state transition -- is appended to a hash-chained audit log. Each entry is SHA3-256(prev_hash || timestamp || sequence || event_type || event_data). The chain provides integrity (modification changes the hash, breaking the chain) and completeness (deletion removes an entry referenced by the next entry, breaking the chain). Every state transition -- Active to Superseded, Active to Revoked, Active to Expired -- is recorded with the authority that performed the transition, the reason, and a cryptographic proof.

The hash chain is not a log that records state changes. It is a mathematical structure that makes state changes tamper-evident. A log says "entry X was revoked at 14:23:17." A hash chain proves that if you remove or modify the revocation record, the chain breaks, and the break is detectable by anyone who walks the chain. The chain does not prevent tampering. It makes tampering visible. And visibility is what auditors need.

Every Bundle Is Self-Contained

Every cached value has a corresponding Cache Attestation Bundle (CAB): a 24KB package that contains the value hash, all three PQ signatures, the computation fingerprint, the lifecycle state, and the public keys needed for verification. The CAB is self-contained. It does not require a network connection to verify. It does not require access to the Cachee service. It does not require trusting H33. The math is in the bundle. If the signatures check out, the result is authentic.

This is independence in its purest form. Hand someone a CAB bundle. Tell them to verify it. They need a computer, a PQ signature library, and nothing else. They run bundle.verify(). The function checks the ML-DSA-65 signature against the content hash. It checks the FALCON-512 signature. It checks the SLH-DSA signature. It evaluates the 2-of-3 threshold. It returns a result: verified, partially verified, or failed. The verifier does not need Cachee. They do not need H33. They do not need a subscription, an API key, or a network connection. The bundle carries its own proof.

The Verification Test

Hand someone a CAB bundle. Tell them to verify it. They do not need Cachee. They do not need H33. They do not need a network connection. The math is in the bundle. If the signatures check out, the result is authentic. Period. That is the difference between evidence and a log. A log requires you to trust the source. Evidence requires you to trust mathematics.

Why This Distinction Matters Commercially

The distinction between logs and evidence is not academic. It maps directly to cost, risk, and compliance outcomes. Every organization with a compliance obligation -- SOC 2, HIPAA, PCI DSS, FedRAMP, CMMC -- pays for logs. The question is whether those logs are actually evidence, or whether they are expensive stories.

Logs Are a Cost Center

The average enterprise spends $2.5-4 million per year on log infrastructure: SIEM licenses (Splunk, Elastic, CrowdStrike), storage (S3, Glacier), analyst time to review and interpret logs, and incident response when logs are found to be incomplete or compromised. This is the cost of maintaining stories. You pay to write the stories, ship the stories, store the stories, index the stories, search the stories, and pay analysts to read the stories and decide if they believe them.

And when a breach occurs, the first question the forensic team asks is: "can we trust the logs?" If the attacker had root access -- and if they breached your system, they probably did -- the answer is "maybe." The stories might be real. They might be edited. The cost of the log infrastructure does not help you answer that question. The logs have no integrity proof. They have no completeness guarantee. They have no independence from the compromised system. You paid millions for stories you cannot verify.

Evidence Is a Compliance Asset

Evidence proves you did what you said you did. This is not a nuance. It is the difference between a clean SOC 2 report and an exception. It is the difference between a HIPAA audit that closes in two weeks and one that drags on for six months. It is the difference between a PCI assessment that accepts your controls and one that requires remediation.

SOX auditors do not want logs. They want proof that financial computations were performed correctly, that the results were not modified, and that the audit trail is intact. They want to see that the value reported in the financial statement was produced by a specific computation, with specific inputs, at a specific time, and that nothing has changed since. Cachee's CAB bundles are exactly this proof. The computation fingerprint binds the result to the computation. The signatures prove the result was not modified. The hash chain proves the audit trail is intact. The auditor can verify all of this independently, without trusting the system being audited.

HIPAA does not require "logging." It requires "audit controls" -- 45 CFR 164.312(b). The regulation says: "Implement hardware, software, and/or procedural mechanisms that record and examine activity in information systems that contain or use electronic protected health information." The key word is "examine." Not "store." Examine. An audit control that cannot be examined -- because the logs might be modified, because the logs might be incomplete, because the logs cannot be independently verified -- is not meeting the regulation's intent. It is meeting the regulation's letter while violating its spirit. Cachee's evidence artifacts meet both: they record activity (the audit chain) and they can be examined (self-contained CAB bundles with independent verification).

The Insurance Angle

Cyber insurance underwriters are beginning to differentiate between organizations that produce logs and organizations that produce evidence. The distinction matters actuarially. An organization with verifiable evidence of its security controls has a lower risk profile than an organization with logs that might or might not be authentic. The evidence does not prevent breaches. But it does prove the state of controls at the time of the breach, which affects liability, which affects payouts, which affects premiums.

The market signal is early but clear. Underwriters who ask "do you have logging?" are being replaced by underwriters who ask "can you prove the integrity of your audit trail?" The organizations that can answer yes will pay lower premiums. The organizations that cannot will pay more -- or find coverage harder to obtain.

Dimension	Logs	Evidence
Cost classification	Cost center (storage + SIEM + analysts)	Compliance asset (proves controls)
Integrity	None (mutable text files)	Hash-chained + signed
Completeness	No guarantee (gaps common)	Sequence numbers + chain references
Independence	Requires trusting the source	Self-contained verification
Breach value	May be compromised with the system	Tamper-evident regardless of system state
Audit outcome	"We believe controls were in place"	"We can prove controls were in place"
Retention value	Depreciates (stale stories)	Appreciates (stronger proof over time)

What Changes When You Have Evidence

The shift from logs to evidence changes three operational realities that every engineering and compliance team deals with.

Incident response becomes forensically sound. When a breach occurs and you have hash-chained, signed evidence, the forensic team does not start by asking "can we trust the logs?" They start by verifying the chain. If the chain is intact, the evidence is authentic -- regardless of whether the attacker had root access. If the chain is broken, the break point tells them exactly where the attacker intervened. Either way, the forensic team has a foundation of mathematical certainty that log-based investigations never achieve.

Compliance becomes demonstrable. When an auditor asks "prove that this computation was performed correctly and the result was not modified," you hand them a CAB bundle. They verify it independently. The audit finding is "controls are operating effectively" -- not "controls appear to be operating effectively based on logs we cannot independently verify." The word "appear" is the difference between a clean report and a qualified one. Evidence eliminates "appear."

Dispute resolution becomes deterministic. When two parties disagree about what a system did -- "we sent the correct data" versus "the data we received was corrupted" -- evidence resolves the dispute. The CAB bundle for the sent data has a computation fingerprint and three PQ signatures. The receiving party can verify the bundle and determine, with mathematical certainty, whether the data they received matches what was sent. No he-said-she-said. No log comparison. No ambiguity. The math settles it.

The Line Between Stories and Proof

The infrastructure industry has spent two decades optimizing the wrong thing. We have built extraordinary systems for collecting, shipping, storing, indexing, and querying logs. We have SIEM platforms that ingest terabytes per day. We have log aggregation pipelines that correlate events across thousands of services. We have dashboards that visualize log data in real time. And all of it rests on a foundation of mutable, deletable, unverifiable text files written by the systems being audited.

Cachee does not produce better logs. It does not produce more logs. It does not produce faster logs. It produces evidence. Every value is signed by three post-quantum families. Every computation is fingerprinted. Every state change is hash-chained. Every bundle is self-contained and independently verifiable. The audit chain is tamper-evident. The verification artifacts are portable. The math does not require trust.

If you can't prove what your system did, you don't have an audit trail. You have a story. We stopped telling stories. We started producing evidence.

The Difference in One Sentence

A log says "this happened." Evidence proves it. Cachee produces cryptographic evidence -- signed values, fingerprinted computations, hash-chained state transitions, and self-contained verification bundles -- as a byproduct of normal cache operations. Not as an add-on. Not as a separate system. As the architecture itself. If the signatures verify, the data is authentic. If the chain is intact, the history is complete. If the bundle is present, the proof is portable. That is what evidence looks like.

Stop producing logs. Start producing evidence. Every cached value signed, fingerprinted, and hash-chained.

Get Started Compliance & Audit