Structured Logs, JSON Lines, and Retention: From grep to Centralized Search

Apr 5, 2026 · Written by: Netspare Team

Operations & support

Structured Logs, JSON Lines, and Retention: From grep to Centralized Search

Structured logs (JSON fields like `level`, `request_id`, `user_id`, `duration_ms`) let you query and alert in centralized systems instead of regex-guessing prose lines.

Retention and cardinality drive cost: high-cardinality labels (per-user IDs as metric tags) explode storage—know what belongs in logs vs traces vs metrics.

Standard fields and correlation

Propagate a request ID from edge load balancer through app and outbound HTTP clients; include it in every log line for one-click trace reconstruction.

Log levels should mean something operationally: ERROR requires human action or paging; INFO is normal business events; DEBUG stays off in production unless sampled.

PII, secrets, and redaction

Never log raw passwords, session cookies, or full payment PANs—use tokenized references.

GDPR/CCPA deletion requests must reach log pipelines if you store identifiable fields; retention policies are legal requirements, not only disk limits.

Retention tiers

  • Hot storage (7–30 days) for incident response; warm/cold for compliance archives.
  • Sampling for DEBUG in prod at high QPS—100% DEBUG can double infrastructure cost for little insight.
  • Test restore of log archives if auditors expect proof of integrity.

Agents vs stdout in containers

Twelve-factor style stdout/stderr collection simplifies container rotation; sidecar agents add features (parsing, batching) but also failure modes.

Clock skew across nodes timestamps logs incorrectly—use NTP/chrony discipline.

Frequently asked questions

JSON logs bigger than plain text?
Slightly, but compression and columnar storage often win; the queryability pays back quickly at scale.
Who owns log pipelines?
Platform/SRE typically operates collectors; app teams own field schemas and documentation.

You may also like