Follow the Journey
4 min read

Why TimescaleDB

Time-series data without the complexity

Logs are time-series. Metrics are time-series. Traces are time-series. Errors are time-series.

Almost everything in observability is "what happened, when."

You could use a specialized time-series database. InfluxDB. Prometheus. ClickHouse. Each with its own query language, ops requirements, and learning curve.

Or you could use PostgreSQL with TimescaleDB.

The problem with specialized databases

Every new database is overhead:

  • New query language to learn
  • New backup procedures
  • New monitoring to set up
  • New failure modes to understand
  • New scaling patterns to master

For a solo founder building 16 products, that overhead compounds fast.

TimescaleDB is just PostgreSQL

-- It's SQL
SELECT time_bucket('1 hour', created_at) AS hour,
       count(*) AS log_count
FROM log_entries
WHERE created_at > now() - interval '24 hours'
GROUP BY hour
ORDER BY hour;

Same SQL you already know. Same psql you already use. Same pg_dump for backups. Same everything.

TimescaleDB adds time-series superpowers without changing the fundamentals.

Hypertables

The core concept:

CREATE TABLE log_entries (
  id uuid PRIMARY KEY,
  created_at timestamptz NOT NULL,
  level varchar(10),
  message text,
  data jsonb
);

SELECT create_hypertable('log_entries', 'created_at');

That's it. Your table is now a hypertable.

Behind the scenes, TimescaleDB partitions by time automatically. Recent data in one chunk. Older data in others. Queries only scan relevant chunks.

A table with 10 billion rows queries like a table with 10 million—because you're only touching recent data.

Continuous aggregates

Pre-compute rollups:

CREATE MATERIALIZED VIEW hourly_log_counts
WITH (timescaledb.continuous) AS
SELECT time_bucket('1 hour', created_at) AS hour,
       level,
       count(*) AS count
FROM log_entries
GROUP BY hour, level;

The view updates automatically as new data arrives. Queries against hourly_log_counts are instant—the work is already done.

Dashboard showing logs per hour? Reads from the aggregate. No scanning millions of rows.

Compression

Old data compresses dramatically:

ALTER TABLE log_entries SET (
  timescaledb.compress,
  timescaledb.compress_segmentby = 'project_id',
  timescaledb.compress_orderby = 'created_at DESC'
);

SELECT add_compression_policy('log_entries', interval '7 days');

Data older than 7 days compresses automatically. 10x compression is typical. 20x isn't unusual.

Keep years of logs without breaking the budget.

Retention policies

Delete old data automatically:

SELECT add_retention_policy('log_entries', interval '90 days');

90-day retention. Automatic. No cron jobs.

Different tables can have different policies. Logs: 30 days. Metrics: 1 year. Errors: 90 days.

How we use it

Recall (logs):
- Hypertable on created_at
- 7-day compression
- 30-day default retention
- Hourly aggregates for dashboards

Pulse (traces):
- Hypertable on started_at
- Continuous aggregates for percentiles
- 14-day default retention

Flux (metrics):
- Hypertable on recorded_at
- Pre-computed aggregates for all metric types
- Configurable retention per metric

Signal (alerts):
- Hypertable for incident timeline
- Long retention (compliance)
- Aggregates for trend analysis

Performance

Real numbers from our stack:

  • 100M log entries — Dashboard loads in 200ms
  • 1B metric points — Query last hour in 50ms
  • Compression ratio — 12x average
  • Storage cost — 90% less than uncompressed

This is with default PostgreSQL. No special hardware. No distributed clusters.

The Rails integration

ActiveRecord works normally:

class LogEntry < ApplicationRecord
  # Just a normal model
end

# Queries work as expected
LogEntry.where(created_at: 1.hour.ago..).count

TimescaleDB is invisible at the application layer. All the optimization happens in the database.

Why not ClickHouse?

ClickHouse is faster for analytics. Significantly faster.

But:

  • Separate database to operate
  • Different query patterns
  • No transactions
  • Limited UPDATE/DELETE

For a product where time-series is part of a larger system—with users, organizations, relationships—PostgreSQL + TimescaleDB keeps everything unified.

One database. One set of tools. One operational model.

Try it

# docker-compose.yml
postgres:
  image: timescale/timescaledb:latest-pg16

Same PostgreSQL. Superpowered for time-series.

— Andres

Want to follow the journey?

Get Updates