Signal is Live
Alerting that doesn't cry wolf
My phone used to vibrate constantly.
Error rate above 1%? Buzz. Response time above 500ms? Buzz. CPU above 80%? Buzz. Disk usage above 70%? Buzz.
Eventually, I started ignoring them all.
That's alert fatigue. And it's dangerous. Because when everything is urgent, nothing is urgent.
The problem with traditional alerting
Every monitoring tool sends alerts. They're all dumb about it.
Static thresholds. Simple comparisons. No context.
"Error rate is 2%."
Cool. Is that bad? Was it 0.1% before? Did something just deploy? Are these real errors or test traffic? Who knows. The alert doesn't tell you.
So you investigate. Usually it's nothing. Eventually you stop investigating.
And then one day it's something real, and you miss it.
Signal thinks first
Signal doesn't just threshold and notify. It actually analyzes.
Correlation. Error spike and slow response times at the same moment? That's one incident, not two alerts. Signal correlates events across Recall, Reflex, and Pulse. Multiple symptoms, one notification.
Context. Instead of "Error rate is 2%", Signal tells you:
"Error rate jumped from 0.1% to 2% in the last 10 minutes. The errors are NoMethodError in OrdersController. They started after deploy #847. 234 users affected so far."
That's actionable. That tells you what to do.
Smart baselines. 500ms response time might be fine for your app. Or it might be terrible. Static thresholds are dumb.
Signal learns your baselines. Alerts when things deviate from your normal, not some arbitrary number.
The AI angle
Ask Claude: "What should I be worried about right now?"
Claude queries Signal. Reviews recent alerts. Checks current metrics. Gives you a situation report:
"One active incident: checkout conversion dropped 15% in the last hour. Likely cause: Stripe API timeouts (P95 response time 4.2 seconds, normally 200ms). Recommended action: check Stripe status page, consider failing over to secondary processor."
Not just alerts. Situation awareness.
Escalation that makes sense
Signal doesn't just blast your phone.
- First alert: Slack notification
- No response in 15 minutes: Email
- No response in 30 minutes: PagerDuty
- No response in 1 hour: Call the backup
Configurable paths. The right urgency for the right situation.
Night-time alerts for critical issues only. Business hours for everything else. You define what matters when.
How it integrates
Signal pulls from all Brainz Lab products:
alerts:
- name: "High Error Rate"
source: reflex
condition: "error_rate > 1% for 5 minutes"
severity: critical
- name: "Slow Responses"
source: pulse
condition: "p95_response_time > 2s for 10 minutes"
severity: warning
- name: "Unusual Log Volume"
source: recall
condition: "log_volume > 3x baseline"
severity: info
One place to manage all rules. One place to see all incidents.
The on-call experience
Signal isn't just about sending alerts. It's about making on-call bearable.
Incident timeline. Everything that happened, in order.
Runbooks. Linked documentation for common issues.
Quick actions. One-click acknowledge, resolve, snooze.
Handoff notes. Context for the next person on-call.
On-call doesn't have to mean anxiety. It can just be... manageable.
Try it
docker-compose up -d signal
# Open http://signal.localhost
Configure your first alert. Watch it work.
The stack is complete
Recall + Reflex + Pulse + Signal = Complete observability.
Four products that work together. One AI interface to query them all.
Ask Claude anything about your running application. Get answers. Fix problems. Sleep better at night.
This is what I've been building toward.
— Andres