close

DEV Community

# reliability

General discussions on building and maintaining reliable software systems.

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
How to Build AI Agents That Fail Safely: Circuit Breakers, Health Checks, and Graceful Degradation

How to Build AI Agents That Fail Safely: Circuit Breakers, Health Checks, and Graceful Degradation

Comments
2 min read
I Tracked Why AI Agent Projects Fail. 80% of the Time, It's Not the Agents.

I Tracked Why AI Agent Projects Fail. 80% of the Time, It's Not the Agents.

Comments
8 min read
Chaos Engineering for Teams That Aren't Netflix

Chaos Engineering for Teams That Aren't Netflix

Comments
3 min read
Why Your Database Is Lying to You (And How to Catch It)

Why Your Database Is Lying to You (And How to Catch It)

BERJAYA 1
Comments
5 min read
Intermittent outages: causes, detection and solutions

Intermittent outages: causes, detection and solutions

Comments
3 min read
FaultRay: Why We Formalized Cascade Failure Propagation as a Labeled Transition System

FaultRay: Why We Formalized Cascade Failure Propagation as a Labeled Transition System

Comments
7 min read
Recurring VPS Hosting Issues: How Switching Providers and Negotiating Contracts Restores Trust and Reliability

Recurring VPS Hosting Issues: How Switching Providers and Negotiating Contracts Restores Trust and Reliability

Comments
8 min read
Exponential Backoff & Idempotency: The Unsung Heroes of Reliable Systems

Exponential Backoff & Idempotency: The Unsung Heroes of Reliable Systems

Comments
2 min read
The <final> Tag That Ate Your Response

The <final> Tag That Ate Your Response

Comments
2 min read
Why deployments break production systems

Why deployments break production systems

Comments
4 min read
Addressing Overconfidence in REST API Reliability: Implementing Resilience Patterns Like Polly

Addressing Overconfidence in REST API Reliability: Implementing Resilience Patterns Like Polly

Comments
8 min read
Two Channels, One Brain, Zero Isolation

Two Channels, One Brain, Zero Isolation

Comments
2 min read
The 429 That Poisoned Every Fallback

The 429 That Poisoned Every Fallback

Comments
2 min read
How to Monitor Background Jobs in Production (and Stop Losing Data)

How to Monitor Background Jobs in Production (and Stop Losing Data)

Comments
7 min read
The Release That Broke Everything

The Release That Broke Everything

Comments
4 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.