StreamNative

Open-Sourcing the Leaderless Log Protocol Behind Ursa

StreamNative · 2026-04-15T14:37:15.611Z

Since launching Ursa and Ursa for Kafka, one question kept coming up: “What are you open-sourcing—and when?” We’ve shared our answer — and it starts with a different approach. Instead of open-sourcing code first, we started with something more fundamental: 👉 the formally verified specification behind Ursa’s storage engine — the Leaderless Log Protocol. Why? Because in the age of AI, code is cheap — design is not. In this deep dive, we cover: 🔹 Why we’re taking a specs-first approach to open source 🔹 How formal verification uncovered a real design bug (after years in production) 🔹 What “harness engineering” means for building reliable distributed systems with AI 🔹 A full working example (S3-Queue) generated directly from the spec 🔹 What this means for the future of building distributed systems This isn’t just about open source — it’s about rethinking how distributed systems are designed, verified, and built in the AI era. 📖 Read the full article below. 👇 💻 Check out the GitHub: https://lnkd.in/gQGyvtHJ We’d love feedback from the community — review it, challenge it, and build on it! #OpenSource #ApacheKafka #AIEngineering #Lakestream #EventStreaming

Software Development

Sunnyvale, California 5,810 followers

StreamNative offers fully-managed cloud-native event streaming and messaging powered by Apache Pulsar.

Discover all 60 employees

About us

StreamNative is the Streaming Intelligence Platform for the AI era. Founded by the original creators of Apache Pulsar, we’ve evolved into a "Lakestream" company—unifying multi-protocol streaming, real-time processing, and event-driven AI on a single cloud-native foundation. - Native Kafka & Pulsar: Run both protocols natively without compromise or rewrites. - 95% Lower TCO: Powered by the award-winning Ursa Engine (VLDB 2025 Best Industry Paper), we eliminate cross-AZ costs and expensive SSDs. - Lakehouse-Native: Stream directly into Apache Iceberg and Delta Lake—no ETL, no connectors, no duplicate storage. - Agent-Ready: The backbone for event-driven LLMs and autonomous AI actions. From mission-critical messaging to massive scale data pipelines, we’re dissolving the boundaries between streaming and the lakehouse.

Website: https://streamnative.io
External link for StreamNative
Industry: Software Development
Company size: 51-200 employees
Headquarters: Sunnyvale, California
Type: Privately Held
Founded: 2019
Specialties: pub/sub, messaging, event streaming, streaming, apache pulsar, apache flink, and stream processing

Locations

Primary

440 N Wolfe Rd

Sunnyvale, California 94085, US

Get directions

Employees at StreamNative

See all employees

Updates

StreamNative

5,810 followers
2d
Report this post
Join us as Aiven, RisingWave and StreamNative team up for an exclusive event diving into the real-time infrastructure behind the next generation of AI! Here is what we’ll be covering: 🎤 Panel: The Future of Kafka + AI – Featuring Sijie Guo (StreamNative), Yingjun Wu (RisingWave), and Filip Yonov (Aiven). ✅ Streaming as the Backbone of Autonomous AI Agents – Sijie Guo breaks down event-driven agent architectures. ✅ Designing Real-Time Systems for the Age of Agents – Yingjun Wu explores architecture patterns and real-time serving. ✅ Live Demo: Your Kafka Topics Already Know What They Are – Filip Yonov shows how Claude can diagnose a production anomaly from raw Kafka topics in under 30 seconds. ✅ OpenSearch + AI: Smarter Search for the Agentic Era – Dmitry Kan shares practical patterns for RAG and real-time indexing. Come talk shop, see live demos, and learn how to build the "nervous system" for autonomous AI. 📅 Save your spot here: https://luma.com/ub9sq0u5

Streaming the Future: Kafka & The Agentic Era · Luma luma.com

Like Comment Share
StreamNative

5,810 followers
3d
Report this post
Autonomous AI agents aren't just reading static data anymore—they need real-time context to act. 🤖⚡ Join our upcoming webinar to explore how streaming data infrastructure powers the next generation of autonomous AI systems. In this session, we’ll cover: 🔹 Why real-time data is critical for agentic AI 🔹 How streaming platforms enable responsive, context-aware agents 🔹 Architectural patterns for building AI-ready data pipelines 🔹 A look at the infrastructure behind modern autonomous systems Whether you're building AI applications, data platforms, or event-driven systems, this webinar will give you a practical foundation for what’s next. 🔗 Save your spot: https://hubs.ly/Q04d2qRZ0
Like Comment Share
StreamNative

5,810 followers
4d
Report this post
🚀 Turning CSVs into real-time data streams—without the heavy lifting. Our partner CocoIndex walks through how to stream CSV data directly into Kafka in real time, unlocking faster, more responsive data pipelines. Powered by StreamNative’s native Kafka service, this approach makes it easy to move from batch to streaming—without the operational overhead. Great example of what modern data ingestion should look like 👇 https://hubs.ly/Q04cTqVn0 #ApacheKafka #RealTimeData #DataStreaming

Live CSV → Kafka with CocoIndex's New Kafka Target Connector cocoindex.io

Like Comment Share
StreamNative

5,810 followers
1w
Report this post
Streaming and the lakehouse have lived in two separate worlds for too long — two copies of the data, two governance models, two operational stacks. Lakestream was our answer to collapsing that divide. This partnership with Starburst is a major proof point. Our Native Kafka service (powered by #Ursa) handles real-time streaming. Starburst's Managed Ingestion and Trino-based engine turn those tables into governed, queryable assets for analytics and AI — the moment the data lands. No custom pipelines. No stream-vs-batch tax. Just Kafka topics becoming AI-ready data. Great work from Kundan Vyas and Ahmed Niyaz pulling this together. Read the full breakdown below. #Kafka #StreamingData #DataEngineering #ApacheIceberg #DataLakehouse #EnterpriseAI #Starburst
Starburst

50,404 followers
1w

We’re excited to announce a new partnership between Starburst and StreamNative to simplify how streaming data becomes ready for AI and analytics. In this post, Ahmed Niyaz and Kundan Vyas walk through how StreamNative Kafka and Starburst Managed Ingestion come together to deliver a production-grade path from real-time streams to Apache Iceberg and from there to immediate analytics and AI. With this approach, teams can move from Kafka topics to governed, queryable data without building and maintaining custom ingestion workflows. If you’re working with Kafka or building real-time data architectures, this is worth a look. Read the full post: https://okt.to/RWekKG #Kafka #StreamingData #DataEngineering #ApacheIceberg #DataLakehouse #EnterpriseAI #Starburst
Like Comment Share
StreamNative

5,810 followers
1w
Report this post
Solving the Kafka Storage Bottleneck 🛠️ Scaling Apache Kafka is powerful, but managing its storage layer and leader bottlenecks can quickly become an operational headache. It’s time to look under the hood. Today, we’re kicking off a 3-part technical deep-dive series into Ursa for Kafka! In Part 1, we break down the fundamental limitations of traditional Kafka architectures—what we call "The Kafka Problem"—and introduce how Ursa fundamentally changes the game with a groundbreaking storage leaderless architecture. If you are a data engineer or architect managing streaming data at scale, this is a must-read. 📖 Dive into Part 1 of 3 here: https://hubs.ly/Q04ctMNW0 Stay tuned for the rest of the series! #ApacheKafka #DataEngineering #DataStreaming #StreamNative #Ursa #DataArchitecture
Like Comment Share
StreamNative

5,810 followers
1w
Report this post
Heading to Current London 2026? Kick off the week with StreamNative, alongside our friends at Factor House and NetApp Instaclustr, at an exclusive Happy Hour in Canary Wharf. Before the sessions and keynotes begin, join us for an evening of: ✨ Great conversations with the streaming & observability community ✨ A relaxed chance to connect with peers across the data streaming ecosystem ✨ Drinks, food, and a fantastic London backdrop No agenda—just good people and good vibes before a busy week. Link to register in the comments. See you there! 🇬🇧
2 Comments

Like Comment Share
StreamNative

5,810 followers
1w Edited
Report this post
Bridging the Gap: From Real-Time Streams to Analytics-Ready Iceberg Tables 🚀 For data teams, the path from raw Kafka streams to a queryable lakehouse is often paved with fragile connectors and manual schema management. Not anymore! We’re excited to highlight our collaboration with Starburst! By combining StreamNative’s cloud-native Kafka service with Starburst Managed Ingestion, we’ve created a friction-free workflow to power the modern lakehouse: ✅ Automated Ingestion: Move data from Kafka topics directly into Apache Iceberg. ✅ Schema Integrity: Native integration with the StreamNative Schema Registry ensures data stays governed as it evolves. ✅ Zero Maintenance: Automated table compaction and maintenance so your SQL queries stay fast. Ready to build an open, AI-ready foundation? Read the full technical breakdown on the Starburst blog: 🔗 https://hubs.ly/Q04chpW90 #DataStreaming #ApacheIceberg #DataLakehouse #Kafka #UFK #StarburstData #StreamNative

StreamNative Kafka and Starburst Managed Ingestion Power the Modern Lakehouse | Starburst

Like Comment Share
StreamNative

5,810 followers
1w
Report this post
🤝 Meet the StreamNative team tomorrow at #DEOF 2026! We are incredibly proud to be sponsoring this year's event, organized by Data Engineer Things, and can't wait to connect with the brightest minds in the data community. If you are attending, make sure to stop by the #StreamNative booth! Come meet our team of experts and chat about the future of data streaming and the lakehouse. Ask us about: ⚡ Ursa for Kafka: Our newly launched Native Apache Kafka 4.2+ Service. 📉 Massive Cost Savings: How teams are reducing streaming costs by up to 95%. ✅ Zero Code Changes: How to instantly turn every Kafka topic into a queryable lakehouse table (Iceberg) with no application rewrites. Whether you want to dive deep into Lakestream architecture, talk about real-time analytics, or just pick up some great swag, we’d love to see you. 📍 Event Details: https://hubs.ly/Q04c7ZFw0 #DataEngineering #DataEngineeringOpenForum #ApacheKafka #Lakestream #Lakehouse #RealTimeAnalytics
Like Comment Share
StreamNative

5,810 followers
1w
Report this post
Since launching Ursa and Ursa for Kafka, one question kept coming up: “What are you open-sourcing—and when?” We’ve shared our answer — and it starts with a different approach. Instead of open-sourcing code first, we started with something more fundamental: 👉 the formally verified specification behind Ursa’s storage engine — the Leaderless Log Protocol. Why? Because in the age of AI, code is cheap — design is not. In this deep dive, we cover: 🔹 Why we’re taking a specs-first approach to open source 🔹 How formal verification uncovered a real design bug (after years in production) 🔹 What “harness engineering” means for building reliable distributed systems with AI 🔹 A full working example (S3-Queue) generated directly from the spec 🔹 What this means for the future of building distributed systems This isn’t just about open source — it’s about rethinking how distributed systems are designed, verified, and built in the AI era. 📖 Read the full article below. 👇 💻 Check out the GitHub: https://lnkd.in/gQGyvtHJ We’d love feedback from the community — review it, challenge it, and build on it! #OpenSource #ApacheKafka #AIEngineering #Lakestream #EventStreaming

Open-Sourcing the Leaderless Log Protocol Behind Ursa StreamNative on LinkedIn

Like Comment Share
StreamNative reposted this
Stanislav Kozlovski

2 Minute Streaming S.L.•34K followers
1w
Report this post
Oxia is a modern ZooKeeper replacement for Kafka-like systems that scales 50x+ more. 🔥 It was released in 2023 by StreamNative and is currently incubating in the CNCF. Why replace ZK? Because writes do NOT scale. • ❌ Cannot scale horizontally. Any write has to reach a quorum - the more nodes you add, the slower it becomes. • ❌ Vertical scaling also has its limits due to the snapshotting process. Here's how Oxia scales: 💡 𝐒𝐡𝐚𝐫𝐝𝐢𝐧𝐠 Systems like ZK/etcd do not partition their data - they store everything in each node. This is intentional because it makes global linearizability (i.e multi-key transactions) easier. But it also limits their scalability - etcd, for example, officially recommends a max state of 2-4GB for optimal performance. The key insight is that Kafka does not need this type of guarantee. Oxia was designed from day one to focus on 𝐩𝐞𝐫-𝐤𝐞𝐲 linearizability (not global), and not care about gobal linearizability. This is the secret that lets it scale so much 💡 Conceptually, this is analogous to how Kafka can scale to many GB/s whereas a single database table cannot. Kafka has many WALs, the DB table has one. 🔥 𝐒𝐜𝐚𝐥𝐞 It's said the first production cut of Oxia let Pulsar scale to 10 million topics (10x the previous limit). It's also said a single Oxia shard leader can deliver ~100k ops/s with a 80/20 read/write mix. Oxia scales linearly by adding more storage nodes and shards, so it's not unthinkable to imagine it scaling to ~1M ops/s across 100s of GBs of state. 🤔 𝐇𝐨𝐰 𝐢𝐭 𝐰𝐨𝐫𝐤𝐬 It resembles Kafka a bit. The system consists of a single Coordinator and many Data Nodes that host Shards. (controllers, brokers, partitions for the Kafka analogy) A Shard has a single leader and multiple followers, depending on its replication factor. A single Data Node can store many shards - it can act as a leader for some and as a follower for others. For a write to be acknowledged, a quorum is required -- i.e a majority of a shard's nodes have to acknowledge the write. 👌 Shards store both a log (WAL) and a LSM-backed key-value store (KV) called Pebble (by CockroachDB). The WAL in the leader first receives the new write, and when its confirmed to be durably replicated - it updates the KV. The Coordinator represents a stateless control plane. It relies on Kubernetes as its external consensus protocol. It maintains the single source of truth for overall cluster membership/topology in a k8s ConfigMap. ☁️ Yes, a simple Kubernetes ConfigMap is used as a linearizable metadata store. It's really etcd under the hood. I find it really elegant in its simplicity. The key reason it works is because the ConfigMap is updated on rare events - only on failures, rebalancing, etc.. The Coordinator continuously health checks the Data Nodes and handles things like shard assignments, shard rebalances, etc, by both persisting the state in the ConfigMap and taking action to reconcile the actual state with the desired one!
7 Comments

Like Comment Share

Browse jobs

Funding

StreamNative 3 total rounds

Last Round

Series A Oct 14, 2021

US$ 23.7M

Investors

Prosperity7 Ventures + 1 Other investor

See more info on crunchbase

StreamNative

Software Development

Sunnyvale, California 5,810 followers

StreamNative offers fully-managed cloud-native event streaming and messaging powered by Apache Pulsar.

About us

Locations

Employees at StreamNative

Lari Hotari

StreamNative•1K followers

David Kjerrumgaard

StreamNative•2K followers

Jun Ma

StreamNative•98 followers

Jung (JK) Kim

StreamNative•554 followers

Updates

Join now to see what you are missing

Similar pages

Redpanda Data

RisingWave

Confluent

Aiven

AutoMQ | Low Latency Diskless Kafka® on S3

Ververica

Streamlio (acquired by Splunk)

Databricks

Lenses.io

Buf

Browse jobs

Advocate jobs

Clinical Research jobs

Recruiter jobs

Business Development Representative jobs

Data Analyst jobs

Specialist jobs

Administrator jobs

Software Engineer jobs

Executive jobs

Account Manager jobs

Engineer jobs

Consultant jobs

Director jobs

Vice President of Software Engineering jobs

Sales Manager jobs

Intern jobs

Manager jobs

Credit Collections Analyst jobs

Customer Success Manager jobs

Customer Director jobs

Funding