loader

Storage Reimagined for a Streaming World

Pravega is about a new storage abstraction — a stream — for continuously generated and unbounded data. A Pravega stream stores unbounded parallel sequences of bytes in a durable, elastic and consistent manner while providing unbeatable performance and automatically tiering data to scale-out storage.

Why Pravega

Distributed messaging systems such as Kafka and Pulsar have provided modern Pub/Sub infrastructure well suited for today’s data-intensive applications. Pravega further enhances this popular programming model and provides a cloud-native streaming infrastructure, enabling a wider swath of applications. Pravega streams are durable, consistent, and elastic, while natively supporting long-term data retention. Pravega solves architecture-level problems that former topic-based systems Kafka and Pulsar have failed to solve, such as auto-scaling of partitions or maintaining high performance for a large number of partitions. It enhances the range of supported applications by efficiently handling both small events as in IoT and larger data as in videos for computer vision/video analytics. By providing abstractions beyond streams, Pravega also enables replicating application state and storing key-value pairs.

The following table compares some of the key features of Pravega to Kafka and Pulsar:

Pravega Kafka Pulsar
Transactions
Event streams
Long-term retention
Durable by default
Auto-scaling
Ingestion of large data such as video
Efficient at high partition counts
Consistent state replication
Key-value tables

Key Features

Exactly-One-v3-210

Exactly-Once Semantics

Ensure that each event is delivered and processed exactly once, with exact ordering guarantees, despite failures in clients, servers or the network.

Auto-Scaling

Auto-Scaling

Unlike systems with static partitioning, Pravega can automatically scale individual data streams to accommodate changes in data ingestion rate.

Distributed-Computing

Distributed Computing Primitive

Pravega is great for distributed computing; it can be used as a data storage mechanism, for messaging between processes and for other distributed computing services such as leader election.

Write-Efficiency

Write Efficiency

Pravega shrinks write latency to milliseconds, and seamlessly scales to handle high throughput reads and writes from thousands of concurrent clients, making it ideal for IoT and other time sensitive applications.

Unlimited-Retention

Unlimited Retention

Ingest, process and retain data in streams forever. Use same paradigm to access both real-time and historical events stored in Pravega.

Durability-210

Durability

Don't compromise between performance, durability and consistency. Pravega persists and protects data before the write operation is acknowledged to the client.

Storage-Efficiency

Storage Efficiency

Use Pravega to build pipelines of data processing, combining batch, real-time and other applications without duplicating data for every step of the pipeline.

Transaction-Support

Transaction Support

A developer uses a Pravega Transaction to ensure that a set of events are written to a stream atomically.

Architecture

qloud
qloud

Solutions and Use Cases

Real-Time-Threat-Detection
Predictive Maintenance for IoT

Harnessing wind power at commercial scale requires a large number of wind turbines distributed over a large area. Each wind turbine generates thousands of data points per second (e.g. temperature, rotation speed, wind direction, energy output).

Real-Time-Billing
Real-time Billing

A typical billing system will collect billable events from a variety of sources such as online purchases, server usage metrics, network usage metrics, electrical meters, and much more. Traditional billing systems process these events once a month to provide monthly bills.

Predictive-Maintenance
Real-time Cybersecurity Threat Detection

To detect cybersecurity threats in real-time, huge volumes of streaming data from servers, network infrastructure, and applications logs must be analyzed in real-time using AI to identify possible threats. Event-driven applications must act quickly

Recent Posts

Latest News

Blogs

​​Pravega Byte Stream Client API 101​

Introduction Pravega is an open-source distributed storage system implementing streams as first-class primitive for storing/serving continuous and unbounded data [1]. A Pravega stream is a durable, elastic, append-only, and

Read More

Presentations

Stream is the New File (English)
Srikanth Satya, Flink Forward Asia: Dec. 2020
Stream is the New File (Chinese)
Teng Yu, Flink Forward Asia: Dec. 2020