Extremely fast Query Engine for DataFrames, written in Rust
arrow
Scalable datastore for metrics, events, and real-time analytics
Search infrastructure for AI
Production-grade Rust-native trading engine with deterministic event-driven architecture
A high-performance observability data pipeline.
📊 Cube Core is open-source semantic layer for AI, BI and embedded analytics
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
TensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation.
Cloud-native search engine for observability. An open-source alternative to Datadog, Elasticsearch, Loki, and Tempo.
Visualize, query, and stream to train on multimodal robotics data.
PRQL is a modern language for transforming data — a simple, powerful, pipelined SQL replacement
🐚 A powerful relational ORM for Rust
Data Agent Ready Warehouse : One for Analytics, Search, AI, Python Sandbox. — rebuilt from scratch. Unified architecture on your S3.
Apache DataFusion SQL Query Engine
Sui, a next-generation smart contract platform with high throughput, low latency, and an asset-oriented programming model powered by the Move programming language
Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
The open-source Observability 2.0 database. One engine for metrics, logs, and traces — replacing Prometheus, Loki & ES.
The live data layer for apps and AI agents. Create up-to-the-second views into your business, just using SQL
High-performance data engine for AI and multimodal workloads. Process images, audio, video, and structured data at any scale
One SQL interface over APIs, files, and live sources — built for agents.
Apache Iggy: Hyper-Efficient Message Streaming at Laser Speed
Restate is the platform for building resilient applications that tolerate all infrastructure faults w/o the need for a PhD.
DORA (Dataflow-Oriented Robotic Architecture) is middleware designed to streamline and simplify the creation of AI-based robotic applications. It offers low latency, composable, and distributed dataflow capabilities. Applications are modeled as directed graphs, also referred to as pipelines.
Official Rust implementation of Apache Arrow
A native Rust library for Delta Lake, with bindings into Python
A portable accelerated SQL query, search, and LLM-inference engine, written in Rust, for data-grounded AI apps and agents.
Drop-in Apache Spark replacement written in Rust, unifying batch processing, stream processing, and compute-intensive AI workloads.
Fastest library to load data from DB to DataFrames in Rust and Python
Parseable is an observability datalake built from first principles.
Apache Mahout - an environment for quickly creating scalable, performant machine learning applications.
Stream your Postgres data anywhere in real-time. Simple Rust building blocks for change data capture (CDC) pipelines.
📺(tv) Tidy Viewer is a cross-platform CLI csv pretty printer that uses column styling to maximize viewer enjoyment.
Apache DataFusion Ballista Distributed Query Engine
The Feldera Incremental Computation Engine
Apache Kafka® compatible broker with S3, PostgreSQL, SQLite, Apache Iceberg and Delta Lake
The Auron accelerator for distributed computing framework (e.g., Spark) leverages native vectorized execution to accelerate query processing
A cloud-native open source distributed time series database with high performance, high compression ratio and high availability.
Apache Iceberg
Lightning fast data version control system for structured and unstructured machine learning datasets. We aim to make versioning datasets as easy as versioning code.
Ergonomic bindings to duckdb for Rust
Intuitive Data Workflows
Rust-based WebAssembly bindings to read and write Apache Parquet data
Scalable graph analytics database powered by a multithreaded, vectorized temporal engine, written in Rust
Local-first ETL/ELT studio: a drag-and-drop visual pipeline designer that compiles to SQL and runs on DuckDB. Tiny desktop app, no servers, git-friendly workspaces.
Analytical database for data-driven Web applications 🪶
A SQL extension for declarative data visualisation based on the Grammar of Graphics.
A single-node analytical database engine with geospatial as a first-class citizen
Nyx is a high fidelity, fast, reliable and validated astrodynamics toolkit library written in Rust and available in Python
OpenData is a collection of open source databases built on a common, object-native storage and infrastructure foundation.
Fusio provides file operations on multiple storages across various async runtimes.