Extremely fast Query Engine for DataFrames, written in Rust
Extremely fast Query Engine for DataFrames, written in Rust
Scalable datastore for metrics, events, and real-time analytics
Search infrastructure for AI
Production-grade Rust-native trading engine with deterministic event-driven architecture
A high-performance observability data pipeline.
📊 Cube Core is open-source semantic layer for AI, BI and embedded analytics
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
TensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation.
Cloud-native search engine for observability. An open-source alternative to Datadog, Elasticsearch, Loki, and Tempo.
Visualize, query, and stream to train on multimodal robotics data.
PRQL is a modern language for transforming data — a simple, powerful, pipelined SQL replacement
🐚 A powerful relational ORM for Rust
Data Agent Ready Warehouse : One for Analytics, Search, AI, Python Sandbox. — rebuilt from scratch. Unified architecture on your S3.
Apache DataFusion SQL Query Engine
Sui, a next-generation smart contract platform with high throughput, low latency, and an asset-oriented programming model powered by the Move programming language
Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
The open-source Observability 2.0 database. One engine for metrics, logs, and traces — replacing Prometheus, Loki & ES.
The live data layer for apps and AI agents. Create up-to-the-second views into your business, just using SQL
High-performance data engine for AI and multimodal workloads. Process images, audio, video, and structured data at any scale
One SQL interface over APIs, files, and live sources — built for agents.
Apache Iggy: Hyper-Efficient Message Streaming at Laser Speed
Restate is the platform for building resilient applications that tolerate all infrastructure faults w/o the need for a PhD.
DORA (Dataflow-Oriented Robotic Architecture) is middleware designed to streamline and simplify the creation of AI-based robotic applications. It offers low latency, composable, and distributed dataflow capabilities. Applications are modeled as directed graphs, also referred to as pipelines.
Official Rust implementation of Apache Arrow
A native Rust library for Delta Lake, with bindings into Python
A portable accelerated SQL query, search, and LLM-inference engine, written in Rust, for data-grounded AI apps and agents.
Drop-in Apache Spark replacement written in Rust, unifying batch processing, stream processing, and compute-intensive AI workloads.
Fastest library to load data from DB to DataFrames in Rust and Python
Parseable is an observability datalake built from first principles.
Apache Mahout - an environment for quickly creating scalable, performant machine learning applications.
Stream your Postgres data anywhere in real-time. Simple Rust building blocks for change data capture (CDC) pipelines.
📺(tv) Tidy Viewer is a cross-platform CLI csv pretty printer that uses column styling to maximize viewer enjoyment.
Apache DataFusion Ballista Distributed Query Engine
The Feldera Incremental Computation Engine
Apache Kafka® compatible broker with S3, PostgreSQL, SQLite, Apache Iceberg and Delta Lake
The Auron accelerator for distributed computing framework (e.g., Spark) leverages native vectorized execution to accelerate query processing
A cloud-native open source distributed time series database with high performance, high compression ratio and high availability.
Apache Iceberg
Lightning fast data version control system for structured and unstructured machine learning datasets. We aim to make versioning datasets as easy as versioning code.
Ergonomic bindings to duckdb for Rust
Intuitive Data Workflows
Rust-based WebAssembly bindings to read and write Apache Parquet data
Scalable graph analytics database powered by a multithreaded, vectorized temporal engine, written in Rust
Local-first ETL/ELT studio: a drag-and-drop visual pipeline designer that compiles to SQL and runs on DuckDB. Tiny desktop app, no servers, git-friendly workspaces.
Analytical database for data-driven Web applications 🪶
A SQL extension for declarative data visualisation based on the Grammar of Graphics.
A single-node analytical database engine with geospatial as a first-class citizen
Nyx is a high fidelity, fast, reliable and validated astrodynamics toolkit library written in Rust and available in Python
OpenData is a collection of open source databases built on a common, object-native storage and infrastructure foundation.
Fusio provides file operations on multiple storages across various async runtimes.
Fluree database library
Protocol and libraries for sending and receiving OpenTelemetry data using Apache Arrow
Grammar of Graphics in Rust
Local-first ETL/ELT studio: a drag-and-drop visual pipeline designer that compiles to SQL and runs on DuckDB. Tiny desktop app, no servers, git-friendly workspaces.
A SQL transformation engine that type-checks your whole pipeline and catches breaking changes before they run — branches, replay, column-level lineage, compile-time contracts, per-model cost. Adapters: Databricks, Snowflake, BigQuery, DuckDB. Single static Rust binary. Apache 2.0.
High-resolution, low-overhead systems and service telemetry
View parquet files online
Oxygen OS is the operating system for AI deployment
EVM transactions querying CLI/TUI powered by Revm
Apache Paimon Rust The rust implementation of Apache Paimon.
A timeseries database created for events, logs, traces and metrics. Speaks the postgres dialect, and stores data in s3 via delta lake protocol
A command-line tool for querying databases
Run Graph Queries with Lance
Real-time data processing/feature engineering tailored for modern AI/ML systems.
DuckLake took Flight. Welcome to SwanLake.
Document, columnar, KV, Graph, Vector, Array - The only all-in-one database that you need.
Postgres protocol frontend for DataFusion
Comparing performance-oriented string-processing libraries for substring search, multi-pattern matching, hashing, edit-distances, sketching, and sorting across CPUs and GPUs in Rust 🦀 and Python 🐍
A Rust-native DuckLake engine built on Apache DataFusion
We're back! Now firing notebooks out of a t-shirt gun.
An AI-native multi-model database unifying SQL, vector, full-text, graph, and sandboxed Python — for transactional, analytical, and agent workloads.
Next Generation Machine Learning, Statistics and Deep Learning in PURE Rust
Sub-millisecond cache for ML/AI workloads. Parquets in, Arrow-Flight out.
Apache Arrow and Polars compatible, Rust-first columnar data library for real-time and systems workloads
Next Generation Database Client
Rust Client for Apache Fluss (Incubating)
Scalable Observability
Drop-in Acceleration of SQL/PromQL queries
A high-performance, low-cost vector database that can run either embedded inside your Python process or as a standalone HTTP service.
Rust crate for Envio's HyperSync client
Open-source streaming SQL engine written in Rust using Apache Arrow and DataFusion. Supports continuous queries, temporal stream joins, tumbling/session windows, and CDC/Kafka connectors. Lightweight, embeddable, and sub-microsecond latency
Uni is a modern, embedded database that combines property graph (OpenCypher), vector search, and columnar storage (Lance) into a single, cohesive engine. It is designed for applications requiring local, fast, and multimodal data access, backed by object storage (S3/GCS) durability.
A benchmark for assessing geospatial SQL analytics query performance across database systems
Offline-first in-memory data grid with CRDT sync and a Rust server
SochDB is a high-performance embedded, ACID-compliant vector database purpose-built for AI agents and memory
KalamDB — a lightweight, real-time, storage-efficient SQL database. Designed for per-user data isolation and scalable performance — ideal for the AI era.
Vibe-coded NIST compatible database in Rust
PostgreSQL Lance Table Extension
A high-performance, in-memory vector database written in Rust, designed for semantic search and top-k nearest neighbor queries in AI-driven applications, with binary file persistence for durability.
A Rust ingester for GreptimeDB, which is compatible with GreptimeDB protocol and lightweight.
Orbit, aka the GitLab Knowledge Graph, is a project that aims to provide a unified context API for AI systems and human users. This project has both a local Knowledge Graph for your code and a backend service for the entire SDLC.
A fast and modern combat parser for Star Wars: The Old Republic written in Rust