Spacedrive is an open source cross-platform file explorer, powered by a virtual distributed filesystem written in Rust.
Spacedrive is an open source cross-platform file explorer, powered by a virtual distributed filesystem written in Rust.
Scalable datastore for metrics, events, and real-time analytics
A high-performance observability data pipeline.
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
Incremental engine for long horizon agents 🌟 Star if you like it!
Data Agent Ready Warehouse : One for Analytics, Search, AI, Python Sandbox. — rebuilt from scratch. Unified architecture on your S3.
Event streaming platform for agentic AI. Continuously ingest, transform, and serve event streams in real time, at scale.
Simple, Elastic-quality search for Postgres
Apache DataFusion SQL Query Engine
Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
The open-source Observability 2.0 database. One engine for metrics, logs, and traces — replacing Prometheus, Loki & ES.
High-performance data engine for AI and multimodal workloads. Process images, audio, video, and structured data at any scale
DORA (Dataflow-Oriented Robotic Architecture) is middleware designed to streamline and simplify the creation of AI-based robotic applications. It offers low latency, composable, and distributed dataflow capabilities. Applications are modeled as directed graphs, also referred to as pipelines.
Official Rust implementation of Apache Arrow
A native Rust library for Delta Lake, with bindings into Python
An extensible, state-of-the-art framework for columnar compression, and the fastest FOSS columnar file format. Formerly at @spiraldb, now an Incubation Stage project at LFAI&Data, part of the Linux Foundation.
A portable accelerated SQL query, search, and LLM-inference engine, written in Rust, for data-grounded AI apps and agents.
Drop-in Apache Spark replacement written in Rust, unifying batch processing, stream processing, and compute-intensive AI workloads.
Parseable is an observability datalake built from first principles.
The Feldera Incremental Computation Engine
The Auron accelerator for distributed computing framework (e.g., Spark) leverages native vectorized execution to accelerate query processing
A cloud-native open source distributed time series database with high performance, high compression ratio and high availability.
Communication infrastructure for the AI era — one binary, one broker, one storage layer, any protocol
Tonbo is an embedded database for serverless and edge runtimes.
Apache Iceberg
Postgres Foreign Data Wrapper development framework in Rust.
AI-Native & Cloud-Native FS: A high-performance file semantic layer for cloud object storage, integrated with high-speed cache
Intuitive Data Workflows
Rust-based WebAssembly bindings to read and write Apache Parquet data
Scalable graph analytics database powered by a multithreaded, vectorized temporal engine, written in Rust
Local-first ETL/ELT studio: a drag-and-drop visual pipeline designer that compiles to SQL and runs on DuckDB. Tiny desktop app, no servers, git-friendly workspaces.
Analytical database for data-driven Web applications 🪶
A single-node analytical database engine with geospatial as a first-class citizen
GeoArrow in Rust, Python, and JavaScript (WebAssembly) with vectorized geometry operations
Protocol and libraries for sending and receiving OpenTelemetry data using Apache Arrow
Local-first ETL/ELT studio: a drag-and-drop visual pipeline designer that compiles to SQL and runs on DuckDB. Tiny desktop app, no servers, git-friendly workspaces.
Lakehouse native graph engine with git-style workflows
View parquet files online
Apache Paimon Rust The rust implementation of Apache Paimon.
A timeseries database created for events, logs, traces and metrics. Speaks the postgres dialect, and stores data in s3 via delta lake protocol
A command-line tool for querying databases
Run Graph Queries with Lance
DuckLake took Flight. Welcome to SwanLake.
Postgres protocol frontend for DataFusion
On-device property graph database. Schema-as-code. One CLI → One Folder. No Server. Think: DuckDB for graphs.
The power of Rust for the STAC ecosystem
Apache Arrow and Polars compatible, Rust-first columnar data library for real-time and systems workloads
Databricks's Zerobus Ingest SDKs
Manage Multimodal Agentic Context Lifecycle with Lance
High-performance, DSL-free stream processing
Geometry and Geography Support for Apache DataFusion
Rust Client for Apache Fluss (Incubating)
Open-source streaming SQL engine written in Rust using Apache Arrow and DataFusion. Supports continuous queries, temporal stream joins, tumbling/session windows, and CDC/Kafka connectors. Lightweight, embeddable, and sub-microsecond latency
Uni is a modern, embedded database that combines property graph (OpenCypher), vector search, and columnar storage (Lance) into a single, cohesive engine. It is designed for applications requiring local, fast, and multimodal data access, backed by object storage (S3/GCS) durability.
A benchmark for assessing geospatial SQL analytics query performance across database systems
🌿 StoryForge (草苔) — AI导演式小说创作系统。Tauri+Rust驱动的桌面写作软件,集成知识图谱、伏笔追踪、StyleDNA风格引擎、协同编辑、7阶段全自动创作工作流。让AI成为你的创作搭档,越写越懂你。
Multi-driver TUI database client with a built-in MCP server. Six databases (postgres, mysql, sqlite, duckdb, clickhouse, mssql), vim editing, Lua + WASM plugins, schema diff, audit log.
RDF.rs is a Rust framework for working with RDF knowledge graphs.
Offline-first in-memory data grid with CRDT sync and a Rust server
caro: fast Rust CLI that turns natural‑language tasks into a safe POSIX command. Built for macOS (MLX/Metal) with a built‑in model; supports vLLM/Ollama/LM Studio. JSON‑only output, safety checks, confirmation, multi‑step goals, devcontainer included.
SochDB is a high-performance embedded, ACID-compliant vector database purpose-built for AI agents and memory
KalamDB — a lightweight, real-time, storage-efficient SQL database. Designed for per-user data isolation and scalable performance — ideal for the AI era.
Building block library for using Apache Arrow in Rust WebAssembly modules.
Full AI context layer for coding agents — code-map, document RAG, shared memory, web crawl, git history. 300+ languages, one MCP server.
One Memory. All CLIs. Never Compacted. Exact Search.
A Book about how SQLite works. Rewriting SQLite in Rust for Learning and Fun and writing a book I wished I had when started.
From the original MikeOSS, I ported and expanded it as desktop application
PostgreSQL Lance Table Extension
Self-hosted control plane for long-lived AI coding agents — structured ACP timelines, multi-agent team workflows, and remote execution nodes in one surface.
A Rust ingester for GreptimeDB, which is compatible with GreptimeDB protocol and lightweight.