smallpond#

Smallpond is a lightweight distributed data processing framework. It uses duckdb as the compute engine and stores data in parquet format on a distributed file system (e.g. 3FS).

Why smallpond?#

  • Performance: Smallpond uses DuckDB to deliver native-level performance for efficient data processing.

  • Scalability: Leverages high-performance distributed file systems for intermediate storage, enabling PB-scale data handling without memory bottlenecks.

  • Simplicity: No long-running services or complex dependencies, making it easy to deploy and maintain.