• bobburger@fedia.io
    link
    fedilink
    arrow-up
    4
    ·
    8 months ago

    I love duckDB, my usual workflow is:

    • initially read my data from whatever source (CSV, relational database somewhere, whatever)
    • write it to one or more parquet files in a directory
    • tell duckdb that the directory is my data source

    Then duckdb treats the directory just like a databese that you can build indexes on, and since they’re parquet files they’re hella small and have static typing. It was pretty fast and efficient before, and duckdb has really sped up my data wrangling and analysis a ton.