Skip to content

Some bunch of test scripts to generate a SQLite DB with 1B rows in fastest possible way

License

Notifications You must be signed in to change notification settings

avinassh/fast-sqlite3-inserts

Repository files navigation

Fast SQLite Inserts

To find out the fastest way to create an SQLite DB with one billion random rows.

Read this blog post for the more context - Towards Inserting One Billion Rows in SQLite Under A Minute

Leaderboard

(for 100M insertions)

Variant Time
Rust 23 seconds
PyPy 126 seconds
CPython 210 seconds

Current Benchmark

Python

These are the current fastest CPython and PyPy numbers.

$ ./bench.sh

Sat May  8 19:42:44 IST 2021 [PYTHON] running sqlite3_opt_batched.py (100_000_000) inserts
      517.53 real       508.24 user         7.35 sys

Sat May  8 20:03:04 IST 2021 [PYPY] running sqlite3_opt_batched.py (100_000_000) inserts
      159.70 real       153.46 user         5.81 sys

Rust

These are the current fastest Rust numbers

Mon Nov 22 18:47:26 IST 2021 [RUST] basic_batched.rs (100_000_000) inserts

real   0m23.826s
user   0m21.685s
sys    0m2.057s

Mon Nov 22 18:47:50 IST 2021 [RUST] threaded_batched.rs (100_000_000) inserts

real   0m23.070s
user   0m27.512s
sys    0m2.465s

In Memory

Instead of writing to disk, I used a :memory: DB, these are the numbers

Mon May 10 17:40:39 IST 2021 [RUST] basic_batched.rs (100_000_000) inserts
       31.38 real        30.55 user         0.56 sys

Mon May 10 17:39:39 IST 2021 [RUST] threaded_batched.rs (100_000_000) inserts
       28.94 real        45.02 user         2.03 sys

Busy loop time

The amount of time these scripts were taking in just to run the for loops (and no SQL insertion)

$ ./busy.sh

Sun May  9 13:16:01 IST 2021 [PYTHON] busy_loop.py (100_000_000) iterations
      351.14 real       347.53 user         3.39 sys

Sun May  9 13:21:52 IST 2021 [PYPY] busy_loop.py (100_000_000) iterations
       81.58 real        77.73 user         3.80 sys

Sun May  9 13:23:14 IST 2021 [RUST] busy.rs (100_000_000) iterations
       17.97 real        16.29 user         1.67 sys

Sun May  9 13:23:32 IST 2021 [RUST] threaded_busy.rs (100_000_000) iterations
        7.18 real        42.52 user         7.20 sys

Community Contributions

PR Author Result
#2 captn3m0 Reduced the CPython running time by half (from 7.5 minutes to 3.5 minute)
#12 red15 saved 2s from Rust's running time (bringing it to 30s)
#19 kerollmops saved 5s from Rust's running time (bringing it to 23s)

Contributing

All contributions are welcome. If you have any ideas on increasing the performance, feel free to submit a PR. You may also check the current open issues to work on.

License

Released under MIT License. Check LICENSE file more info.