Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CMU Databases Systems / Fall 2019 #98

Open
Sunt-ing opened this issue Nov 6, 2021 · 1 comment
Open

CMU Databases Systems / Fall 2019 #98

Sunt-ing opened this issue Nov 6, 2021 · 1 comment

Comments

@Sunt-ing
Copy link
Owner

Sunt-ing commented Nov 6, 2021

https://www.youtube.com/watch?v=oeYBdghaIjc&list=PLSE8ODhjZXjbohkNBWQs_otTrBTrjyohi

@Sunt-ing
Copy link
Owner Author

Sunt-ing commented Nov 7, 2021

01 - Course Introduction & Relational Model

  • Web browser is important but there is no course for how to build it.
  • Database is important but it's hard to work with it correctly.

02 - Advanced SQL

  • SQL-92 is the minimum that a DBMS has to support in order to claim they support SQL.
  • SQL is based on bags (duplicates) not sets (no duplicates).

03 - Database Storage I

  • Still, a Jean Dean list:
    • 0.5 ns L1 Cache Ref
    • 7 ns L2 Cache Ref
    • 100 ns DRAM
    • 150,000 ns SSD
    • 10,000,000 ns HDD
    • ~30,000,000 ns Network Storage
    • 1,000,000,000 ns Tape Archives
  • Why not use mmap in DB?
    • OS know nothing about DB
    • You can give OS some hints by using following commands
      • madvise: Tell the OS how you expect to read certain pages
      • mlock: Tell the OS that memory ranges cannot be paged out.
      • msync: Tell the OS to flush memory ranges out to disk.
    • Problems:
      • What if we allow multiple threads to access the mmap files to hide page fault stalls
      • This works good enough for read-only access. It is complicated when there are multiple writers
    • Which DBMS use mmap?
      • ElasticSearch, ravendb, MonetDB, LevelDB (but no RocksDB)
    • Which DBMS partially use mmap?
      • MemSQL, InfluxDB, MongoDB, SQLite
    • DBMS is better than OS in:
      • Flushing dirty pages to disk in the correct order.
      • Specialized prefetching.
      • Buffer replacement policy.
      • Thread/process scheduling.
  • There are three different notions of "pages" in a DBMS:
    • Hardware Page (usually 4KB)
    • OS Page (usually 4KB)
    • Database Page (512B-16KB)

04 - Database Storage II

  • Blob
    • Some systems allow you to store a really large value in an external file. Treated as a BLOB type.
    • The DBMS cannot manipulate the contents of an external file.
      • No durability protections.
      • No transaction protections.
    • Someone who maintained SQLite said that many users store up to 1 MB blob to SQLite since it's faster. The database already has the file open, rather than fetch the file pointer to the file system and then fetch the blob.
    • Database storage is super expensive since to ensure the data is safe you may use high-end storage.

05 - Buffer Pools + Memory Management

  • Locks and Latches
    • Locks are held for transaction duration. Need to be able to rollback changes.
    • Latches protect the critical sections of the DBMS's internal data structure from other threads. Held for operation duration.
  • Optimization
    • Prefetch (based on what the query wants)
    • The DBMS can also prefetch pages based on a query plan
      • Sequential Scans (for example, we need page 1, page 2, page 5...)
      • Index Scans (similar to the case in B+ tree)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant