Skip to content

Database

November 29, 2023
August 3, 2015

TODO: merge caravan/database/ here
split datebase-mysql, datebase-redis, datebase-tikv-tidb, datebase-graph, datebase-as-a-service, datebase-transactional, datebase-kv, datebase-document, datebase-lightweight, datebase-multimodal, datebase-vector

Database - Wikiwand
Databases 101 - Thomas LaRock
Introduction :: LearnDB

Why We Disable Linux's THP Feature for Databases - DZone Database

Theory

ACID vs. BASE: The Shifting pH of Database Transaction Processing | Big Data Articles | DATAVERSITY
ACID - Wikiwand
Eventual consistency - Wikiwand
Myth: Eric Brewer on Why Banks are BASE Not ACID - Availability Is Revenue - High Scalability -
Building Robust Systems With ACID and Constraints — Brandur Leach
Relational Database ACID Transactions (Explained by Example) - YouTube

XML database - Wikiwand

NoSQL 数据库不应该放弃 Consistency
Don't Get Stuck in the CON Game (V3) - by Pat Helland

Database Theory - MariaDB Knowledge Base
Linearizability versus Serializability | Peter Bailis

VoltDB and the Jepsen Test: What we learned about data accuracy and consistency - VoltDB

CAP Theorem

CAP theorem - Wikiwand
Brewer's CAP Theorem <= :julianbrowne
CAP Twelve Years Later: How the "Rules" Have Changed
Blog | Plan setup, pause-minority, mirrored nodes and the CAP theorem - CloudAMQP, RabbitMQ as a Service
Spanner, TrueTime and the CAP Theorem – Google AI

Episode 227: Eric Brewer: The CAP Theorem, Then and Now : Software Engineering Radio

RUM Conjecture

Read, Update, Memory amplification

RUM Conjecture Series' Articles - DEV Community
The RUM Conjecture | Codementor
EDBT-RUM-Conjecture-public

Data Modeling

Making The Invalid Impossible - Choosing The Right Data Model - DEV Community
Developer: Data Modeling - Neo4j Graph Database (Neo4j)
Database Design - Introduction
A beginner's guide to database table relationships - Vlad Mihalcea

Intro, Data Modeling, Databases | Prisma's Data Guide

Database Keys Made Easy - Primary, Foreign, Candidate, Surrogate, & Many More - YouTube
Third normal form - Wikiwand
An Introduction to Database Normalization | Mike Hillyer's Personal Webspace
The Basics of Database Normalization
Database Normalization Explained - DEV Community 👩‍💻👨‍💻
Database Normalization Explained in Simple English - Essential SQL
Learn Database Normalization - 1NF, 2NF, 3NF, 4NF, 5NF - YouTube

Learn Boyce-Codd Normal Form (BCNF) - YouTube

Boyce-Codd Normal Form:
every attribute should depend on the key, the whole key, and nothing but the key

BCNF is stronger the 3NF but in practice 99.99% of 3NF are BCNF.

The Troublesome Active Record Pattern

DBML

DBML - Database Markup Language | DBML
dbdiagram.io - Database Relationship Diagrams Design Tool

Concurrent Update

How To Build a High-Concurrency Ticket Booking System With Prisma - DEV Community Optimistic Concurrency Control(OCC)

What is SELECT FOR UPDATE in SQL (with examples)? SQL support this with built-in

Use case

How does Stack Overflow do pagination? - Meta Stack Overflow

CRDT

Conflict-free replicated data type - Wikiwand
Readings in conflict-free replicated data types ❗!important
A Look at Conflict-Free Replicated Data Types (CRDT) – Medium
ljwagerfield/crdt: CRDT Tutorial for Beginners (a digestible explanation with less math!)
Summary of CRDTs

SE-Radio Episode 252: Christopher Meiklejohn on CRDTs : Software Engineering Radio
Decentralized Objects with Martin Kleppman | Software Engineering Daily

dominictarr/crdt: Commutative Replicated Data Types for easy collaborative/distributed systems. is this the same?

This replaces operational transformation for collaborative editing.
Operational transformation - Wikiwand
Operational Transformation – OT Explained
Operation Transformation - Google Slides
Operational Transformation or How Google Docs Works - David Chu @CocoaHeads Taipei - YouTube

Comparisons

DB-Engines - Knowledge Base of Relational and NoSQL Database Management Systems ❗!important
Database of Databases - Home
Explore Databases - GitHub Reviews

7 Database Paradigms - YouTube
15 futuristic databases you’ve never heard of - YouTube

Did I Pick The Right Database??? - YouTube

DB-Engines Ranking - popularity ranking of database management systems
DB-Engines Ranking - popularity ranking of document stores

Seven Databases in Seven Days - a Cloud Data Services journey

How To Choose The Right Database? - YouTube
How to Choose the Right Database? - MongoDB, Cassandra, MySQL, HBase - Frank Kane - YouTube
Hitchhiker's guide to database types - DEV Community 👩‍💻👨‍💻

A Comparison of Advanced, Modern Cloud Databases — Brandur Leach
Comparing databases for Vercel and Netlify

Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs HBase vs Couchbase vs Hypertable vs ElasticSearch vs Accumulo vs VoltDB vs Scalaris comparison -- Software architect Kristof Kovacs
CouchDB Vs MongoDB
Couchbase vs CouchDB | Couchbase
NoSQL - MongoDB vs CouchDB - Stack Overflow
NoSQL grudge match: MongoDB vs. Couchbase Server | InfoWorld
nosql - When to use CouchDB over MongoDB and vice versa - Stack Overflow
Riyad Kalla - Google+ - -Should I use MongoDB or CouchDB (or Redis)--…
Riyad Kalla's answer to How does MongoDB compare to CouchDB- What are the advantages and disadvantages of each- - Quora

Benchmarking LevelDB vs. RocksDB vs. HyperLevelDB vs. LMDB
LevelDB vs. RocksDB Comparison

RocksDB log structure merge tree, fast for write and append workload
Innodb B-tree, fast for read and update workload

Storage Engines

algorithm#Database Data Structures

Database engine - Wikiwand
Comparison of MySQL database engines - Wikiwand
MySQL Storage Engines » ADMIN Magazine

Should you move from MyISAM to Innodb ? - MySQL Performance Blog
MySQL Engines - MyISAM vs Innodb

InnoDB - Wikiwand Relational
The physical structure of InnoDB index pages – Jeremy Cole
TokuDB - Wikiwand Relational
Percona TokuDB
TokuDB Introduction

WiredTiger - Wikiwand Document

RocksDB - Wikiwand KV

Database Pages — A deep dive. The Physical storage of rows and… | by Hussein Nasser | Medium

How Discord Stores Trillions of Messages | Deep Dive - YouTube

SQL family

BretFisher/sysbench-docker-hpe: Sysbench Dockerfiles and Scripts for VM and Container benchmarking MySQL

SQLite vs MySQL vs PostgreSQL: A Comparison Of Relational Database Management Systems | DigitalOcean
What are pros and cons of PostgreSQL and MySQL? - Quora
MySQL vs PostgreSQL: Why MySQL Is Superior To PostgreSQL

Uber's migration from PostgreSQL (back) to MySQL
Project Mezzanine: The Great Migration at Uber Engineering - Uber Engineering Blog
Why Uber Engineering Switched from Postgres to MySQL - Uber Engineering Blog
Opening Old Wounds - Why Uber Engineering Switched from Postgres to MySQL - YouTube

Timezones

MySQL :: MySQL 8.0 Reference Manual :: 5.1.13 MySQL Server Time Zone Support
time - Should MySQL have its timezone set to UTC? - Stack Overflow
timezone - How do I set the time zone of MySQL? - Stack Overflow

jdbc:mysql://localhost:3306/dbname?serverTimezone=UTC also works


RDBMS/Transactional Database

Relational database management system - Wikiwand

Most call this categories of DBMS as SQL DB.

Provides ACID consistency.

Why SQL is neither legacy, nor low-level, nor difficult, nor the wrong place for (business) data logic, but is simply awesome!
8 no-bull reasons why SQL Server on Linux is huge for Microsoft | InfoWorld

Using SQL for Lightweight Data Analysis | School of Data - Evidence is Power

prahladyeri/VisualAlchemist: Open source web-based database diagramming and automation tool

SQL Server Performance Achieving Massive Scalability with SQL Server
Advanced scaling strategies: Achieving massive scale with SQL
Is 20M of rows still a valid soft limit of MySQL table in 2023? – Yisheng's blog

A Deep Dive in How Slow SELECT * is - YouTube
How Slow is SELECT * ? (A deep dive) | by Hussein Nasser | Apr, 2023 | Medium

Why Does EVERYONE Still Do This To Their DBs??? - YouTube migration files

SQL

sql

Indexes

Database Indexing for Dumb Developers - YouTube

Database Indexing Explained (with PostgreSQL) - YouTube
Indexing in PostgreSQL vs MySQL - YouTube

The effect of Random UUID on database performance - YouTuberanom UUID causes lots of pages look up and split
How Shopify’s engineering improved database writes by 50% with ULID - YouTube

UUIDs are Popular, but Bad for Performance — Let’s Discuss - Percona Database Performance Blog
UUIDs are Bad for Performance in MySQL - Is Postgres better? Let us Discuss - YouTube
MySQL's data clustering means the primary key affects data I/O; Postgres does not have this issue

Datalog

Datalog - Wikiwand

Datalog: Deductive Database Programming
pyDatalog

google/mangle

R2DBC

R2DBC Reactive Relational Database Connectivity, reactive variant of the JDBC API

Unleash the Power of Reactive Programming with R2DBC and MariaDB

Schemaless SQL

Designing Schemaless, Uber Engineering's Scalable Datastore Using MySQL - Uber Engineering Blog
The Architecture of Schemaless, Uber Engineering's Trip Datastore Using MySQL - Uber Engineering Blog
Using Triggers On Schemaless, Uber Engineering's Datastore Using MySQL - Uber Engineering Blog
Code Migration in Production: Rewriting the Sharding Layer of Uber’s Schemaless Datastore

rbastic/go-schemaless: An open-source sharded database framework based on Uber's Schemaless

Scaling

Rise of Globally Distributed SQL Databases - Redefining Transactional Stores for Cloud Native Era - The Distributed SQL Blog

LONG LIVE SQL - YouTube
When should you shard your database? - YouTube
Horizontal vs Vertical Database Partitioning - YouTube
Avoid premature Database Sharding - YouTube
sharding should be the last resort, consider partitioning first

ProxySQL
Load balancing with ProxySQL

GitHub Engineering Adopts New Architecture for MySQL High Availability
MySQL High Availability at GitHub | GitHub Engineering

MariaDB MaxScale | Database Proxy - Database Security, HA works with MySQL
MariaDB MaxScale | MariaDB datasheet

Postgres-XL | Open Source Scalable SQL Database Cluster
PgBouncer - lightweight connection pooler for PostgreSQL
levkk/pgcat: Meow. PgBouncer rewritten in Rust, with sharding, load balancing and failover support.

Citus Data PostgreSQL extension, no need for application level sharding
Scalable PostgreSQL with Real-Time Analytics | Citus Data
Scaling PostgreSQL with Citus Data's Ozgun Erdogan - Software Engineering Daily
Citus: Scale-Out Clustering and Sharding for PostgreSQL

Presto | Distributed SQL Query Engine for Big Data
Presto: The Definitive Guide
Presto_SQL_on_Everything.pdf
Presto replace Hive, SQL on anything

Vitess | A database clustering system for horizontal scaling of MySQL MySQL wrapper middleware, no need for application level sharding, used by YouTube before Google's acquisition
Vitess: Scaling MySQL with Sugu Sougoumarane - Software Engineering Daily
Vitess: Scaling MySQL Through Distributed Sharding

PlanetScale built on Vitess
PlanetScale: Sharded Database Management with Jiten Vaidya and Dan Kozlowski - Software Engineering Daily

HAProxy - The Reliable, High Performance TCP/HTTP Load Balancer
HAProxy recipes

Deploying Active-Active PostgreSQL on Kubernetes
PostgreSQL: Documentation: 10: Chapter 26. High Availability, Load Balancing, and Replication
How to Set Up PostgreSQL for High Availability and Replication with Hot Standby | Google Cloud Platform Community
Scaling Postgres with Read Replicas & Using WAL to Counter Stale Reads — Brandur Leach
An Easy Recipe for Creating a PostgreSQL Cluster with Docker Swarm

Understanding Database Failover in the Cloud and Across Regions - Heimdall Blog
Understanding Database Failover: Part 2 - Amazon - Heimdall Blog
Understanding Database Failover: Part 3 - MySQL - Heimdall Blog
Understanding Database Failover: Part 4 - PostgreSQL - Heimdall Blog
Database Scaling with Read/Write Split - Heimdall Blog

ClusterControl

ClusterControl | Open Source Database Management System
severalnines/docker: ClusterControl docker image

ClusterControl on Docker | Severalnines

Java issues

Problems with MySQL master/slave allowMasterDownConnections · Issue #625 · brettwooldridge/HikariCP slave down will cause server down
Failover and High availability with MariaDB Connector/J - MariaDB Knowledge Base

Replication Lag

What Causes Replication Lag? | Oracle Learning MySQL Blog
Goodbye Replication Lag! | MariaDB/
How to identify and cure MySQL replication slave lag

Database replication lag | Dries Buytaert

Context aware MySQL pools via HAProxy | GitHub Engineering
Mitigating replication lag and reducing read load with freno | GitHub Engineering

OctoBase

OctoBase - Local-first, yet collaborative database
toeverything/OctoBase: 🐙 OctoBase is the open-source database behind AFFiNE, local-first, yet collaborative. A light-weight, scalable, data engine written in Rust.

MySQL

MySQL
MySQL - Wikiwand

Course introduction — MySQL for Developers — PlanetScale

Top 5 open source tools for MySQL administrators | InfoWorld

# connect to server
mysql -u${user} -p${password} -h ${server} --database ${database}

# within mysql client
use {database};
desc {table};
explain select * from {table};

with Docker

MySQL on Docker | Severalnines from basic to setting up cluster

MySQL on Docker: Swarm Mode Limitations for Galera Cluster in Production Setups | Severalnines
MySQL on Docker: Running ProxySQL as a Helper Container on Kubernetes | Severalnines
MySQL on Docker: Running ProxySQL as Kubernetes Service | Severalnines
MySQL on Docker: Running Galera Cluster on Kubernetes | Severalnines

Basics:

MySQL Docker Containers: Understanding the basics | Severalnines intro to Docker
MySQL on Docker: Building the Container Image | Severalnines building image and pushing to Docker Hub
MySQL on Docker: Single Host Networking for MySQL Containers | Severalnines
MySQL on Docker: Introduction to Docker Swarm Mode and Multi-Host Networking | Severalnines

10 Tips for Building Resilient Payment Systems (2023)
How Shopify’s engineering improved database writes by 50% with ULID - YouTube

Dolt

DoltHub Home |DoltHub MySQL + Git
dolthub/dolt: Dolt – Git for Data

Drop-in replacements

MariaDB.org - Continuity and open collaboration may not be so since it adds proprietary and open source extensions
MariaDB TX | Enterprise Open Source Database Platform (DBMS)
Library - MariaDB Knowledge Base
MariaDB on Vimeo
Sessions | M|18
Sessions | M|17
MariaDB vs MySQL: What is the Difference Between MariaDB and MySQL

Percona Server– An enhanced, drop-in MySQL Replacement

PostgreSQL: The world's most advanced open source database

H2 Database Engine
HSQLDB


NoSQL

sql#JSON support

NoSQL - Wikiwand
NOSQL Databases
NoSQL: Past, Present, Future
Visual Guide To NoSQL Systems
Why SQL Database? - VoltDB
The basics of NoSQL databases — and why we need them
SQL vs NoSQL: The Differences — SitePoint
When SQL Isn’t the Right Answer - Better Programming - Medium
NoSQL Databases: Why You Don’t Need Them

Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open-source and horizontally scalable.

Picking SQL or NoSQL? – A Compose View
SQL vs NoSQL: when to use?
Mathias Meyer - 'Don’t Use NoSQL' on Vimeo
Elasticsearch as a NoSQL Database | Elastic
NoSQL standouts: The best document databases | InfoWorld
NoSQL standouts: The best key-value databases | InfoWorld

NoSQL vs. NewSQL_ Evaluating Database Technologies for 2019 on Vimeo

Usually provides BASE eventual consistency (eventual convergence may be a better term).

MongoDB vs. PostgreSQL vs. ScyllaDB: Tractian’s Experience - The New Stack

Don't Get Stuck in the CON Game (V3) - by Pat Helland

Category in data type/arrangement:

Category in architecture:

The first three supports relationship by a second index lookup, JOIN-like operation of SQL.

SQL is Dead, Hail to Flux – devconnected
Monitoring systemd services in realtime with Chronograf – devconnected

UI Client

DbGate | Open Source (no)SQL Database Client ❗!important
dbgate/dbgate: Database manager for MySQL, PostgreSQL, SQL Server, MongoDB, SQLite and others. Runs under Windows, Linux, Mac or as web application

FastoNoSQL - cross-platform GUI Manager for Redis, Memcached, SSDB, LevelDB, RocksDB, LMDB, Unqlite, ForestDB, Pika, Dynomite and KeyDB databases.
fastogt/fastonosql: FastoNoSQL is a crossplatform Redis, Memcached, SSDB, LevelDB, RocksDB, UnQLite, LMDB, ForestDB, Pika, Dynomite, KeyDB GUI management tool.


Key-Value Database

Redis

Redis
Redis documentation
Get Started with Redis Modules on AWS - Redis
Redis 7.0 Is Near With "Significant Performance Optimizations" - Phoronix

Learn Redis with Free Online Courses | Redis University
RU203: Querying, Indexing, and Full-Text Search | Redis University

Redis In-memory Database Crash Course - YouTube
I've been using Redis wrong this whole time... - YouTube

rbmkio/radish: Desktop client for Redis (Windows, MacOS, Linux)

The Little Redis Book at the time of v3.0.3, for core concept only
AppsInTheOpen

编程技术宇宙 - YouTube Redis animations

Introduction to Redis - DEV Community 👩‍💻👨‍💻
Introduction to Redis: Installation, CLI Commands, and Data Types
An introduction to Redis data types and abstractions – Redis
Redis Crash Course - YouTube
Redis In-Memory Database Crash Course - YouTube

Redis persistence demystified
Scaling a High-traffic Rate Limiting Stack With Redis Cluster — Brandur Leach
How to Use Redis With Python – Real Python
Using Redis with docker and docker-compose for local development a step-by-step tutorial

Redis Sentinel Documentation – Redis monitoring, automatic failover, provides HA
Redis cluster tutorial – Redis distributed data store, automatic failover, data sharding
projecteru/redis-trib.py: Redis Cluster lib in Python

How to use Redis Streams | InfoWorld
How to build a Redis Streams application | InfoWorld
How to build a Redis Streams application | InfoWorld

Introduction · Hydra using Redis as message queue
fetlife/redis-analyzer: Redis Memory Analyzer written in Rust

arq — arq v0.25.0 documentation
samuelcolvin/arq: Fast job queuing and RPC in python with asyncio and redis.

OptimalBits/bull: Premium Queue package for handling distributed jobs and messages in NodeJS.
nodeca/idoit: Redis-backed task queue engine with advanced task control and eventual consistency

twitter/twemproxy: A fast, light-weight proxy for memcached and redis

Our failure story with Redis operator for K8s (+ a brief look at Redis data analysis tools)

Redis 单线程不行了,快来割 VM/ BIO/ IO 多线程的韭菜!(附源码)-InfoQ

KeyDB

KeyDB - The faster Redis Alternative
JohnSully/KeyDB: A Multithreaded Fork of Redis
KeyDB - Database of Databases

KeyDB as a [possible] replacement for Redis - Flant - Medium
Redis Should Be Multi-threaded - John Sully - Medium

Dragonfly

Dragonfly
dragonflydb/dragonfly: A modern replacement for Redis and Memcached
Dragonfly - Database of Databases

Riak

Key Value Database | NoSQL Key Value Database | Riak KV | Basho
Riak - Database of Databases

Badger DB

Golang Key-Value Store - Badger DB | Dgraph
dgraph-io/badger: Fast key-value DB in Go.

LMDB

LMDB: Lightning Memory-Mapped Database Manager (LMDB) B+ tree, faster than log based for small tree size
Lightning Memory-Mapped Database - Wikiwand
Kolab Now Blog: A short guide to LMDB
LMDB: Getting Started
LMDB/lmdb: Read-only mirror of official repo on openldap.org.
LMDB - Database of Databases

Symas Lightning Memory-mapped Database | Symas Corporation

Yugabyte DB

YugaByte DB bult on Postgres, multicloud
Introducing YugaByte DB | YugaByte DB
Architecture | YugabyteDB Docs

YugaByte review: Planet-scale Cassandra and Redis | InfoWorld
Review: YugabyteDB does PostgreSQL proud | InfoWorld

Strong Consistency with YugabyteDB - Vlad Mihalcea

LevelDB

see #PouchDB for higher level API

LevelDB.org
google/leveldb: LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values.
Database of Databases - LevelDB

Node.js LevelDB GitHub org
Resources · Level/levelup Wiki
workshopper/levelmeup: Level Me Up Scotty! An intro to Node.js databases via a set of self-guided workshops.
LevelDB Review (in 18 parts, seriously) « Another Word For It
substack/leveldb-handbook: how to modularly database with leveldb
[r.va.gg] Should I use a single LevelDB or many to hold my data?

syndtr/goleveldb: LevelDB key/value database in Go.
oodrive/leveldb.net: LevelDB for Windows and .NET standard

NodeUp: A Node.js Podcast - fortyeight - the first nodebase show
Poor Man's Firebase: LevelDB, REST, and WebSockets

Ecosystem

LevelUp Ecosystem
Modules · Level/levelup Wiki
[r.va.gg] All the levels!

Level/levelup: A node.js wrapper for abstract-leveldown compliant stores
Level/leveldown: Pure C++ Node.js LevelDB binding serving as the back-end to LevelUP
Level/level-js: An abstract-leveldown compliant store on top of IndexedDB.
Level/memdown: In-memory abstract-leveldown store for Node.js and browsers.

levelgraph/levelgraph: Graph database JS style for Node.js and the Browser. Built upon LevelUp and LevelDB.
levelgraph/levelgraph-jsonld: The Object Document Mapper for LevelGraph based on JSON-LD

mafintosh/hypergraph: Yet another Merkle DAG
substack/level-create-batch: insert a batch of keys if and only if none of the keys already exist
substack/level-lock: in-memory advisory read/write locks for leveldb keys

hxoht/lev: The complete REPL & CLI for managing LevelDB instances.
maxogden/superlevel: a minimalist cli utility for leveldb databases

hxoht/levelui: A GUI for LevelDB management based on atom-shell.
ricardobeat/levelhud: Graphical front-end for exploring data stored in LevelDB.

Presentations

A Real Database Rethink
JavaScript Databases II

Optimizing LevelDB for Performance and Scale (RICON East 2013) - Speaker Deck
How to Cook a Graph Database in a Night
Build your own database with LevelDB and Node.js

PouchDB

PouchDB, the JavaScript Database that Syncs!
CouchDB, LevelDB and browsers compatible

pouchdb/pouchdb: - PouchDB is a pocket-sized database.

Ecosystem

#LevelDB

PouchDB Community
Adapters
Plugins and External Projects

pouchdb-community/pouchdb-load: Load documents into CouchDB/PouchDB from a dumpfile

RocksDB

RocksDB | A persistent key-value store | RocksDB forked from LevelDB 1.5, allows multiple writers, reduces stalls and write amplification

Getting started | RocksDB
Home · facebook/rocksdb Wiki
RocksDB Basics · facebook/rocksdb Wiki
Features Not in LevelDB · facebook/rocksdb Wiki
Administration and Data Access Tool · facebook/rocksdb Wiki ldb CLI tool

Under the Hood: Building and open-sourcing RocksDB
Facebook rocks an open source storage engine for MySQL | InfoWorld

facebook/rocksdb: A library that provides an embeddable, persistent key-value store for fast storage.
warrenfalk/rocksdb-sharp: .net bindings for the rocksdb by facebook
Welcome to python-rocksdb’s documentation! — python-rocksdb documentation

MyRocks: A space- and write-optimized MySQL database | Engineering Blog | Facebook Code
facebook/mysql-5.6: Facebook's branch of the Oracle MySQL v5.6 database. This includes MyRocks.

SEHException thrown after disposal and reopen · Issue #5 · curiosity-ai/rocksdb-sharp C# binding's library may crash on CPU without AVX2 (now fixed)

YottaDB

YottaDB | Rock Solid. Lightning Fast. Secure.
Documentation - YottaDB

YottaDB · GitLab

TiKV

#TiDB

GunDB

GUN - Graph Database
A key-value database written in Node.js that supports:

amark/gun: A realtime, decentralized, offline-first, graph database engine.
The Changelog #236: GunDB, Venture Backed and Decentralized with Mark Nadal | Changelog
232 JSJ GunDB and Databases with Mark Nadal

Skytable

Skytable: A free and open-source realtime NoSQL database for building modern apps
Introduction | Skytable Documentation
skytable/skytable: Skytable is an extremely fast, secure and reliable real-time NoSQL database with automated snapshots and TLS


Document Database

Why the Document Model Is More Cost-Efficient Than RDBMS - The New Stack

RDBMS not suitable for high-velocity online transaction processing (OLTP) workloads for its data being normalized.
But modern processors are powerful enough that we get away with it anyways.
RDBMS solutions rely on cheap CPU cycles to enable efficient solutions. NoSQL solutions rely on efficient data models to minimize the amount of CPU required to execute common queries.

CouchDB

Apache CouchDB
#bbuzz: Jan Lehnard "The CouchDB Implementation" - YouTube

robertkowalski/learnyoucouchdb: Learn you CouchDB for great good!

RethinkDB

RethinkDB: the open-source database for the realtime web
Frequently asked questions - RethinkDB

RethinkDB - Wikiwand
rethinkdb/rethinkdb: The open-source database for the realtime web.
Ten-minute guide with RethinkDB and JavaScript - RethinkDB
Jepsen: RethinkDB 2.1.5
Jepsen: RethinkDB 2.2.3 reconfiguration

Originally designed as database for SSD, then re-targets realtime application
Push model, live query forms a pub/sub stream (changefeeds)
Document store with table joins
ACID guarantee, durable by default (commit to disk before ack)
Value consistency, unless client explicitly requesting stale data
ReQL, python like, chain-able query

The company behind RethinkDB shutdown in 2016-10. CNCF bought the rights to the open source project and donated it to The Linux Foundation in 2017-02.
RethinkDB is shutting down - RethinkDB
RethinkDB: why we failed
RethinkDB joins The Linux Foundation - RethinkDB

Rob Conery | RethinkDB 2.0 Is Amazing
Rob Conery | Optimizing a Big RethinkDB Query, and a Correction

#114: RethinkDB with Slava Akhmechet - Changelog
#181: RethinkDB, Databases, and the Realtime Web With Slava Akhmechet - Changelog
SE-Radio Episode 243: RethinkDB with Slava Akhmechet : Software Engineering Radio

Horizon

Horizon is a JavaScript backend built with RethinkDB

Xodus

JetBrains/xodus: JetBrains Xodus is a Java transactional schema-less embedded database used by JetBrains YouTrack and JetBrains Hub.
Home · JetBrains/xodus Wiki

How to use the Xodus database in Kotlin applications

RavenDB

RavenDB - ACID NoSQL Document Database
ravendb/ravendb: A linq enabled document database for .NET

Elasticsearch

elastic-elasticsearch

Crate.io

Crate.IO

SQL over Elasticsearch, works for both operation database and analytic (OLTP to OLAP).
CrateDB packs NoSQL flexibility, SQL familiarity | InfoWorld
Containerized deployment, meant for scale.

MongoDB

mongodb

ToroDB

MongoDB protocol and APIs backed by PostgreSQL.

ToroDB | ToroDB
torodb/server: ToroDB Server is an open source NoSQL database that runs on top of a RDBMS. Compatible with MongoDB protocol and APIs, but with support for native SQL, atomic operations and reliable and durable backends like PostgreSQL
torodb/stampede: The ToroDB solution to provide better analytics on top of MongoDB and make it easier to migrate from MongoDB to SQL Transform your NoSQL data from a MongoDB replica set into a relational database in PostgreSQL.

torodb/mongowp: Mongo Wire Protocol layer to create server applications Wraps data source as Mongo compatible server

EdgeDB

EdgeQL/GraphQL and REST backed by PostgreSQL.

EdgeDB—The next generation database
edgedb/edgedb: A graph-relational database with declarative schema, built-in migration system, and a next-generation query language


Columnar

Column vs Row Oriented Databases Explained - YouTube

Apache HBase

Apache HBase – Apache HBase™ Home
Apache HBase - Wikiwand
implements Google's BigTable with Hadoop and HDFS

Configuring and deploying HBase [Tutorial] | Packt Hub
How to interact with HBase using HBase shell [Tutorial] | Packt Hub

Apache Cassandra

The Apache Cassandra Project
Apache Cassandra - Wikiwand

Top 5 reasons to use Apache Cassandra Database | IT Svit Blog

#bbuzz: Sylvain Lebresne "On Cassandra's evolutions" - YouTube
How Discord Stores Trillions of Messages | Deep Dive - YouTube

DataStax Academy: Free Cassandra Tutorials and Training

Running Cassandra in Kubernetes: challenges and solutions

Scylla

ScyllaDB
Scylla (database) - Wikiwand
Cassandra compatible with higher throughputs and lower latencies

scylladb/scylla: NoSQL data store using the seastar framework, compatible with Apache Cassandra

Scylla Care-Pet Example | ScyllaDB Docs
scylladb/care-pet: Care Pet IoT ScyllaDB example
Build your First ScyllaDB Application: New Rust, Python & PHP Tutorials - ScyllaDB

Cassandra Compliant ScyllaDB with Dor Laor | Software Engineering Daily
Column store, without join, no atomic transaction (no 2 phase commit, not strongly consistent), schemaful, highly scalable, CQL
Acknowledge client after reaching quorum in cluster
Append only, fast write in expense of read
Java tuning and GC pause
Sharding per core, localize memory and lockless

Seastar - Seastar
C++ framework for high-performance server applications on modern hardware

BigTable

Bigtable - Scalable NoSQL Database Service | Google Cloud
Bigtable - Wikiwand

Bigtable: A Distributed Storage System for Structured Data – Google AI


Full-text search

mongodb#Text Search

SQLite FTS5 Extension full-text search

Lyra

Lyra
LyraSearch/lyra: 🌌 Fast, in-memory, typo-tolerant, full-text search engine written in TypeScript.

Lyra: Disrupting full Text Search industry with JavaScript - YouTube


In-memory/Lite Database

#LevelDB

In-memory database - Wikiwand

Lightweight javascript in-memory database: LokiJS Mongo API
techfort/LokiJS: javascript embeddable / in-memory database
LokiJS-Forge/LokiDB: blazing fast, feature-rich in-memory database written in TypeScript LokiDB is the official successor of LokiJS, but updated less frequently than LokiJS

louischatriot/nedb: The JavaScript Database, for Node.js, nw.js, electron and the browser not actively maintained, Mongo API
Database of Databases - NeDB

TerminusDB an open-source in-memory document graph database
TerminusDB Internals - Part 1: Smaller, Faster, Stronger
TerminusDB Internals - Part 2: Change is Gonna Come

Galaxy - Parallel Universe in-memory data grid for horizontal scaling
SpaceBase - Parallel Universe a real-time spatial database

node.js - What (in_memory) graph DB if modeling data is focused - Stack Overflow


Lightweight Database

Single file/folder, cross-platform, in process database with persistence.

KV:

Database of Databases - LMDB
Database of Databases - LevelDB
Database of Databases - RocksDB
Database of Databases - BadgerDB
Database of Databases - ForestDB

Database of Databases - Sled
sled - Rust
KodrAus/rust-csharp-ffi: An example Rust + C# hybrid application

SQL:

sqlite
DuckDB - An in-process SQL OLAP database management system
Firebird: The true open source database for Windows, Linux, Mac OS X and more
LiteDB :: A .NET embedded NoSQL database

Bolt

boltdb/bolt: An embedded key/value database for Go. 😴inactive
etcd-io/bbolt: An embedded key/value database for Go. active fork
asdine/storm: Simple and powerful toolkit for BoltDB


Graph Database

Why relationships are cool but "join" sucks
10 good reasons to use graph databases - NaNLABS
Why you should use a graph database | InfoWorld
What is a graph database? A better way to store connected data | InfoWorld
On Graph Databases | The Backend Engineering Show - YouTube

Dgraph

Dgraph — A Distributed, Fast Graph Database

eBay Akutan

Akutan: A Distributed Knowledge Graph Store
eBay/akutan: A distributed knowledge graph store archived

Apache TinkerPop

The Standard API for interacting with GraphDB.

Apache TinkerPop
TinkerPop3 Documentation

Neo4j

Neo4j: The World's Leading Graph Database
Review: Neo4j supercharges graph analytics | InfoWorld
Learn to Build Graph Databases with Neo4j (Full Course)

Memgraph

Memgraph - Open Source Graph Database
memgraph/memgraph: Open-source graph database, built for real-time streaming data, compatible with Neo4j.

FlockDB

twitter-archive/flockdb: A distributed, fault-tolerant graph database

Crux

Crux general purpose database with bitemporal SQL & Datalog
JUXT Blog - Kotlin Adventures With Crux

AnzoGraph

AnzoGraph™
AnzoGraph review: A graph database for deep analytics | InfoWorld
AnzoGraph: A W3C Standards-Based Graph Database – Towards Data Science

TigerGraph

Home - TigerGraph free for non-commercial use
TigerGraph review: A graph database designed for deep analytics | InfoWorld


Vector Database

WTF Is a Vector Database: A Beginner's Guide! - DEV Community
The Power of Vector Databases For Knowledge Search - YouTube
Why are they suddenly so popular? - YouTube
Vector Databases and the Future of AI-powered Search - Sam Partee - YouTube Redis Vector Search

All You Need to Know about Vector Databases and How to Use Them to Augment Your LLM Apps | by Dominik Polzer | Sep, 2023 | Towards Data Science
Which Vector Database Should I Use? A Comparison Cheatsheet | by Navid Rezaei | Medium
Vector Database Comparison Cheatsheet - Google Sheets

Document-Oriented Agents: Vector Databases, LLMs, Langchain, FastAPI, and Docker | Towards Data Science
Explaining Vector Databases in 3 Levels of Difficulty | by Leonie Monigatti | Towards Data Science
How to implement a vector database for AI - LogRocket Blog

【上集】向量数据库技术鉴赏 - YouTube
【下集】向量数据库技术鉴赏 - YouTube

Top lists

Top 5 Vector Database Solutions for Your AI Project - The New Stack
The 5 Best Vector Databases | A List With Examples | DataCamp
6 open-source Pinecone alternatives for LLMs
12 Vector Databases For 2023: A Review
Vector Database - A Comprehensive Guide | by Navid Rezaei | Towards Data Science
Best Vector Database Software in 2023 | G2
Vector databases - a look at the AI database market with a comprehensive comparison matrix

Qdrant

Qdrant - Vector Database
Qdrant Documentation - Qdrant Similarity search
qdrant/qdrant: Qdrant - High-performance, massive-scale Vector Database for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/

Milvus

Vector database - Milvus
Milvus documentation Similarity search
milvus-io/milvus: A cloud-native vector database, storage for next generation AI applications

Weaviate

Welcome | Weaviate - vector database
Introduction | Weaviate - vector database Similarity search
weaviate/weaviate: Weaviate is an open source vector database that stores both objects and vectors, allowing for combining vector search with structured filtering with the fault-tolerance and scalability of a cloud-native database, all accessible through GraphQL, REST, and various language clients.

Vespa

Vespa - the big data serving engine
Vespa Documentation nearest neighbor search, approximate nearest neighbor search
vespa-engine/vespa: The open big data serving engine. https://vespa.ai

Faiss

Welcome to Faiss Documentation — Faiss documentation similarity search, approximate similarity search
facebookresearch/faiss: A library for efficient similarity search and clustering of dense vectors.

Chroma

Chroma
chroma-core/chroma: the AI-native open-source embedding database

NucliaDB

Nuclia vector database
nuclia/nucliadb: NucliaDB, The vector database optimized for documents and video search

LlamaIndex

LlamaIndex 🦙
run-llama/llama_index: LlamaIndex (formerly GPT Index) is a data framework for your LLM applications

Pinecone

Vector Database for Vector Search | Pinecone
【人工智能】爆肝万字介绍向量数据库和 Pinecone | 向量搜索的演化过程 | LLM 是人类的大脑,向量数据库就是海马体 | Pinecone 的发展历程 | Pinecone 直接和潜在竞争对手有哪些 - YouTube


NewSQL

datebase-postgresql

NewSQL - Wikiwand
NoSQL vs. NewSQL: Choosing the Right Tool - VoltDB

NoSQL Is Dead
Thank You for Your Help NoSQL, but We Got It From Here - DZone Database

NewSQL are relational databases that scales easily horizontally
What is Distributed SQL? - The Distributed SQL Blog
Distributed SQL vs. NewSQL - The Distributed SQL Blog

Better performance than RMDBS
ACID guarantee of RMDBS
Better scaling strategy than RMDBS
=> Performance and Scale Without Compromise

VoltDB

In-Memory Database | VoltDB a NewSQL database that combines the scale + performance of NoSQL with immediate consistency + ACID transactions
VoltDB - Wikiwand

Jepsen: VoltDB 6.3

VoltDB and In-Memory Databases with John Hugg - Software Engineering Daily
Episode 199: Michael Stonebraker on Current Developments in Databases : Software Engineering Radio

Intelligent Real-Time Decisions with VoltDB and Apache Kafka

Row based, each row is stored as a record of proprietary format on storage
adjacent rows are stored consecutively on block device
Write ahead lock to support rollback and crash recovery
Query language of choice is SQL
B-tree index
Query optimizers and query executor
Multithreaded, lock b-tree

HA is a byproduct of scaling out to multiple node

X record level locking
Multiversion concurrency control
Optimiztic, write to new location, assuming transaction is successful, resolve if conflict exist
timestamp ordering
Require time sync
X lock for crash recovery
Command logging, slower recovery
HA by default, recovery from command log is rare case
X buffer pool
not in main memory database
X multithreading
Sharding with cores
Latchfree data structure

Critics on NoSQL:

Hdfs, map reduce, hive and pig (sql like)
Hive long running job, resume interrupted computation

MemSQL

MemSQL: The Database For Real-Time Applications
MemSQL - Wikiwand

CockroachDB

Cockroach Labs
Cockroach Labs - Wikiwand

CockroachDB is a cloud-native SQL database for building global, scalable cloud services that survive disasters.

Open source SQL database CockroachDB hits 1.0 | InfoWorld
Be Flexible & Consistent: JSON Comes to CockroachDB | Cockroach Labs

TiDB

Home| PingCAP
TiDB | PingCAP
pingcap/tidb: TiDB is a distributed HTAP database compatible with the MySQL protocol

TiDB Academy | PingCAP
How we build TiDB| PingCAP
TiDB 社区技术月刊 | TiDB Books

TiDB Operator 1.0 GA: Database Cluster Deployment and Management Made Easy with Kubernetes | TiDB
How to save time with TiDB | Opensource.com
5 key differences between MySQL and TiDB | Opensource.com
Implementing Distributed Transactions the Google Way: Percolator vs. Spanner - The Distributed SQL Blog

Kubernetes Podcast from Google: Episode 121 - TiKV, TiDB and PingCAP, with Ed Huang
TiKV | A distributed transactional key-value database contrinbuted to CNCF

TiDB 的后花园

TiKV

pingcap/tikv: Distributed transactional key value database powered by Rust and Raft uses Rust, Raft, RocksDB
A Deep Dive into TiKV| PingCAP
A TiKV Source Code Walkthrough - Raft in TiKV| PingCAP
RocksDB in TiKV| PingCAP

TiKV 源码初探

TiKV - building a distributed key-value store with Rust A transactional key-value store powered by … - YouTube

Google Cloud Spanner

Cloud Spanner | Automatic Sharding with Transactional Consistency at Scale | Google Cloud
Spanner (database) - Wikiwand

Google's Cloud Spanner: how does it stack up? | ZDNet
Spanner vs. Calvin: Distributed Consistency at Scale


Multi Model

Data modeling with multi-model databases - O'Reilly Radar
SE-Radio Episode 353: Max Neunhoffer on Multi-model databases and ArangoDB : Software Engineering Radio

SurrealDB

written in Rust; SQL; ACID; schemaless or schemafull; relational, graph and document

SurrealDB | The ultimate serverless cloud database
surrealdb/surrealdb: A scalable, distributed, collaborative, document-graph database, for the realtime web
SurrealDB in 100 Seconds - YouTube
Beyond Surreal? A closer look at NewSQL Relational Data - YouTube
Rust Powered Database SurrealDB (It's Pretty Ambitious) - YouTube
Getting started with SurrealDB!! Future of cloud databases (maybe)? - YouTube

OrientDB

OrientDB - Distributed Graph/Document Multi-Model Database
OrientDB - Wikiwand

Multi-model database supporting graph, document, key/value, and object models.
Relationships are managed by graph.

Why OrientDB | Open Source NoSQL Multi-model Database | OrientDB
OrientDB - Getting Started | Udemy

ArangoDB

ArangoDB - highly available multi-model NoSQL database
ArangoDB - Wikiwand
Document, Graph, KV
Supports JavaScript (V8 Engine)

Data modeling with multi-model databases - O'Reilly Radar

FoundationDB

FoundationDB | Home
Announcing The FoundationDB Record Layer SQL/FoundationDB

FoundationDB
apple/foundationdb: FoundationDB - the open source, distributed, transactional key-value store

CosmoDB

Azure Cosmos DB – Globally Distributed Database Service (formerly DocumentDB) | Microsoft Azure
Cosmos DB - Wikiwand
Introduction to Azure Cosmos DB | Microsoft Docs

multi data model, multi API, multi consistencies database as a service by Microsoft

FaunaDB

Fauna | The data API for client-serverless applications
Define a GraphQL schema and it will handle the rest
distributed ACID document DB

Fauna is rethinking the database with Evan Weaver, Co-founder and CTO at Fauna (The Changelog #461) |> Changelog
FaunaDB Basics - The Database of your Dreams - YouTube


Time Series

What Are Time Series Databases, and Why Do You Need Them? - The New Stack
4 Best Time Series Databases To Watch in 2019 – devconnected
Time Series Analysis For Beginners - Towards Data Science

Time series database is also good for logging

Time Series Analysis Introduction — A Comparison of ARMA, ARIMA, SARIMA Models | by Destin Gong | Nov, 2022 | Towards Data Science

Time Series Database | NoSQL Time Series Database | Riak TS | Basho
OpenTSDB - A Distributed, Scalable Monitoring System

Timescale | an open-source time-series SQL database optimized for fast ingest, complex queries and scale. Postgres extension
Solving one of PostgreSQL's biggest weaknesses. - YouTube

LinkedIn 開源時間序列預測函式庫 Greykite | iThome
Greykite: A flexible, intuitive, and fast forecasting library | LinkedIn Engineering
linkedin/greykite: A flexible, intuitive and fast forecasting library

Greptime: Cloud-scale, Fast and Efficient Time Series Data Infrastructure | Greptime
GreptimeTeam/greptimedb: GreptimeDB, an open-source, cloud-native, distributed time-series database.

Apache IoTDB

Telegraf Open Source Server Agent | InfluxData

elastic-elasticsearch

InfluxDB

Home InfluxDB | InfluxData
InfluxDB - Wikiwand
influxdata/influxdb: Scalable datastore for metrics, events, and real-time analytics
InfluxDB 2.0 is cloud based, the battle tested features goes to the open source version, more like a downstream
IOX (InfluxDB 3.0?) separated the DB part as open source product,the control plane is the closed source commercial part

The Definitive Guide To InfluxDB In 2019 – devconnected
Getting Started with Python and InfluxDB – The New Stack

InfluxData Documentation

Query data stored in InfluxDB | InfluxDB Cloud Documentation
Use the Flux Visual Studio Code extension | InfluxDB Cloud Documentation

InfluxDB (TICK Stack) — Part1. Overview | by Nidhin kumar | CodingTown | Medium
InfluxDB (TICK Stack) — Part2. Overview | by Nidhin kumar | CodingTown | Medium

M3DB

M3: Open Source Metrics Engine wire compatible with InfluxDB, built for Prometheus and Grafana

Real Time

RethinkDB
GunDB
MemSQL
VoltDB

Introduction · RxDB - Documentation

Fluid Framework
Solving real time collaboration using Eventual Consistency

Druid | Database for modern analytics applications
Introduction to Apache Druid | Apache® Druid

AIDB

Machine Learning In Your Database Using SQL
mindsdb/mindsdb: A low-code Machine Learning platform to help developers build AI solutions

Hacky DB

mapbox/hubdb: a github-powered database
jadeallencook/gDoc.js: Use Google Spreadsheets as your CMS
Sheetsee.js Uses Google Spreadsheets

DBaaS

PlanetScale Pricing MySQL-compatible, with free tier

Pricing | Cockroach Labs distributed SQL, with free tier

Pricing | Railway can host SQL, Redis, Mongo, $5 credit per month

Turso | Pricing SQLite on the edge, generous free tier3

Pricing and Plans | Fauna distributed ACID document DB

Neon — Serverless, Fault-Tolerant, Branchable Postgres built on Postgres

Pricing | MongoDB with free tier

Upstash: Serverless Data for Redis® and Kafka® with free tier, cheap

Pricing & fees | Supabase Firebase alternative, with free tier

Fauna | Pricing generous free tier

The open-source alternative to Vercel Storage | JavaScript in Plain English
Vercel Storage | Vercel Docs
Vercel Postgres | Vercel Docs powered by Neon
Vercel KV | Vercel Docs Redis compatible KB store, replacible by Upstash
Vercel Blob | Vercel Docs S3 compatible storage powered by Cloudflare R2