Remote Dictionary Server (Redis) was created by Salvatore Sanfilippo in 2009. It was born out of the necessity to store viewership metrics for a few web sites and display them in real-time in a web page.
It was originally prototyped in TCL. After a successful proof of concept, he rewrote it in C, added a fork-based persistence feature and open-sourced it on GitHub.
The earliest adopters were GitHub and Instagram.
In 2011, Ofer Bengal created Redis Labs, the current lead sponsor company behind Redis. In 2015, Salvatore, joined as an open source development lead.
As of May 2020, it is used by GitHub, Twitter, StackOverflow, Flickr and many others, with adoption heavily driven by the cloud providers and enterprise offerings.
Name | Storage Type | Storage Options | Query types | Extras |
---|---|---|---|---|
Redis |
In-memory NoSQL Database | Strings (Text, Numbers, Datetime, Boolean), Lists, Sets, Hashes, ZSets, Streams, Customizable Types | CRUD Commands Bulk operations Partial transaction support |
Master/Follower replication, Disk persistence, Sharding, Publish/Subscribe, Stored procedures, Pluggable module system |
Memcached |
In-memory Key-Value Cache | Key-Value mappings (Strings) | CRUD Commands | Multithreaded server |
PostgreSQL |
On-disk Relational Database (RDBMS) |
Tables of rows and columns (Text, Numbers, Datetime, Boolean), views over tables, customizable types (XML, JSON, etc) | CRUD Commands Custom stored procedures Full transactional support |
ACID operations, Master/Follower replication, Multi-Master replication, Extensible |
MongoDB |
On-disk NoSQL Database | Tables of schemaless Binary JSON (BSON) documents (Text, Numbers, Datetime, Bool) | CRUD Commands Conditional queries Full transactional support |
ACID operations, Map-reduce support, Master/Follower replication, Sharding, Spatial indices |
Web being the primary use cases
Functionality is closer to a NoSQL database than just an in-memory cache
Everything is in-memory
Sticks to the basics - data structures are standard C implementations
Below are some benchmarks & comparisons, for the caveats refer to the links
More benchmarks on redis.io/topics/benchmarks
JSON Blob | Redis | PostgreSQL | |
---|---|---|---|
Get | 0.53ms | 8.66ms | 16x |
Set | 0.44ms | 8.59ms | 20x |
1mil GET/SET | Redis | Memcached |
---|---|---|
User time | 8.95s | 8.64s |
System time | 20.59s | 19.37s |
Redis Ops/sec | Xeon E5520 (2.27Ghz) |
---|---|
Set | 552,028.75 |
Get | 707,463.75 |
List Push | 767,459.75 |
List Pop | 770,119.38 |
Type | Contains |
---|---|
String | Strings (Encoding Agnostic), Integers (32/64bit), Floats (IEEE 754) |
List | Linked list of Strings |
Set | Unordered collection of unique strings |
Hash | Unordered hash table of key-values |
ZSet (Sorted Set) | Ordered map of string to float, sorted by score |
Stream | Append-only log, different consumers for one data stream (Kafka-like) |
Hyperloglogs | Counts unique items in a space efficient manner (Bloom filter-like) |
Bitmaps | String (char[]) with bit-oriented commands |
Geospatial Indices | Encodes latitude and longitude (ZSet with Geohash algorithm) |
// String Representation (Text, Integer, Float)
{
"key_to_text": "Hello World!",
"key_to_int": 14,
"key_to_float": 3.34,
}
// List of Integers
{
"key_to_list": [1,2,3,4]
}
// Set of Text Strings
{
"key_to_set": {"foo","bar","baz"}
}
// Hashmap Representation
{
"key_to_hashmap": {
"some_text": "hello world",
"a_number": 42
}
}
// ZSet Representation
{
"key_to_highscore": {
"Ivo": 10.99999,
"John": 9.8888,
"Peter": 7.8888
}
}
A common misconception is that inside a hash, you can have another data structure like a list or set. Unfortunately, you can't nest data structures in Redis.
Redis system reliability
Data transmission reliability
2 Shortcomings
Clients can subscribe to channels and listen for published messages on these channels.
TL;DR, this is not the strong side of Redis, for a proper pub/sub system,
you should be looking either at Redis Stream or Apache Kafka.
A few examples
Everything is commands
The entire list is available on redis.io/commands
//Adding a string with key "ivo" and value "1"
SET ivo 1
// Getting the created key:
GET ivo
1
//Appending to a list:
RPUSH mykey a b c d
// Indexing a list:
LINDEX mykey 0
'a'
All commands are atomic, ie. change is visible to all clients immediately.
However, there's partial support for transactions
They are not the same as transactions in relational databases.
It's just a collection of commands that are executed without interruption, wrapped between MULTI and EXEC
MULTI
COMMAND 1
COMMAND 2
....
EXEC
1) Main goal is it to remove race conditions
2) Secondary use case is reduced server and client round trips
The full list is available on redis.io/clients
Python
NodeJS
C++
C#
Lua
Scala
OCaml
C
Java
Rust
Ruby
PHP
The level of client integration speaks volumes about popularity and amount of problems it solves in all aspects of software engineering.
2 ways of storing data on disk, both compressable
If Redis runs out of memory will start using SWAP and performance will degrade.
Fully discussed on redis.io/topics/persistence
1) Master starts a background snapshot and starts holding a backlog of writes since snapshot begin time
2) The follower gets wiped out entirely
3) Follower starts syncing to the snapshot
4) Once follower is synced, master starts sending the writes backlog to the follower
5) Master and follower are up to date
The replication process in a nutshell
Diskless replication
During a resync, the snapshot file (*.rdb) is written to disk then fetched from disk by the replica, if your disk is slow, replication speed suffers. Diskless replication is directly streaming the file over the wire to the replica, skipping the disk, by increasing the replication speed, alleviates load on the master.
Followers can have their own followers resulting in replication chaining
This is useful, if you have too many followers and want to avoid slowing down your master by doing snapshots constantly, as each new follower joining kicks off a creation of a new snapshot. They all sync up to the same master snapshot.
Fully discussed on https://redis.io/topics/replication
Provides high availability for Redis as a distributed system
Sentinel Configuration Advice
Always deploy multiple Sentinel instances:
Fully discussed on https://redis.io/topics/sentinel
Redis Cluster offers automatic sharding across nodes
Increases cluster resiliency in case of node failure
Cache keys on client-side
The server keeps a list of client-request keys, so it knows which clients to publish key invalidation message to upon value change.
Key | Clients caching the key |
---|---|
foo | A, B |
bar | C |
baz | A, D |
Inefficient when many keys with many client connections that fetch millions of keys. It has a large memory footprint on server-side
Keeps an Invalidation Table with limited number of caching slot, to improve server-side memory overhead.
Ranked Popular Modules on https://redis.io/modules
NB! Some of the concepts mentioned are still subject to change, like client-side caching implementation, which is in its preliminary stage.
Redis is significantly better documented than Memcached. If you've found anything you would like to gain in-depth understanding of, go for the official documentation.
Thank you for listening!