Over the years Microsoft researchers have exhibited an unbridled enthusiasm for doing interesting things around data storage. Synthetic DNA? Not quite ready for primetime, but denser than flash memory, and once you make the DNA polymer, it doesn’t consume any energy.
Quartz glass? Using lasers, Microsoft’s researchers managed to store the original Superman film on a coaster-shaped piece of it, boasting that it could be “boiled in hot water, baked in an oven, microwaved, flooded, scoured, demagnetized” without losing data. (Just don’t drop it, OK?)
Now a slightly less avante-garde but more immediately deployable project has just landed. Redmond has open-sourced (under the permissive MIT Licence) a new cache-store called Garnet, which its researchers describe as having “state-of-the-art performance on both Linux and Windows” as tested against Redis, Dragonfly and KeyDB. (Benchmarking details here.*)
It is being deployed by Microsoft in production for Azure Resource Manager among several other projects, Redmond said on March 18.
Garnet’s server can talk to any existing Redis client owing to its use of the latter’s open source RESP protocol. Its storage layer is a customised fork of Microsoft’s own open source FASTER project, with tiered storage support (memory, SSD, and cloud storage), “fast non-blocking checkpointing, recovery, operation logging for durability, multi-key transaction support, and better memory management and reuse.”
Badrish Chandramouli, a partner research manager in the data systems group at Microsoft Research, told The Stack by email that “Garnet has been built from the ground up with performance (especially thread scalability in throughput, and low latency at high percentiles) in mind.”
Asked where he sees Garnet being deployed, Chandramouli said any “existing application that uses Redis (or KeyDB or Dragonfly) as its cache, but needs better throughput, lower latency, reduce costs by hosting fewer shards of the cache-store, needs a cache to be larger-than-memory, i.e., spill data to local disk or SSD. [Or] any new application that is looking for an extremely high performance caching tier to accelerate performance and reduce costs of going to a backend storage server or database.”
Chandramouli has a track record of real innovation here, having led the creation of streaming analytics engine Trill, used in the public-facing Azure Stream Analytics, and more recently FASTER and FishStore, a fast storage and querying layer for flexible-schema data; both used in-house. (FASTER alone has seen over half a million downloads on NuGet.org).
“We would love to get feedback on how the system does for various [other] real-world applications. Also, we have a powerful C# based stored procedure model for custom transactions that users might find interesting. Finally, we see the project as a vehicle for future research innovations such as optimized disk IO, kernel-bypass networking, and vector database application scenarios,” Chandramouli added by email.
Getting started with Garnet
With a typical deployment of this kind of cache-store, an end-user deploys a server that hosts the cache (e.g., Redis, KeyDB, or Dragonfly) and a client library that applications embed, to interact with the server using a wire protocol (RESP in this case). At its most basic level, think of Garnet as an alternative server to such servers, Chandramouli explained.
“You can deploy Garnet, then continue to use your favorite client in your favorite programming language to interact with the new server… To get started, you would simply run Garnet as described in our “getting started” section. Then you would perform your cache operations using your favorite Redis client (there are many, for example, see here for a list),” he said, adding “we like to use StackExchange.Redis for C# applications.)
Microsoft describes Garnet has having “a fast and pluggable network layer” that supports TLS as well as basic access controls; it also “supports a wide range of APIs including raw string, analytical, and object operations described earlier. It also implements a cluster mode with sharding, replication, and dynamic key migration,” a blog explains.
Garnet’s documentation is here. Releases can be found at https://www.nuget.org/packages/Microsoft.Garnet, which contains Garnet as a library for users to self-host in an application. This can be based on GarnetServer application code available here.
The research team is also providing what it admits are “very basic” Dockerfiles on GitHub, adding “we would appreciate contributions to help make our Docker support more comprehensive. Official Docker builds are on our radar for the future.” For more details and testing, see here.
*We generally treat performance benchmarks with a degree of caution. As Redis has earlier highlighted, rivals love to tweak their benchmarks to bump up their performance versus Redis. Here's the link [at bottom] to Microsoft's benchmarking tool for this set of tests.