In the post-pandemic era of overnight iteration and change, monolithic traditional databases are now one of the last holdouts of unnecessary operational complexity that wastes time, money and opportunity. Adrian Bridgwater spoke to Cockroach Labs' Peter Guagenti for The Stack to understand how to move forward on the march to reinvent an enterprise's own personal data fabric with a next-gen database.
Software has a half-life. Over time, almost every piece of software (it’s hard to think of an exception) starts to deteriorate, lose robustness and surrender its ability to integrate or be able to operate in the way that it first did when it was initially driven out of the showroom on the day of its release.
Because software quite naturally erodes in this way - and of course we know that hardware does too - the IT industry is perpetually driven to the creation of what it likes to call next-generation technologies. Not always fully representative of an entirely new era in computing, so-called next-gen tech is often a means of expressing compatibility and functionality at a tier that is at least one solid step forward.
While browsers, word processors, entertainment apps and perhaps even games stand up to comparatively long shelf lives, if there is one subset of the total IT fabric most prone to perpetual reinvention it might be databases.
A leg-up past scale-up
Long built to an architectural paradigm designed to enable relational databases to scale-up over their lifespan, this technique has proven useful, but not immortal. Many in the industry now argue that this era has now passed.
They say that many of today’s commonly deployed relational databases simply weren’t built for modern applications.
Why is this so? Because these are databases that extend backwards into the last millennium into that misty period of compute structures before cloud, before Software-as-a-Service, before e-commerce and our contemporary ability to harness Machine Learning (ML) and generative AI. Most of all, these are database functions built before the era of data-intensive applications subject to real-time streaming and massively broad webscale scope and reach.
See also: Bloomberg, Man Group team up to develop open source “ArcticDB” database
In the post-pandemic era of overnight iteration and change, monolithic traditional databases are now one of the last holdouts of unnecessary operational complexity that wastes time, money and opportunity. This is the opinion of Peter Guagenti in his role as chief marketing officer at Cockroach Labs. He asserts that next-gen databases - exemplified by the latest generation of distributed SQL databases - are easier to live and work with every day.
So how do we pick the right one?
“It depends, of course, on the velocity and scale of your application and how much it matters to your business,” said Guagenti, “However, there are three things that cannot be compromised for data-intensive applications, each of which are driving architects and IT decision makers to the latest generation of database technology.”
#1 Performance & scale
One thing that hasn’t changed in databases over the years is the need for high-performance and highly-performant ability to scale. Scaling upwards (and indeed outwards) should be a solid process that does not introduce any ‘brittle’ elements into the database’s operational structure.
Cockroach’s Guagenti reminds us that today’s most important applications handle huge volumes of data, with reads and writes generated by massive numbers of concurrent users, resulting in staggering volumes of data. “Compounding this, if users don’t get responses nearly instantaneously, they go elsewhere [i.e. to other apps, services or sources]. Elastically scalable distributed application design makes operating at this scale almost trivial,” he said.
2# Resilience and correctness
For Guagenti and team, resilience and correctness are key attributes for any next-gen database technology proposition. This is simply because users expect data services to be available right now, always right and always on.
“For example, if a banking application goes down, or if there are discrepancies in balances in a user's account, that’s a fatal problem. With the ability to distribute data across nodes (or datacentres, or clouds) joined with ACID-compliant consistency, distributed SQL databases make both planned and unplanned downtime a thing of the past,” proposed Guagenti.
#3 Data Locality
Thirdly we come to the issue of data locality. We know that organisations across every vertical today need to serve customers in various geographies around the world. This means they want (for the sake of speed) or are required (increasingly by law) to keep the data in close proximity to the user.
Guagenti explains that doing this in a way that requires sharding the database and forking applications increases the cost to operate dramatically. It can represent a potential doubling of operational expenses for each new market you need to support. The advent of globally distributed databases are designed to fundamentally counter and conquer this challenge.
Switching from legacy to new
“Making the database switch isn’t easy. In fact, it’s hard. Probably one of the last things companies want to do. That’s why Oracle just sits there and grows 10% a year through price increases,” suggested Guagenti, somewhere cheekily but perhaps realistically.
No doubt, there's always a cost to switch. There's always a cost to change. There's a momentum cost as you re-skill people. If a business switches from a legacy relational database that requires manual sharding, manual updates (which means lots of scheduled downtime) and building in complex backups and redundancy, the Cockroach CMO claims that distributed SQL databases can eliminate upwards of 70% of a firm’s operational cost when compared to a traditional relational database deployed in the cloud.
See also: 2023’s first mega tech deal is in — and a sign of things to come in the database world
He admits that firms looking to make this leap may need to increase their IT operations for an initial period, but the suggestion here throughout is that this cost is an investment that pays back dividends. On paper if not in practice, a more functional next-gen database enables an organisation to tap into new markets and create new products with new features.
The risk of inertia
“Migrating away from legacy infrastructure isn't without risk, but the greater risk is of inaction,” said Guagenti. “This is where logic must trump momentum. If you already know you need to modernise your data stack for any reason, it’s time to pull the bandaid or sticking-plaster off.”
Speaking from a C-suite position with the word marketing in his job title, Guagenti is clearly not going to hold off on many punches if we can table ideas that promote the type of technology that Cockroach is known for. What is inarguably true whether we go with his general ebullience and evangelism for distributed SQL wonderfulness is that the pre-cloud era had comparatively less of this type of architecture, which - even if it does sound cheesy - does make the new era next-generation.
Cloud is distributed, companies are distributed, supply chains are distributed channels of distribution and users are now more widely geographically distributed, perhaps the argument for consistently distributed database architectures is easier to validate after all.
What are your database challenges and priorities? What has worked and not worked for you? How do you tackle database sprawl driven by every developer and their pet favourite DB? Share your thoughts.