What are your dreams when you launch an application? To have millions of users, to deliver something that meets a need, to earn money? Or all three of those things? When you launch, you join the 38,000 other apps starting up that month on Apple’s App Store and the 110,000 opening up on Android.
Say you are successful, and smash your goals around downloads, sign-ups and usage. Now the real adventure begins - you have to keep up with the sheer amount of data that your application generates.
With application development and deployment, we have proven techniques that allow us to scale.
Microservices application design breaks apps down into smaller components that connect to each other through APIs. Want to update or scale up that component? This can all take place behind the API, so users don’t see the changes and get the service they want.
The development of Kubernetes as a container orchestration platform has made it easier to support those applications. Elastically scaling instances if workloads peak or healing individual container failures. Either cloud or on-prem, the same deployment methodologies work. Scaling apps wasn’t always easy, but now there are tools in place that Site Reliability Engineers and developers can use.
Managing and using data is a different problem. For applications that have to support millions of users, the sheer scale of the data problem is massive. Any infrastructure put in place here has to scale, has to run across multiple locations or availability zones, and has to connect with the front-end app or service easily.
There are databases that provide the scalability side - the likes of Apple, Netflix, eBay, and Instagram all use Apache Cassandra for example - but distributed databases require a specific skills and understanding. There has been a strong desire to get the data side as easy as the application side.
Thinking cloud native around data
The move to data on Kubernetes is on.
There are several new open source projects starting up to combine cloud-native applications and infrastructure with data. The first is K8ssandra, a combination of Apache Cassandra and Kubernetes, which provides a single distribution for those that want to make it easier to scale around data and applications in a single control plane.
Distributed systems have to support working through failures and both Cassandra and Kubernetes are designed for this problem. They are able to automatically restart clusters or containers if they fail - however, as both systems are able to do this, a lot of coordination needs to take place. K8ssandra provides the needed automations and administrative tools to make it easy for both new and experienced users. It has everything needed for daily operations like backup and hooks for observability services to monitor running databases. SREs can focus more on the bigger picture of application deployments and less on running a Cassandra cluster.
Alongside this is another open source project around data and APIs called Stargate. This provides a data gateway that acts as an API and connects front-end frameworks for application development through to the database at the back-end. By abstracting the database layer and automating management around this, developers can avoid some of the problems associated with picking and running databases alongside systems like GraphQL or React. Rather than having to use specific databases that are not built for scale, this approach can ensure that using a distributed database is easy while letting developers concentrate on building applications.
Behind all these projects, there is a bigger movement. Developers are driving the decisions by choosing to work with simple yet powerful APIs. Their work is responsible for the new services or applications that customers want to use and speed counts. For many, the developer experience is the be all and end all of those decisions. This should make the back-end of these applications, where scalability and availability have to take place, more important to the application. SREs can now deliver the services that developers are asking for, with a lot less ramp up time and way more repeatability.
Making it easier for developers to use what they are familiar with. This should involve using any data store for any data application by adding support for new APIs, data types, and access methods. Developers should no longer need to work with different databases and different APIs to power modern data apps. Instead, we have to concentrate on making the default settings here about providing more security, availability and scalability without developers having to invest time into learning specific data structures or models.
For everyone building applications, thinking about a database should be one of the furthest things from our minds when it comes to launching. New projects coming up around cloud-native data and Kubernetes, are creating that reality. Making it easier to scale up and keep services running is essential for any new application or service launch, so implementing cloud-native data without trade-offs will build our future.