Skip to content

Search the site

Warren Buffett’s GEICO repatriates work from the cloud, continues ambitious infrastructure overhaul

Cue Kubernetes, OpenStack, and a LOT of on-premises storage, says platform and infrastructure VP Rebecca Weekly in an exclusive interview with The Stack.

Rebecca Weekly. VP, Infrastructure, GEICO.

Editor's note: We followed up on this popular story with a deeper dive into the infrastructure layer and Open Compute Project adoption.

GEICO, a large insurance firm owned by Warren Buffett’s Berkshire Hathaway, is repatriating many workloads from the cloud, The Stack can confirm – as it embarks on a sweeping, ambitious architectural overhaul.

That’s according to Rebecca Weekly, GEICO’s VP of platform and infrastructure engineering, who is leading the infrastructure rebuild at what is the United States’ third largest automotive insurer by volume.

(GEICO, which reported revenues of $39.6 billion in 2023 and pre-tax profits of $3.6 billion, is the third-largest automotive insurer in the United States, offering coverage in 50 states for cars, motorcycles, all-terrain vehicles, boats and more.)

In an interview with The Stack she confirmed the shift, saying “we have a lot of data – and it turns out that storage in the cloud is one of the most expensive things you can do in the cloud, followed by AI in the cloud…”

The company had started moving to the cloud in 2013 for its 600+ applications but the journey was sub-optimal. (Weekly does not name the provider but public documentation suggests that GEICO went in heavily on Azure.) The ultimate result? Costs rose and availability declined.

Weekly previously held senior roles at Intel and Cloudflare. She is now leading the infrastructure element of a major IT overhaul – which includes a large OpenStack private cloud deployment and heavy use of Kubernetes to manage containerised compute and storage.

Bills went up 2.5x”

She is blunt about the previous cloud migration effort.

“Ten years into that [cloud] journey GEICO still hadn’t migrated everything to the cloud, their bills went up 2.5X and their reliability challenges went up quite a lot too – because if you spread your data and your methodology across so many different vendors you are going to spend a lot of time recollecting that data to actually serve customers.”

GEICO had shifted to the cloud, like so many organisations, in a bid to exit expensive on-premise data centres, she says; the insurer at the time was grappling with “a lot of vendors, a lot of fixed function appliances, a lot of complexity; separate SANs, explicit L2-L3 switch layers… all the ways using hardware and software that you can make a data center expensive.”

But the migration actually reduced availability, putting the firm “at the cumulative mercy of our clouds” and with “no consistent data strategy, no consistent hybrid stack” it was hardly an improvement. That, in part, was the result of a lift-and-shift approach that took applications/stacks to the cloud instead of refactoring for what the cloud actually does well.

As Weekly sums up: “Just running legacy applications in the cloud is prohibitively expensive. Our use case just highlights that…”

Lots of data…

“We are a very large storage company, we have a LOT of data (most insurers do)... we do a lot of predictive analytics to understand risk” she explains. Pushed on whether the migration is going to be the kind of CapEx-heavy approach that so many CIOs shied away from as the cloud became popular, says “there’s ways of doing ‘CapEx-heavy’ in an OpEx-fashion on-prem; and most companies in the financial services space are going to be willing to make CapEx investments over a period of time."

Compliance is among the drivers for the overhaul. She notes that data also has to be retained due to state regulatory requirements, “because at any time, we could be asked by a given state to produce information that proved we didn't have bias, or we didn't violate any of the terms [of an insurance policy].”

See also: Our exclusive interview with JPMorgan's Global CISO

Ease of access to data and the ability to pull it out with minimal latency and at minimal cost was a major factor driving cloud repatriation she says (data “exfiltration” costs from public cloud are a well-worn concern).

Whilst we’re on that investment conversation, Weekly says that “[At GEICO] we're looking at more and more advanced ML and AI techniques. The cost and structure of how the cloud is charging for those kinds of instances [is changing]” she points out, citing “three year reserve pricing upfront… and it’s nine months to GET that capacity?

“All of a sudden that ‘OpEx model’ is looking like a CapEx model…”

GEICO's cloud transformation: Building the team

Weekly has been hiring widely as GEICO builds out its team across a sweeping range of functions: A recent post by a colleague on LinkedIn noted that GEICO was hiring across “nearly all domains [including] IaaS roles (dc, networking, compute core), PaaS roles (databases, storage, search, queues, caches), data platform (spark, iceberg, data lakehouse), billing + finance tech, marketing tech, or AI/ML…”) for example.

The company is heavily focused on building with open source toolkits and using Kubernetes to provision compute. GEICO’s team is “building a homegrown placement solution leveraging CAPI” (an emerging open-source project for Kubernetes cluster management). Application observability is also handled via open-source toolings, in this case Prometheus with Grafana using open telemetry exporters, Weekly says.

“Obviously we have many vendor stacks in these places today, and are working to migrate from legacy to the net new” she adds.

"Right-sizing patterns to an enterprise use-case"

We ask how she is finding it, getting people with the right skills in place (including for OpenStack). “I LOVE what we’re doing” she starts.

“I’m living case studies that are ten years of my life in terms of products and services I've built – right-sizing those patterns to an enterprise use case, and mapping dependency, complex legacy to new design pattern.

Weekly adds: “Most engineers I talk to are [also] excited…when ‘Big Tech’ is enforcing less flexible working conditions, coming to a place where we want to build together [without those conditions – she is hiring for jobs that include the option to be fully remote] then that’s very attractive…”

Business leadership is firmly behind the shift. As the head of Berkshire’s insurance business, Ajit Jain, noted at its annual meeting back in May 2023, before Weekly and many of her peers leading the overhaul were hired: “Geico’s technology needs a lot more work than I thought it did.”

Despite its earlier “cloud migration” it still has “more than 600 legacy systems that don’t really talk to each other” he said; the company wants to cut this down to 15 or 16 systems in a major consolidation.

To nail that, Weekly continues to hire, for engineers who will, for example, “contribute to the upstream Linux community, and optimize runtimes for Java, GoLang, Node.js, Python, and .NET environments.”

Other open roles aim to have engineers in place using “ArgoCD or Flux to automate application deployment and lifecycle management in Kubernetes clusters… build the load testing harness using tools such as JMeter, K6, Locust, provision infrastructure using K8, public cloud, and engage with engineering teams to understand and dig into performance bottlenecks in the client, server, and database layers.”

Latest