Skip to content

Search the site

The Big Interview: Elastic CTO Shay Banon on suing AWS, returning to OSS, and GenAI

"... even saying this sentence makes me happy because it was so weird to compete with 'Elasticsearch'"

Elastic founder and CTO Shay Banon, hiking with his family.

Elastic creator Shay Banon took out a loan to register the “Elasticsearch” trademark as a solo open-source developer back in 2012. The search and analytics engine was permissively licensed, but like many optimistic free, open-source software (FOSS) developers, Banon hoped that a trademark would help protect against opportunistic abuse of the project’s brand.

Banon was disillusioned of this in 2015, when Amazon decided that it was going to offer a managed Elasticsearch service on AWS; the hyperscaler’s CTO Werner Vogels even describing this in a social media post as a “great partnership between Elastic and AWS.” Banon was left fuming: It was the “most atrocious” example of AWS’s behaviour, he recalled this month.

Why? “There was no partnership” as he posted on X. “The problem was never on AWS taking Elasticsearch and providing it [as a managed service] it was calling it AWS Elasticsearch and implying that it’s their service (including stating it explicitly), it was a clear trademark infringement, but regardless of how much we tried, we had 1000 lawyers thrown at us…”

Elastic, where Banon is now CTO, threw some lawyers back. 

To cut a long story short, Elastic abandoned its open source licence and hunkered down for a fight with AWS. The two settled the trademark spat in 2022, with a “mutually agreeable” resolution – AWS would stop using the “Elasticsearch” term for services it had built on an Elasticsearch fork, OpenSearch, that it spun up after Elastic controversially ditched its permissive Apache 2.0 licence in favour of the SSPL and Elastic License.

Nine long years ago, Werner, but an elephant never forgets.

Ancient history, perhaps, but conversational stuff again after Elastic in late August reintroduced an "official" open-source licence option for customers, in the form of the AGPL; an Open Source Initiative (OSI)-approved licence that is used by the likes of observability firm Grafana – which moved away from Apache 2.0 in 2021. It’s hardly the most popular licence, but it is nonetheless “officially” open-source. 

As Stefano Maffulli, executive director of the OSI, said at the time: “We are delighted to welcome Elastic back into the Open Source ecosystem… 

“Their choice of a strong copyleft license signals the continuing importance of that license model and its dual effect: one, it’s designed to preserve the user's freedoms downstream, and two, it also grants strong control over the project by the single-vendor developers… one big question the ecosystem will ask is whether or not Elastic's software will be packaged by Debian and Fedora once again? That’s going to be a wait-and-see situation, as Elastic does have some trust to re-establish.

"That said, this licensing move is a big step in that direction.”

Sitting down to chat with The Stack in September, Banon said: “I had no problem with Amazon taking Elasticsearch and providing it as a service. It was part of the licence; we signed up for it when we chose Apache 2.0.

"I was just very surprised that they called the service Amazon Elasticsearch; I was very surprised that they were going around saying that it's in partnership with us. We were shocked by it, to be completely honest. Our customers were saying: ‘Amazon salespeople are coming in and saying that Amazon Elasticsearch is in partnership with you, so we should just move to Amazon.’ There was a lot of confusion in the market. 

“The worst thing that really hurt me was that their product sucked! They took Elasticsearch, and they didn't know how to provide it as a service; our engineers were getting really frustrated…” The issue clearly still grates.

“Our customers were saying ‘Amazon salespeople are coming in and saying that Amazon Elasticsearch is in partnership with you, so we should just move to Amazon.’ There was a lot of confusion in the market. 

Elastic and AWS: Forking good friends

AWS and Elastic have now established a partnership – as Banon put it over the summer, “Amazon is fully invested in their fork, the market confusion has been (mostly) resolved, and our partnership with AWS is stronger than ever. We were even named AWS partner of the year!”

Elastic, lest anyone need reminding, is a pretty big technology provider.

There have been over four billion downloads of the company’s search analytics software and as an enterprise it now has 20,000+ subscription customers. Not bad for a business started by someone trying to help their wife (a chef) develop an application focused on cooking recipes. 

Elastic has a potent reputation for search. It’s a recognised “Leader” in a Magic Quadrant for that category, yet also a Magic Quadrant “Visionary” for Application Performance Monitoring and Observability (“can transform, analyze, visualize and gain insights from increasingly complex, heterogeneous datasets while owning data”) – and what CISA describes as an “Advanced” provider of visibility, threat hunting, automated detection, and Security Operations Center (SOC) workflows via an increasingly popular Elastic SIEM platform in a competitive segment. 

Banon took Elastic public as CEO in 2018 then “got promoted back to CTO” as he puts it; adding to be clear “as a CTO, I don't manage anybody."

“I’m, to a degree, like an individual contributor that has obviously the weight or the features of a founder and someone that has been around for a very long time. I enjoy trying to tackle big problems. I enjoy complicated systems, and the process of making them simple…”

Were you pressured out as CEO by investors, The Stack asks, perhaps tactlessly? “I think the board, investors were surprised that I decided to step down as a CEO!” he responds cheerfully. “I don't want to sound too egomaniacal or something, but I think I was an okay-plus CEO!” 

So why become a roving CTO again?

“ I felt like someone else can do, you know, the earnings calls… all of these things. You end up just suddenly finding out that your week or your month goes away with tons of meetings. [This way] I can go back to focus on areas [like] the Elastic vision and strategy; on GenAI; on our huge change in architecture, in our serverless offering… There were a lot of chunky things that I think if we didn't do them, we would have problems, like, five years from now – and I really wanted to figure out how to do it!”

(Recent and ongoing work at Elastic has included the ground-up rebuild of its query language, as The Stack discussed with CPO Ken Exner here.)

Back that new relationship with AWS. How is it, now, for Elastic and other companies built on their own services around OSS they developed, working with the hyperscalers – and does he think Amazon has changed?

“I think they're learning how to engage with the open source community. I think they can do more. I think Google and Microsoft are still by far, much more advanced in their thinking around how to engage with open source companies versus Amazon. But I think they're improving as well. 

“We work really closely with them, he says.

“We also compete the hell with them on Opensearch – even saying this sentence makes me happy because it was so weird to compete with Elasticsearch – but then EVERYBODY competes with the large hyperscalers, because they do everything; we compete with Microsoft on security, we compete with Google on… I don’t know; Logstore. And that’s fine. We integrate with the [AWS] marketplace. We go to market together. We have joint customers that we sell to. It's become significantly simpler.”

So where’s his focus currently?

“Our effort is to take the hundreds of thousands, if not millions, of Elasticsearch developers and help make them vector database developers without them even knowing – Elastic CTO Shay Banon

“The generative AI efforts are where we spend a lot of time” he admits.

In this world there are two big buckets. One of it is large language models and how fast they move and how capable they are. 

“The other side of the equation is all the private data that organizations have, and what to do with it if you want to marry the two together and try to figure out, how do they work together… Like ‘here’s a Human Resources wiki, so I can ask the LLM what my pension plan is in the UK’. 

Taking all that private data and making it accessible to an LLM is a really tough problem. We’ve doubled down on making Elasticsearch a great vector database but Elasticsearch is even better than a vector database because you can do hybrid search” he says. What does this mean exactly?

Elastic, he says, can do multiple types of retrieval – text, sparse and dense vector, hybrid for example – (simply “vector search and non-vector search”) and users can “marry these results together; they can do security filtering, for example; role-based access control. When I ask my HR wiki ‘what's the salary of my manager?’ it should probably not answer that.”

“Our effort”, he concludes, “is to take the hundreds of thousands, if not millions, of Elasticsearch developers and help make them vector database developers without them even knowing about it. So you can take the same APIs… and we'll do all the work for you.”

(Elastic released support for so-called “approximate nearest neighbor” search in 8.0 and added something called “Elastic Learned Sparse Encoder” in 8.8 for AI search that helps mitigate what is known as the vocabulary mismatch problem; i.e. even if the query terms are not present in the documents, the company says, it will be able to return relevant documents if they exist.

As with most providers in this space Elastic is galloping at a dizzying pace; the generative AI tool chain continues to evolve rapidly but as Banon says, bridging the gap between sandboxed pilots and larger production use cases is all about managing data properly. Elastic’s heritage here, he thinks, will see it shine and the former CEO – mercifully free of having to manage staff or earnings calls – seems happiest with sleeves rolled up wrenching away with OSS community members on the challenge ahead.

See also: Microsoft open-sources unique “Garnet” cache-store; a Redis rival?

Latest