In this edition of the Cloudify Tech Talk Podcast, we talk with the Head of Cloud Products at Elastic, Uri Cohen. We explore the journey of moving from an on-prem to SaaS, and what specific steps are needed to accomplish this difficult task.
Intro: Welcome to the Cloudify Tech Talk podcast, taking a deeper dive into all things DevOps, all things toolchains, all things automation and all things orchestration.
Jonny: Guys, welcome to another episode of the Cloudify Tech Talk podcast. My name is Jonny, I’m your moderator. We have a very special use case, very special guests in this episode, but without further ado, I’m going to hand over to Nati Shalom our CTO to tell you a little bit more about it.
Nati: Hello everybody and welcome to the Cloudify Tech Talk podcast, with me is an old friend Uri Cohen, who is now in Elastic, been also a product manager in Cloudify, and me and Uri spend a lot of time talking about product strategies, both in his days in Cloudify, but also in his days in Elastic and I’m kind of continue to learn from his experience and lessons on his journey and dilemmas, and I would say his decision-making process which I found to be quite inspiring I think if I look at the Elastic journey. The discussion today will talk specifically on the topic of moving from a product that was primarily delivered on-prem Elastic, and then the challenges of moving to SaaS and the lessons then from that and I thought that journey itself raises some other interesting questions, like the question of why would you need to be an open-source if you’re moving it to SaaS, we’ll kind of end with that, so we’ll open that discussion. So hopefully it’s going to be a very interesting discussion today. Without further ado, I’ll let Uri maybe for those who are less familiar with Uri present himself and introduce himself. Uri go ahead.
Uri: Hey folks. Thanks, Nati. Great to be here. So name’s Uri, I spent the last five years running the product management for Elastic Cloud Products and prior to that, I’ve been a partner to Nati in various spaces and Cloudify. My background is engineering, before becoming a product manager I was an engineer and a solution architect and kind of find my way into product management, a lot thanks to Nati by the way. So at Elastic, just a bit of background, we obviously have the core stack that most people know Elasticsearch and Kibana. I think something that some people may not know is that in the last three, four years, we’ve been focusing also on building more curated experiences for specific use cases and we have a few of those now. These are basically dedicated teams that build into the stack and on top of it to cater for specific use cases like logs and infrastructure monitoring and APM and security like SIM and endpoint protection and a website search. So for example, for website search, it’s not just about using the the process for APIs, but also how do you crawl websites and pull the data and provide a much more user-friendly interface for website admins to run search applications that have websites.
So we have a few of those solution teams and then the 5th team, which I’m a member of is really like the platform team and our primary goal is to take off everything that’s built upstream and deliver it as a service to our customers. So this service it’s called Elastic Cloud, it’s been operating for the last six years or so. Right now we’re located across 45 regions across three cloud providers, all the major cloud providers and it’s a pretty intensive operation with tens of thousands of customers and clusters that we manage.
Nati: Excellent. This is definitely quite impressive and I’m sure many of the listeners will definitely learn from that experience. So, I think that the journey of getting to the point where you mentioned that you were six years in that journey since it started. Maybe let’s start with the beginning and kind of work through that process. So Elastic was delivered as an open-source, meaning that you download the product, install it and run it on your own environments, whether it’s cloud or on-prem environment, why worry about SaaS in the first place? Where did that come from?
Uri: Yeah, that’s a good question. Obviously, you know all the trends, people are going to cloud, they don’t want to mess with installing, purchasing, managing and accounting for the costs that are associated with owning infrastructure. So this is a wide industry move, we all seen the growth of all the major cloud providers in the recent years and I think customers or just users like Elastic search has been tremendously popular and a lot of those users just say, hey, I don’t want to mess with the running hosts and thinking about infrastructure and upgrading OSs and security vulnerabilities and things along those lines. It would be great if someone can do that for me. By the way, Amazon was the first one to offer Elasticsearch as a service. There is a lot of history there and then some interesting developments recently. So, just like having that service, like I download the Elasticsearch and I like it and I use it and then I realized that I don’t want to invest in managing the infrastructure that powers it. So I want to delegate it to a third party that will do that. Maybe see like the best way to do that is to use the actual vendor that knows the software inside out and knows how to run a desk. So as I got thinking, obviously back in 2015, when we launched Elastic Cloud, this was a much smaller portion of our customers, but today it’s a pretty massive business and with more and more customers, even the larger ones, finding their way into the hosted service.
Nati: Now what’s interesting is I think you mentioned it and we’ll probably talk about it later in the discussion is that the consumption model, as you mentioned, started by take care of, it’s the same elastic that I’m using, using API and whatever but we’re carrying out the operational aspect. Like how do I manage it, how do to ensure security and other aspects that you mentioned. But I think that the trend is also to change the consumption model because I think you mentioned that number of added value services that you added and other things. People when they moved to SaaS, I’m expecting that they also expect to get more than from what I would say is the built-in and value add perspective than they would on the on-prem environment. Is that the case?
Uri: Yes. So definitely people have different expectations. When you hear the term managed service what does it mean right? How far does it go? And there’s like a spectrum of offerings and products. I think the best analogy that I like to use is if you look at Amazon, there’s RDS, and RDS is easy too and then you kind of offload to Amazon all the hardware management, but then everything beyond that is on you. Then you can go to a bit of a higher level offering like RDS, which includes the infrastructure but also includes some aspects of managing the database, you don’t need to install it. They give you tools to upgrade and manage it and it’s a more streamlined experience for DBA. Then you can go all the way to a fully managed offering like for example, Aurora serverless, where all do you get is a database API and the rest is really done for you. It scales automatically, it upgrades automatically, and you don’t need to do anything. So there’s a spectrum of offerings and it really depends on what customers are looking for and I think as we evolved our own offering we realized that there are two types of users. The first type of users is like our historical open source users who know the platform very, very well and when they come and deploying a managed service, they still want the high degree of control that they had on-prem, they kind of tuned their application to Elasticsearch so there are parameters that they want to set. They want to control the instance size and number of replicas in charge with H index and they expect that level of control.
On the other hand, there are different types of users who are looking at things from a solution perspective. Like I want an APM solution, I want a monitoring solution and then you start looking at products like CloudWatch and Splunk, for example. So it’s all these products basically have a more of a fully managed offering. So we don’t need to think about all these aspects and I think that the tricky thing is how you balance the two. You probably want to do both but if you do the other one, does it stop you from doing the first one and if you’re allowing that level of control, can you really be turnkey. If you serve RDS, you need a separate product to serve Aurora. So this is kind of the question that we’re dealing with right now and how do we still retain the high degree of control for our traditional users? There are tons of those still but on the other hand, how do we become more turnkey for those users that don’t want to mess with managing charts, for example.
Nati: Very interesting. So basically what you’re saying is that it’s not just taking the same on-prem version and putting it in a managed fashion. There’s also this different set of value-added services that people are expecting. You mentioned Splunk, and you mentioned others of whom people, I think get a better idea of what you mean and in your case, it will probably be doing both because you do cater for both users and that’s in itself poses a probably challenges because probably it’s easier to take one of the options and not two, and probably taking two would make make it difficult for SaaS to be a very efficient, but also and a lot of those value-added. We’re not going to get into that. I think this touches other areas on the business side and technical side. But one thing that I wanted to discuss is the approach that you’ve taken since the lessons learned since you started the journey, you mentioned 2015, six years into the journey. First of all, what is from your perspective, the difference between running on-prem into SaaS, what are the changes from product perspective from, I would say a sizing perspective, where are the differences in terms of efficiency that is very important in SaaS, but was not as important when you’re running on-prem.
Uri: First of all, I think there’s a process. If you start on-prem and there are many companies that are in similar situation, like Hashicorp for example, Grefana and a few others that basically started with an on-prem self-managed product and are now moving various scale into a hosted offering. So I think the first challenge is how do you structure your organization? Do you have a separate team that owns the hosted offering? If you do, what is the boundaries, how do you draw the lines between that team and the upstream product teams. What is the level of ownership of the upstream product teams of the deployment into the hosted service and how far you’re willing to go in adjusting the on-prem product to host them all. And again, here there’s a spectrum as well. You can say for example, that every user is single-tenant in the case of elastic search, for example, they get their own cluster which is more similar to how they would experience the product on-prem. But you can go all the way to multi tenant model, completely changed or revamped the underpinnings of your product so that customers basically share the same environment at the product level. For example, Grafana is doing that with some of their offerings. And again, there’s a spectrum here and it really depends on how far you want to go. I think another important aspect is when you do need to make changes.
For example, one thing that happened with Elastic, which, which I really liked, and this is something that took us a while to get to from a cultural perspective, the original hosted service was based on an acquisition we’ve made back in 2015 of a very small company called Found in Norway and super talented group of individuals that were very, very proficient and understood search very well. In fact, they contributed some code to Elasticsearch and they kind of really, really knew the product inside out. But because they were an external company, their mindset was okay, there’s an upstream product that we have little influence over, and they will use versions every once in a while and what we need to do is to take those versions and do whatever we can to sell them as a service the best way we can. Sometimes for example, certain features or certain capabilities of the upstream products do not lend themselves well to a hosted offering and it starts from various configuration options that if you let users do on a hosted service, it just breaks things, like Elasticsearch has a discovery mechanism of peer nodes and if you let customers control of that then they may break their own cluster and shoot themselves in the foot without even an intention to do that.
So you need to kind of think about what level of functionality you expose but it goes deeper. How far do you want to go and adjusting the product to be able to run on a hosted service. You create a certain feature and that feature has certain networking crime, but the way you’ve structured your architecture really doesn’t allow for that feature to be operated. So in case you’re an external company, you have no choice, you have to create workarounds or just like block to feature. But I think where the synergy has gone is that because the team became part of Elastic, then it took us a while, but gradually we kind of started influencing the upstream product and working title collaboration with the core Elasticsearch and Kibana teams to make sure that they build their products in a way that is optimized for the hosted service and specifically our own hosted service.
Nati: Yes, I think you mentioned a couple of things. Let me maybe wrap up what you just said, just to make sure that we’re clear on that. So the first approach, and I think we’ve seen it also with other companies like Kashi Corp is to build a separate team that will deal with the SaaS part of the product and by definition, when you’re looking at that team they will look at the product from the outside, not from the inside, and we’ll try to work around maybe that’s one way to call it, work around some of the operational aspects to make it work in a SaaS kind of model and yes, they will influence the product, but it would still be more an outside look into the product than the inside look into the product and we’ll touch on that later. The second point that I think you mentioned is the differences in terms of control. Obviously, one thing that is changing with you on the operation versus the customer on the operation is that you have to think about what knobs you want to open and whatnot, because basically you’re in the frontline of support to deal with those issues and I think you mentioned the ability to deal with the pairing discovery and other items that if you open that app, the changes that people will damage themselves would probably be higher and that is an exposure to you as an operator of that service. That’s probably something you don’t want to get people to mess with.
You also mentioned if I recall correctly some of the network aspects and cost sensitivity, we didn’t really talk about it yet. So maybe you can touch on that, but I think that’s kind of the main points that one, it’s very typical to start as an external team building into SaaS, but as soon as he started to do that type of journey, you’re looking at the product from the outside, which has its own limit, we’ll touch on that later. And the other thing that you mentioned is the control level, the difference between what happens when you own the product and the operation versus the customer on the operations. Obviously, the degree of flexibility that we want to open up is going to be lower when you meaning Elastic on the operation. So let’s talk about the; so first of all, did I summarize it well?
Uri: Yes, I think so, one comment on teams, so inevitably you have to create a team that builds a platform because there’s much more than just running Elasticsearch and ECP, for example. There’s metering and billing and customer onboarding and customer analytics and security, and like a bunch of other platform level concerns that you have to account for like just operating the thing and making sure you have their own capacity and it’s the providers in each of the regions, these are all hard problems that you need to accomplish. So you have to have a dedicated function within the org to make sure these things are handled the best way you can. I think the interesting question is really where you draw the line between the product team or the upstream in our case Elasticsearch and Kibana teams and the platform team and how turnkey, or how much can you obstruct the way for that product team in terms of APIs and interfaces to allow them to manage the product on their own, the ideal situation is that the upstream team has a set of API interfaces that they can go and deploy on the platform.
If you think about that, like Kubernetes, for example, there’s a very clear contract between the platform and the people who go on it and you don’t need to make code changes into Kubernetes when you’re running a pod because as long as you conform to the actual technical standards that are imposed by the Kubernetes platform, you’re good. You have Gamify, you have a coop CTL, there’s like a very neat way to extend this with CRDs and whatever and you don’t need to, there was a very clear separation between whoever is creating and operating Kubernetes and whoever is running applications on top of it. So that’s kind of the holy grail, if you can get to that point where the upstream team just has a set of interfaces and they use those and rarely require help from the platform team. I think that’s a great model because then the ownership lines are very, very clear.
Nati: But one thing I think, especially for product who grew from an on-prem environment, there are nuances that are different that will affect the core product. I think you mentioned cost sensitivity. I remember last time I spoke with Aton from optic glass about it as well. Even for example, the time it takes for container to boot up was a key factor that he needed to optimize, something that obviously when you’re running on a SaaS and you’re measured on an end to end say solution, become a factor. I think you mentioned also the footprint and the effect of that bandwidth costs, for example, when you run on a non-prem environment, you care less about it, but when you’re running on a SaaS that starts to pile up, especially when you start to accumulate multiple instances at the size that I think you mentioned. So what are the things that I think are clear differences between, that you need to think about from a core product perspective and I’ll touch about the Kubernetes example that I think is very good, that you need to think about as you moved from on-prem into SaaS. So you mentioned, I think in Alaska the footprint and the bandwidth so a lot of those costs sensitive type of solution. Can you touch on that? And also maybe other things that product thing would need to think about as they do that type of transition?
Uri: Yes, sure. So cost is a major; beyond cost, so for example, when customers get started you as a hosted service, you want to make it as easy as possible and part of that is the product experience, how easy it is to sign up, how much friction have in the process and in our case, the key measure is how fast customers can ingest and get value from their own data. This is the key kind of motion for us when we acquire customers. Now part of that is really like, can you do like a free trial? Can you give them time to experiment and use the product, even as a free tier, that a lot of vendors do that allows you to capture, you can think of it as like you want to acquire, you want your top funnel to be as big as possible and most of those would stay in the 10, 20, $30 a month, even a hundred dollars a month type of range. But if out of every hundred customers, you get the one customer that would pay you a hundred thousand or a million, then this is what it’s all about. So it’s really, really important to remove all these obstacles and to make sure that you can actually cater for those massive amount of customers are coming from a top funnel and allow them to experience the product the best way they can, but still in a way that would be cost-efficient for you.
If the minimum size of an Elasticsearch would be 64 gigs, then every trial customer would cost you tons of money and the customer acquisition costs would just not make sense. So for example, in this case, you need to make sure that you can actually run very small nodes of Elasticsearch and still give the user reasonable user experience so they can experience the product in a good way and some of those that are happy will actually grow and become your big paying customers. This kind of touches on the previous point where this is where like the boundaries are not, you need to go and make sure that when you’re building an on-prem product, of course, 64 gigs, who would not host even when you were deploying lots of training on your own, this is a non-issue and most people would, would start with 64 and then 128 gigs and just deploy this. But when you go in and have this motion of thousands of customers that are coming from your top funnel they want to experience the product for free, then this becomes a really, really big concern and at a cultural level, you need to make sure that the product team, the option team really understands that this is important, and they optimize for that and they test those configurations.
When they add a new feature, they think about how much memory it’s going to consume and how much disc it is going to consume and same thing with data transfer, these are things that get charged by the top provider and they are the headwind to your costs. So if you create a feature that is massively using the network and replicating data, this will cost you in terms of data charge. So you need to be aware of all of these costs vectors, and the need to optimize those configurations so that your SaaS or cloud selling motion will be as efficient as possible.
Nati: Yes, the printer is definitely stretching it to the limit that I think is pretty clear, but also I think the effect of it on product, I think would be very important. You also mentioned the multi tenancy versus a single-tenant per user and we’ll talk about also introduction to Kubernetes. The reason why I think it’s interesting and I’m going to that thought process myself is, I mean, we have different ways to split. I mean, multi tenancy is a way for us to split the product between users and still give isolation, but in a relatively more efficient way, because we’re basically splitting the product, not the hardware and that gives us a little bit more granularity on how we can optimize the usage per user, gain, touching on the cost aspect. But in the world of containers and microservices, partially that job is done for us by the platform, because the scheduler of Kubernetes, in this case, can split workload on the same machines between users. So do we have to optimize it to multitenancy the traditional, I would say the kind of senseless way, or when we run again into container, it’s enough to let the scheduler Kubernetes deal with that and not worry about it from a, I would say, a SaaS perspective.
Uri: So I think Nati, this is a great question. I think the underlying question is where do you draw the line and what type of multi tenancy you’re implementing. Kubernetes can allow you to be multitenant at the hardware level. So you can, while you’re still using containers and every customer has their own container, the multi-tenant aspect is at the hardware level, you can share hosts between multiple customers. Another level of multi tenancy is really the application of a multi tenancy, right? So if I’m using elastic search as an example, if you’re at the container level, multi tenant then every customer has their own cluster or their own nodes, they run in containers. Yes, they share hosts, but Kubernetes and Docker gives us a reasonable level of isolation and the ability to constrain resources so that you can minimize the noisy neighbor effect. When you go to the application of it, it gets more interesting because at that point you need to build and cater for multiple users running on the same instance and in the case of Elasticsearch this will be like a shared cluster that you host multiple customers on and then he did it. I don’t think there’s a right answer here. There are pluses and minuses for each of those models.
If I’m kind of relating to the previous point about costs, when you are operating in a fully multi tenant environment at the application level, the cost of acquiring an additional customer and onboarding them to the platform, you can make that a lot smaller and you can minimize that. On the flip side, it’s much harder to give customers a consistent experience because they’re much more exposed to noisy neighbor issues. So some vendors like Mongo, for example, actually do both, they have a lower tier, which is free or very cheap that is multi tenant and then as soon as you go above a certain level, then you got a dedicated requirement. In our case at Elastic we’re still in the multi tenant at the host level, but every customer gets their own cluster and we’re optimizing Elasticsearch and Kibana to be able to operate even at very small instance sizes like 4 gigs or even two gigs of RAM. So it gives us the ability to cater for the lower end of the market and give customers like a reasonable free trial experience.
Nati: Very interesting. So basically I think the comparison to Mongo is also very interesting. So one aspect that you mentioned is that in a world of, I mean, before Kubernetes and before containers, I think it was, if you’re running in SaaS, you have to, there was no option, you have to kind of build things multitenancy, and that was a pretty big let’s say uptake on the product side to change from non-multitenant to multi tenant because you now have to split many resources that wasn’t really shareable before and that was a big headache and require a lot of refactoring and in many products and SaaS became a big bear in a containerized world, especially with the world of Kubernetes. There is the option to let the platform deal with some separation at the hardware level in which the resources can be shared between multiple users on the same, if you like VMR hardware, but still be completely isolated from one another. Still the degree of efficiency in terms of maximizing the number of users that you could put on a same, let’s say infrastructure can grow even in that model if you’re making your product itself more closer to multi tenancy and the difference between what you said about Mongo and Elastic is interesting. One of them is to make, let’s say the containerized version of the product efficient enough so it can fit into those different sizes and then you don’t have to deal with fine-grained multi tenancy, but you only need to deal with the different sizes in terms of SLS, sizes that you gave for let’s say, large and medium customers or small customers, versus in the context of Mongo, they offer the lower tier in the fine-grained multi tenancy and then the higher tier on I would say the more the platform level multitenancy, which means that you have more control of resources or isolation of the resources that you’re going to get, but obviously also less effect on noisy neighbor kind of the effect.
So it’s interesting but it’s not a clear card as it was before. You could also look at that from a timing perspective, meaning that if you’re just starting with that transition, maybe you don’t have to be as efficient to the final degree of multi tenancy and it’s good enough to rely on the platform, which resists the bear and only then start worry about being fine-grained multi tenancy. So my take from this is that it’s less obvious that you have to go through this exercise of fine-grained multi tenancy and there are at least two options on how you deal with that, that I think you mentioned here, which is interesting. I hadn’t thought about that. Did I get it right?
Uri: Yes, I think so and at the end of the day, it’s like a trade-off, you want to be more cost-effective for your new users. This will inquire to make pretty large investment at the product level.
Nati: And at the end of the day, it all comes down to how many customers you can push on a certain algorithm. So from what you mentioned, there’s more than one way to do that. One of them is to make the product itself efficient in terms of the ability to fit into different footprints. The other one is to make it efficient at the fine-grained multi tenancy level. The other one is to make it efficient that the size of the footprint and obviously there’s different degree of efficiency that you could get in each one of them. But it might be that it’s good enough to, at least for up to a certain point, it’s good enough to go with the efficiency at the footprint level before you have to get into the fine grain, multitenancy. Obviously, fine-grained, multitenancy does add much more complexity into the product architecture and if you could avoid that then why doing it obviously if that’s not available, then you’ll have to go to that exercise. Okay, cool. So we started talking about Kubernetes from the multitenancy perspective kind of reverse way but originally, when you started the journey in 2015, Elastic wasn’t running on Kubernetes, so maybe share with us the journey, what was before you moved to Kubernetes and why did he join to Kubernetes or when did you move to Kubernetes?
Uri: So first of all, we’re still not running, we’re still operating at the hosted service level, it’s such a journey. So yes, back in 20, even before 2015, when the original company we acquired started their service [inaudible 36:32] was kind of getting momentum and the whole kind of container movement. These folks decide to take a bet on it and just build everything around DACA. So in a sense, they built their own kind of Kubernetes that is optimized for Elasticsearch and Kibana and Kubernetes is, I think just in like 27, there was like a lot of confusion in the community and the container orchestration wars, who would emerge as a winner and I think only in 2017 or even 2018, it was clear that everyone, Kubernetes is the winner in that aspect.
Nati: But yes, there were multiple platforms… sometimes we forgetting that journey of the things that maybe they look obvious today were not as obvious back then and even today I’m starting to see companies questioning Kubernetes to some degree. But that’s a different type.
Uri: A different discussion, yes. There’s like Nomad for example, and console HashiCorp stack, which is pretty good. So there are there options still today, but it’s still Kubernetes that’s the de facto standard and then like everyone is measuring themselves based on that. Another thing that made us hesitate in the move is that Kubernetes itself was not really ready for stateful workloads, only later versions, like start introducing things like persistent volume claims and stateful sets like constructs that will allow you to manage the state application properly on Kubernetes. There are, by the way, there are still things that are missing today and if you go and ask even heavy Kubernetes users, like how much, how much of those stateful workloads like databases and data stores, they run in Kubernetes. I suspect you will get like a, not a lot of them do that. So there’s still like things to do, even at the Kubernetes level to cater for those applications. But it’s getting there, there’s a container storage interface and things along those lines that make things simpler for stateful workloads to run there.
Nati: It’s important to note, again, for those with a short memory, is that when Kubernetes started, it was really because of the dynamic nature of Kubernetes, it kind of assumed that your workload is can be relocated and shut down during the process and help to actually, make sure that your program can cater to that. It was very much catering to stateless applications and in many cases, the stateful side was living outside of Kubernetes. The later evolution was to add a persistent volume and storage interface, as I think you mentioned, and that’s still kind of a partially in progress. They’re still, I think majority are running I would say persistent services outside of Kubernetes, like RDS and others but getting the assistant services running in Kubernetes is a big challenge for someone who has been managing a lot of clusters of data. It’s a pretty big challenge. So I can understand that the decision here wasn’t moving into SaaS and whether to manage it in Kubernetes, it’s also how much the platform itself can deal with the type of persistent volume that you’re describing which is another dimension that not everyone has to deal with but definitely something that is a hardware to carry. So, where is Kubernetes today in that regard, how much it is, how much you could rely on it to run high volume stateful services?
Uri: There are still some things to do, but I think that if you’re operating in a cloud environment and can use, for example, attachable volumes like EBS I think there’s pretty good support. When we kind of decided to go there one thing, what was scripted on us is that we can take a platform that’s been built outside of Kubernetes and been continuously evolving over a few years and just magically retrofitted into Kubernetes. So it was clear that we’re going to need to do things in a very native way to Kubernetes and luckily at that point the old kind of operator design pattern started to emerge, and it just made a lot of sense to kind of do it this way.
Nati: I’m going to double click on that.
Uri: So Operator is a design pattern that is geared towards, let’s call it like, so there’s like day zero where you installed the thing and run it on Kubernetes and initially just your Yamas and collection, and then home came up and kind of cater to a lot of those complexities and simplified things. But this is really just how do I take a workload and install it in Kubernetes in a consistent way. Now that is just the start of your journey, there are day one and day two operations, how do I upgrade? How do I scale? How do I move around things? And this is where the operator pattern comes into play and basically, at the very basic level it’s just another workload that you’re on Kubernetes, but it has provisions to manage other workloads. As you install it, it adds extensions to the Kubernetes API, and it kind of ties into the Kubernetes event loop, and how do you kind of evaluate the state and conversion over time, just do it for a very specific type of technology and so on. So it started to emerge. I think it was CoreOS, which was later acquired by Redhat but initially came up with that, and now there’s like a very clear way of building operators. There are frameworks doing that and there are marketplaces for operators, where there is one. So it was clear that like, this is probably the actually want to go and if you’re a Kubernetes user that’s how you would expect to manage Elasticsearch on Kubernetes.
Nati: So it’s a way to manage a service through, let’s say, coop CPL and there is that rather than having each product provide its own, if you like the operational API and interfaces completely differently that was the design part here was to provide some sort of a framework. So that the way you manage services within Kubernetes, external services that is would be more consistent and use, I would say at least common framework to do that so that it wouldn’t be too complex on the users to learn for each product, a completely new set of tools. I think that’s kind of the idea. It kind of sounds like JMX said days when; and JMX was mostly on the monitoring side, less on the data operation side, but a similar idea of having some consistency on how you manage the service, not just how do you deploy the service which is interesting. So that’s one thing that you needed to change.
Uri: Yes, because then it was clear that we’re going to basically start Greenfield, like you can build like an operator from scratch that is geared toward Kubernetes and we did that. We released a product called ECK, it stands for Elastic Cloud of Kubernetes, right now it’s mostly an operator and that obviously in the future will add a lot more capabilities to go beyond the operator layer and right now we’re hard at work, basically, looking at a unified architecture that will allow us to leverage Kubernetes. Again, the benefits there of even powering our hosted service and it’s not like; it may sound simple, but when you’re running tens of thousands of clusters with tens of thousands of users across 45 regions, across, again, tens of thousands of rows, you can just like rip it all out and replace it with Kubernetes. So it needs to be a very careful, talking about changing wheels as the car is going, this is a really extreme version of it. So we need to make sure that we have the right construct in the software to do that but I said operationally, we feel comfortable with that. It’s definitely a direction that we’re going towards but it’s going to take time.
Nati: So that touches, I think on another motivation that I think is unique to those who are providing on-prem and SaaS version, which is one of the challenges that I think you have is, and many still need to support both models is consistency between the SaaS and the on-prem environment in terms of the operational aspect. I mean, basically what happens is that when you move to SaaS people seeing, oh, this is managed very nicely, I want also not just the Elastic part, but they also want the management of Elastic to be part of my on-prem environment. How do I get there? And this is where it’s becoming an interesting choice, if you have everything in Kubernetes, it’s easier to at least create that consistency also at the operation level between the on-prem and SaaS environment. Definitely not to the degree of cost efficiency that I think you’ve touched on earlier but very close to that. So that’s another motivation to move to Kubernetes?
Uri: Not quite because; so two, two pieces here. One is we have a product called Elastic Cloud enterprise, and this is a self managed product that released back in 2017 and this product really uses the exact same code base that we use to power our hosted service and for customers, it’s a spectrum. There are some customers who can not go and define cloud, like for various reasons, a lot of that resilient to regulation, how compliant you are, for example, if you’re selling to the DOD in the US, they may not be even able to connect to the internet for some workloads. So you need to have a solution for them as well and between like a fully managed solution and a totally self-managed solution, there are things in the middle, and Kubernetes and our ECP offering is one such offering. But EEC enterprise is the other product that we are also allow you to do and to purchase. Essentially installing your own environments and basically have your own copy of Elastic cloud that you manage. So between ECK and ECE, the Kubernetes version and the self-built orchestration version, over time we enter converse it to, as I mentioned but today it’s really the choice of the customer. If you’re standardizing Kubernetes and you want things to run there in a consistent manner, then probably ECP is the right choice for you. If you’re not really, you want a more turnkey solution that wouldn’t force you to learn another technology or rely on Kubernetes knowledge then you can take ECE and basically install like a fully featured and complete solution that includes both the arbitration and the option profits.
Nati: So right now, basically what you’re saying is that first of all, from an evolution perspective, the on-prem version of the managed product is different that we’re basically the older version of the product that evolved over the time and that’s what’s available right now and probably starting the journey to move to Kubernetes. Where are you in that journey? And also if you would make that decision today, given the maturity cycle that I think we discussed earlier, would you have that all in Kubernetes or would you still run it separately between two different products, because there’s still big differences. So let’s start with the first part, which is where are you on the journey with Kubernetes and second, let’s talk about, if you would do that same decision of supporting on-prem and SaaS today, what would be your choice?
Uri: Yes, sure. So in terms of Kubernetes, as I mentioned in the self-manage category, we have two orchestration products that are available for customers. One is the Kubernetes version, it’s called ECK and the other one is the non-Kubernetes version it’s called ECE and customers have the choice. They can use one or the other, or both, whatever suits them. So that’s fun, at least from a customer perspective, you have the choice. Now over time we want to convert it too, so we want there to be a Kubernetes-based offering that both powers our own hosted service, but also is available to customers and it’s just one converged product. So where we are today is that we had started a journey and working on a few core architectural pieces that would be the foundation for this transition and this will take time. This is an effort measured in years, not even in months. But I’d say we’re not at the very start, but we’re not in the moment. We’re closer to the start, than we’re in the middle. Then what was the other question?
Nati: Yes, I was asking in retrospect, what would be the decision? Like if I’m studying right now, my journey and I want to support both on-prem and SaaS would going only with Kubernetes would be the right choice, or I still need to think about the on-prem differently?
Uri: So if I look at the trade-offs, if you’re going with Kubernetes, then obviously a lot of the work is already done for you. You don’t need to mess with things like container management and networking and resource isolation and all that stuff. There’s a very clear and established and proven framework to work with, which means that you can focus on the differentiators of your business. So this is a great benefit. I think there’s also another benefit in terms of skillsets, because Kubernetes is very popular, it will be very easy to find talent that will help you build that. On the flip side, if you’re talking about a hosted offering, I think that’s a pretty obvious choice here. You should use container orchestration if you’re running on containers and probably Kubernetes would be your choice because it’s the most mature and popular framework. It gets a bit trickier when it comes to self-manage products, because now you need to think about how complete you want your offering to be. Do you want to rely on the customer having a Kubernetes environment and only if they do, they can run your on-prem offering or do you want to go and bundle or embed Kubernetes self-managed products and then like provide a more turnkey experience? They install your product, yes, Kubernetes as far as under the hood, but they don’t have to know about it if they don’t need to, like, if they don’t want.
Nati: Yes, so basically, that’s a good point. So basically, you’re saying that for example, today obviously if you’re want to support both on-prem and the SaaS if you could run on Kubernetes that gives you that level of consistency between the two environments don’t necessarily have to worry about running a completely separate product, which is obviously a big, big investment. The second part is what do I do with customers that either don’t have Kubernetes or especially on the on-prem side, and now I need to introduce them to Kubernetes or bring the product with them for that you could have a pre-package Kubernetes with our product offering so that customers would just want to run your product. You could still leverage Kubernetes, but you don’t have to expose that to customers and have it funded within your product offering and so it become just another piece of software for them and they only see Elastic being managed and in that case, you could still leverage Kubernetes, but not necessarily get to the point where you have to interface with the customer Kubernetes and the differences there and all that complexity that comes with it. So that’s a good trade-off and that part you’d probably want a lighter version of Kubernetes because it only runs Elastic. That choice is today in the market right now for having a Kubernetes version that caters to that standalone mode and not necessarily OpenShift as the only choice here.
Uri: At the end of the day I think it’s a business decision, how much market you’re losing by not having the fully bundled offering. At least in cloud environments, I’ve managed Kubernetes offerings, EKS, EKE.
Nati: How much of the Elastic users are now running on Kubernetes? Like in ratio, not on business.
Uri: Yes, I can’t expose exact ratios. I’ll tell you one thing that the most popular binary format, that’s not from Elastic are container images and if you kind of think about that, if you’re going to think about that, most people would probably use a container orchestrator to run containers.
Nati: And in terms of trend, how do you see the trend towards that? Maybe that is something that we could share.
Uri: Yeah, I think so, at least for self-managed.
Nati: Again, the reason why I’m asking is not to pick on the last thing, just I’m just trying to correlate, I’m seeing differences about how much organization adopting Kubernetes to what degree and obviously, stateful services are a very big sign of maturity of the platform in this case. So that trend could give a clear indication from the source in terms of where we are in the maturity cycle. That’s why I’m asking.
Uri: Yes, so as far as Kubernetes option goes, I think the notion people have that it’s like everywhere, it’s maybe a bit rosier than the actual reality because every large organization has Kubernetes initiative right and are containers and they want to run on it, but then you kind of start to dig one layer deeper and, it’s like, okay, what kind of workloads do you run on that these customer’s facing or this business critical? Are you using StatePlus or safer workloads, any kind of, and what is the size of equipment as you plan? And then do you have like a centralized cluster or every team has their own, and everything has their own, what tooling do you use to make them available, make it available to them. You kind of start like digging deeper and you see that a lot of those organizations are still in the very early phases of their journey. They’re not mature enough to go and run a business-critical Elasticsearch deployment on that environment. So for us, as a company, I think like, one thing we really, really want to want to do well is be there for our customers wherever they are. If they’re on Kubernetes and it’s standardized on it, they have a great option. If they’re on cloud and regardless of which side they’re on and which region they’re on, we want to be there for them. That’s why we have 45 regions and three cloud providers and every major region that exists we’re there. If they want to run things on their own completely, they can download Elasticsearch and if they want the more orchestrated version, but they’re not really standardized in Kubernetes, they have another option. So there’s like a wide range of options that we want to have because we believe that every customer is in a different place and we want to meet them there.
Nati: Very interesting. So that brings me to, by the way, very interesting discussion and definitely experience, I think. I personally kind of learned a lot from that. So let’s move on to marketplace versus SaaS, cloud providers now offering the marketplace that is becoming a popular choice by customers. I think I’ve heard you talking about it as well. The question that I wanted to provoke here is, if I have a marketplace, why should I worry about building my own SaaS, maybe as a starting point, and then we can talk about some of the nuances and differences between a marketplace and SaaS, but let’s start with that provoking message or question.
Uri: Yes, sure. So first of all you should separate billing and metering, and I think if I have to quantify the bulk of the work is actually in metering not in billing. Measuring the usage of customers of your software.
Nati: Basically what you’re saying is that the function of a marketplace is mostly on the billing side, but there’s still a large majority of the market.
Uri: Exactly, you still need to meter and no matter how you build you still need to meter the usage. So that stuff is like most of the work you have to do there anyway. Then with market-based it, by the way, there are great billing solutions, Stripe and Zuora and others, they give you a lot of options and basically allow you to offer out that entire functionality to a third party vendor, which probably does it way better than you and gives you multiple currencies and credit card processing. Again, even if you’re doing self-manage, most of the work will be done around metering and the billing side will be done probably by a third party. Again, not to say that that’s exactly how we do it, but just to kind of frame the discussion. Now in terms of marketplaces, the main benefit that you have there is discoverability and reduced friction. So imagine a customer that if you go to cloud customers, they all have an account with Amazon and the bigger ones actually have an enterprise program that it’s called EDP on Amazon it’s called The A Integer, for example.
You can assume that anyone spending more than a few hundred thousands of dollars on a cloud provider will have an enterprise agreement and when you transact through the marketplace, the customer doesn’t need to establish a separate billing relationship with you as a vendor. So you’re just like go around procurement and contract negotiation, and it’s like entire lengthy process. They typically have to go through when you onboard a new customer and this is like very significant. I’ll give you another point why this is important. When customers are on an enterprise agreement, the way these agreements work is that there’s a certain spend commitment that those customers commit to. So if I’m an Amazon EDP customer, then I have an agreement, the agreement says I need to spend say $5 million in over a period of a year and an exchange I get like whatever, like 20% discount. Again, this is just an example, these are not the exact numbers and basically you’d then start consuming and you still get invoiced on a month to month basis. At the end of the term, if you consume more great, you benefited because you’ve got the discounts. If you consumed less, they need to drop, you need to pay the difference.
Nati: So basically again, just to summarize what you just said that again in layman terms I think what you’re basically saying is that customer, in order to get a discount from Amazon, they commit to a certain purchase size I should say, and then they incentivized to actually buy more from Amazon and that incentivizes them to actually buy more from Amazon. So in the context of buying direct from Elastic, in this case, or a supplement other company, versus buying it through the marketplace, that’s kind of what probably increases the demand towards the marketplace, because it fits that model. Obviously it caters to those customers, not necessarily to the end customers. Not all customers running on that program, but definitely within that program, that is becoming a big incentive.
Uri: It’s a big incentive but just to make a clear, like, even if you’re not an enterprise customer with an EDP agreement, then if you go through the marketplace, you just see the vendor charges as part of your Amazon deal. You go need to input credit card, nothing of that sort. You just use the product through the marketplace and you get billed by Amazon.
Nati: So that’s so basically saying the entire activation process, purchasing process is becoming very streamlined into your own billing system and you don’t have to go through very complex, I would say new vendor type of model. What’s the process of getting into the marketplace? Is that an easy thing or complex thing?
Uri: So there are various marketplaces, and what I’m talking about, there’s an AMI marketplace. Theirs is pretty easy. I’m not referring to that and talking about the Amazon SaaS marketplace. By the way, there are serverless application marketplace now and there’s a bunch of data marketplace. So I’m referring to the SaaS marketplace, which all three big cloud providers have those and to get there, you need to go and there’s a set of, obviously we just submit to file for presence. We need to implement the integration. There are very specific definitions for how you define your products, how you can build for them. They’re like a few models and then you build the integration and activate it. At the end of the day it’s to set of APIs and the listing on the Amazon marketplace or the GCP marketplace.
Nati: And is it like in up story, you have to also do your own marketing around it?
Uri: So each of those vendors have obviously promotions and joined marketing programs. As soon as you go above a certain scale, you get noticed and then those vendors, they have a vested interest in promoting you because they make money. For example, if you’re purchasing Elastic cloud through the GCP or the Azure marketplace, then Azure win both ways. Microsoft gains the instant usage that your users are incurring because they run on Azure. So then when Elastic pays Microsoft the fees.
Nati: I was asking more on the product marketing side, if that requires a separate operations. How do you promote a Elastic on a marketplace versus how to promote Elastic globally going to the marketplace. Do I need to operate differently from a marketing perspective or is that the same?
Uri: You can have various more targeted campaigns for marketplace and then the benefit of those is that in some cases you can do them jointly with a call provider. So you get like to tap into their tunnel which is great. But like you can also continue and market just the way you would before, but when customers go in and sign up, you can give them the option to go through the marketplace, or even prioritize that if that makes sense. It really depends on your business strategy, how big do you want to be on the marketplace? The best analogy I can think of is, take Spotify, for example, they have a direct model and they have a marketplace model, like with the app store and you can subscribe through their websites and pay one fee and subscribe through the Apple app store and pay another fee and some customers prefer it this way, some prefer the other way. From a Spotify perspective, they made a conscious business decision to have both. Some vendors say like, I’m just cool with doing everything from the app store. That’s fine.
Nati: So basically what you’re saying is that it’s not an either or, you probably have to do both in any case. The things that they need to do to build your SaaS is not covered by the marketplace. It’s mostly the billing side, the billing aspect that is going to be different. But 90% of that is still going to be something that you’re going to deal with yourself and so that’s part of the question and probably the right model is to support both, meaning give the customer the preference choice to choose where to buy from. Not all customers will want to go through marketplace, some would want to go direct, some would want to go through marketplace. I think that Spotify example is a a good example for that.
Uri: Just one comment here Nati, the integration can be deeper, not just on the billing side. We actually just yesterday we announced a deeper integration with Azure. So that gives us, this is part of a program that Microsoft is running and this basically exposes our hosted service. Almost like a first party service in the Azure, but in the Azure portal, you go, you provide…
Nati: What does deeper mean? It means that the monitoring…?
Uri: It means that you can use it. You can go in and you don’t need to sign up separately. Everything is done automatically for you. You create a cluster or multiple clusters right from within the Azure console without ever visiting the Elastic loud console, you can ship data from your Azure VMs and resources with a single click and obviously like integrated billing. So from a customer perspective, the user experience is very, very similar to just like consuming a Nadir Azure service.
Nati: So deeper meaning that it would look like just another Azure service provided by Microsoft in this space, almost as close to that and that’s what you mean by deeper. It’s not just covering the billing, which means that the billing can be native, but the rest of the experience of how you monitor, how you access it, that whatever would look different than other. Okay, got it. So that gets me to the, I would say the last, but for me it was probably the most interesting question on that we’re kind of circling around all the time and we’re seeing changes on licenses and whatever it’s the open-source business stuff. So the whole idea of being open source was you have an open platform, people can access the code, contribute, kind of do a lot of sort of things. As we moved to SaaS, that kind of changes that motivation aspects quite a bit, and kind of, we’re moving to, we don’t want to worry about your software, we just want to get the value out of the software. So obviously that created a very different dynamic on how people are looking into the value of open source when you’re running on the SaaS versus if you’re running on-prem. Tell me a little bit about that part, your thoughts, obviously what you can say about it. I’m not going to get too much into the license battle around that. We can maybe talk about it but I don’t want to put you on the spot here, but share with me kind of your thoughts. I think you’ve gone through the entire cycle of both end of the spectrum.
Uri: I think having open source again, open source is really like a contentious word, like what is open source? So you go by the OSI definition or some other definition, let’s call it open code. Having an open code and a freely available version that you can install it on your own, I still think it’s very important and it’s a very powerful tool to get your product adopted at a massive scale. So I still think it’s important to maintain that and just give customers options that they can go and like experience your product and use it to their benefit even free. Now, if you think of it, it’s not very different from like having a free tier in your product or free trials, it’s just a way of doing it in a self-managed world and again, I don’t want to get into religious discussions about what type of licenses good or bad, that’s a whole, probably multiple podcasts and people are smarter than me on that. So I still think it’s important. I think it also helps, so for example, you look at what Google did with Kubernetes, it’s really a way to…
Nati: That’s an excellent example actually.
Uri: Yes, it’s really great way to expose customers to like standardize on a way to do a certain thing and Kubernetes is a great example. It’s a standard way of, now it’s the facto standard way of running containers and people come to expect that because of the open source, free version of Kubernetes and now when they want to deploy it on a cloud service, there’s like multiple options, but they know the APIs and they know the contract and they expect things to work relatively similarly to the way they work on-prem. So I think in that sense, open source free products is a very important aspect. But again, if you kind of go back to what I said about those types of customers the customer is looking for a solution, I think they couldn’t care less. Like I want the solution that protect my end points and my laptop, I don’t care if it’s running on Elasticsearch and Elasticsearch is like the best technology ever. I just want it to fulfill the function that I bought it for the best way possible. So I think it’s really that sense, it depends on like who are you selling to, what are their needs and that informs how important the free slash open source version becomes to them.
Nati: And I think it touches on the fact that when it comes to consuming a product probably the value of being open source is less important, but there’s still a lot of cases in which we need to integrate with a product, maybe build security aspects and kind of do deeper integration with certain products and that still happens quite a lot in many companies and for doing those types of integration, there is no good way of doing it or better way of doing it when, when it’s open source, it’s much easier to both understand how to do it and then second do the things that probably in a SaaS model, you don’t even have access to. So that I think still remains a critical piece here and since integration is not a negligible use case I think that that we’ll still need to kind of keep things open and interestingly enough, I think Amazon is also starting to realize that, and I’m seeing them releasing the EKS version of a product as an open source. So they’re starting to also, they’ve been completing an alum, I would say taking open source products and making money out of it but now they’re starting to also start to contribute that bit, which is an interesting trend, but we can probably talk about it for yet another podcast and that’s going to be a very interesting topic. But I think we’re already in the kind of, we covered quite a quite amount of ground here. So I wanted to thank you Uri for this interesting discussion as I expected, it’s quite interesting. I hope the audience here will also find it interesting. So thank you very much, Uri.
Uri: Thanks Nati, it was great to be here.
Jonny: Thanks guys, once again, for a fantastic episode. As usual, all supporting materials for this can be found cloudify.co/podcast. If there was anything you feel that we need to be talking about on this podcast, please reach out to us at email@example.com. In the meantime, stay happy, healthy, safe, and we will look forward to catching you on the next session.
Outro: This episode has been brought to you by Cloudify, the one platform you need to accelerate, migrate, and manage your multi-cloud environment for faster deployments, optimized costs, and total compliance, head to cloudify.co to learn more.