Episode 16: Kelsey Hightower, Kubernetes and Google Cloud

Kelsey Hightower, Beyang Liu

As an engineer at Puppet, CoreOS, and Google Cloud, Kelsey Hightower has been at the forefront of new deployment technologies over the past decade. Along the way, he has built tools like confd, created learning resources like Kubernetes The Hard Way, co-founded KubeCon, and taught multitudes of people about containers, infrastructure as code, service meshes, and the operating system of the cloud.

In this conversation, Kelsey talks about how he learns new technologies, shares stories over the course of Kubernetes history, and explains how one might make sense of the varied ecosystem of infrastructure tools ("engineering organizations are like restaurants").

Show Notes

Kelsey Hightower: https://twitter.com/kelseyhightower

Kubernetes: https://en.wikipedia.org/wiki/Kubernetes

Kelsey's talk at the first GopherCon (2014): https://www.youtube.com/watch?v=wyRbHhHFZh8

Brian Ketelsen: https://twitter.com/bketelsen

Erik St. Martin: https://twitter.com/erikstmartin

Billie Cleek: https://twitter.com/bhcleek

Rob Pike: https://en.wikipedia.org/wiki/Rob_Pike

Comedian Ronnie Jordan: https://www.youtube.com/watch?v=S0HOY_SMseM

Docker: https://en.wikipedia.org/wiki/Docker_(software)

Serverless computing: https://en.wikipedia.org/wiki/Serverless_computing

AWS Lambda: https://aws.amazon.com/lambda

Google Cloud Run: https://cloud.google.com/run

LAMP stack: https://en.wikipedia.org/wiki/LAMP_(software_bundle)

Linux Pro Magazine: https://www.linuxpromagazine.com

Envoy Proxy: https://www.envoyproxy.io/

Google App Engine: https://cloud.google.com/appengine

Cloud Native: https://en.wikipedia.org/wiki/Cloud_native_computing

Prometheus: https://prometheus.io/

PaaS, Platform as a Service: https://en.wikipedia.org/wiki/Platform_as_a_service

Leslie Lamport: https://en.wikipedia.org/wiki/Leslie_Lamport

etcd: https://github.com/etcd-io/etcd

Raft protocol: https://raft.github.io/

T-shaped engineer: https://medium.com/making-meetup/t-shaped-engineering-on-meetup-pro-1e0a38df7f5b

A+ certification: https://www.comptia.org/training/by-certification/a

Linux+ certification: https://www.comptia.org/training/books/linux-study-guide

Infrastructure as Code: https://en.wikipedia.org/wiki/Infrastructure_as_code

Kubernetes the Hard Way: https://github.com/kelseyhightower/kubernetes-the-hard-way

Google Kubernetes Engine, GKE: https://cloud.google.com/kubernetes-engine

Chen Goldberg: https://www.linkedin.com/in/goldbergchen/

Tim Hockin: https://www.linkedin.com/in/tim-hockin-6501072/

Brian Grant: https://www.linkedin.com/in/bgrant0607/

Eric Tune: https://www.linkedin.com/in/eric-tune-3033693/

kubeadm: https://kubernetes.io/docs/reference/setup-tools/kubeadm

KubeCon: https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/

Jospeh Jacks: https://twitter.com/asynchio

Patrick Reilly: https://twitter.com/oreilly___

Kismatic: https://www.crunchbase.com/organization/kismatic, https://github.com/apprenda/kismatic

Apache Mesos: http://mesos.apache.org

Mesosphere (now D2iQ): https://en.wikipedia.org/wiki/Mesosphere,_Inc., https://d2iq.com

Docker Swarm: https://github.com/docker/classicswarm

Container Vendor Wars: https://codefresh.io/containers/age-container-wars/, https://www.redapt.com/blog/how-kubernetes-won-the-container-war

Red Hat OpenShift: https://www.openshift.com/

Container Runtime Interface: https://kubernetes.io/blog/2016/12/container-runtime-interface-cri-in-kubernetes/

OpenStack: https://www.openstack.org/

Custom Resource Definition (CRD): https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/

Brendan Burns (creator of CRDs): https://twitter.com/brendandburns, https://www.linkedin.com/in/brendan-burns-487aa590/

WebAssembly ("Wasm"): https://webassembly.org

Java runtime: https://en.wikipedia.org/wiki/Java_virtual_machine

Service mesh: https://www.redhat.com/en/topics/microservices/what-is-a-service-mesh

Nginx: https://en.wikipedia.org/wiki/Nginx

Lua: https://www.lua.org

Rust: https://www.rust-lang.org

Go ("Golang"): https://golang.org

AVX-512 extensions to x86 instruction set: https://en.wikipedia.org/wiki/AVX-512


Beyang Liu: Alright, I'm here with Kelsey Hightower, principal engineer at Google Cloud, perhaps best known as the explainer-in-chief of Kubernetes, and more broadly, new technologies in the general realm of deployment and cloud infrastructure. Kelsey, welcome to the show.

Kelsey Hightower: Hey, thanks for having me.

Beyang: Awesome. I think we have a lot of ground to cover. I like to kick things off by asking people how they got into programming. How did you get into programming, Kelsey?

Kelsey: Like many people, I started my career in tech support and system administration. The first things I ever wrote were bash scripts, I mean, probably more bash than I should have been using. I guess my first fully featured programming language would have been Python, where I started to build what most people would consider backend systems. That was my journey into programming.

Beyang: I want to take a brief trip down memory lane and going back to the first GopherCon, I don't know if you remember this, but I actually met you briefly at the first GopherCon. It was back in 2014.

Kelsey: I remember that initial meeting, but I do remember one meeting where I gave y'all some really good feedback about the product at the airport. Is that the same timeframe?

Beyang: Yeah. That was at the Denver Airport. I was on the way out. I bumped into you at the airport, I think we were both flying out of the same terminal. This was a really early version of Sourcegraph, so V-negative-1, and you gave really good feedback on the product. I remember seeing you at the conference. If my memory serves correctly, the first GopherCon, it was this like scrappy initial conference. Brian Ketelsen and Erik St. Martin, the organizers did a great job, but it was super scrappy. They didn't have an emcee. You volunteered yourself to be the emcee. Do you remember that?

Kelsey: Yeah, I'll tell that story. I didn't really know Erik and Brian that well. I knew I had made a submission to speak at GopherCon. That year, pretty much everything was a keynote; it was still a single-track conference. I think I was scheduled to be maybe the fifth or sixth speaker. I'm sitting in the audience in that beautiful convention center. The stage is well done. They picked a great location. I'm sitting there next to my best friend, Billie Cleek. He's the maintainer of vim-go now. We were just kind of watching Brian come up there and say, "Hey, this is all for the community."

Kelsey: They were the scheduled emcees. They were the hosts. But of course, we were trying to figure this thing out. I remember them bringing up Rob Pike. I was sitting there like, I don't know if the intro was as strong as it could have been or maybe there's a way we can get the audience levels a little bit higher. I had never done emceeing before. That was going to be my first time. I saw how it was going. I said, you know what, I think I might be able to help here. I went backstage and said, "Hey, Brian, Erik, I'm Kelsey. Can I try being the emcee?" They're like, "Sure." Then I went up there and I was just like, "Hey, I'm some guy. My name is Kelsey." I just had a bunch of fun on stage and it seemed to work out well.

Beyang: Yeah, I remember you were a fantastic emcee. It's actually quite surprising to me that that was your first time emceeing because it felt like you had done it before. I think you had mentioned that you've done some stuff in the standup comedy world. I recently read this article on you that said that you were someone's agent before getting into--or maybe during your--initial foray into sysadmining and DevOps and that sort of thing. Is that right?

Kelsey: Yeah. I had a previous life before joining the corporate world. I ran my own computer consultancy for a while, had a small computer store. One of my best friends that I went to high school with, his name is Ronnie Jordan, he was an aspiring comedian at the time. He was like, "Hey, I'm doing this comedy thing. You seem to be very business-minded, maybe you can consider being my manager." I went on the road with him, or at least went to a few of his early comedy performances. I was like, wow, this person is actually good.

Kelsey: When you hear someone who's a musician or a comedian, you're like, ah, are you any good? You never want to ask that question. He was phenomenal. I spent some time on the road. When you're managing a comedian, it's not just booking dates and collecting payment. A lot of times, you're also ... I don't write jokes, but you're also a cowriter in some ways. You're testing that material. You're giving feedback on the order. You're paying attention to the crowd's feedback to some jokes, which ones are working well, which ones should be extended.

Kelsey: It's more of a collaboration on the whole art. I guess I did learn a little bit from that about how to pay attention to the crowd, how to make sure that you set the tone for what you want it to be. As a good host, you always want the energy level as high as it can be, and that just sets off the speaker to have a really good shot at doing a good job.

Beyang: Yeah, that makes sense. I really feel like the skills you've developed doing that have served you quite well over the years. I think in technology, there's a lot of insanely smart people. The number of people who are insanely smart, but who can also explain technologies well to a broad audience is much smaller. You've played an essential role in explaining all these various new technologies to many, many people, including myself over the years.

Kelsey: I think a lot of times in science and engineering, we often forget that one of the core skills is communication. Everyone knows E equals MC squared. The succinctness of that, the ability to describe what's happening, to articulate the world, and then to make it work. Communication is fundamental for all of these things. You have to write down the formula, so someone else can follow it. You have to describe what happens if you don't follow the rules. You have to understand the rules.

Kelsey: I think a lot of times, some of us jump into engineering and the sciences thinking that we don't need to be able to communicate. Someone else will do that, the product manager will do that, the marketing team will do that. That shows up in our UX. When you don't explain how things work and some of our UX designs are a form of communication, here's what you're supposed to do now, here's what you're supposed to do next. We don't invest in our communication skills, in my opinion, at equal pace that we do with some of the, you know, how to write programs with the proper syntax.

Kelsey: What happens is, I think, is then we struggle to communicate our intentions. So for me, when I start to understand--some of the best engineering work comes before you start touching the keyboard: the whiteboard sessions, getting buy-in, understanding the actual problem. All those things can be far more important. Because then once you get all of that worked out, then a lot of times it just becomes kind of syntax, then you go to communicating your intentions to the computer. For a lot of people, it's a little bit more rigid. There's syntax. There's a certain way you have to do that in order to make the compiler happy. Communication is the thing that we do before and after we do those things in the middle.

Beyang: Yeah. I think the infrastructure world is a world that needs explainers, because it's almost overwhelming to someone who's not familiar with all the different technologies and tools being used. I remember in the early days, it seemed like there was Docker and then there was Kubernetes. Now all of a sudden, there's all these other tools like Rancher, there was CoreOS, Nomad, Crossplane, Istio, Linkerd. To the people who are listening who, like myself, are just confused at the sheer volume and diversity of tools, what is your brief like, here's the landscape of DevOps. Do you have an explanation of that in your back pocket that you can share?

Kelsey: I think the way you should think about these things is all of these are really just a serialization of workflows or culture. That's really what they are, if you think about it. For example, there are millions of restaurants across the globe. This doesn't make anyone panic. It's like, oh my god, how can I figure out how to eat at all of these restaurants? That's never the goal. The goal is that these restaurants are here to serve particular meals, maybe there's something that you desire, and these restaurants pop up to feed that.

Kelsey: Then ideally, you go to the one that's close to you or the one that meets your requirements for what you want to eat. Then we go along. Sometimes you'll discover a new restaurant. A lot of times, if you have your top three or four restaurants, you may not go explore other ones because you're totally satisfied with the three or four that you go to. Maybe sometimes it's great to venture out to say, is there something better? If it is, then maybe that becomes your fourth best restaurant.

Kelsey: When I think about the DevOps landscape, we have so many people just like chefs in a restaurant that are experimenting with different ways of doing things. Once they get it, then they create those recipes. Those recipes in our world is source code. Then when we take those recipes and we put them on a menu, aka the projects, then we advertise to people what's available. When you go to GitHub, I look at that as like a big menu and filled with so many dishes and those dishes have specific recipes.

Kelsey: You might be vegetarian, you're going to focus on certain restaurants who cater to that. In the DevOps world, it's very similar. Lots of people are starting with mainframes. Some people have VM. Some people are on Windows. Some people on Linux. Those become the baseline and they form your appetite. Then you go seek out projects that can then fill some void or address some problem or concern you have. That's why we will always have duplicates and similar projects, because there's going to be one ingredient that's going to be slightly different to make you preferred over something else.

Beyang: Yeah, kind of running with that analogy. In the restaurant world, there are certain types of restaurants. There's like the fast food, fast casual, there's the fancy where they have the prefix menu. There's these different categories that you fit restaurants into. From your view, are there such archetypes in the DevOps world? Is there a category called simple CRUD web apps that are typically all deployed and managed in the same way? Or is it a bit more fragmented than that?

Kelsey: Yeah. I think what you start to think about is scale, McDonald's scales. McDonald's makes some tradeoffs. Some of those tradeoffs are good for you. Some of those tradeoffs could be bad for you. They have to cater to serving millions of customers per day globally. They have to be able to address all the pallets of the world. I remember when I first started doing international travel, one thing that intrigued me was going to a local McDonald's in that particular country, South Korea or Japan. It's slightly different, but you know what to order for some reason.

Kelsey: There's a lot of consistency. Then they just do just enough to cater to that particular market. Now that's very different than a gourmet restaurant where that dish is only made in that one country and that's the only authentic thing that you're going to get or that's the only way to get it in authentic form. I think in our world, the Cloud tries to play the role of scale. We try to be as high as quality as we can. I think the Cloud would be more like the grocery store. We're going to sell high quality ingredients, because we're open for people who are building things.

Kelsey: Sometimes we may ... If you go to your local grocery store, there's going to be ready to go food that you can get from the hot aisle. Whole Foods is notorious for this. There's a hot bar where the food is already made. Let's say you really have a custom thing that you're trying to do that I can't cater buffet style, then we're going to have the aisles filled with high quality ingredients. Now some of them are very complex to use. Some of these ingredients that if you cook them too long, it just won't taste right.

Kelsey: You have to have those things because there's a lot of things that some businesses are trying to do and they're may be on the edge of innovation where I can't just use something that's already ready to go. I need a few ingredients because I have a perfect dish that I'm trying to serve to my customers. That's where all of this gets complicated. We're not necessarily been building products for the absolute end user, the person with the iPhone device. We're trying to build the things for the chefs that will eventually prepare the meal that will be served to their customers.

Beyang: That makes a lot of sense. From my vantage point, it seems like this restaurant industry of ours has changed drastically over the past decade. You think back to when we first met back in 2014, that was when Docker was just getting off the ground. I don't think I had heard of Kubernetes then. Now those things are mainstream. Given that you've had a firsthand, not just seat, but you've been at the forefront of these changes, can you tell us about how things were done back in the olden days, so to speak? I bet there are a lot of listeners now who don't even remember a time before Docker. Can you talk about how things used to be done and how they've changed with the advent of all these new tools and technologies?

Kelsey: I would say the fundamentals are roughly the same. People always ask me, how can you learn these new technologies so fast? Like, Kelsey, Lambda announced container image support for Amazon's Lambda Serverless offering. Then the next day, you have something that works on Cloud Run, which typically only requires containers and not functions like, how? It's like, well, because the fundamentals are roughly the same. When I started in tech maybe a little after, around 1999, 2000-ish, at that time, I think people understood the UNIX model pretty well.

Kelsey: We had tools like Linux, so things being free and open source increase accessibility, because all of these things are just building on top of maybe on the shoulders of giants. Now that Linux is accessible, people like me at that time didn't necessarily have to pay for a big UNIX license or buy some special hardware to run it. I remember I bought mine from a magazine. I bought a Linux Pro Magazine and there was a CD in the back with maybe two or three Linux distros on it. I would put into a computer. It's like, would you like to install Linux?

Kelsey: I would flip the pages and it will walk me through it. At that point in time, the three tier architecture was like the staple like you have a web server. The web server would give you all the logic for dealing with like rewriting URLs, routing traffic to different servers, but the front end had all of that logic, Apache, Nginx. Then you would pick your language of choice, so Perl, Python. Most people would be familiar with the LAMP stack, Linux, Apache, MySQL, PHP, or Perl, or Python, whatever your gem was at the time. The LAMP stack was the way we walked around.

Kelsey: We talked about that set of patterns like, oh, I'm using the LAMP stack. Maybe you meant Postgres instead of MySQL. We knew what the architecture look like. We could scale that pretty well horizontally, but we didn't necessarily have all the automation. To be honest, in 2020, that's what most people are still doing. The difference is, what's doing it? Instead of Apache, people are looking at Envoy. Instead of PHP, maybe you're looking at Ruby on Rails, or Golang, or Node.js. You're still using Linux, that's 30 years old. We're still doing that.

Kelsey: Now that we have the Cloud, we're taking a lot of the infrastructure and putting it behind an API. I think that part is very different. Back in my day, you would rack a server. You would make sure that you line it correctly on that 42-inch rack, screw it, and make sure it didn't lean too far off the side, make sure you haven't done the power supplies and rate five versus rate zero or something like that. We just had to know so many low level details just to get off the ground. Now it's more like click a button and 90% of the things we've learned in the last 30 years are one API call away.

Beyang: What would you say to the people who pined for the simpler olden days? If everything is the same, then why do I have to learn all these new technologies?

Kelsey: I think a lot of people didn't learn the previous ones. Because the reason why I think a lot of people are confused right now, especially if you've been in the business for a while, like I always say this thing, some people have 20 years of one year experience, they never learned the fundamentals of programming. They think that there's a stark difference between Python and Golang, for example. Now there's some concurrency features baked in. For what most people are doing, you can accomplish most of your goals with either of those.

Kelsey: Because the fundamentals are there, you're going to make some system calls, you're probably going to deal with some data coming in and out through some protocol to exchange it. Those are fundamentals that just carry over. If that new language ecosystem has something to offer for mobile, iOS, and all the things in that language ecosystem, Objective-C may be the right thing to learn if you want to capitalize on that ecosystem. I think a lot of people that are looking for simpler times, not only do you still have it, it's far easier to use than ever before.

Kelsey: If you still want to go with the LAMP stack, and remember, there are still millions of people who carry on using the LAMP stack and don't make much noise about it because there's nothing to see here. They have their servers. There's lots of customers that have 10 beefy servers, processing millions of requests per day, and building a viable business on top of that. For them, they're good. I was at a conference once and someone was like, I was at a KubeCon. We were standing in the circle of people, it's probably about 30 people in this nice circle.

Kelsey: Everyone was talking about how big their cluster was and I got this version of etcd and all of this stuff. Then there was this small group of people, they're like three or four of them. They're from the same company. They just had this look on their face like, what the hell are y'all talking about? I asked him, I said, "What do you think about all this?" He's like, "Look, we've been using App Engine for eight years and I want no part of what you all are talking about. We build our apps. We host them on App Engine. They get traffic and we've been doing fine."

Beyang: Yeah, I feel like that might have been me, although we don't host on App En. We do use Kubernetes. Certainly, I feel like at conferences there's a social pressure to sound like you know what you're talking about. Then everyone has this experience of like you join any circle of conversation with engineers who don't work directly with you or at companies of your size. Then it's like, what language are they speaking?

Kelsey: Yeah. I would also say one thing. Right now, where we are in the Cloud Nativeworld, Promethease, Kubernetes, Docker, Golang to some extent, you're in the kitchen. You're watching the chef prepare the dish. You're watching the chef dip their spoon, hopefully washing their hands and the spoon afterwards, and tasting the meal and asking you how do you think it tastes and willing to adjust the ingredients before they put it on the plate. That is a very complex thing.

Kelsey: Because when you go to a restaurant, you don't demand the chef use simpler tools because you're not the chef. You just say, hey, I want to order this thing that I have no idea how you would even go about making it, but I love the way it tastes. The Kubernetes world in many ways can be like that for some people. You don't necessarily need to learn how to build Kubernetes from scratch, that information is available. That isn't the requirement. All of this stuff that we're doing, we're all trying to ... We always say this, everyone is just trying to create a PaaS, a platform as a service, where you can just be intentional.

Kelsey: I have this app. Here's how I like you to run it. That's the end game for any platform. Sometimes it does require complex ingredients. Just remember that a lot of these tools are industrial tools for industrial shifts, so they can eventually offer you such a simple experience.

Beyang: Yeah, that makes sense. I really like this restaurant analogy. Taking it a bit further, would it be safe to say that the approach that you like is all these different companies, the different restaurants, and you, as an engineer, you're either a chef or maybe a cook. What you got to do is you just got to find a kitchen, join a kitchen, and then figure out how that kitchen produces things and then go from there. Don't necessarily try to adopt the processes and tools and methodologies of some other kitchen that's not cooking the meals that you're cooking and serving the customers that you're serving. Would you say that's a reasonable analogy?

Kelsey: Yeah. Remember, there's lots of fusing kitchens like Korean barbecue. I think that analogy is spot on. Then you also have to know what your role is in that. Even inside of a kitchen, there's a big system. It's just not people cooking for themselves. They're cooking for the customer. One thing you want to do, if you want to preserve your customer base, you got to make sure that you're using the right ingredients. Sometimes they don't really care about the name of the company that produces the plates and the forks.

Kelsey: I've never went to a restaurant and say, hey, who makes the plates here? Because unless it's made by ... I can't eat here. That tends not to happen. If you're starting to put the infrastructure ahead of the meal, then you may be wasting time. You need to worry about what is your food tastes like, what's the experience when you come into the restaurant. Once these tools start to add to the experience, so as a chef or as a line cook, it's in your best interest to dibble dabble in other flavors and spices to see if there's an opportunity to make your menu a little bit better.

Kelsey: It's okay to make sure that you at least have a menu and you have a baseline. Then it's not always about transformation. We hear this a lot, digital transformation or change everything. You don't always have to change everything, but it does make sense to know where you are and what's possible.

Beyang: Mm-hmm (affirmative). Would it be safe to say that your role is you're like this advisor that goes around to various kitchens and understands their needs and then you go back and work with the teams producing equipment that then you can sell to a wide range of restaurants to help them improve their efficiency. Is that how you view your role?

Kelsey: Yeah, like that. Because when you think about some of those food cooking shows, the best hosts are not necessarily actors that are now hosting cooking shows. It's typically people who have been chefs, people who own restaurants. When you put them in front of the camera, while their acting may not be superb, they do know what they're talking about. They may also go on to design their own aprons. They may also go on to design their own pots and pans, because they know what those tools should do, because they've been using them so long.

Kelsey: I think anyone owes it to themselves in their career to make that progression. Maybe you start off cooking what they tell you to cook and you follow the steps, then you may progress to someone who redesigns the menu through experimentation. At some point, you may decide that you want to open your own restaurant. Maybe at some point, you've done all of those things and you become a food critic. Sometimes being a food critic could lead to just better dishes. Someone comes to rate my food and they give me tangible feedback, well, I can go on in and improve the menu, and then I can go help a lot of other people discover new things that they should try that may also fulfill something in their life like host a birthday party or add something new to their diet.

Beyang: Yeah, that makes sense. What is your take on what the role of education should be in this kind of world? I'm talking about more formal education. Because in the culinary world, there's culinary school. Obviously, not all chefs or cooks go to culinary school. A lot of people just join a kitchen and figure things out. I've often heard in this industry, people say like, oh, if you want to learn good fundamentals, you should find a good computer science program and go that. That'll give you a really good foundation. The more I work in the industry, I don't know if that's necessarily the case.

Beyang: In your role, you've been more hands on practical educator. Do you have a take on what the role of formal education ought to be? Is that something that you think people should pursue to learn the fundamentals? Or do you think it's possible to learn the fundamentals on the job as you're cooking, so to speak?

Kelsey: Yeah. I just think all of these things, the fact that we have all these lanes to swim in is wonderful. There are some people who prefer the structure in some ways the predictability of formal education. This many years, you'll get this degree. This many years, you'll get that degree. Also, sometimes it's the right forum for some very complex ideas that require patients to acquire. When I think about the work, if we were to think about the work of building a website, that may not require the same patients to understand how electricity moves through a system.

Kelsey: I'm not required to learn that first. I can just jump straight into a Ruby on Rails tutorial and launch a website. That doesn't mean learning how to understand how electricity moves through this whole thing isn't valuable, it just isn't required. Now this same person, because I think just having access to knowledge, depending on what they want to do, let's say they want to redesign the CPU. Well, guess what? You're going to appreciate the ability to have a formal lane.

Kelsey: Because think about it, how many people ever have the ability to work at Apple on the M1 chip, work at Arm, or work at a company doing real semiconductor work. That is very, very small part of our population. The people who can be exposed to that in some setting, well, thank God, there's Georgia Tech and MIT and some of these other universities and community colleges that do present at least an on ramp into that world. I think that part is necessary.

Kelsey: The last thing I'll say here is the biggest drawback I see to any of these formal things is when we start seeing gatekeeping. When someone says you can even learn how to do this work unless you can afford to get a PhD from Stanford, that's not healthy. Because as we spoke about earlier, most of the task that we're doing don't require a PhD to perform. Nothing is worse than applying for a job with an impressive set of requirements to get there, JavaScript, that only changes the color buttons every day. That's a mismatch. It doesn't require a PhD to do that particular job.

Kelsey: I do like the fact that the people with PhDs are sharing their information through books, and podcasts, and conference talks, local meetups, IRC channels, Slack channels. That ecosystem of education, to me, just makes it all accessible to everyone. You can pick and choose your preference or you can actually choose multiple of these lanes. I think, for me, I remember giving a crash course on distributed systems to a PhD program at Stanford University. The students were wanting to understand how all of the things they'd been learning, the things they read from Leslie Lamport, the things they learned about all of these incredible complex algorithms that work in the vacuum, how do they work in real life.

Kelsey: I remember giving them a live demo of etcd and how the Raft protocol behaves in the real world and how it's broken and how the white paper doesn't describe all the things that can go wrong with cluster membership in many ways it was undefined and how it's just layered behind something like Kubernetes where all the action is. I think for them, they appreciated the fact that I was able to use ... I don't want to say simpler terms, but I had a lot of real world experience that was able to augment their formal education.

Beyang: Yeah. What was your own educational backstory? You're kind of a domain expert in the world of containers and Kubernetes now. How did you first encounter those technologies? How did you go about building an understanding of the fundamentals of what made those technologies tick?

Kelsey: In our industry, we talk about the T-shaped engineer horizontal with depth. You ask yourself, how does one become horizontally versed in all of these technologies of disciplines where there's storage, security, networking, programming, et cetera? I think the way I started my career, I remember when I was learning for my A+ certification and learning just the basics of hardware and how things fit together, and then getting my Linux+ certification, understanding how the kernel works, and how to load modules, and how file systems works, and what inodes were, I was going pretty deep on each of those subjects one at a time.

Kelsey: Every job opportunity that I had, when I worked in web hosting, I learned everything I could about HTTP and headers and MySQL and how WordPress and all these things work together. Each of those stints, I was able to go super, super deep in a particular thing. Every time I switch roles, when I joined the Open Source World, I think, is when I had a chance to express my expertise. There's a lot of experts who don't use social media. There's lots of experts that don't give conference talks. We don't necessarily know about them, because they're not necessarily in the public.

Kelsey: I just happen to be in the public, so people will recognize me. I think that's just happenstance. The way I work myself into the containers and the Kubernetes world, all of those fundamentals from managing Linux servers and then building apps that ran on top of them using things like the LAMP stack and then progressing to the Java world with JBoss, I spent some time at Puppet labs around configuration management, the dawn of DevOps where we wanted to start treating infrastructure as code. Again, the fundamentals from Promise Theory and config management.

Kelsey: When I saw Kubernetes, I instantly saw the combination of all of those things, treating servers as a group of resources and then attempting to automate them away through higher level objects or resource definitions. It looks like Kubernetes takes an imperative world and puts a type system on top and says, this is a container and this is how you can expect it to behave. Then we'll move all that other stuff into the platform so you can work at this higher level of abstraction. It was the perfect combination for me, and that's why I went all in.

Beyang: Yeah. One of the things I want to dive into is some of the specific tools and projects that you've created, and one of those is Kubernetes The Hard Way, which is almost going the opposite direction. It's like after the abstraction is built and you no longer have to interact with the lower level primitives of the abstraction hides from you, there's still this need to break through the abstraction barrier to understand how that abstraction works underneath the hood. Because in case something goes wrong and you need to debug or peek under the hood, you're going to want to know what's going on underneath that, that layer of abstraction. Can you talk a bit about Kubernetes The Hard Way and what your motivations were in creating it?

Kelsey: Yeah. I think we spoke about earlier Kubernetes has this duality to it. On one hand, it presents somewhat of this attractive interface for describing a complex distributed system. We call it YAML. You declare everything that you want to happen in terms of running your application and you give it to the system. If you're on that side of the field, maybe that's all you need to know, assuming there's someone who can keep that system up, running secure, up to date, and running. For those people, Kubernetes The Hard Way tends to be a requirement.

Kelsey: I remember when I came up with idea for Kubernetes The Hard Way, we were in a what I call empathy sessions at Google. I just joined Google from CoreOS. I remember having what we call empathy session. In Developer Relations at Google, it's part of the engineering org. I remember getting all of the GKE team, the Google Kubernetes Engine, our commercial offering on top of open source Kubernetes into one room. You have people like Chen Goldberg, who's now VP of Engineering. We have Tim Hockin, Brian Grant, Eric Tune. You have all of these people in this room.

Kelsey: I have them break up into teams of four. I was like, you all the people that work on the scheduler, this storage system, there are no excuses in this room. These are the experts. I said, guess what? All you have to do ... You guys are ready? All you have to do is install Kubernetes without using the kube-up bash script that was part of the project. Break up into teams of four, and you have an hour. Boy, boy, boy, were people struggling. Most people can figure out what VM type to use, should you install Docker first. Where do you get the kubelet binary from again?

Kelsey: Oh my God, what do I do for networking? Do I really need weave? To me, what that really represented was, it's great when you can just use a tool and the tool should be available to everyone. This is where kubeadm came from. After that session, some people decided that we should have a tool that would allow you to make all of these decisions in a formal and automated way than just running one big bash script. That was a net improvement. For me, that didn't address the original problem. People need to know how this thing works.

Kelsey: Because after you provision the cluster, I remember people hitting me up saying, "Hey, my cluster isn't working. Everything seems to be down." I jumped around, I said, "Oh wow, your TLS certificates between your worker nodes and the API server are expired." They're like, "What TLS certificates?" I was like, "That's a problem." I decided to create Kubernetes The Hard Way to say let's break down every moving part, so that the chefs that the system administrators, all the people who are responsible for the other side of those YAML files can keep the system running.

Beyang: Awesome. I have to say, as a user of that project myself, it was immensely valuable as our own team was adopting Kubernetes. It helped me get comfortable with all the things that were happening underneath the hood and feel confident that whatever happened in production, I would be able to at least know where to start, so to speak, in terms of how to debug the problem. Thank you.

Kelsey: You're welcome.

Beyang: Another project that you kicked off was KubeCon. You were one of the founders of KubeCon, which has now grown to ... I don't know what the numbers are, but it must be huge. Can you talk about what it was like to start something like that, a big technology conference and get it off the ground?

Kelsey: This is the opportunity to give credit where credit is due. Joseph Jacks and Patrick Reilly had the idea for KubeCon. Joseph Jacks and Patrick Reilly had a company called Kismatic. Kismatic was one of the first pure Kubernetes distros. Their goal was to go out and help customers. I think they both had previous experience working at Mesosphere. They really understood the community potential around Kubernetes. I think when I met them, I met them at GopherCon as well. This is after attending DockerCons and seeing the power of community firsthand.

Kelsey: They came up with this idea to have a community conference and they approached me. They said, "Hey, would you want to be a part of this?" At that time, I had already been giving most of the keynotes around Kubernetes and workshops, and had become a public figure in the space by that time. I was like, "Yeah, of course. What are we doing?" I think Patrick financed a lot of this stuff. He booked the initial hotel in San Francisco. At the early time, there wasn't a lot of people on board with KubeCon. You'd think it would have been a natural slam dunk, but it wasn't.

Beyang: Yeah, that's surprising.

Kelsey: It totally wasn't. Those logos you see now we're not interested back then. Look, I was doing it for the people. For me, I didn't care what logos were involved. I was like, not only will I be a part of it, I'll help organize it, I'll help emcee it, I'll give a keynote, use my name however you want. I remember we were going back and forth deciding what logo to pick. We landed on that logo. That came from Patrick and Joseph, JJ, working with a designer to come up with that whole ... the wheel and the logo. That was their idea. We approved it.

Kelsey: I think on the official paperwork, I am included technically as a founder. I just wanted to make sure that they got proper credit. Then once we had the conference and it was coming up and we started to get like Red Hat was one of the first big logos to jump in. CoreOS was one of the big logos to jump in. Of course, once you start to have those logos on board, then comes the Googles, the IBMs, and the others. I think people were still unsure because remember, at the time, Kubernetes wasn't ... It wasn't that great. People were like, I don't know, this thing looks clunky.

Kelsey: It doesn't have this, it doesn't have that. Look at Docker Swarm, it has all of the things and it's easier to use. Not a lot of people were sold on Kube. Then when we did the first KubeCon, I don't know, 600 people, we piled up in a hotel. It was so much fun, because there were no vendor wars yet. There was just ... What is this? How does it work? How is it different than Docker? Where are we going? I remember asking anyone running Kubernetes in production to get on stage. I promise maybe ... I don't know, 10 people maybe go on stage.

Kelsey: I was like, so sad that our project was not necessarily where it need to be in. Man, did it blow up for the next one, KubeCon EU blew up. We did one in London, and that's right before the CNCF was formed officially or maybe around the same time it was formed. Things happened and they took over the conference. It's preserved its roots in many ways of a community first, even though this 20,000, 30,000 people that go to those things. When you walk the floor, it still has the same vibe. It has the same energy. I hope that continues going forward.

Beyang: That's awesome. You mentioned, that's just a sign of how fast our industry moves. There was a time where Kubernetes was not the clear winner in the container orchestration field. Why do you think Kubernetes ended up winning among a field of very good potential competitors?

Kelsey: I think Kubernetes wasn't trying to win, is why it did well. Now, that's very different from there were people who wanted it to win, very different. The community was just so excited to have a place to learn and capture their learnings going forward. For vendors, like Red Hat that were very early on, in partnering with Google to redefine what OpenShift is, their container platform that they had previously. The Docker community depending on if you're working at Docker or not, or your affinity to Docker, but you did see this as a level up in some ways.

Kelsey: You were convinced that containers were the right way to go. You didn't necessarily have good answers to how do you deal with multiple systems, how do you deal with networking, where do load balancers fit into this equation. I think the expertise that Google brought to the table based on experience and, again, we had Mesos to learn from, we had Docker Swarm, in some ways, to learn from. Taking all of those things and then saying, we're going to take the best ideas and just bring them together and try to wrap something that works for people on top.

Kelsey: Etcd was chosen for the data store. Could we have written something from scratch? Probably. Would have made sense? No. Docker was chosen initially, actually was the only thing you could use at the time. It took several years before we had the container runtime interface that would loosen that tie and allow any other compliant runtime to be there. We chose the things that the community had already chosen. Then we layered on expertise about what a PoD definition meant. What does it mean to scale horizontally?

Kelsey: Then the last thing I will say here is that the biggest reason is we make every component separate, but first class. The scheduler is an independent thing. I remember doing a demo, I think, at a GopherCon, where I show people how to write a Kubernetes scheduler from scratch and run it from their laptop. In many ways, unlike OpenStack. OpenStack was also to me a very successful infrastructure project. What made Kubernetes different, Kubernetes was an infrastructure project focused at the last mile of the development process, not the beginning.

Kelsey: VMs networking is just the beginning of the process. When you start talking about deploying actual applications and how they run, that's very different than giving people a bunch of virtual machines. That really set the stage that Kubernetes could just focus in that area, bring these things together, and that's API enabled everyone else to layer on what was missing in the form of a custom resource definition. Big shout out to Brendan Burns, because he believed that Kubernetes needed to provide an interface for everyone else to build first class objects, just like we have DaemonSets and volumes, and pause and deployment.

Kelsey: He was really convinced that everything that was going to come later needed to also be first class. He pushed that thing through with the help of the community. Now we have an ecosystem, not just a project.

Beyang: Now fast forward to today, Kubernetes is ... I think it was you who said that Kubernetes is like the operating system for the Cloud. I think there are people who love that it's the operating system of the Cloud. I think there are people who hate it. There are people who both love and hate it at the same time. The question I want to ask you is looking to the future, what things do you see? Do you think, in 5, 10 years' time, Kubernetes will remain in its current place as this operating system for the Cloud? Do you think something might supplant it?

Beyang: Or do you think there might be additional things that are built on top of it that potentially take the place in terms of mindshare, like maybe Kubernetes is so low level that only a small number of people will have to know about it and interact with it? What are your thoughts there?

Kelsey: I think it's important to understand why Kubernetes exists at all. You can look back at previous operating systems, if we want to stick with that analogy. Like why does Linux exist at all? Why do we even need an operating system? Because we don't want everyone trying to figure out how to write bits to a hard drive. We don't want everyone trying to figure out how to convert data to electricity and put it on a networking card. That doesn't work, that doesn't scale.

Kelsey: We have this operating system that tries to normalize and give us common interfaces like TCP/IP or IP addresses and file systems, directories and folders. When you think about the Cloud, it felt like that, where we said, hey, here's 10,000 piece of box servers. Here's a bunch of network switches and ports. Here's a bunch of load balancers. Do your thing. Then Kubernetes comes along because it's necessary to say, do we need everyone trying to figure out how to glue all of these subcomponents together?

Kelsey: Now Kubernetes sits on top of that and presents a new set of things, PoDs, volume mounts. A lot of these things are very similar. It just it up levels it to say, we're going to turn the data center into the computer and Kubernetes is going to be its operating system. That's great. Linux had a great run, and it's still here. 30 years in, Linux is at the bottom of most of these stacks. It didn't go anywhere. It's got bigger. It's on your phone. It's gotten bigger, but we don't talk about it anymore because it's so stable.

Kelsey: It's so reliable, that it doesn't deserve all the attention or hype, but it's still there. Kubernetes is going to follow the same trajectory, because it has to. That's the only way we get to where we need to go. Right now, Kubernetes is going to be entering year 10 soon. It's one-third of the way there where Linux was. I think the adoption is not as big as we think it is. It just gets a lot of attention right now, because the problem is not necessarily completely solved. We still have random network cards with the drivers that don't quite work.

Kelsey: Linux still doesn't work great on the desktop. Although if you have a Chromebook, I would say the experience is vastly improved. I think Kubernetes still has a way to go to normalize the data center as a computer concept. It will stick around and be front and center until we achieve that goal. Now in parallel, just like Kubernetes in parallel and the same for Docker appeared during the evolution of the stacks below, there are things that are evolving above Kubernetes.

Kelsey: There's lots of platforms that are saying, why don't we come in and hide Kubernetes and present a new set of abstractions? Even if you're ... Hopefully you don't have a Kubernetes tattoo, don't do that.

Beyang: Ouch.

Kelsey: If you're listening to this, don't go out and get a Kubernetes tattoo. Trust me, all projects either get boring or die. Just know that going in, but we should all hope that something better comes along. Because this is definitely not the best we can do, but it's really a good place to land. It serves as a technology checkpoint of all the things we did before, serialize them into this project. It just opens up the door for someone to come in and not have to start from scratch. They can now start from declarative infrastructure and decide, do they do something different? Or did they build on top?

Beyang: What about new virtualization technologies? When I think of the Cloud today and infrastructure more probably, I think of Linux containers as the basic building block. Containers are fundamentally a virtualization technology. These days, I'm hearing more and more about WebAssembly as this new rising technology that works on both the client and the server that could allow universal language runtimes and things like that. How up-to-date are you with that world? What are your thoughts about WebAssembly as a potential new disrupting technology in the space?

Kelsey: Yeah. For listeners, you may have heard this described or called Wasm. The thing that's interesting about WebAssembly, we have a few places to look to see something simple, like Java is probably the biggest example. Java's attempt to say, there's a big difference between Windows, Linux, various UNIX's, phones. What if we can get people the Java runtime to normalize all those system calls and present our own interface so that you can just write code? If you do it right, you can write code in multiple languages as long as they generate the Java bytecode.

Kelsey: Sure, write it in Python, or Clojure, or Ruby, doesn't matter. Because you're writing to the Java bytecode. Guess what? Write once, run anywhere. Remember, that was the slogan for Java. As long as the JVM was everywhere, including your web browser, then it'll all be Java programmers. Didn't necessarily pan out that way, but Java is still wildly successful. I think another area where this probably panned out pretty well is JavaScript. Very distinctly different. Given the wide usage of web browsers, it's almost like if you learn web technologies, you also get the privilege of running everywhere.

Kelsey: Let's talk about Wasm for a second. Where Wasm starts to shine is it seems to build on top of the same position that JavaScript took. Meaning there are certain environments, constrained environments, where if a runtime was there as a target, then you have the opportunity to simplify things. Do people really need to be making all the Linux's calls? No. Do people really need to be running Docker in their web browser? No. That means you have the opportunity to say, well, how about we just present a smaller, for lack of a better term, instruction set and say, if you can spit out one of these, then we can run it in a browser.

Kelsey: We can run it on the edge. We can also give you some security promises, because since we're not exposing so many things that can be exploited. Wasm gives us this opportunity to run things that typically run in a web browser, but safer. Wasm is also used as a plug in architecture for like Envoy. If you're familiar with Service mesh, Envoy is a proxy that you typically see in the data plane that's processing that traffic. Well, think about a great extension interface. If you come from the Nginx world, Lua was something that we used to embed into a lot of tools and still do.

Kelsey: What Wasm tries to do is say, look, we can probably have something purpose built for performance, works in the browser, can run Rust, Golang, various other languages that can target it. I look at as more of an evolution of a pattern and a set of fundamentals about constraining the runtime environment, so we can promise more security and performance as long as it's worth the tradeoff and the niche that Wasm has found is appropriate for that. Is it an appropriate interface for generic compute? That's where I got to say, I'm not quite sure.

Kelsey: Because there's a whole lot of workloads like machine learning that does require advanced instruction sets like AVX-512. You can't try to constrain the whole world to a very limited runtime. That doesn't mean that it shouldn't exist, because it can't handle all the things.

Beyang: Makes total sense. We are almost at a time, which is unfortunate, because there's a lot more stuff I wanted to get to. As a final question, if there's one piece of technology or one tool that you want to ask listeners to try out or check out, what would it be?

Kelsey: I'm going to say something probably weird, but maybe your mind. Seriously, I think a lot of people are trying to look for external solutions. Just sometimes, I've just seen it so many times, but just a little bit of thought. You can probably avoid the problem altogether. I've seen people trick themselves into thinking they need to replace their CI/CD solution, versus sitting down with a piece of paper and writing out all of their bill steps and then fixing it on paper first, and then going back and refactoring those bill steps, end up in a great place.

Kelsey: I think a lot of times, we just don't believe that that tool should be used as often. Now what we try to do is we try to have other people do it for us. We look for five different medium posts. We go on Twitter, we go on Hacker News, we watch 100 videos, keep doing those things. That doesn't necessarily mean that you should be delegating your thought to someone else. You can use those things as input. At some point, I think it's healthy to look at all these projects as inputs, and then sit down with just a piece of paper and just write out what do you actually need. You might just surprise yourself that you don't even have those problems to begin with. To me, I think that's a tool that I think we got to remind people. It's a great tool that goes underutilized.

Beyang: My guest today has been Kelsey Hightower. Kelsey, thank you so much for being on the show.

Kelsey: Awesome. Thanks for having me.

Start using Sourcegraph on your own code

Sourcegraph makes it easy to read, write, and fix code—even in big, complex codebases.