On this episode we cross over with the Cross Cutting Concerns podcast for a special dual interview show. Matt Groves talks about CAP theorem and the challenges of distributed databases. Ed Charbeneau shares his perspective on why it's important as a full-stack developer to understand machine learning.
Matthew D. Groves is a guy who loves to code. It doesn't matter if it's C#, jQuery, or PHP: he'll submit pull requests for anything. He has been coding professionally ever since he wrote a QuickBASIC point-of-sale app for his parent's pizza shop back in the 90s. He currently works as a Developer Advocate for Couchbase. His free time is spent with his family, watching the Reds, and getting involved in the developer community. He is the author of AOP in .NET (published by Manning), and is also a Microsoft MVP.
Ed is a Microsoft MVP and an internationally recognized online influencer, speaker, writer, design admirer, a Developer Advocate for Progress, and expert on all things web development. Ed enjoys geeking out to cool new tech, brainstorming about future technology, and admiring great design.
00:00 Ed Charbeneau: This podcast is part of the Telerik Developer Network. Telerik, by Progress.
00:20 EC: Hello and welcome to Eat Sleep Code, the official Telerik podcast.
00:25 Matt Groves: And welcome to Cross Cutting Concerns podcast.
00:29 EC: I'm your host, Ed Charbeneau, and...
00:31 MG: I'm your host, Matt Groves.
00:33 EC: So today, we're doing a little crossover episode, we're both here at Stir Trek, we thought it'd be fun to do a little dual interview/crossover podcast reporting, right?
00:45 MG: That's right. And we're going to try to interview each other about some cool stuff that we're up to and maybe talk about Stir Trek if we have time to.
00:53 EC: Absolutely. Stir Trek's been a fun event, you've probably heard other shows that I've recorded about what sessions people are giving and this will be yet another one. So Matt, what are you here at Stir Trek talking about today?
01:09 MG: So, my session was about chaos and how we would deal with chaos in our web applications by using a distributed database, and what would happen if one of those nodes of the database goes down or is split from the network. And so, I had to talk a lot about CAP theorem, and that theorem that applies to distributed databases. And what about yourself? What are you up to?
01:33 EC: So, I'm here at the event doing some speaking on dotNET. So I'm talking about all of the changes that have happened in the ASP.NET Core space over the course of it being called ASP.NET vNext, up until today, where it's called ASP.NET Core 1.1. So there's a lot of changes in that category, it's a very fast-paced talk, I will probably be losing my voice by the end of the day today.
02:02 MG: I peeked in on your session and I couldn't find a seat and it was wall-to-wall, packed house there. It's a very interesting topic at Stir Trek.
02:11 EC: People are trying to get up to speed, hopefully what they've done is buried their head in the sand for the last two years and just tried to ignore all the churn and noise that's happened, and that's not as a slight to anybody that's worked on the product, it's amazing. It's just the growing pains of what you need to do when you're building something of that scale and of that severity of change. We're talking about going from Microsoft-only product to one that runs cross platform and is a big framework that's rewritten from the ground up and a lot of new ideas were introduced, some didn't make it and some did. So again, it's not a slight at anybody of any shape or form, but there was a lot of painful changes throughout that process. So I kinda try to keep everybody up to speed as to what actually did make it, make sure I include enough tidbits about what didn't make it, in case you encounter that kind of project in the wild where somebody may have been an early adopter of something and then left the company or whatever and you open up this ASP.NET Core project and you find a project.json file and kind of don't know what that is and where it came from.
03:47 EC: That's right. When we talk about Angular, there's Angular JS, which is Angular 1.x, the Angular you're referring to, they like to call just Angular now. So this was Angular 2, they released a new version, they skipped and they went straight to Angular 4. So, Angular 3, you're not gonna find a release for that, now we have Angular 4. So that adds a lot of confusion and again, it reminds me a lot of what's happening in the dotNET Core space. So a lot of similar things happen there where we have ASP.NET MVC, which was version 3, 4 and then 5, and we were expecting ASP.NET Core to be MVC6 and they decided to reversion that as 1. So we're counting 3, 4, 5, 1 and then with Angular, we're doing 1, 2, 4. So, you start combining these technologies together in a single application, which is very plausible, things get confusing really, really quick.
04:50 MG: It seems like people are having trouble counting, we went from Windows 8 to Windows 10, Angular 2 to Angular 4 and so on. So it doesn't help when you're trying to Google these things after the fact. [chuckle]
06:02 MG: Definitely. And Anders, I'm gonna butcher his last name, Anders H., let's say, who worked on C# is also working on TypeScript, so you definitely see a lot of C# influence on that TypeScript language.
06:14 EC: There's a few things that if you can mentally map the syntax in your head, they blend very easily together. So you can learn very quickly if you're a C# developer, the basics of TypeScript, there's just a few small things like, for example, I believe, I'm trying to code in my head here, so don't shoot me if I'm wrong. Declaring a class is public in C#, a similar thing would be to export in TypeScript. So once you learn those few little syntactical things, it's pretty easy to pick up on.
06:49 MG: The whole thing you just said with MVVM and XAML, that's where Angular really started to click for me, is when I started thinking of it more like the web forms or XAML approach and then it started to feel a lot more familiar to me than trying to approach it from, "Oh, this was an MVC framework and now it sort of changed and I'm a little confused." So that is a very good analogy, I think.
07:11 EC: The other things that I've been working on along with... And it kinda fits into these things because those are the front end technologies; at least, dotNET Core you can use as a front end technology, and also the server layer as well in this scenario. But I always did full stack development as a developer, it's kinda like the lone wolf, and I see the trend of machine learning picking up on and being a very big contender in the software development stack. I've been exploring how difficult that technology is to learn as a human being [chuckle] and implement in that full stack. So, for example, if you wanted to add a feature to your application to maybe predict whether somebody was gonna pay off their loan or not, if they're filling out a credit form for a loan application, what does that UI look like and what process does that have to go through to get to the machine learning algorithm? How is that machine learning algorithm built if you're on a team that has a data scientist person and a machine learning expert? What does that mean for you as a developer? What terminology do you need to know to interact with these teammates?
08:22 EC: I find that if you go through the process of what they have to do to build one of these things, then you can at least learn some of the terminology, learn some their pain points and then, by doing that, be able to communicate better with them, if you're gonna have to interact in that scenario. So I've gone to Azure Machine Learning Studio and physically built a machine learning or a training experiment in their machine learning studio to do the credit prediction and then learned how to attach a web API endpoint to it, and the next phase of this project is going to be to build the ASP.NET Core server component that will talk to it and then furthermore frontend that with various Telerik UI technologies.
09:11 EC: So we have UI controls for ASP.NET Core, so we'll be using those. There's still a lot of developers out there that use web forms, so I'm gonna give that a shot. We also have our brand new Angular UI, called Kendo UI for Angular, and that is components that are built on native Angular, and they are components with no dependency on JQuery, so I'm gonna try to frontend it with that as well. So that's been a big ongoing project of mine. It's been a lot of learning. I never thought it was gonna be easy, right? This is machine learning we're talking about and data science. Building something like that without peers to work with is very difficult, because you can't validate yourself. So thankfully, I had some friends that had coworkers that they got got me in touch with, after I built my project they were able to validate that I actually did a good job.
10:06 EC: I'm looking forward to releasing all of that content. We're slowly putting it up on developer.telerik.com as it's written. Looking forward to everybody being able to learn from that.
10:15 MG: Cool, sounds good. So I used a little bit of ASP.NET Core with my demo today. It didn't work out all that well. If you were here at Stir Trek you saw what went wrong, but the idea was I wanted to use a web application to demonstrate the CAP theorem a little bit with a NoSQL database, and I happened to use Couchbase, which is my employer but this is... The CAP theorem applies to distributed databases in general and there's a guy, Eric Brewer, who did the original white paper on the CAP theorem. He also invented terms like acid and base, so if you're familiar with database world you've heard these terms before, but the idea is that, if you have a distributed database, you have multiple machines acting together as one database, but that introduces some problems, because those databases need to connect to each other over a network, and what happens if the network partitions, or one of those machines goes down, or something happens to a part of the distributed database? And the CAP theorem governs what you can actually do in those situations.
11:18 MG: So the CAP theorem is an acronym. It's C for Consistency, A for Availability and P for Partition tolerance. The partition tolerance is kind of a given, because if your distributed database is not partition-tolerant, then it's not really a distributed database. You have to then design your database to favor either one or the other. So that's what the CAP theorem says, is that you can have those three attributes, but you can only guarantee to have two of them consistently, all the time. And so just talk about what does it mean? Consistency means that across your distributed database, there's only one copy of a piece of data, one recognized canonical copy of that piece of data, no matter what happens to the partition.
12:03 MG: On the other end of the spectrum, availability says you're always going to have the ability to read and write a piece of data, even if a node goes down or the partition is split, if the network partitions. And so, just as an example, if I were using Couchbase, for instance, and I had three nodes, and I create a document, that document lives on one of those nodes and if that node goes down, I can no longer access that document. Whereas if I look at something like Riak, or DynamoDB, or Cassandra, or something like that that's on the other end of the spectrum, if that node goes down with that document, I'm still able to make writes to that document. So the database will still take those writes, but the trade-off there is, of course, if that node comes back up then I'll have two conflicting sibling documents that I have to resolve somehow.
12:54 MG: And there are some ways to do that. These databases do address both of these concerns, right? So they both might sound dire, but the CAP theorem is really more of a spectrum in my mind. Couchbase has one document on that one node, but we can compensate by saying, "Okay, the node is down, let's fail it over. All those documents are now dead to me and we'll take some replicas on other nodes and promote those to the active copy." So there's a period of time where you're not available, and this is consistent with the CAP theorem, that you don't have availability guaranteed all the time, but you do have a way to deal with that when the network partitions. And similarly, with the other end of the spectrum, if you have multiple documents that are siblings or in conflict, then you can have the database say, "Okay, we'll take the most recent document and we'll go with that one. Or the one with the latest revision, the highest revision number, we'll go with that one."
13:54 MG: Or you could allow a user to interact with it and say, "Okay, I will merge these documents by hand, like we do with source code when we have to merge two branches together, there might be some conflicts." And so I was hoping to demonstrate that in the Stir Trek session today, I wanted to demonstrate it in a very interesting way, but my CouchCase demo didn't cooperate, so I had to fall back to using a cluster inside a Docker so I could shut down individual Docker containers, show the results on the screen, show how the database deals with it.
14:25 EC: You actually have some custom hardware type of a rig that you've set up. Can we talk about that?
14:31 MG: Yeah. I don't know if I'd go so far as to call it custom hardware, but it's a selection of three tiny computers I wanted to put into a portable device, so I could carry it around to conferences and user groups. And if I carry around three servers with me that's gonna be a little heavy and hard to carry around. So I wanted to get three tiny... And they're called Intel Compute Sticks and they're meant for Netflix, really, is what they're meant for.
14:55 MG: They're not meant for running a distributed database. So sometimes they don't cooperate that well, but I put those in a box, I put a router in a box, I connect them all together, get a power source and then use some software to connect to that cluster and demonstrate the CAP theorem.
15:10 EC: So you have lots of wires, lots of computer components, lots of little black boxes, jammed in a bigger black box that you have to travel with, that has to be [15:21] ____.
15:22 MG: And not only that, but I want you to picture the little box that this is in. It has a handle, it has a combination lock on the side, and it has the word "freedom" stamped on the side in large letters because...
15:35 EC: Oh, boy.
15:36 MG: It is a handgun case.
15:38 MG: Which is the perfect size and it has the padding for traveling. But yes, it is a little... It might be suspicious a guy walking through TSA with a handgun case full of wires and electronics.
15:52 EC: All you're missing from that is a digital clock and you are in a whole world of hurt.
15:57 MG: So far so good, I haven't been bothered by the TSA yet.
16:00 EC: Yet. 'Til they hear this podcast, assuming they listen. Oh, boy. With the CAP theorem stuff, I think now that you've talked about it, it kind of refreshes my memory a little bit that I've actually seen some of this presented before. Now, this is something that you'd really wanna test and intentionally break, right? So you'd wanna get your solution set up to where you're happy with it and then purposely pull one of the servers down and break it, right?
16:28 MG: Absolutely. As I mentioned, you can deal with some of these challenges on the server side, but there is a period of time where the kind of requests you're making from your web application to the database may not return the results you expect. The document might be missing, there might be a read-only copy, there might be a conflict. And so you need to deal with that contingency, right? So your code needs to deal with that possibility, even though hopefully your network is reliable, your servers are reliable, if you're running inside as your AWS or something, they're not gonna come down very often, but you do have to plan for that eventuality that they might come down.
17:08 EC: By planning for that, I've heard of people purposely inflicting network outages or taking servers down to make sure that nothing happens while this is going on.
17:20 MG: Absolutely. So Netflix famously has the Chaos Monkey that just shuts down random machines. This is one of the ways you can deal with that Chaos Monkey is by having a distributed data in a cluster or multiple clusters, perhaps. Maybe if for some crazy circumstance the entire Amazon Eastern Data Center goes down, which we all know could never happen, you still have a backup in the Amazon Western Data Center that you can switch over to.
17:47 EC: Awesome. So how have you enjoyed the event so far?
17:51 MG: I've been going to Stir Trek since the very first one. I may have missed one in there, but I love Stir Trek, it's a great event put on by some really great people, they work very hard and I enjoy this event immensely.
18:05 EC: Same here, I think it's very well-run. If it wasn't, I wouldn't be back year after year. I did miss the first year but I've been here three consecutive years, so I've got to enjoy a lot of excellent speakers, a lot of cool events, lots of great movies. We're usually treated to a Marvel Avengers movie, this year we have Guardians of the Galaxy 2. I'm looking forward to watching it with everybody and that experience in itself is different than just a normal movie experience, right? 'Cause we're all geeks. We understand some of the nuances in the movie that your general audience doesn't. Like I've seen like Captain America movies with normal people and...
18:47 MG: Normal people, he says.
18:48 EC: I know, I'm gonna get it for that one, right? [chuckle] Geeky people like myself. When we see the cameo of...
18:58 MG: Stan Lee?
18:58 EC: Yes. Stan Lee pop up. There's a little more laughter than if you're with a general public audience, right?
19:06 MG: Yeah. There's just little details and this conference is called Stir Trek. They've shown the Star Trek movies a couple times and I think the same holds true with that if you're a Star Trek fan from way back, you understand what a red shirt means, so if they make reference to it in the movie, you absolutely know what's going on there, and so those sorts of bits of subtext and references to the source material or comic books, in the case of Marvel movies, definitely something that this audience tends to appreciate more than others.
19:37 EC: Yeah. It's an experience like no other. I've never been able to watch a movie with this many people that I've felt a connection to before, that's great.
19:47 MG: Absolutely. Have you got to go to any sessions so far?
19:50 EC: I have not. I've been too busy working. [chuckle] Progress and Telerik, we've been running a booth, so we've been talking to lots of amazing attendees, which is always great, that we get to hear about lots of cool things that people are building and using our tools for. And there's always something that's new that I've never heard of before and you're like, "Wow, I didn't know we could actually build something like that." We have people that integrate our stuff with F# in all kinds of different ways of powering their UIs, and it's pretty amazing. But the only session I've gotten to attend is my own, [chuckle] gotten to record, this will be the fifth podcast today, so.
20:34 MG: Oh, my. This is my fourth. So I'm one behind, I'm gonna have to catch up. But I do appreciate Telerik and Progress coming to sponsor Stir Trek, seems like whenever I go to a great event, you guys are there sponsoring, so I definitely appreciate that.
20:48 EC: Well, thank you. Yeah, we enjoy coming out and talking with the community, we'd like to hear how you guys use our products, right from the source, meet you in person, so that's why we come to a lot of these events. I think I've seen you at quite a few, as well, so I know Couchbase guys like to get out and talk to the folks as well. We'll have some events coming up in the future as well. We will be at Code PaLOUsa. We have some events where we're getting on board for for the fall, so I'd probably rather not say what they are yet, just in case the plans fall through. As we're recording, we are still pre-build, so Microsoft Build is coming up, this will probably air much after that event is over. So I'm looking forward to some of the announcements that come up at Build, we'll be there in person as well, so I hope we get to meet a lot of cool people there.
21:42 MG: Any Build predictions?
21:44 EC: I'm not inclined to say too many, so without any inside knowledge, I'd like to see some announcements around dotNET and a solid schedule of what dotNET standard is going to be released on. We've heard some rumors about when in... It may be in the fall or something like that, that it's official. I think those, they're not very much rumors exactly, but there's been like soft announcements, I think, is more of a correct way of phrasing it. I think it was actually on MSDN they said, "Expect it in the fall." But I'd like to see like a hard timeline come out on standard two and a real detailed outline of what APIs are gonna be supported by that. And I'd like something to surprise me as well, right? We went into an event a couple years ago not understanding that something like the HoloLens was gonna come out. I'm not saying another HoloLens should happen, but something to just blow our minds would be cool, I'm ready for that. Since I've been working with Progress I haven't been at a Build event where there's been like one of those huge things dropped on the audience. So I wanna be part of that.
22:58 MG: I'm hoping for some more progress in the open source arena, so some of the things they've announced in the past like SQL server for Linux have kind of really blown my mind, so a Bash on Windows 10, for instance. So I'm hoping to see something along those lines at Build coming up.
23:15 EC: And they've been open sourcing quite a bit of stuff, so we'll definitely see what they have to offer this time around. It's in Seattle this year, so I'm looking forward to going back to Seattle. I know that since it's in Seattle there's gonna be a lot of blue shirts there, I get to talk to some really awesome folks.
23:32 MG: So listen to Eat Sleep Code podcast for some interviews with them, hopefully?
23:36 EC: We definitely have some stuff lined up, there may or may not be some things already in the can that haven't been released yet.
23:43 MG: Intriguing. And you can listen to my podcast. I'm not gonna have a lot of blue shirts on in the future, I don't think. But crosscuttingconcerns.com.
23:51 EC: What are some of like the past few guests you've had on, what are some of your favorites?
23:56 MG: I had Sophie Wilson on an episode. I'm sure some of your listeners are familiar with her, she invented the ARM architecture and some cool things like that. I just like to meet people at conferences and try to have interesting conversations and there just happened to be microphones nearby. So I just did a couple here at Stir Trek with Brett Whittington, who's one of 'em, one of the speakers here at Stir Trek. And Charles Huelsman, who I used to work with at a consulting company here in Columbus. So it's great to get in touch with those guys.
24:25 EC: So between Eat Sleep Code and Cross Cutting Concerns, you probably will hear a good chunk of Stir Trek speakers. The reason for that is just 'cause it's really awesome people that come and do this and it's a great event. Thank you for coming on my show.
24:40 MG: Yes, and thank you for being on Cross Cutting Concerns. I always like to give my guests a chance to plug their contact info or websites, things like that, so please go ahead.
24:49 EC: You can find all of my info, podcasts, show notes, all that good stuff at developer.telerik.com. And you can find me on Twitter @edcharbeneau, E-D-C-H-A-R-B-E-N-E-A-U. Where can we find your information, Matt, for our listeners?
25:08 MG: So check out crosscuttingconcerns.com for my podcasts. I'm also on Twitter, mgroves, that's G-R-O-V-E-S. And please check out my blogs at blog.couchbase.com. Lots of cool stuff there, including some details about the CAP theorem and the CouchCase I mentioned.
25:24 EC: And we'll post all that in our show notes as well. Thank you again and we'll have to meet up at some more events.
25:33 MG: My guest today has been Ed Charbeneau. Thank you.
25:36 EC: Thank you.