Django Chat

Official Django MongoDB Backend - Jib Adegunloye

Episode Summary

Jib is a Senior Software Engineer at MongoDB working on the newly-released official Django MongoDB backend. We discuss building out this new support, future plans, and his previous career at Meta.

Episode Notes

Sponsor

This episode was brought to you by Buttondown, the easiest way to start, send, and grow your email newsletter. New customers can save 50% off their first year with Buttondown using the coupon code DJANGO.

Episode Transcription

Carlton Gibson 0:00
This episode is brought to you by ButtonDown, the easiest way to start send and grow your email newsletter.

Hi. Welcome to another episode of Django chat podcast on the Django web framework. I'm Carlton Gibson. Joined us ever by Will Vincent, hello. Will

Will Vincent 0:16
Hi, Carlton.

Carlton Gibson 0:18
Hello, Will. Today we've got with us. Jib Adegunloye from who's a senior software engineer, engineer at MongoDB, hi, jib. How

Jib Adegunloye 0:25
are you? Hey, hey. How are you doing? I'm doing great.

Carlton Gibson 0:29
I'm marvelous. I'm really excited. You've come on. You've got some really exciting news, because Mongo just released the preview of a new back database, back end for MongoDB, for Django, right? Yes,

Jib Adegunloye 0:41
yes. We just did our public preview of beta of our release called the Django MongoDB back end, and it quite literally is what it sounds like a MongoDB back end for Django.

Carlton Gibson 0:56
Okay, well, that's that sounds super I'm going to pause us there on talking about that, because we're going to nerd out on that a lot before we get that. I want to know, I want to know a little bit about you, and you know who you are, and you know, how come you get to be building this thing? So, you know, do tell us,

Jib Adegunloye 1:12
yeah. So I guess I'd start at one, you know, the beginning of my sort of software engineering journey, journey. I came into sort of CS as an undergrad, and over in 2014 I'd always known I'd like CS, and I was actually really interested in database systems. So a large bulk of my CS education was actually based in SQL databases. So,

Carlton Gibson 1:42
right?

Jib Adegunloye 1:45
And so as I actually, by the time I had graduated, though, the first thing I had done is, I think a lot of computer scientists did at the time, was trying to build something of my own. And from trying to build something of my own, literally in the bike setting, bike shedding, conversation with with my ragtag team of friends, they said, Hey, can we use this thing called MongoDB? And I was like, what is it? And they said, no sequel. And I was like, Absolutely not literally sweat, yeah. But eventually, eventually, I actually got battered down, and I went ahead and I used MongoDB. And I will say I am now like a MongoDB evangelist. So really, I started, like once, once I graduated, start seeing more and more. And so when I got to my first job, corporate meta, I started seeing more no sequel and more pieces like that. And, you know, after kind of doing the gamut around the tech scene, I'd seen an opening for the very thing that, you know, had kind of like been the defining factor of me leaving sort of academia and going into the business side of things. And there was a position open to be a senior Python Dev over at MongoDB. And I was like, yeah, absolutely, you will see me there.

Carlton Gibson 3:11
So are you? Are you the Lucky man doing his dream job? Is that what you're saying? Basically,

Jib Adegunloye 3:16
you know, I'd hate to romanticize things. It feels that way for sure.

Will Vincent 3:23
Can I ask, what were you doing at Facebook meta? Were you working on databases as well? But, or no,

Jib Adegunloye 3:30
no. So at meta, I did. I did a lot of things. So I first started off at what was formerly known at the time as Oculus research. It's now Facebook AR VR, or, I think, reality labs. And so I did a lot of the stuff around the what's currently seen as like the headsets. And then from there, I had transferred over to what they termed the blue app for Facebook proper, where I did quite a bit of machine learning ranking work for the groups, Facebook groups, activity feed. And then from there, I decided to do yet another pivot, and I worked deeply in the traffic organization, specifically dealing with live stream, video, network protocols, so on across the Facebook, a meta family suite of apps, so Instagram, Facebook, Oculus, even. So if it had live, we, we were, we were very, very much in tune. And like I said, I started out in a life of databases. Found myself in testing infrastructure, then machine learning and then network protocols. And I was like, let's go back home. Well, I have to

Will Vincent 4:58
ask just a little bit about machine learning. Because the it's now public that DjangoCon Europe is happening in April in Dublin, and Carlton's giving a talk. I'm giving a keynote on Django and data science machine learning. So I'm fascinated to hear from people who've actually done it in a professional capacity, because I'm certainly new to it. I know the web stuff a little bit, but it definitely feels like two separate realms, like people who do data science don't really do web and vice versa. Does that match at all with your experience? Yeah,

Jib Adegunloye 5:31
I'll say that the like, one thing that I was introduced to was truly the vastness of computer science by jumping into this machine learning space. When I had done the transition to the team, I had almost openly said, I have no experience yet, and so, and I thought, Hey, maybe I would have experience given, you know, understand scaling, large scale databases and things like that, there are definitely some notes that you can pick up. Like, in terms of, like, how do you ingest all this data, to feed it into it, to feed it into a system that is getting trained or that is using the information as training data in order to generate informed decisions? So there's definitely an aspect of, if you know, a lot of really complex SQL, yeah, that's a great place to get started, right? But I'd say the big thing that I'd learned, like when dealing with applied machine learning, was understanding that, you know, the components you learn about in theories, right, the things about hypotheses. And building nodes and things such as that, they all kind of map to a larger thing, which is the larger like, like, what is the genuine machine learning pipeline? What are the machine learning componentry that stack up in this pipeline to actually give me the result? Because even I'm not sure if you're familiar with the current generative AI things, those similar like, sort of principles, kind of map where the AI or the machine learning to be specific machine learning, because generative AI machine learning are truly like they have kind of like a fork in the road now. But yeah, the machine learning aspect is, well, what part do I need to apply machine machine learning functionality to, to then pass it back off to, you know, like that, sort of what I'd like to call the bread and butter, the meat and potatoes, engineering aspects of still parsing, processing and generating output for the information. And so what that looks like sometimes is, hey, I've got 1.7 billion users, right? And I want to suggest them a group. What can I look at? Well, I know that they've got some maybe they've got some terms they use that are public access content. And I can do basically a unsupervised learning where I just kind of match up commonly placed together keywords. The output of that are like tuples of just like two to three sort of keyword pairs or word parents. And you actually can't make any sense of that like you, but you now know that people talk about the word dog and chicken a lot in a post, right? Like, that's the thing that the the clustering algorithm told you, like dogs and chicken. They're doing something there, right? So now we have that raw, new piece of information, and we're like, okay, so we know that we want to look for posts that say dogs and chicken, so we then run it through another algorithm. We then actually run it through some SQL where we're we're querying for posts or information that are specifically using both the terms dog and chicken simultaneously. And then we feed it through yet another machine learning algorithm to understand, well, okay, of these phrases that we've now isolated, how can we use it to understand, or like, piece out sentences that actually make sense, and then finally we piece out the sentences that make sense, and then we feed it back into our usual like ranking algorithm. Well, what's getting the most engagement we we score that on against engagement, and then finally, boom, you've got the output you expected. Two out of four of those stages were machine learning. The other two stages were just again, like us, of the of the machine learning application. And so it's that fine tuning where you build a machine learning like a machine learning algorithm, and applying it and just building a sense of confidence around it in that model, that becomes the crux, at least, of my app, sort of like application of it. Well,

Will Vincent 9:59
that's one of the we. Often make the point that people talk about, like, internet scale with some things. And if you're working at Facebook with billions of users, that's like, that counts a few times it counts. So Wow. Well,

I don't want to get too off track, but that's all like, super interesting. And yeah, we could, we could talk about that the whole time, but we're gonna talk about other things. So Carlton, go ahead. Well,

Carlton Gibson 10:24
okay, go ahead and move on to your message. So tell us not message your MongoDB. Tell us about what you do at dumongus. You're senior, senior software engineer there. Yeah.

Jib Adegunloye 10:33
So, um, like said, I'm a senior software engineer over at MongoDB, and if I were to describe my role right now is sort of on paper, clear cut. I deal with the like our I work on the database experiences Python team, specifically, also known as the drivers Python team, or Python drivers team. And what a driver is is essentially the the like the library or set of mechanisms that allow a user, a developer, namely, to easily connect to their manga, like their database instance. So in this case, the like our mainstay, our core language, or our core library, it's PyMongo, which is the Mong, the Python driver for MongoDB, right? Or the MongoDB driver in Python, yeah, and

Carlton Gibson 11:28
there's a Python developer, that's why they install, I pip install PI Mongo when I want to get going,

Jib Adegunloye 11:32
exactly. And so the main safe thing we focus on, and we make, we maintain, is that library. There are a several subsidiary libraries that we either contribute to heavily or maintain, some of those being Mongo engine, which we contribute to, or an example of when we maintain ourselves, pi, Mongo arrow. And then additionally, there are several others, especially like in the AIML space, like Lang chain, we've had several contributions there, and, you know, sort of built a relationship and rapport. And then we've also got the web frameworks that we've done our best to integrate with, namely, you know, things like fast API, flask and now Jenga. So we try our best team to integrate there.

Carlton Gibson 12:23
Okay, so that's, that's super I've got to, I've got to ask, because Hang on, querying Mongo is nothing at all like using the Django ORM. So hang on. What's going on here? how on earth are you, are you integrating with the Django ORM?

Jib Adegunloye 12:39
Great question. So one thing I've been saying, and I'm, I'm actually eager to get, you know, like, checked on it, is then when I look at Django, when I've when I've viewed Django, I've also always been like, Oh, this is a SQL like framework. Like, that's what it is. But when you start pulling or piecing apart the layers under the hood, you realize that the way the system works is, yes, there's the Django query set API, like that framework piece the component tree that allows you to actually issue queries to the database at one point, at some point that gets converted to explicit SQL, like Structured Query language, and then that gets fed to the database. What you realize is okay, so there's a set of API calls that actually generate the SQL. So technically, if I just remove the parts that generate the SQL and replace that with MQL, the Mongo query language. It should work, right? And like how

Carlton Gibson 13:45
I got to you just said, technically, if I just replace, yeah, sure, sure, sure, we'll just do the Mongo query language in there, go and carry on. So

Jib Adegunloye 13:57
yeah, of course, I'm definitely being more sort of, oh, laissez faire about my explanation, but to be more specific, so with each so, like, let's say you're doing a standard lookup query. Of like, say, I want something that starts with, I don't know the letter Z, right, forgive my SQL here, but essentially you'd be doing select star from, you know, select whatever from, specifying the database table where starts with letter Z. That's what it would be structured like in SQL, right? So for us, when you do that, look up call instead of calling, you know, basically like this conversion of as SQL, we have embedded a callback that mirrors that same behavior, where, instead of calling what as SQL, we're calling as MQL. And this as MQL will is essentially a. Um, doing, taking that step of where it's seeing the starts with, and it's making our manga to be analog. We actually do have a predicate that lets you regex match against the beginning of a string. And so that's that's like the one mini component. But the thing that therein wraps it is, we've created our own customized compiler, because what Django is actually speaking to is a SQL compiler. And so we've made our customized SQL compiler with an embedded Mongo query generator that at the same steps where you would call these that SQL WHERE clause to generate, to generate your SQL. We call, we call that where, and it starts generating what is the syntactic equivalent of a where in, you out, and then, and then, the the stuff that's wrapping it, similar to how it's wrapped in Django, where I was talking about select star from. That's that's kind of standard information that you get from having the model exist. You already know what you're selecting from. That's the model you defined. And you already know what your columns are, because you define the middle fields. So if you know you want x field, we already know that by virtue of having the model. So when we it's very easy for us to now construct our MQL equivalent and say, All right, we don't have a select star from but we do our match statement on starts with, and then we do what's known as a project statement to say, actually, just give me the specific fields that I specified. And so the dollar projected MQL looks like, works in a very similar way to select, so you know, to select star from

Carlton Gibson 16:49
right or you've got and did the ORM has the only filter where you can select specific fields, I guess, implemented that as well?

Jib Adegunloye 16:58
Yes. And so what we what this led to, what this leads to is an almost completely obfuscated layer of MQL that a Django developer does not need to know in order to get work done right. And I and we've understood that this has been something that has been a plight for NoSQL developers for some time.

Carlton Gibson 17:23
Well, I mean, there have been attempts at this, right? I mean, based around the Google App Engine, data store, originally, I can't remember what it, what it was called, but it kind of, it kind of worked, but only, only kind of, whereas you've gone a step further, right, you've got pretty much all of it works now, right? Yeah,

Jib Adegunloye 17:43
yeah, and so to even, and it's great you brought up the Google App Engine, because I would say that one that was a product of almost 10 years ago. Now, yeah, yeah, right. A lot's happened in 10 years. At that time, MongoDB was, I believe, operating on version less than server version four, we are currently trailblazing on MongoDB server version eight. We have a wealth more of querying predicates that do mirror a lot of the things and expectations over in SQL land, but also like we, we too have built a unique set of things that have made us a significantly more powerful, significantly more powerful database than the implementation strategies of 10 years ago. And then moreover, those implementation strategies we're using basically a different paradigm. Right now we're leveraging the unique power of the MongoDB aggregation pipeline. I'm not sure if folks are familiar, but just to walk through the aggregation pipeline that it is, it's not necessarily new, but it's MongoDB sort of Premier service in creating very, very intuitive and complex queries that can really stand production level work. And so how it works is, there are you've heard me say the term predicate, right? So essentially, these aggregation operators. And this aggregation pipeline, and I want you to imagine like an assembly line of of instruction, like an assembly line, the traditional assembly line, right? And so the MongoDB aggregation pipeline works kind of like a traditional assembly line where each predicate is its own isolated augmentation onto the collection or the database that you're working against, and so you can, quite literally Intuit what's going to happen next, because it's procedural, right? So if I were like using that starts with Z example, right? I know. Let's say I want to find something super complex. Like, I want to find every movie that made over $2 billion in the last 12 years, right? You could spend a bunch of time trying to think about, how do I do this in a very, increasingly nested structure? Or you could say, first, let me match on every movie, right? The I want to group every movie by all its total sales. Great. That's one step. Now that I've grouped every movie by every each one of its total sales, I want to now do a greater than operation on 2 billion, like, based on this dude, are they greater than 2 billion? Great. Now I found all the ones greater than 2 billion. And now let's say I want to do something like rank them. Third procedural step, all of these isolated buckets that can can be swapped around, interchanged and tested in real

Carlton Gibson 20:47
time. And that kind of fits with how the you know, when you're using a Django query set, you know, you add a filter call, and under the hood, it's doing an add queue where that's exactly the same kind of predicate

Jib Adegunloye 21:01
within so that's why I've been championing the point that Django isn't actually a SQL framework.

Carlton Gibson 21:08
Oh, good. I love that

Will Vincent 21:10
sounds. That sounds like a conference talk for DjangoCon us. It's in September in Chicago. Yeah.

Jib Adegunloye 21:19
And the team will be at DjangoCon us, and we will actually also be giving a talk at DjangoCon EU. So okay, which actually jumps into my third point about why this is different. I hate to kind of, you know, to put it out there so plainly, but in the past solutions, you know, MongoDB may have been a passive member, or we may have, you know, helped influence the solution. But in today's iteration, in today's version, MongoDB has put their full force behind it. We've got dedicated engineers working on it as well. We've enlisted the consulting power of Tim Graham, a former Django fellow, so all of our decisions are couched in both Django specialization and MongoDB specialization. So there will be times where we truly like as a team, come together butt heads and have a real discussion about like, Okay, in this, in part of the implementation, are we more Django here? Are we more MongoDB, or can we develop right? And because we're now taking it from a, from two, from deeply specialized angles, we're able to say more concretely whether or not this is a decision that stand, that will stand the test of time, because we know we've got somebody here, two people, in fact, that can speak very, very acutely to what a typical Django developer looks for. And then we got employees who, you know, deal with MongoDB every day. And so I think even in that relationship, we've we've formed something richer and and we folded it into our into our quarterly plans, our yearly plans. It is, it is something that we, we're committed to continually iterating on.

Carlton Gibson 23:11
Well, that's super important, I think, like the trouble with third party back ends historically has been, well, who's maintaining them? Who's developing them? Where's the engineering time to make sure they keep up with the changes in the in the ORM as Django, Django evolves, and like for Mongo, to allocate that engineering time to it. Is that okay? That gives us a, you know, some, some degree of confidence that it will continue to evolve, compared to, say, other, you know, other back ends from companies that we won't bother naming

Will Vincent 23:41
it. Can I ask about the so now there's support for the three big web frameworks, right? So there's already support for flask and fast API, and now Django, that's correct, right? Yes. How much overlap is there? Because you're going like, there's the Python space and then the web space, does the can you reuse the Python stuff at all? Or is are they completely different the drivers for each of the three?

Jib Adegunloye 24:08
So they all, they all use the same PI Mongo driver, they all leverage that same core driver. So in terms of that, there's definitely overlap, because we can use it, but in terms of how you we view, or we've treated each one, I would say that from, from my experience, like the flask and the fast API, they are, they're sort of closer and in wheelhouse based on their implementation, you know, flask being sort of like the least opinionated as we know, like, Hey, you Just want to start something up. Here you go. Right. Fast API having a bit of an opinion, specifically, the Sebastian's template generator is a little more opinionated, because it helps you scaffold things. But in terms of, like, letting you kind of pick and choose your own adventure, fast API still allows that leverage. Django. Very, very on the other end, it's, well, the other two don't have the ORM, right,

Will Vincent 25:09
but they have SQL, yeah, I guess that'd be like a question, like, you know, so SQL alchemy. I mean, okay, they don't have their own thing, but I mean, flask, certainly, you use SQL alchemy in most of the cases I'm familiar with or disagree with me, please. If that's not the case, oh no, no, no, you're

Jib Adegunloye 25:26
right. I'd say that the thing is, like the usage of the ORMs are not as tightly coupled like so, and because they're not as tightly coupled, it's just like for us, our decision is, I think it's just better to make it easy to use or leverage our like our driver directly versus in Django. That's kind of antithetical to the experience, right? If,

Carlton Gibson 25:52
yeah, no, historically, that's it. You'd be like, you've got your Django view, and then all of a sudden you're bringing in, you know, the library for whichever, you know, whichever no SQL database you're using, and it's like, well, hang on, this just doesn't feel like Django Exactly. The whole point with integrating the ORM is it will just feel like Django still,

Jib Adegunloye 26:12
and I'll say it that, and that's where the we we understood. The thesis was different, right? Like when you go to Django, you go to Django because you're not you're not just there for something that allows you to spin up a website. You're there because I want to have a solid, authentic authentication system. I want solid administrative management. I want solid session management. I want to be able to know that when I build a form it's all auto, magically like being able to, like, take the information and then write it to the database and in such a way that I'm not thinking about this like 24/7, in the prior solutions, they're still going to be somewhat non trivial now, and that's not to their their detriment. It's just

Carlton Gibson 26:58
their you're building it yourself, right? So it's gonna be just the

Jib Adegunloye 27:03
you're building it yourself, and we understood that in Django, if we if we're making users build it themselves, then we're kind of going against like what Django says on its website.

Carlton Gibson 27:15
So I have to ask, are there corners, corners yet that you aren't happy with that. Aren't finished, that you absolutely

Jib Adegunloye 27:24
there's one thing I stand by. It's like this, like one, you know, we're in beta. But two, even after we go for our general availability, our general availability release later this year, there's still going to be more work to be done, because at the end of the day, fundamentally, sequel and no, see, there's, there's a reason it's called no sequel, um, but to Yeah, but to answer the question, in the present, right? There are a dirge of MongoDB specific things that we are aching to get in. We're just trying to get it in the right way. So one example is in a traditional like SQL structure. You're looking at the usage of foreign keys. That's, that's second nature, right for us, we also support that ability. Like it's, it's like, if you want to use your foreign keys, go ahead, but we understand what makes MongoDB performant is moving away from that paradigm and using what we call embedded documents in this library. They're known as embedded models, and they're viewed as embedded model fields. The power of MongoDB is that you can have a document with tons like like that, hundreds 1000s of sub documents in them, and then query against that. Because this is not something that's really been done in SQL like, I know there's H store, there's sort of like sub objects and things like that, but to the degree in which they're done in MongoDB, it's a little bit different. And so we've been working on ways to get this nested embedded model structure to be airtight, but also work very intuitively in the Django ecosystem. And so that's one area that we're, we're actually barreling down right now and and we are, like, absolutely set on making sure that that is a fluid experience for folks. There are other things that you know that we're that we are. We've run into corner case situations on usually, if it's like, just an issue around, like, SQL, then we just, we ignore it. Like, well, I can't support these SQL functions. It's okay. But, yeah, I'd say, like, a lot of the like, we're running into things where we want to improve more, iterate and introduce more multi be specific aspects, and I'd say that the like as we are personally going through making sure those work, we're also hoping that during this like beta phase, people tell us what isn't working, or what they find. Trying to be weird, so we can immediately, kind of capture those. This episode

Carlton Gibson 30:07
is brought to you by button down. That's buttondown.com. Email software for developers like you. There are hundreds of email marketing software services out there, and they will pretty much offer the same thing, collect and clean addresses, send out broad crisis or drip campaigns, get analytics so you can see what's resonating and what's not. Button down is designed to hook into the tools that you already care about, everything from static site generators like Jekyll or Hugo to payment platforms like Stripe and memberful. You can hook your site up to button down with just a form element or a simple REST call write emails in Markdown and then get on with the actual work you're supposed to do. New customers can save 50% off their first year with button down using the coupon code Django, and if you email support, they'll white glove migrate your existing subscribers and archives for free.

Will Vincent 30:52
I saw this in the docs. I know async is, you know, a new frontier in Django, and I believe there isn't really Mongo support for that yet? Is that correct?

Jib Adegunloye 31:03
Yeah, there isn't Mongo support for that yet. But that is also an area that we we want to introduce. We are also in our driver. We are also currently, like, improving, like our async asynchronous functionality support. So simultaneously with this, you'd see like an like, like an improved and richer version of asynchronous functionality, both by the baseline driver and also, and we hope it compliments Django as well. So we aim to have support. But for now we, we've, we figured it's just based it's great to get that, that sort of bread and butter synchronous ability, good to go. And as we, you know, push towards gender availability, where more production use cases are leveraged. We want to be confident about saying, Hey, we can also support your asynchronous case as well. Okay,

Carlton Gibson 32:06
what was the hardest thing, you know? What was the bit where, what was the moment in the project where you're like, Oh, this isn't gonna work.

Jib Adegunloye 32:17
Oh, yeah. So I think there's, there's actually so many of these. One, I'll say, you know, let me start off with what was easy. So what was actually shockingly easy to was doing that initial replacement of, like the MQL with SQL. That was actually shockingly easy. It was like, hey, here are the lookups. Here are the functions that the lookups will generate instead, right? It's quite literally a dictionary where things got very interesting was when we had to start supporting things like annotations or supporting our version of grouping, like, like, yeah, like, supporting group by in a MongoDB specific way, because, like I said, we have a very procedural, like, procedurally generated SQL statement. I mean, MQL, it's an array of, it's an array of Python objects, if you if you want to think about it like that. And so one big thing is, in order for it to be self referential about something that happened, let's say at step number four, we need to capture metadata now and then make sure that the metadata hasn't been mutated too much from step number four to step number seven, to ensure that the output is the same. And then another challenging one was MongoDB, at the database layer, handles nulls differently, right? So where Django would expect a null value, MongoDB is like, bro, we don't have the value at all. Like we're we're not, we're not beholden to that. And so we would run into those almost quirks where it's like, okay, what do we need to do? How do we get smart about this? And how do we, like, really get things working? And then I think, honestly, personally, personal to me, one of the biggest necessary headaches tests.

Carlton Gibson 34:24
I was gonna ask you about tests.

Jib Adegunloye 34:26
So the first thing we've run into with the tests was, well, when the Django test suite runs its test, clearly, it's gonna check against an auto field, an integer as it's key MongoDB does not use like traditionally, our primary key is an object ID, which is a, let me not get the bit encoded, but basically it's like a specialized unique ID constructed in visa. And whenever you submit, let's say. A in this case, in like, let's say you submit a document or a row. We don't supply that object ID usually for creation, because when it reads like like, because we'll automatically generate an object ID and then submit it. And so what would happen in the test is the test case would expect, hey, this object that's coming back should have the idea of 123, or four. And instead we'd have the object idea of one, a, 3735, but, like, crap, right? And so we had to, we had to basically fork the test to then override the natural test structure to allow us to just use object IDs, or to or to just like, kind of like fake use, like a different ID implementation. And the thing is, this sounds, again, hand wavy and easy solution, but when you when there's hundreds, 1000s of tests that run into this, like minute issue, and you're trying to debug whether or not, Hey, is it this thing that we've identified or something completely different, or is it something we can even solve it all, it is something that you have to, quite literally look at each test and deduce yourself. And it's like, I think it was, for me, it felt like the most daunting thing, because I to look at such a tenured databases or tenured frameworks test cases and say, like, Okay, we're gonna try for this, this, this, this, this, this, this, this, right? And then to kind of present that to the team. And then have you know Tim, who extremely well informed, like, actually, no, not this one, not this one, not this one, not this one. And like, it's just so it's, it felt like, Is this ever actually going to end? And thankfully, it did, like we managed to triage every single test, figure out the ones that don't necessarily matter for the sake of what we're doing and the ones that we absolutely want to make sure matter based on this for I think it comes out to the number of like, 82 test suites with what you know well into the hundreds in terms of tests. But once everything went green, that was honestly like Christmas day for me. Yeah, it objectively works.

Carlton Gibson 37:37
And how are you keeping the your test suite up to date with changes that come into Django main branch or development branch.

Jib Adegunloye 37:43
So right now, there are so two, two sort of things. One, we keep a because, you know, Django has got its branching is like 515242, we've got our individual branches, like four branches that will keep up with whatever changes may manifest per those specific branches. And clearly, like, for Maine, like, we also do our best to test against that. And so in terms of, like, how we keep that up is, hey, we check with five one, are we in alignment with the five one? Tests, we align with five one changes, great. Are we aligned with five? Oh, changes, five. Oh. Tests. Great. With Maine, we see something's been added. Is this? Is this something that we want to get ahead of now? Yes. And so we have even like testing coming in in later bits, just to continually make sure that we are up to date. And then we again triage the issue to understand whether or not it's something we need to tackle now, if it's something we can just skip and things of that nature, I

Carlton Gibson 38:48
think that's the key. Is that historically, back ends have sort of matched the Django test suite, but maybe at a point of time, but then they just haven't been able to keep the engineering going, to keep it up to date. Yes, Django keeps moving right all the time, and so that it's that bit, it's that the kind of like, yeah, we need to keep bringing in the new tests, keep bringing in the new features as they're developed, you know, to the extent it matches the data model. Super

Will Vincent 39:17
realizing, technically, yeah, well, you can continue. I just saying. I just say, I just realized this is, I think we this is now the third episode we've done related to MongoDB, because we've had two developer advocates before. We had Mark Smith, who's based in the UK, two years ago, I think, and then we had Aaron Bassett before that, who was on the Django, I don't think is at MongoDB anymore, but he was on the Django board with me. So this a rich vein of discussion.

Carlton Gibson 39:46
When, when we had Aaron on like this, this idea of a native integration with the Django or M was just, it was just, no more than a twinkle in somebody's eye. Yeah. I mean, that

Will Vincent 39:55
was almost five years ago, yeah. I think there was a third. There was a, there was a it wasn't. Anything official, I don't think,

Carlton Gibson 40:02
okay, sorry, go ahead. I've got one more question, migrations. How does that look? How? Because MongoDB schema list, right? No. SQL schemas like, what out of how did migrations fit in? Do I just create them as usual, run them as usual and all this magic? Yes.

Jib Adegunloye 40:19
So one yes, I'll say this

Will Vincent 40:23
next question.

Jib Adegunloye 40:30
But no, it's a great question. So again, MongoDB schema less, I will say that, you know, having a lack of a schema one does not mean that you don't need to have a schema. I as a person who's done a lot of database work, please always, as you're like, as you're codifying your work, establish a schema, because you don't want developers or anybody kind of saying, oh, there's no schema. I'm just gonna throw in a random key every now and again and then, yeah, that's

Will Vincent 40:58
later down, later down the road. Yeah, I've

Carlton Gibson 41:01
got bad memories from about 2000

Jib Adegunloye 41:07
but yeah, the way the migration is working right now is you quite literally call it, it'll create your database. It'll create your indexes that you specify. Right now, we've chosen not to enforce a schema again, we're in beta, because most people understand the rapid prototyping and development of MongoDB, and like enforcing the schema at this stage just feels like, you know, we may feel like we may be jumping the gun right, But the beauty of we still recognize the beauty that comes from having something called migrations. And one, now that it's codified what your schema is, regardless of whether or not it is or isn't enforced at the database layer. And then two, when you make alterations, migrations will also capture those alterations and make them as necessary to fly, and even go, so far as you know, supplying whatever the necessary default is when it when it comes down to you. So, like in terms of migrations, it works. PM, it works. It's just not enforcing schema, but that does not remove the level of importance we understood and found about it. And

Carlton Gibson 42:27
so that's something I could do at a later date or not, but one could do at a later date. And

Jib Adegunloye 42:31
even now, we do have, we do have, basically ways, or we like to call them a bit of escape hatches, or functionalities, where, if there is something super unique to MongoDB that at its current iteration, we haven't exposed through Django feature set, we've documented a way to just grab the underlying driver or the underlying client connection that's used, and then you can configure The client information under the like, just right there, directly, and then go right back to your Django development, and it will, it'll propagate, because you're using the same, like, the same client configuration of the connection, yeah.

Carlton Gibson 43:11
And like, you know, even in normal Django land, you occasionally grab the connection and grab the cursor,

Jib Adegunloye 43:18
yeah. And so, if you, if somebody does want to enforce schema, we've documented how you can go do that, grab the client, use the dollar JSON, schema validation, and you're good to

Carlton Gibson 43:33
go. Wow, you've answered all my questions. I'm just itching to go and go and play now. Thank you will

Will Vincent 43:40
Well, I just, I mean, what? What should we be asking you? You know, I mean, I know there's a whole lot of press that's just come out, like we've touched upon a number of things. Are there any anything we haven't mentioned that specifically you're excited about or think people should know about with this, this new driver? Yeah,

Jib Adegunloye 43:59
that, in and of itself, is a great question. So first, I think one of the biggest things that I'm really proud we put out in this iteration is it's called a raw aggregate. Now I understand there is, there's, there's literally a predicate for aggregation in SQL, this is raw aggregate is is harkening towards MongoDB aggregation pipeline, right? So if you remember the dot raw in Django, which allows you to pass a string that has structured SQL in it, we've made our own analog. But it's not just a string. It's a list of, like I said, Python dictionaries, which, in this list of Python dictionaries, you can quite literally construct a normal MongoDB aggregation pipeline. And what it'll do is similar to how the dot raw works. It will. Give you a a Django query set object, like you will get a query set object back from a query you've done basically executing raw MQL. And I think that's been that's a really powerful thing, because nobody, no, no other implementation has done that. No other implementation has really looked through the API for all of its richness and said, Hey, this is something that, if this is something that would be really powerful to use, especially at this stage. And the reason even more so why it's powerful is MongoDB has got things that come natively, right? We've got vector search, we've got our full text search capabilities. We've got our geospatial query and we've got even more interesting like search predicates, like graph lookup and things like that. And by using that raw aggregate framework, you're almost you're immediately able to interface with that, and then still get Django like you'd expect, and use it very seamlessly or very well in the flow, like in the future, we definitely want to improve upon this, and like, you know, have potentially, like some integrate more integrated solution of search and geospatial. But for now, it's, it's a great way, if you want to, like, do more advanced things that you know or you understand MongoDB can do really well. You can, yeah,

Carlton Gibson 46:30
I think that that's a nice selling point. It's like one of the complaints about an ORM, a multi database ORM, in particular, is that it's kind of ends up being the the lowest common denominator is, the criticism is because you don't get the advanced features of Database A, and you don't get the advanced features of database B, and you don't get the advanced the features of Database c. So to be able to break or to have made space to say, You know what, if you're using Mongo, you can still reach out for those advanced features. That's a nice a nice addition. Of course, you can, you can always get the driver. You can always do the things, but if it's nicely exposed in the API, that makes it all the more enticing,

Jib Adegunloye 47:05
right? And then second thing, the, I think one big question I remember getting asked at DjangoCon us last year, was, hey, hey, you gotta do joins. And I want to say very succinctly, yes, we do. We actually have introduced this operator called dollar lookup. It's been something that's existed in MongoDB for some time, and without getting two in the weeds. That is, that is our joint. You can execute a left join, a right join, an outer join through the dollar look up operation. Now my big sort of cautioning point is, like I said earlier, we are not like we don't the third normal form, and leveraging that, that is not what makes MongoDB MongoDB. So for now, please go ahead, use it. Create a foreign key, create a one to one to many map, one to many field, a many to many field. But really in the coming months, in the coming weeks like look forward to seeing the things like nested embedded models, and we will also provide documentation on how to transform that sort of traditional foreign key usage to a representation in MongoDB that leverages that, the that more traditional document structure, that that we as MongoDB know And mom, yeah, right.

Carlton Gibson 48:40
Ultimately, the SQL model and the document model on the relational document models aren't the same, right? So,

Jib Adegunloye 48:48
and, and this the, say, the third piece that I want to touch on is the third party integrations, right? Third party libraries? Yes. So, like, I said, like, like we've chatted about, we're trying to make this work with Django, period. And so part of Django is power, is all of the libraries that have come out, things like wagtail Django, all off Django rest framework, Django filter, we are currently doing the work to make sure those are just as seamless as we're making this or like as we're making this framework integration to be so that that is, that is on our work, and that is on our roadmap for the months to come. And if there is a third party library, a third party framework with Django, that as a developer, you find important. We are all ears. We are listening, we are open, and we're like and we'd love to know sooner rather than later, what those are, so we can evaluate whether or not we can support them, because we'd most likely want to. And so like to conclude those three points, it's like we're still growing. We're still evolving. We're still learning. I say this, yeah, I say this, as the sort of lead engineer on this project, we are here for real, like we are fully invested and and with open ears, and we are a team of enthusiastic Django developers writing MongoDB, like code for MongoDB. So don't ever think you can't just, you know, check out the MongoDB community forum, or post something in the Django forums, or even post something on Reddit and an app MongoDB and let us know we we've got folks just, just looking, waiting, checking, because we want to do this right, and we want to do this permanently. That's kind of, that's, that's kind of like the, the recurring motif, yeah,

Carlton Gibson 50:50
no, that. I really support that, because so many companies, they put a Django library out, and they don't really do the job, and they don't support it in the medium term. And people, people get burnt by by that. So the, you know that you're here, saying that Mongo is here for the medium and long term, that's, you know, it's a really nice thing to hear, yeah,

Will Vincent 51:13
I mean, I see, I see you also on the I have the repo up, you know, you were just yesterday, you know, putting in commits. So I think this is, I think this is really great. I mean, selfishly, for Django right to have a powerful, no sequel story, because people have used Mongo over the years, but it's been, you know, there hasn't been that all in support the way there is now. And so both, obviously, hopefully Django clearly makes sense for Mongo. But, I mean, just for Django to be broader in terms of Python web all encompassing? Yeah, it opens up so many areas that, you know people, one of the big reasons why people use flask is because they want no sequel, and they feel like they can't do it in Django like that. I would say that's a top, top three, top five reason I hear. So that would be pretty powerful for Django, if that shifts a bit,

Jib Adegunloye 52:04
and I can even add an anecdote myself. So I've been doing a project, like I said, that project. I started off on Monday, sometime back with my friends, and we were like, Okay, we've been using flask. This is not like a production level site. Let's actually, you know, get some authentication and paneling that we can stand behind. First thing we look towards is like, Okay, what does it take to move to Django and ultimately, have to pass because the upkeep for maintaining both MongoDB and Django at that time was not something that we as sort of ragtag group of developers felt like handling at the time. And so this really feels like a full circle moment for me, as well as those same friends who I am currently messaging like, Guys, look

Will Vincent 53:01
what I've done. It's happening. It's happening. And

Jib Adegunloye 53:05
so it's really a moment where we're rejoicing like, wow. Like, I wish we had this, you know, eight years ago.

Will Vincent 53:12
Yeah, no. I mean, again, thinking about what I just said, I think it's definitely a top three reason why people choose flask, like, so why would someone choose flask? They would say it's more lightweight and you can get started on it faster. There's been a bunch of work, not the least, Carlton's given talks. There's these nano Django repos and stuff showing that, you know, hey, like, you don't have to use all the batteries of Django, right off the bat. No. SQL support is a huge one. Yeah, I can't think what the third is, but you know, to leverage what, what we have in this community, of the third party packages, of the forum, of all these things, like, I can't speak as much to fast API, but, but flask, the whole point is, it's a tool in part, that you use with other things, whereas Django is an all in one code and community, in a way that you know flask is, is not. And that's not to take anything away from flask. It's just they're different. They're different things, different tools for different jobs. And,

Jib Adegunloye 54:05
yeah, I definitely resonate with the Django there's also like, you like, when you're implementing something for Django, you're implementing something for like, clearly a community. It was very like, my first time at a DjangoCon. And I was like, I was, you know, great, in ear to ear, just chatting with everyone, yeah, and I really, really look forward to going to another like, and sort of being engaged, because I understand it's not just, it's not just code that's being written. It's like the impact to the communities, the like, the the workflows and how it's, you know, influencing so many things, and how even engaging with the community can help, you know, inform decisions itself. So I like, I will say, like, it's a full package deal, and that's what I do really love about J mole.

Will Vincent 54:56
Well, I look forward to your colleagues at J. Con Europe. And I just want to mention you said something that I've done a couple times and have been corrected by Europeans, which I've referred to it as, DjangoCon EU. You know, there is a difference, apparently, between the EU and Europe, especially when we were in Scotland, yeah, two years ago. So I, I'm attuned to that, because I was corrected on it several times. So just, I'll just mention that to you. It's, it's Gen Con Europe, not EU. Thank you

Jib Adegunloye 55:24
so much. Thank you so I these things are important

Will Vincent 55:29
in these fellow American you know, it's like, you know, whatever, less it less important. No,

Jib Adegunloye 55:37
thank you that that is a very good clarification. Great.

Carlton Gibson 55:44
So, Jim, I've got one, one question we often ask people, but I think it's, it'd be really nice to ask you this, this question, because you've, you've come to Django as somebody implementing, and, you know, no SQL back end for the ORM. If you had a magic wand and you could fix one thing in Django. What would it be if you know, if Django could make one change, what would your magic wand fix the magic or new feature or change? Yeah,

Jib Adegunloye 56:09
let's see a magic wand feature fixing for Django. Oh, I would say that I would really like, it's a

Will Vincent 56:26
tough way, it's a tough one. Sorry, sorry if we sprung it on you. I mean, it's something we've gotten in the habit recently of asking, guess, because, yeah, it's a tough one, and

Carlton Gibson 56:38
maybe now you've worked around all the bad things. You don't want to change anything because you've already worked,

Will Vincent 56:43
right? Um, I mean, from, I mean, we could, for example, just to get your brain going. I think Carlton, I have talked about one thing would be to have, when you start a new Django project, some way that you can have, like, in brackets, local or production, to kind of get over that cliff of, you know, because it defaults to local, and then to get to production is just a lot of steps for anyone. And I think that's off putting to to people, so that that seems like something that could be done. Yeah,

Jib Adegunloye 57:16
actually, there's a great point I'd say in that same vein, yeah, I think when we were like, creating our Quick Start for for the Django project, I was like, How quick can I make this quick start? And and I feel like, yeah, there were, there were several things that, if there was a if there was a way to even basically go from project to first app in one command line argument. That would be a that would be beautiful, because right now, like you start project, after you start project, you start app, and then, yeah, and then you you then go in, you edit the app, and then you also have to edit the main line to link URLs. So there we go. I found my answer, URL, Link behavior

Carlton Gibson 58:06
good. So there is a depth in progress, a dango enhancement process, in process to simplify the default project and the tutorial, so to have a kind of single file thing. So start project, and then that's already an app that you would then carry on, create your view, and things

Will Vincent 58:21
I mean, and to be and to give a shout out. So Eric Mathis, who's the author of a Python Crash Course, he has a package he's been working on for a while, a repo called Django simple deploy, that is, integrates the number of back ends, I think, Heroku fly platform, sh, to try to have like, one place to handle all this. He's been working on that a lot over the last couple years. There's also, I maintain a package called lithium. Now, that was Django X. It's sort of, it doesn't have full deployment yet, but it has, like, just gets you started, like, you know, gives you Django all off, gives you crispy forms, gives you a couple things so you can just get going. I would check that out. It's basically a simpler cookie cutter Django, whatever that is called. So that's been around for a while. But if there's a, you know, eventually, if there's a depth and there's an officially supported way to do it, I think it would get over that idea in people's head that Django is like this beast that can't be tamed when, you know, it's customizable, but we can have a an easier path if you want to just get going with something. Yes, Carlton,

Carlton Gibson 59:26
I think as well, part of the depth is the idea to kind of promote the idea of project templates, to make them a bit more of a thing, so that, you know, if I'm, if you're working for MongoDB, you you can create a MongoDB starter project they just go start project and then with the link, and then it's kind of already got the right bits set up. That would, I think that's possible now, but it's kind of hidden and it's a bit arcane, whereas, if we can promote that a bit more, maybe it become normalized that each shop would have their own start project. Yeah. Yeah.

Will Vincent 59:59
Yeah, and, I mean, it's just as a last thought, it's great to see a database back end getting involved with Django. I mean, if you look at who sponsored things historically, like conferences and other things, it is. There's been platform providers, so Platform sh, some other ones over the years, and then consultancies. There isn't really a big next place. So Mongo seems very well placed, just just for that to slip in. And also because, yeah, like, if we can just do away with the no sequel doesn't work on Django story that's that's such a huge win.

Carlton Gibson 1:00:35
And yeah, and to have, you know, headline another back end for the Django room. You know, that's

Will Vincent 1:00:44
a beige Can we, can we finally kick out Oracle? Like, let's just swap it. You can

Carlton Gibson 1:00:49
put that to the board with, I know now that I'm not on the board,

Will Vincent 1:00:53
I can just, I was actually asking some other stuff. I can just be like, it wouldn't it be nice if someone XYZ? All right, so we're gonna link to the post, to the repo, if people do have suggestions around third party packages and other things, what's the easiest way to reach out? You mentioned you're available on the forum, on Reddit? Should they do an issue on the project? Or what would be the preferred way? So

Jib Adegunloye 1:01:17
I'd say the preferred way right now is we've got a if you like, if you check the repo, we've got a section called issues and help. The fastest response will give is if you provide feedback on the MongoDB community forums or filing an issue or JIRA ticket, because that's that's that'll get right to developers almost immediately. Also definitely, like, yeah, definitely, if the Django form is the is the next sort of comfortable one, I'd say go there. But yeah, that MongoDB community forum and our issues and help section providing a JIRA ticket, okay.

Will Vincent 1:02:00
Well, thank you. This is this. This was a pretty amazing chat, I think so and and timely. I'm sorry that it's gonna wait two weeks to come out, but that's okay. Well, you know, the other things will come out in the on the news, the Django news newsletter and other things. And then people will be will have this to look forward to, to have more explanation.

Jib Adegunloye 1:02:18
And thank you. Thank you for having me. I feel so honored and humbled to be here.

Carlton Gibson 1:02:24
It's awesome having you on really, really, really good chat.

Will Vincent 1:02:27
So to wrap it up, Django chat.com, links to everything in the show notes, and we'll see everyone next time bye, bye, bye.

Carlton Gibson 1:02:34
This episode was brought to you by ButtonDown. The easiest way to start send and grow your email newsletter.