Jake is a Senior Systems Engineer at Torchbox and the author of DEP 14, django.tasks, the highlight feature in Django 6.0. We discuss his work on the Django security team, work with Wagtail, AI dabblings, and more.
π Links
π¦ Projects
π Books
π₯ YouTube
π€ Sponsor
This episode was brought to you by Buttondown, the easiest way to start, send, and grow your email newsletter. New customers can save 50% off their first year with Buttondown using the coupon code DJANGO.
Carlton (00:00)
Hello welcome to another episode of Django Chat podcast on the Django web framework Carlton Gibson joined us ever by Will Vincent. Hello Will. Hello Will and today we've got Jake Howard who's Django and Wagtail superstar. How are you doing Jake?
Will (00:07)
Hey Carlton.
Jake Howard (00:13)
Yeah, not too bad, thank you very much.
Carlton (00:14)
Well, that's good. Good. Thank you for coming on with us. I guess we should dive right into it. You're super famous now because you're the author of the whole Django tasks framework.
Jake Howard (00:23)
Yeah, the kind of my like, CV, my Django CV, as it were, is getting longer and longer and longer as the kind of months tick on. β So, for people that have never met me before, professionally, I'm a Senior Systems Engineer at Torchbox. Torchbox are the people who make the Wagtail Contra-Management System, which by extension is built on Django. Wagtail is used in very, very many places. β
NHS, Mozilla, Google, NASA, loads and loads of people using it. β Yeah, everyone's using it. β On the Django side, I'm a Django Software Foundation member, β but as of about this time last year, I also joined the Django security team dealing with all of the security things. So if you've seen a, God, here's a security thing, I need to go and do something, it's probably my fault, sorry. β On top of that,
Carlton (00:54)
my house.
Jake Howard (01:18)
I'm a new parent. β I occasionally speak at conferences and I've been sort of dabbling in the Django ecosystem for the best part of a decade, but sort of more prominently over the last two years or so.
Carlton (01:30)
I'll be leaning into it more heavily.
Jake Howard (01:31)
Exactly, yeah, it's kind of the last ever since the β the depth 14 stuff which we'll get into is everything's gone from I'm a person who works with Django to like there are other Django developers who know my name which feels really weird
Carlton (01:45)
Yeah, because I mean, it's not exactly the highest profile thing in the world being a Django developer, it? Normally.
Jake Howard (01:50)
As you well know, yes.
Carlton (01:52)
So Gordon, which do you want to pick first? Because you mentioned tasks, you mentioned the security team. I want to talk to you about both of those topics. So I'll let you.
Jake Howard (01:59)
I mean, the docs got tasks first. Should we start with that?
Will (02:02)
All right. Well, sorry, we're referring to these docs we use to guide the discussion. I'll say, let's talk about tasks. So one of the public facing things is you gave a talk at DjangoCon Europe in 2024. β I suspect that wasn't the start of you thinking about tasks and what we can do about things. Like, where did it start, this idea of something should change and I should lead it? How'd you get sucked into being the leader?
Jake Howard (02:26)
Yeah, so how I got sucked into it, I don't entirely remember. That's kind of lost a time a little bit. The story of where Django tasks came from, sort of, it sort of jumped on the scene, I would say, sort of early 2024 was when Carlson popped up and said, hey, this thing you've been doing in Wagtail, let's make, put it in Django. And I went, okay, having no idea how to contribute to Django at all. I don't think even at that point, I had an account on track, let alone knew the track existed.
β So long, long ago, β a lot of the Depth stuff started in Wagtail. β Wagtail being a content management system, it does a lot of things when you click publish. For example, one of the things it might do is go and build a search index. It might yell at your CDN to purchase caches. Currently, or at the time of developing things, now 2022 era.
When you click publish, all of that stuff happened inside the request response lifecycle. And that works and that's fine. But as a site scales, it starts to get really, really slow. And so we were looking, okay, we want to shift some things into the background. As any developer knows, when you're trying to like add in task things, you can Google Django background tasks and you'll come up with basically a thousand different options and you've got to weigh them.
Wagtail had a slightly more unique problem in that Wagtail is in itself a package, a framework you install. It can't really go, we're going to depend on Celery and everyone is going to live with that. Because if you're, maybe your dev team are more familiar with RQ or Q2 or something like that, then you are sort of forcing a developer's hand as to which framework they want to use. So what we wanted to do was
The original start was we want to kind of build something inside Wagtail based on something that we like, but being very Wagtail specific. β Early on, we realized that's not really gonna work and we are gonna need something that is more sort of an API layer around existing ones where you can sort of plug and play with them and go, okay, well, Wagtail just has this shim and it knows how to talk with this API layer. When you install it in your project, if you are comfortable using Celery,
You just point it at celery and it just works. If you want to point it at RQ, you change like one or two lines, make sure you've got a reddit instance and you're good to go. Wagtail itself in the development side of things doesn't need to care. And that's where this sort of sparked on. We had some proposals, some ideas, and then we just kind of went, here's a good idea. And then it kind of sat gathering dust a little bit. There was an RFC open for 18 months or so, but no one ever had the kind of
the time, the energy to sit down and go, okay, I'm actually gonna write this β for about 18 months. And then someone chimed in on Mastodon, spotted this thing going, this is really cool. This should not be in just a Wagtail. β And then this Carlton Gibson person who I'd heard of in sort of abstract appeared and went, yeah, you should do that. And I went, okay.
And I messaged some people in work going, this Carlton Gibson bloke who's really important thinks what I did was really good. I'm going to sink half a day into this. And they went, okay. And out of that came the sort of the draft, draft, draft that eventually became Dep 14, Shaggot background tasks.
Will (05:51)
Well, that's a sign of a good workplace that they, you say the name Carlton Gibson, it gives permission, but also that, you you can work on stuff, right? I mean, cause that's the thing is that the best things come out of a real world need. So Carlton's Neapolitan tasks, right? Like it's not academic projects. It's solving like Django itself, a real world problem.
Carlton (06:11)
And I remember that from the other side, right? I remember that someone mentioning it on master.com and seeing it and being like, wow, this is really cool. Because it's not just that if you go to Google and type in Django background task, you can get a thousand options is that we would discuss this sort of every six months on, know, whatever the social media of the day was. And there'd be a different option for every single Django developer. Now that's kind of fine. But if you're writing a third party package like Wagtail or any other one,
It doesn't really give you much of a target for, do you know what, we want to enable background tasks in an agnostic way. And what I really liked was that it was like, hey, and what I really liked that it's maintained is it was gonna be a shim above the different concrete implementations.
Jake Howard (06:55)
Yeah, that's where that real benefit was. That's what kind of lit that light bulb of a, this is a big deal. Like this is going to quote unquote change the Django world as it were, that you've got this single unified contract where particularly for package maintainers, they can go, I can now put stuff in the background and it just works. You look at other ecosystems in like your JavaScript Go, things like that. A lot of that stuff is built into the language.
Python, you can kind of do that if you are like fully native async, you can trivially kick stuff off into the background, but you don't get the same kind of guarantees that you get from something with a proper queue store and multiple workers and things like that. Putting this kind of common thing sat there just handles everything nicely. And as a developer, you don't need to relearn an API if you're pivoting between projects. If you move company and go from using RQ to Celery or vice versa.
Carlton (07:37)
with a reach.
Jake Howard (07:53)
you don't need to relearn all the different APIs, you go at task and it kind of does the rest.
Carlton (07:58)
you change it, yeah, and did the backend is sorted out at the deployment time. As well, there was one more thing is at the time, I think they just announced the solid queue implementation for Rails. And it was another one of these, why haven't we got, know, what Rails has got moments, we're doomed. it, you know, your proposal came along nicely at the same time. And so actually we, you know, there's a, there is a community idea in play already that if we just advanced would.
But there was a big gap between that first conversation and then the final thing.
Jake Howard (08:32)
It, well, everything takes a lot of work. I think the problem is, that for, it's something that as a developer, you go, this is great. Django sort of quote unquote moves really slowly. It's solid, it's stable. But what that means is that when you're trying to develop it, it means you have to put in quite a lot of work and quite a lot of pushing to make anything happen. And like I say, with, it's all well and good, like going, I found a bug, raise a track ticket.
Carlton (08:48)
It reads.
Jake Howard (08:59)
progress things through. This is, I want to give, I want to add, I think the final PR ended up being like two and a half thousand lines. I can't really raise a track tick and go, I want to give you two and a half thousand lines that you're going to maintain. Good luck. I can, but Natalia Serring-Jakob will shoot it down very quickly. Rightly so, to clarify, like things like this need thought. They need a kind of an actual
Carlton (09:12)
Well you can but it's not going to get received right now.
Jake Howard (09:25)
review doggone. I know Carlton in particular you've sort of been trying to think quite a lot on where the line sits between a track ticket, the new new features repo, a dep, maybe we need like some kind of fourth thing and it there is no like definitive line.
Carlton (09:42)
Well, if I'm going to just squeeze in there, what I think we really need is easier depths is what I think we really need. But I don't think tasks need an easier depth. It's a very β full featured major scale proposal. It's got its own, you know, it's Django.tasks. It's got its own top part of the namespace, right? It's a feature of that kind of size and importance. You know, the formal debt process was no problem for that. It's where the formal debt process is needed for something which is a bit more than just a track ticket, but,
the impedance of that full depth seems, it does seem to be an impedance.
Jake Howard (10:14)
Yeah, but in in task as you say it it really helped the original I suspect if I looked back at like that initial commit for depth 14 What we have now is massively different to it And there's a good reason for that what I had then was terrible because I was the only person who really looked at it having some insights and people who have either worked with background tasks or have been maintainers of Other libraries and have used it people that have been burnt by other things and go, okay
this library does it in this way, it's not great. Let's try like, let's bring it back to Django and do it right.
Carlton (10:47)
So just from that review process and that maturing process and that brewing process almost, can you call out a few people that perhaps, not saying that other people weren't, but perhaps a few people that stick out in your mind as, that was really helpful, they said this and that.
Jake Howard (11:01)
Yeah, so I've actually put a dedicated post on my own website kind of calling out the various different people who are involved in the all manner of parts of things, whether it's sort of you Carlton being the shepherd on the depth, pushing things forwards, I've got the actual kind of one of the people who sits very much in the background, but was actually really, really pivotal in this progressing through was my colleague, Tom Usher.
He was the one that looked at the Wagtail proposal and went, we can't just implement a new background task thing, we need a generic API. And quite frankly, without that one piece of insight, none of this would happen. What Wagtail would have done is gone, we're gonna make our own thing, we're gonna make you use this. And it would have been fine for Wagtail's use case, but it doesn't progress the ecosystem quite as much.
Carlton (11:50)
You had this in your talk at DjangoCon, you put up the XKCD for the competing standards thing. And I think the important thing is not just to add another one that was the same, you know, another option, RQ, Celery, and now Jake's one. It was that you were offering the interface above to abstract against them. That's the key change, right?
Jake Howard (12:10)
Yeah, I think that's the other thing as well is by calling it Jake's one. Like, I don't want to be like the BDFL for Celery Mark II kind of thing. it's the last thing I need. In researching for kind of the ways we can implement Dev14 background tasks, working out what the different API layers...
Carlton (12:19)
Yeah, right. Yeah, the last thing you need.
Jake Howard (12:30)
should be what the interfaces are. You come across a thousand different people that have gone, I've written a background queuing thing, and what they've got works perfectly for their use case. And that's great. One of the nice things is if you've written it, it's easy to maintain. It's easy to bug fix. You can go and work with it. That starts falling apart when you need to build something on the scale of Django itself. The number of edge cases that you have to account for.
just sort of escalates massively when you're working at a app scale, but also at Django scale. There's extra things like, okay, one of the big ones that really, really stumbled me up was there are lots of these handy things that work really, really well on Linux. I need Django to work 100 % on Windows as well, which means all these nice little handy things are like, yeah, it's great. You just add this one line. And my answer is yes, but what about Windows?
And it just falls apart then. There are so many things that just don't work nicely on Windows. They might. Someone who knows the Win32 APIs can probably come in, point me in the right direction, and we're good. But I'm not a Windows developer. There are, I think there is one Windows machine in this house, and that's it. β
Carlton (13:42)
Yeah.
No, mean,
I remember when I started as a fellow, night of marriage, right, had a Windows machine available. So the DSF providers would want so that we could at least assure that, you know, the test suite was working on Jane on Windows and all the rest of
Jake Howard (14:04)
Yeah, and so the like, one of the things that has particularly on speaking of the differences in Windows is one of the kind of the features that isn't in background tasks yet, but will be is adding the ability to time out a task. So if a task takes more than 30 seconds, kill it. Implementing that, if you look up how to kill Python code, lots of people have gone, β just do this, just do that.
but I need solutions that work on Windows, on Linux, that work reliably, that are maintainable, that you can depend on. And it's really, really hard. It got so hard I actually ended up doing a conference talk on this at PyCon UK back in September about how to kill Python code and all of these different weird APIs and the, this works for 90 % of the cases apart from these 10%. And you can kind of overlap the Venn diagrams and find two that work nicely, but it's still a real pain.
Will (14:57)
Is that that's the side quest you didn't think you wanted to do that emerges?
Jake Howard (15:00)
It...
yeah, it was one of those things. I mean, my personality type can end up being quite... magpie-ish. It will be a, so how would I solve this? I will die for like a week into dealing with how does this work. Exactly. Yeah. It can be, it's a really...
Will (15:11)
Yeah, curiosity run amok. can relate. Yeah. Yeah. It's a dilettante. It's
a dilettante. The dilettante's curse, right? β
Jake Howard (15:19)
Yeah, it's a really like, it's a really fun way of doing things. It's just that I will just dive deep, I'll learn a thing. And after a week, I will probably never touch that knowledge again. But for that week, I'm having a lot of fun. And that's
Will (15:30)
No, it's in the neural
net. It's in the neural net, you know, it's like, cause then five years later, something comes up and you're like, β connection.
Jake Howard (15:37)
Exactly, yeah, that's my hope is sort of picking up a lot of these little bits of knowledge and I can go, β I did that like two years ago. I should probably get right over writing some of these notes down, but that's a problem for another day.
Will (15:48)
Well, this is
the Simon Wilson and for me, right? Like, yeah, blog for yourself so you don't forget what, you know, I mean, I'll sometimes write it like, yeah, back when people read tutorials, I'd forget something, type it in and be like, wow, me three years ago was like really, I really knew something, you know, like what have I been doing since then?
Carlton (16:07)
So I want to ask you about tasks and because when it was released, people were a bit confused about the lack of a back-ending core and the messaging around that. because we know we've just talked about the why, but from the horse's mouth, so to speak, how would you describe that situation?
Jake Howard (16:26)
Yeah,
the... I don't think I did myself any favors when I initially sort of released Django tasks upon the world, as it was. β The initial Django Dash tasks release was a PyPI package that contained three things, let's call it. The first one is the kind of the API contract, the bits of scaffolding that are needed, the task class, the task result class.
those bits and the bits that deal with switching out backends. That's the stuff that was sort of required the most amount of time in making sure we got it right. The second thing it contained was the development backends as I've called them, the dummy backend and the immediate backend, both of which are really, really useful for local development unit testing, things like that. The dummy backend doesn't run anything. It just stores the tasks in the list so you can interrogate them in your unit tests.
the immediate one still behaves like a backend, but it runs your functions immediately rather than spinning them off in the backend. So it means that if you're using a package that isn't fully tasks aware, or you don't want to go through the effort of deploying things, your code will just still work, but be a bit slower rather than these really important functions never getting run.
Those two things, the basic API contract and the development things, those were from day one intended to go into Django in the kind of initial push. That sort of two and a half thousand lines of code, that's what we wanted to get in. The third thing that Django dash tasks contained was a database backend. This required a huge amount of work because, as I say, building task systems is really complicated.
and that's when I had to build one sort of from scratch on my own. That's where the kind of confusion came because someone looked at Django tasks and went, but it's an API contract and it's some dev backends and it's this production backend, but I already have Celery that does all those things. Why do I care? And so I ended up shooting myself in the foot there. In the initial ones, just trying to sort of get things out, it was the right thing at the time.
But as time went on, particularly once the code landed in Django Core and was released in 6.0, people got really confused. They looked at Django dash tasks and went, what's this? Why is some of it not in Django? Why is some of it there? How does this all work?
Carlton (18:48)
the other part was why doesn't Django have an actual backend, a production backend for TOS?
Jake Howard (18:52)
Yeah, it ended
up being some kind of like, why have you only done half the job kind of thing? Like, and I'm like, no, what we have here is great. It's really useful. And everyone's like, but it's missing this thing. I don't care. And so what I ended up doing was sort of in what I should have done probably about a year or so prior was I wrote down my thoughts and went, how can this actually work? How should this work? And then I shot an email to the steering council and went, here's what I think we should do. Am I insane?
and conveniently and probably for the first time in my life everyone went, no Jake, you're not insane. We've been thinking that too, exactly. And so that kind of, that shone this light on, okay, let's sink some time, let's make this much, clearer. And so particularly now as I'm recording, but also probably for the last two months or so, things are a lot more sensible. So to give sort of the outline, if you are going, what on earth is this Django tasks? There are...
Carlton (19:24)
We've been thinking that too.
Jake Howard (19:45)
Realistically three things you need to care about. Django.tasks exists in Django 6.0 forward. It contains two things. It contains the API contracts built into Django itself, and it contains the development backends. Those are there, those are ready to use. If you're building packages that target only 6.0 forwards, use Django.tasks and pretend nothing else exists. That is the dream, that's what we wanted, and that's what we've achieved.
The second thing we have is Django-tasks. Django-tasks is for people that want all those nice things that are in Django 6.0, but can't upgrade yet for whatever reasons. Maybe their company policy is sticking to LTSs. Maybe they're just on a really, really big code base and it takes a huge amount of time to move. Doesn't matter the reasons, I want to support those people and get them using some shiny new things. If you're one of those people, you can pip install Django-tasks.
Change a couple of your imports to make sure that you're pulling from Django-tasks, not Django.tasks, and you're good to go. The third thing is, as I say, there is that database backend. That is now separate from Django-tasks. You pip install it separately. It's Django-tasks-db. You can pip install that. And the nice thing is currently it still has a dependency on Django-tasks, just because I haven't written the bit of code that swaps between Django.
and the shim layer. Eventually, that package will have no dependency on Django-tasks at all. If you're on Django 6.0, it'll pull the things it needs from Django.tasks, and you can keep using it in Django 6.0. If you're on, for example, 5.2, you pip install Django-tasks, Django-tasks db, configure your settings, write your tasks, and you're good to go.
Carlton (21:29)
Perfect, perfect. And you've got a reddish back end as well.
Jake Howard (21:33)
Yeah, there is also the... So one of the things I wanted to do is it's all well and good having 100 % of the code written by me because I can change everything I want to work exactly how I want. But the whole point of this without going Power Mad is I need to be able to support third party backends, things that are not written majority by me. And so because RQ is the tool I'm most familiar with outside of the stuff I've written, I thought, okay.
Will (21:41)
you
Jake Howard (22:00)
I'm going to write an API contract against RQ. And so what also exists is Django-tasks-rq. It fits in exactly the same place in the model I just talked about as the database backend. But it means if you want to use Redis and RQ, you just change the one line in your backend config that says rather than database, use RQ. None of your... You change that one line and nothing else in your code needs to change.
You'll go from using the database to using RQ and you're done. At some point, and sort of this is the kind of the next step is the, I've been working on this API contract which shims between Django tasks and RQ. Well, one of those things already exists. It's called Django RQ. It's the shim between Django and RQ. So I've got an issue open against the Django RQ repo to try and get Django tasks RQ.
upstream into Django RQ and eventually that means Django TAS RQ can completely go away and that's the dream. It means if you want to use RQ you install Django RQ and you do the things you don't need to care about. I need this for the caching bits of RQ bits but because I want to use Django.tas I need this extra package. Everything is just there.
Carlton (23:17)
Yeah, and that's the dream, right? It's because the Django Q2 could have their, integrate their backend for Django tasks, know, the celery people, can have their backend and for each option, then they're just implementing against the backend rather than you having to swap your development code for each, for the particular queue that you've picked.
Jake Howard (23:33)
Exactly, yeah, I think.
Yeah, that's I think the main benefit is that you can go, okay, for local development, I don't want to run an entire Redis cluster in RQ, I want a database. So I use the database one, it just works. Then when you scale, you change an environment variable that says, actually, this is RQ now, give it the various credentials for connecting to Redis, and it just works. Your business logic doesn't change.
Carlton (24:03)
Okay, super. And again, I've just got one more then question. What are the bits of the Django tasks interface that you think, β it would be nice if we could grow this or nice if we could grow that or actually I want to keep this exactly at the scope it is because an abstraction layer, you know, people always say of the ORM, well, it's the lowest common denominator because it doesn't do all the advanced features that.
you you might need your Oracle can do that or your Postgres can do that. Well, it's going to be the same here. There'll be features from individual queues that you can't use in Django tasks.
Jake Howard (24:36)
Yes, there are kind of what we've got at the moment is good for like the simple things you need to do. If you need to run a task in the background, that's fine. It'll do that. The things it doesn't do, which as you grow as you start needing to depend on your tasks become really important, aren't there yet. And I do want to stress yet. This is not a I've written Django tasks. I'm going to disappear off into the sunset. Like this is an ongoing effort. Some of the things that
don't exist, that it would be really nice to exist at some point, is going to be things like retries, so you can say, okay, if this task failed for whatever reason, wait a bit, try again, things like that. Timeouts, as I've already mentioned, making sure that, okay, if this takes too long, kill it, start again. Maybe we want to do some other really complicated things. Maybe I want to, I've got a lot of work to do, I want to run five different tasks, I want all five finished, I need to run one final one to do a bit of cleanup.
that is possible to do in fancy things like celery. It's really hard to do in other languages and frameworks that don't have this as a native concept. You can shoehorn around it by sort of spinning up other tasks and it doesn't feel very nice.
Carlton (25:45)
it starts to grow into a sort of fully featured workflow engine at that.
Jake Howard (25:49)
Yeah, so this is, there is going to be a fine line. This is, sort of channeling my inner like, basically going full feature creep of what could this thing do? Because it's, it's fun to work out how could these API looks, what could they do? What are the edge cases? That's the engineering mindset is what can we make it do? At some point, someone needs to kind of, basically someone needs to be the bad guy and go, no, you can't have that.
Will (25:58)
You
Jake Howard (26:16)
I think, unfortunately, that bad guy hat meets the steering council. They have to be the bad guy in some of this. They at least need to be going, or at least saying, we can't do it yet. That's the main thing is A, it needs more thought. It's probably not big enough quite yet. And I think the big one, which is one that the steering council rightly lean on, is the, don't know how important this is. Build it out of core. Once we know it's really important.
then we can include it.
Carlton (26:45)
Yeah, that's absolutely it. And the flips, the sort of the other point is, well, it's all right building it, but what about maintaining it? You know, a fully featured workflow engine to take the whole shebang with dependencies and retries and branching and loops and those kind of control flows within a tasks flow. Wow, there are whole products that do that. And are we going to import one of those into Django?
Will (27:11)
you
Carlton (27:14)
even though they're just the API for it, or, you know, are we going to have the capacity to do that properly? I think it's much better to have half a product than half a fast product, if that makes sense.
Jake Howard (27:26)
Exactly, yeah. And so that's why I went with, I'm going to build the basic bits, let's get those in, because they're sort of, it's much more obvious that these are unnecessary. It's very obvious that you need some way of getting the return value of passing arguments. That's fairly obvious. Do you need retries? Does everyone need retries? Probably not for certain cases. If you're building sort of a small internal backend tool or a pet project.
Carlton (27:31)
Yeah.
Jake Howard (27:52)
you might be alright without it. It would be good if you had it, and if it was one argument away, fantastic. But let's wait until we know the demand is there.
Carlton (28:00)
go much further than that as well. They're like the vast majority of projects don't operate at scale that they see failures on a regular basis. The failures only come up when you're running things at scale and then the uncommon things become common just because you're doing it a lot. But if you're sending email back, emails out, say, as a classic example, you could be sending thousands an hour, which is much more than almost all projects are ever going to send and you'll never see a failure.
So what's the need of retries to say, we can't have any background tasks because we couldn't have retries for the 2 % of projects that doing things that hit the numbers.
Jake Howard (28:41)
Yeah, and it's your then you're having a large amount of code complexity for 2 % of users. And for certain things, if that 2 % is 10%, it's really useful. If that 2 % are a vocal 2%, it might also be useful. At the moment, that 2 % is a hypothetical 2%, and at that point, 2 % is basically 0.
Carlton (28:47)
Yeah, exactly.
Yeah, wouldn't it be nice if, well yes it would be nice.
Jake Howard (29:04)
Yeah. And again, I think another big thing is this is Django. It's an open source project. If you want it, come and build it. Like I do need to work out what the contribution process looks like for how do we design retries? How do we design these more complicated bits that I haven't sat down and sort of discussed with the steering council and worked out where that line sits. But everything else, like this is where open source wins. If you have
Carlton (29:11)
Yeah.
Jake Howard (29:30)
a weird bug, a weird edge case. Open an issue, submit a PR. Let's make these things better.
Carlton (29:37)
I'm, yeah, I mean, I, this is a question. I firmly believe that three or four goes in the ecosystem shows the right way forward because, you know, people have go one, go two, go three, go four. And then it becomes clear what the right route is, whereas you could never have seen that without the, four attempts. And it might take a while for those four attempts to play out, but the end result is something that's lasting and correct. How does, how does that fit your kind of philosophy?
Jake Howard (30:01)
Yeah, and I think,
yeah, I think so. And there is kind of, we do need that kind of falling out of face a bit going, okay, that doesn't work, let's try again. That's why I kind of put the initial depth 14 proposal out there. It got sort of, I think at the time it had sort of 150 comments on one PR, like GitHub got really, really slow trying to load that PR. And that's what I wanted, I want that.
It's all well and good building something based on either my experiences or what I think other people's experiences are. I can't know what other people's experiences actually are unless they tell me. So I make a proposal, people shoot it down. I make a different proposal. We try again.
Carlton (30:43)
Yeah, yeah. Okay, good. Will, you've been silent for a while, so let's...
Will (30:48)
no, this is my dream.
My dream podcast. always tell Carlton is I say hello and then I say goodbye. And I just let Carlton and the guests carry on. β
Carlton (30:57)
So
was there anything we didn't cover about tasks injector? Because before we move on to a different topic.
Jake Howard (31:02)
Not massively, I do want to address one thing which people occasionally pop up and go, huh, that's weird. And it takes, I end up saying the same thing again and they go, okay, that makes sense. With Django tasks, you create a task, you create a function and when you call it, you pass arguments into it. With Django tasks, .tasks, dash tasks, whatever, those arguments must be JSON serializable. There is absolutely no pickling in
any part of Django tasks and it's very, very much by design. I'm sure we'll get into in a second sort of my security team background on things, but it was very much chosen on a let's do things right. And if I can shoehorn people down the don't touch pickle, Jason is fine. Then it solves a huge number of problems. Anyone who's done weird things and sort of like gone, I'll just pass this request session object.
Carlton (31:49)
Thank
Jake Howard (31:58)
into my task, it'll work fine on say the immediate backend because it's the same process, but as soon as it kicks off into like, it could be on a separate machine, it could be halfway around the world. That like TCP session that is being represented by the request.session doesn't exist anymore because it's on a machine a thousand miles away. That's the stuff just falls over. Whereas again, as you say, Carlton, you want to implement things based on the lowest comment denominator. You can do basically everything you need to do.
with JSON. The big one that comes up is β passing models between tasks. It's a really, really obvious use case, but you can't JSON serialize a model. What you can JSON serialize is a primary key. So if you, exactly, just a lookup. So all you do is you pass that through and then in your task, you just retrieve it again. That's got a bunch of nice benefits. It means you're not
Carlton (32:40)
how to look up them all.
Jake Howard (32:49)
passing around stale data where things might change between enqueuing and running. It means your task, the data that's stored in the task queue is also really, really small. It's a number or a UUID versus an entire fat model with all of its columns. It means you haven't got weirdness of maybe when you loaded that model, you deferred some of the fields. Maybe you've only got one of these keys, but the task assumes another one exists. So you've got a bunch of weird performance issues.
those kind of things, you still, don't have to think about when you're just passing IDs around or you're just passing these like primitive data types, it just works. And so for the kind of, it seems to be a very vocal minority that are like, why is this not pickle? There are lots of reasons it's not pickle. I've just talked about all of them. And so by forcing you to think about things in JSON, you avoid a bunch of weirdness for
a slight little bit of, I need to like grab its primary key and then do a database query to pick it up. Doing that little bit of work for avoiding a bunch of extra issues felt worth it to me. It felt worth it to everyone else who reviewed the adep and I'm pretty happy with that stance.
Carlton (33:55)
Yeah, no, it's absolutely the right design decision. We see similar things with people working with channels or async code. Like they're passing, trying to pass a model back from a co-routine. It's hanging on a second, but that could be crossing a thread boundary because it's crossing a sync to async boundary. It's like you can't pass models across threads. This is not, you know, just don't do that. If you need a model in a particular place, look it up there in that. So I think that's a great decision.
Will (34:20)
Well, just briefly, yeah, security. And maybe the interesting point here is how Wagtail and Django overlap there. Because I think we've talked a little bit about the security team, but maybe you could shed some more light on that, right? Like, do things sometimes come in to Django via Wagtail, vice versa? How does that play out?
Carlton (34:23)
So you mentioned Gorn Will. Gorn Will, you do.
Jake Howard (34:42)
So obviously the the poster tile for this relationship is Django tasks, which would not exist without the Wagtail CMS. Wagtail is quite a large project that has quite a large number of installations. And so it gets sort of, it exercises a lot of edge cases that a standard Django project might not. You're doing a lot of things in Wagtail. The content management system pokes a lot of internal things in the Django internals because it's
trying to do some of those things. Some of that works well, sometimes it doesn't work well. And so there are sort of, I suspect that a lot of the Core Wagtail team have raised a lot of issues upstream towards Django and even to other projects going, this weird thing happens, is it a bug, is it a feature? And so that kind of, that relationship works really well. We have also had sort of the relationship work the other way.
whereby Wagtail goes, we need this sort of thing to work a little bit nicer. Maybe we put it up less as a, like we've hit a bug, more a, let's change this so it works a little bit better, and then it can cascade downstream. It's unfortunate that both projects have quite long release cycles. It's not a like, we open a ticket and a week later it's available on PyPI ready to go. Like it's gonna take kind of three, six months sometimes to actually be able to use the thing.
but by contributing upstream, they can come back down. It's useful to us and hopefully it's useful to someone else as well.
Carlton (36:06)
Yeah, think so. mean, I've been a Wagtail user for the last few years and it took me a while to get accustomed to the Wagtail release cadence. It's all good, it's brilliant, it was because Wagtail will use dependencies that are then pinned because of incompatibilities and whatnot. And I'm busy trying to update my dependencies and finding conflicts with what Wagtail had pinned versus what I wanted to use. And how am going to work around that?
Jake Howard (36:33)
The
number of times a Carlton Gibson has appeared in some discussions going, can you please unpin the upper bound on this dependency, please? I think we're at least three or four by now.
Carlton (36:37)
you
Yeah, yeah,
yeah, we're getting there. We're getting there. But I've got kind of come to terms. I think it's only what's the model tree library that you use that. Yeah, it should be that you've got that pin, but that's fine.
Jake Howard (36:52)
tree bed.
Will (36:56)
Okay.
Carlton (36:57)
So the security team, Jack, you joined a year or so ago, said?
Jake Howard (37:00)
Yeah, so was the relationship for how that started is kind of a long one. Like I've from sort of very, very early in my professional career, two things have stuck out as being particularly interesting and that's security and performance. And so naturally Django has a security team that kind of do quite a lot of really cool things around maintaining the security of quite a important software package. There's no kind of
It's not obvious how to get involved in that team, how to join that team. And quite frankly, it still isn't. β That is an ongoing problem. People are thinking about how we can deal with that. There are some things happening internally to try and make that a bit clearer. β I just happened to basically almost be in the right place at the right time. β At the time of this, β Tito Colas, the then president of the Jango Sofe Foundation,
had mentioned that the security team was looking for kind of some more people. And I went, I can do some of these things. At the time I was on the security team for Wagtail, all of our kind of anything internal, Torchbox security related, generally passes my desk. And so was like, I have some experience, like, am I helpful? And at some point I assumed he went away, talked to some people and I got a message in Discord sort of three or four weeks later going,
would you like to join the security team? And I kind of, I had to kind of double take that message a bit, because I was like, you want me? Because it is this kind of, it's really weird, like Django is used by so many different organizations, so many different individuals, and there's a pool of what, eight, 10 people in charge of making sure it doesn't have massive security holes. And I am now one of those people is a
Will (38:27)
Yeah.
Carlton (38:28)
Ha ha ha ha ha.
Jake Howard (38:47)
really really weird and humbling feeling is that like maybe I am kind of good at what I do like there is a list on the Django website that has security team and there is my name next to it and I can point to that and go I am making this thing better and that feels pretty damn great
Carlton (39:03)
Yeah, like that. Can you give us an insight into what the security team does? Like how does it, how does it sort of function?
Jake Howard (39:11)
It's...
Carlton (39:11)
It's a mystery, It's this secret cabal that no one knows anything about.
Will (39:15)
You gotta
wear hoods when you look at the issues that come in.
Jake Howard (39:18)
Yeah, mean, conveniently, none of us are kind of the like, the hoodie up, sitting with bright light screen just tapping away at stuff like we are human beings, like I have met the majority of the other people on the team, we are human beings, we're not like supercomputers sitting in the background. The sort of the start to finish on how things work is if you're if you are sitting for so and so reason, listening to my voice at the moment going, I have a security hole in Wagtail, but I don't know what to do about it.
The thing to do is, at the time of recording anyway, is email security at janko-project.com. That goes to all the members of the security team. We'll triage it, we'll talk to you, we'll work out, is this a bug? How severe is it? Can we reproduce it? Is it like, are you holding it wrong? Things like that. All these things that are really important. The security team gets a lot of emails. Most of them are...
spam, unfortunately. β I would say on sort of a volume, maybe a third of them are like people legitimately trying to help. What that means is that two thirds of the time when I open an email, it's garbage.
Carlton (40:32)
Yeah, and you have to read every one, right?
Jake Howard (40:33)
And I have to, this is the thing I think people forget is if you're looking at an email from a Nigerian prince, for example, it's pretty obvious. I have not actually won a million pounds by having done nothing. If someone goes, I have found a bug in Django. Here's a bunch of code. Here's some bits of stack trace. Here's why it's a thing. It can take a lot of brain power and a lot of time to go, no, this is nothing. We get a lot of things varying from you're calling APIs that don't exist.
or you're calling weird bits of internal stuff that is not intended to be used in this way or is not publicly accessible or even a like, we've had people report that for example, when you call cursor.execute and you pass it arbitrary user input, you get SQL injection. And the response is, yes, if you take a user's input and pass it to the database, that is SQL injection. That is kind of, you have to sanitize your user input. And so,
Carlton (41:22)
It goes to the database.
Jake Howard (41:28)
I'd say sort of nine months or so ago, β bunch of the security team got together and we sort of added some extra things into our security policies to try and make some of those kind of, you're holding it wrong, more obvious so that someone can kind of self-police certain things. So for example, you need to make sure that if you're passing random stuff into the database, you should sanitize it. In most cases, the ORM will do that for you.
If you're parsing it into raw SQL stuff, that's on you. You need to be dealing with that. Things like making sure that some of the stuff you're parsing through the Django template language. If you're parsing the entire works of Shakespeare and complain that it's a little bit slow, we can't do much about that. We're limited by certain things. We can make things faster and we're always up for, this is a bit slow, I want to make it faster. Absolutely, let's do that. But does it constitute a security issue?
Probably not. And finding that line again, unfortunately, there is no, if it takes more than like this amount per character or something, it's a security problem. It doesn't work that way. It's a feeling. It's a, is this too slow? Could this be a problem? If it's a yes, we'll triage things through. Ideally, the person who's reported it has gotten, if you change things like this, it gets faster or better. Great. We love people who actually submit β full kind of
Carlton (42:24)
Yeah.
Jake Howard (42:50)
proof of concept or patches, things like that. It makes our lives so much better and so much easier. But some people don't. And we also don't want to put that as a requirement. We'd want to know about any vulnerability, even if you've gone, I don't really understand the internal bits. We get reports about some of the weird internals of the RM. I have never dug into the weird internals of the RM. So I go, I can reproduce this, but I have no idea why it's a problem.
And there's someone else on the security team that does know the internals of the ORM. They can look at it, they know that side of it. I can help by going, this is legitimate, and sort of shield them from the garbage, as it were. And then they can dig into the internals.
Carlton (43:30)
Yeah, that's good. That's good. I guess we need to talk about β AI and the effect that that's had on the process and the team.
Will (43:39)
This is, can I say this is a new record to make it 47 minutes before the words AI were uttered.
Jake Howard (43:39)
Yeah
Carlton (43:45)
No, we got through the whole
episode without mentioning it once, no? Anyway, let's not argue the details.
Will (43:49)
Did we? OK. Apologies, Adam. Yeah.
Jake Howard (43:52)
Yeah, there's... AI is definitely making it worse and I think the... one of things that people kind of don't really realize about the reports like this is that AI isn't magically creating more reports. It's not like there are bots out there just submitting stuff. One of the big things is that people will sort of go, I think I have found something, dump it into an LLM. Most LLMs are intently designed to make you feel good about yourselves. They'll go...
So if you ask it, is this a vulnerability? They'll go, yes, you're absolutely right. It's definitely a thing. Well done you. And sort of exactly, just a massive amount of ego stroking.
Carlton (44:26)
Wonderful insight, your stuff at Clover.
Will (44:28)
Honestly, you should report it to the...
Jake Howard (44:30)
Yeah, it's they will just support that kind of model, which for some people, sometimes they need that bit of like the confidence boost and we're all for that. But then what happens is the LLM will naturally just sort of vomit out a large amount of text. And so what goes from a like, if you call this function with this input, it's really slow, comes out at about a thousand words and they'll sort of
talk about things like, well it affects these versions of Django and if it's really slow it could have these knock-on effects. really at a certain stage we sort of stop caring about all of those details. The things we need to know is if you do this, this happens. That's the important bit. If you want to go into some more bits of details and go, okay it's really slow because it's in this part of Django, this has this knock-on effect that makes this bit particularly slow.
all really really helpful bits of context but that can be a sentence or two that doesn't need to be a paragraph of stuff and that's the bit that kind of really really slows things down it goes from I can fit this in like half a screen and know this is all the context I need versus I need to kind of sit and read and think and digest because it's the same amount of information in about 10 times the amount of text and unfortunately
that's not something that is just plaguing the security team. I know in the last week or two, Natalia in particular has been sort of getting really, really frustrated at the low quality pull requests that have been coming through to Django repo, where they are kind of a lot of noise for something that is either not actually an issue or it's someone going, this could be better without any justification. And that gets to be a real problem. β Security kind of...
It's not unique to just us, it's not unique to just security, but it just pads a huge amount of time and effort into reviewing every single report.
Carlton (46:24)
You know I stepped down from the security team the end of last year after eight years I think and it was always kind of yeah, this is maintainable. This is this is something we can do You know, okay, we're probably under stuff slightly, but it was fine. It was a it was a comfortable pace But the last year or so, I think it's just picked up and it was like every day multiple issues, you know, and it becomes too much it becomes you know, you need 20 people on the team instead of 10 and
Well, we don't have that capacity.
Jake Howard (46:55)
Yeah, so like I've been a bit slow over the last couple of days at reading because the Django security team, it's shrouded in mystery, but it's also one of the oldest teams that exists in the kind of the Django foundation, I guess, is it runs really, really old. Almost everything is done by email. And so I've not looked at my emails for probably almost a week or so. I've got 16 unread emails in that inbox.
Most of them are probably the internal team going, this is garbage. But that means that some of them are the initial report giving us garbage. And so I need to then sit down and go through 16 emails worth of following a discussion, working out is this garbage? Is this not? For some cases, I can just go, they concluded it's garbage. It's garbage. Occasionally, and this has happened a couple of times, is we'll look at something and go.
Actually, I don't think this is garbage. This is there could be something here. 9 to 10, the person who reported it hasn't seen it that way. They have submitted garbage. But if you look at it with a slightly different angle, you go, OK, well, if they've done this, it's actually this other thing. Unfortunately, because of some of the specifics of things, I can't talk about the like, β there's this API that if you call it in this way, it's a massive problem. β And also my memory sucks. And so I can't remember most of that detail.
But things like this happen. There's a lot of nuance around security of massive frameworks like this because it's Django, the code base itself is gigantic and the security team kind of has a remit over literally all of the code, but just some of the kind of symptoms of things. If it's slow, that's a problem. If it's too slow, it's a security problem.
Carlton (48:34)
Yeah. And just one more question then. So Jacob Walsh, the fellow put a post on the blog recently about some of the recent work in the security team, because there was a number of issues that have been patched over the last month, last few months. And some of those were SQL injections. they were perhaps questioned. You have to go very, very wrong to do this. they were legitimate.
SQL injections and by the standards there an SQL injection is Category high. It's a know, it's a serious severity high issue
And it's kind of this problem where he's like, well, we've got a release coming up. It is a high issue. And then people put on a team to do it. And then they're like, but we would never just pass these keyword orgs to filter in this way. it's kind of difficult. How do we balance that when there's so many reports coming in and we're trying to do the best and we've got our procedures?
Jake Howard (49:31)
Yeah, so this reminds me sort of many, many years ago in the beginning of Thailand, there was the release around the kind of hit a large number of web frameworks around the like email issues with like the dotless I in certain things. Really, really famous. The account hijack one exactly. It was a problem around password reset where emails that as a human aren't the same, the database would look at and go, these are the same and you can take over accounts. It's been fixed for like
Carlton (49:44)
Yes, yes. Yes, was a high jack potential.
Jake Howard (49:59)
eight years now is not a problem. But what happened is that affects basically everyone. had a, because it was very well known, the time between there is going to be a security release and it being released upon the world was expedited. It was one day rather than the standard seven. And so naturally I getting that email going, there's a release tomorrow, lost my mind because I was like, what the hell is going on? About a month, a month and half later,
There was another email that came around. It was again, it was a security release. It was marked high, but it had the standard seven day embargo. I was like, oh God, is this going to be another kind of, I need to spend a day patching all of my projects. Somewhat conveniently, I happened to be out of the office on the day that it was due to be released. So I was there sort of sitting in the passenger seat of a car, panicking, checking my emails, going, okay, is this a big thing? Am I going to need to like triage some things in Slack?
And it was a small SQL injection specifically on Oracle, which if you're using Oracle is a big deal. We weren't using Oracle, we were using Postgres. And so to us, it was not even a low, it was a complete non-issue. But if you see an email saying, there is a security vulnerability, it's rated high, you just see that you have to react based on that. And I don't actually think there's an easy way of
fixing that. You can't give more, you can't give too much information because someone will try and reverse engineer it or work out what it is or try and kind of beat that embargo. We also can't do, β we can't sort of sneak little bits of information to certain people. For example, Django has got, I forget the formal name of it, but basically if you are a big enough important organization, you can ask us very, very nicely.
And we will tell you the specific details about security release when we send that early warning rather than when we sort of do the final release. And what that means is that for, for certain companies where it's really, really important they get patched as soon as that information is out, they need a little bit of warning and we want to accommodate that. But we can't give that to everyone. At the time I was working at a small Django agency, our projects are not.
big enough and important enough to need that kind of information. But where does that line sit? Who is big enough and important enough? Is it based on the size of the team, the number of developers? Is it how important the project is? That's generally the way we've gone. So we've got...
Will (52:26)
And do they sponsor
Django, too?
Carlton (52:28)
Oh, but that's pay to pay for security. Oh, we can't have that. To give you a concrete example, the standard for people be Linux distributions redistributing Django, right? So they need some head time to be able to get it into the patch, their version of Django, because they're not necessarily shipping Django 6.0. They're shipping 5.2 point with all the patches and, you know, they need time to be able to update their repository. So when you apt get,
Jake Howard (52:30)
Yeah, that... that gets really spicy really quickly.
Will (52:31)
Yeah, I know, I know, well.
All right, yeah, okay.
Carlton (52:57)
or app update or unattended updates, it automatically pulls in the Django update for you.
Jake Howard (53:02)
Yeah, those are the easy examples. So we have security context at Debian, at Red Hat, they get the emails. There are some organizations that get a similar thing. β It's really, really ad hoc. kind of, we don't have rules for it. There are some listed in the docs, but they're not perfect. β It's very much a, if you ask, we'll tell you. In the last three weeks or so, we had two companies come to us who I will not include their names.
one we said yes, one we said no. And it just comes up that way sometimes.
Carlton (53:34)
But to to sort of swing back to the original point, like if you're not using β an API in the way that's effective, then even a high severity issues are non-issue.
Jake Howard (53:44)
Exactly, yeah, and there is there is no way around this. did learn recently in in the Go ecosystem, they have a kind of security release configuration basically that encodes as part of a security release, like undoing a release, the issue is in this function, if you are calling it in this way, and it's programmatically defined. So it means you can run like basically static analysis on your project.
and know not only am I using, like the standard one, am I using dependencies that have vulnerabilities in, I can also know, I vulnerable to it? And that would be massively helpful in an ecosystem like Python. But the reason you get some of that in Go is it's got static types. Python does not have quite so static types, let's call it.
Will (54:22)
Hmm.
Carlton (54:35)
Indeed,
Will (54:37)
Well, I yeah, I We have other things in our notes around Self-hosting like you write Quite a lot on your blog, but I do want to respect our roughly one hour time limit And so that leaves times for books and projects the important things Do you I believe you have one queued up Jake is there a book you want to recommend?
Jake Howard (54:54)
There'sβ¦
yeah, I wouldn't say I necessarily recommend it. My brain works in very, very weird ways. I'm not actually a massive reader, sort of from, I would say in the last 15 years, I have probably read two fiction books at all, and I don't think any non-fiction books from start to end. I've tried to trick myself into reading, like if I buy myself a Kindle, I'm kind of playing with some tech, but also reading.
It worked for a couple of months and then I got bored. I'm trying to pick that up again. And one of the things that's helped is conveniently at the same time, there's sort of been a tech reading group that's formed at work of people that are kind of interested in reading and learning stuff. And so the first book and the book we're reading at the moment is Designing Data Intensive Applications. And so conveniently, that's also very well aligned to kind of the systems programming performance.
background tasks, stuff that are kind of interesting to me at the moment. And so I've quite enjoyed a couple of chapters behind the rest of the group, but kind of slowly catching up, being involved just talking about some of this stuff has been really interesting, to me at least.
Carlton (56:02)
Tech reading groups are really good. I had a group with some people from the Janga community reading Rust in Action back a few years ago and it was really good. We'd meet up, I think it might have been during the pandemic time. So was nice to have some sort social interaction as well as, know, how are going to get through this book on?
Will (56:19)
Carlton, you have one. β
Carlton (56:19)
Will.
I do, I do. Well, I've been, I picked it back up off the shelf, Solar Power Finance Without the Jargon by Jenny Chase. Jenny Chase is an analyst that works at Bloomberg and back with oil over $100 a barrel. thought I'd pick it up. It says it's about solar power finance. It's sort of more really about solar power. There's a little bit about finance and prices and supply and demand and guts and things, but it's a really interesting book about how... β
Well, about the economics of solar power and then how renewable energy is grown and will continue to grow, when alternatives are inflated due to geopolitical events. I'm reading that. Solar power finance without the jargon.
Will (57:02)
Well, and I believe this is more recent. There was just some study showing that solar panels can, they thought they had a lifespan that was quite short, but in fact, almost indefinitely for decades, they can continue to pump out.
Carlton (57:12)
Yeah. Yeah. So a lot of
the, a lot of the estimates that, um, you know, it will take you 10 years to the solar panel has a lifestyle of 10 years and it will take you five years to recruit your initial investment. And then the second five years, it turns out that the, that those estimates were a lot shorter exactly as say. So get a lot better.
Will (57:29)
Yeah, which is a nice,
some nice news for a change. I mean, one, and I'll just say one interesting thing as an American with what's happening. β you know, the U S we're the largest producer of oil and gas. And because of that, β we can't make our mind up about what we want, you know, so China, China can't produce enough oil. So it's forced to invest more in alternatives, even though it seems fine to deal with.
Carlton (57:38)
you
Will (57:54)
including things like it just knows it has to the US doesn't really have to and so we're much more prey to do we want high prices do we want low prices there's a whole discussion around that but it's a little more complicated in the US I guess which is one of the reasons why we're so reticent at least this year to invest in alternatives I'll just say that β yeah okay I'll check that out for me quickly just fiction the passage by Justin Cronin this is actually a reread
Carlton (58:11)
Yeah, no, it's interesting.
Will (58:21)
for me, think it's 10 years old. β And this is Katnip for me, because he's like a literary author who won a bunch of awards and then decided to write something that people wanted to read. it's essentially a vampire book. And it's actually a trilogy. Yeah, right? It's, you know, it's just really good. I just really like, you know, a plot. to me, this, you know, quick sketches and portraits and it's, yeah, I need that.
Carlton (58:33)
It's a literary page turn up.
Will (58:47)
So highly recommend it. β Projects, I'll just keep going quickly. So this is, well, as we record, which is March 10th, this will come out a little bit later, but Andres Karpathy just released a GIST micro GPT.py, which is an entire β way to run a GPT in under 200 lines of code. So β train inference tokenizer, autograd, initialize parameters, optimizers and buffers.
repeat in sequence, and then do inference. And it's not optimized at all for scale, but it's meant as a teaching tool. And I've been sort of circling around, how do you actually build these things? I recommended recently, there's a book, Build Your Own LLM from Scratch. This is really great to just see in one file. This is academically how it's all happening. So I would highly recommend, check it out. It's commented. Very cool.
Carlton (59:41)
Check.
Jake Howard (59:41)
Yeah, mean, sort of side projects for me are few and far between in certain cases. found myself sort of originally sort of when I was much younger and had a lot more energy, I would kind of I'd be having thousands and thousands of different things. And slowly my GitHub is slowly filling with these kind of the things I've built. I'll go, yeah, I'll maintain that. And then I realized, it's been like two years since I've had to commit. That's been not so great. One of the things that's kind of
Carlton (59:51)
Yeah, yeah, yeah.
Jake Howard (1:00:08)
particularly in the last week or so that's been on my mind is actually one of the things β that Carlton and you've been working on is, which I'm gonna hand over to you quite nicely, is around type safety and your project mantle. β I'm not gonna stew your thunder, I'm gonna let you talk about it, but this, it's a way I've been writing typed Python code quite a lot is that by its nature, a lot of Python and Django.
Carlton (1:00:23)
yeah, well that was going be my project of the week, so go on, you talk about it.
Jake Howard (1:00:37)
is hard to type and so writing a type safe layer around an untyped safe API is a really, really nice thing. I do want to of randomly operate a little bit in that it's a really, really nice idea, but I've also seen it in a couple of other places. β Not in Python, actually, funny enough, it's also a really, really common model in Rust.
Carlton (1:00:50)
Yeah, come on, come on, come on.
That's okay.
Jake Howard (1:01:00)
β So, I mean, at least you can say you're copying this thing from Rust and pulling it into Python, which is something.
Carlton (1:01:06)
Okay, so I mean, let me talk about then Django Mantle is my project for the weekend. Also it's partner project Mantle DRF, which is a DRF β add-on for it so that you can use it with your existing DRF project now. But what it is is, β so it builds on attas. So you talk about a type safe layer. Django doesn't, you've got your Django model, you're never gonna make that type safe. Like, and there's some new pep about. β
can't remember adaptive typing. can't remember exactly what the word is. Even if that comes in, we're not gonna be able to retrofit that into the art. We're just not like Django is dynamic to the core. It's built on, know, Python's dynamic capabilities and we should just embrace that. But for my work code, I want to have like modern Python, modern β simple, plain typed Python classes. So I use Atters for that. I can talk about why.
some other time. So I use math at us for that. And then the challenge isn't writing the at us class, right? Anyone can write a data class, you know, it's got a name and that's a string and it's got an age and that's an intern, you know, those classes aren't hard to write. The problem is mapping that from the ORM. And so what Mantle provides and it uses Django readers that I've talked about on the show billion times, I love it. It uses Django readers to create efficient ORM queries from your, your at a class.
And then those queries will have just the fields that were in it. And it works with nested data doing prefetch is related automatically. And then it serializes back. So you just do a query. You give it a query set and the shape you want. And it gives you back typed objects that already did all the fetches. And then you can pass them to your API. Or you can pass them to your templates. Or you can add the domain logic that you want. And it's all independent from the ORM. And so it's a way of decoupling.
your business logic or your domain logic from the URM models. URM models are busy enough already. They're busy dealing with the relationship to the database. And one of the big problems we have is when we slap on loads and loads and loads of business logic into our models and they become this big, difficult to deal with monstrosity essentially. Well, it's a way of cutting cleanly between Django's dynamic core and then the type safe layer on top. Now,
The bit that Mantle brings is that flow. Atta's not unique to me, not original to me. Katta's, the serialization tool I'm using, not original to me. Django Reader's not original to me. It's a synthesis of how I've been using those tools over the last few years, basically. And Mantle DRF is, as I say, it gives β versions of DRF's generic views, it's mix-ins, it's view sets, so that you can just change your imports and, you know,
use a Mantle Shape class in your DRF API. So anyway, go check that out. It's on my website.
Will (1:03:56)
And
we'll put a link into it. And this should be out.
Carlton (1:03:59)
I'm glad you've been looking
at it, I haven't really talked about it yet.
Jake Howard (1:04:02)
Yeah, it's what
to the best of my knowledge you talked about it in one place which happens to be the one place that I happened to see it which was the the PyTV conference last week. β Yeah and so exactly yeah and conveniently the two talks I happened to see between Meetings and Other Bits was yours and Sarah's and they were both fantastic and Mantua in particular was just that kind of it's nice that
Carlton (1:04:13)
Right,
Will (1:04:14)
yeah, yeah,
good for you. Yeah, yeah, I'll put a link. Yep. That was the world premiere, yeah.
Carlton (1:04:26)
Right. β
Jake Howard (1:04:31)
for a while I found this type safety on top of untyped safe stuff as a really good way of thinking about stuff. can hide a bunch of weirdness, particularly if you need to do a bunch of weird things to make it type safe. You can hide that stuff in a function and you can just use this public API that makes things nice. We do that for operating system abstractions, Rust does it for...
Carlton (1:04:52)
Yeah, and it returns.
Jake Howard (1:04:57)
safe and unsafe code, why can't we do it for safe and unsafe types? It just felt like, Mantua in particular obviously is the Jenga way of doing it, but that thought process of kind of facading stuff I think is the tool that works for developers in general, even outside of Python.
Carlton (1:04:58)
Yes, yes, yes.
Yeah, no, exactly.
But it also means that you can make use of Python's dynamic powers when those are of use. And you don't have to abandon all that power and leave that power behind. And I think that's really important as well. and this was like, you remember that thread on the forum about Django needs a rest story. And I said, well, we need a serialization story. Well, that's atters and catters, but for me, but the point I wanted, I said was, well, it's got to be ORM aware. And that's what Mantle gives you as well is these.
is the type safety in the modern serialization story, but it's ORM aware. that's anyway, I've just I'm putting it out now. So I'm glad you've seen it.
Will (1:05:53)
And and you're gonna be a Django con Europe and pike on wait pike on Italia. Do I have that right? Are you going Carlton?
Carlton (1:05:58)
Yeah,
I'm going to do a keynoting at Anglican Europe. going to do a long version, directors cut talking about typing and mantling a bit more depth than I was able to do in the PyTV event. And then I'm going to give a shorter version again at PyCon Italia. So if you're anywhere in Europe this spring, anything, you can come and see me talk.
Will (1:06:19)
All right, Jake, think we're almost out of time. Is there anything we didn't ask you or that you want to highlight before we head out? yeah. All right. I'll hand it to you. Digitally hand it to you. Here's your magic wand. You can't change tasks, but something else in Django.
Carlton (1:06:25)
Magic wand, magic wand.
Jake Howard (1:06:34)
I mean I can change tasks but it would be very weird for me to go actually I want to rewrite all of this it's terrible I'm going to redo it.
Carlton (1:06:35)
Well, you don't even have to answer that.
Will (1:06:41)
Rewrite it and rest. Yeah,
yeah.
Jake Howard (1:06:43)
The, yeah, I mean, you're have to explain the magic wand to me a little bit.
Will (1:06:45)
Well, that counts.
Well, it's just this, generally we don't have guests who've actually changed something in Django. So it's just this idea of code community, just, β wouldn't it be nice if we could just change one thing.
Carlton (1:06:57)
one thing you could fix.
Magic.
Jake Howard (1:06:59)
The,
yeah, I think one of the big ones is the Django called codebase seems to be a little bit allergic to dependencies. That's possibly a spicy hot take. Well, it's a fact, it's allergic to dependencies. My hot take is I don't think that's good. I think, I don't know if I go as far as to say it's bad, but I think if we were less allergic to things.
Carlton (1:07:10)
Hot tag, spicy!
Will (1:07:11)
Hahaha
Jake Howard (1:07:25)
some things could be made a lot better. So, for example, if we... I'm not even thinking like on the mass of things like, β we should just depend on Django Restory and what it can be done with and we can solve the rest story once and for all. I don't think that actually solves anything. But there are lots of things inside Django where we've had to not so much reinvent the wheel, but we've had to kind of write some stuff ourselves.
because we can't use a third party library or we've had to vendor little bits of it to work for what we need. If we could instead pull in a dependency, that works out great. If we can, it was really restrictive in Django tasks because that's meant to be a backport. Django tasks has no dependencies. It would be really nice if for certain things I could pull in various dependencies for other things.
But I know that if it's part of shipping Django tasks, I also shipped five additional dependencies for Django, it would be a real problem. I don't really know what the solution is to this, but I think sort of possibly the easier way is making a good argument for this dependency in particular brings us a bunch of benefits, β particularly around like some of the like utils and the core HTTP bits of Django.
They're not unique to Django. They are used by basically every web framework out there. Why can't we have... We need one unifying implementation to bring them all together. I don't want to start this meme again that basically everything I do is making a unified implementation, but at least if we, if there are implementations already out there, maybe we can use them. Maybe we can remove a bunch of stuff.
inside Django, maybe we can shrink the code base a bit, shrink the amount of work that the fellows need to do and use trusted, not just pulling in a random thing that a person's third-party tools, ideally as hard dependencies rather than sort of the optional ones we've already got, that bring in those extra benefits without giving ourselves a ton of extra work.
Carlton (1:09:30)
No, that's a really good one. That's a really good one. It's the whole hyper project around HTTP and Python, which, you know, they've got some good stuff, good code in there and it's trustworthy enough that you use the word trusted. The issue with, you know, the JavaScript project that brings in 400 dependencies and then they all break and so you can't update, right? That's the sort of, that's the historic counter push back is why very few projects have the stability requirements that Django would need to, you know, pull them in as dependencies.
β But yeah, those kind of core cases, I think there's a real good case. I like that. It's a good spicy hot take. Lovely.
Jake Howard (1:10:05)
All the best takes on the spiciest.
Will (1:10:07)
All right, well, Jake, thank you so much for coming on. Thank you for all the work on TAS. mean, Carlton was sort of whispering to me, like, oh, there's this thing coming, this thing coming. And it's still coming. I'm glad you had a chance to address some of the questions. And TAS is not going away. And in fact, it will be easier and more powerful to use going forward.
Carlton (1:10:13)
day.
Jake Howard (1:10:26)
Yeah, it's not going away and it's not done either. It's we've done a chunk of it. It's usable today. If you're on 6.0, use Django tasks, get on with nice easy background tasks. As time goes on, it'll just get better and better and better. It does also it's I really hope it's not just me working on this for the next 10 years. Like if you are interested in background tasks, API design, the kind of
the systems programming that you need to do for building background workers, please sort of appear, contribute to things, get involved in discussions. Like, I don't want to have to maintain this entire thing on my own, please.
Carlton (1:11:03)
call to action box.
Will (1:11:04)
Yep. All right. So we are DjangoChat.com. We're on YouTube and we'll see everyone next time. Bye bye.
Jake Howard (1:11:04)
Exactly.
Carlton (1:11:10)
Bye bye.
Jake Howard (1:11:10)
See you