Django Chat

Coverage.py - Ned Batchelder

Episode Summary

Ned is the creator of `coverage.py`, a longtime organizer of the Boston Python Group, and works at EdX. We discuss what’s changed in Django over the years, his thoughts on testing best practices, and managing a massive codebase.

Episode Notes

Support the Show

This podcast does not have any ads or sponsors. To support the show, please consider visiting LearnDjango.com, Button, or Django News.

Episode Transcription

Will Vincent 0:05
Hello, and welcome to another episode of Django chat a podcast on the Django web framework. I'm Will Vincent joined by Carlton Gibson. Hello, Carlton. Hi, Carlton. This week we're joined by Ned Batchelder. So Ned, welcome to the show. So we're gonna talk a lot about testing but also your your career in Python and Django coverage and all sorts of things. So really happy you could come on your thank I wouldn't we wanted to have on for a long time now. So thank you for making the time.

Ned Batchelder 0:32
Sure. I think we've been I think we've been exchanging emails about this possibility for a while.

Will Vincent 0:36
I think from the beginning. I mean, the shortlist I had when we started. This was you were on it. So

Ned Batchelder 0:43
nice. Well, I'm honored. I'm honored to be early on the list and late on the recording,

Carlton Gibson 0:48
do you? You'll notice his practice, we are now.

Ned Batchelder 0:53
Very smooth.

Will Vincent 0:54
Yes. So you are as a brief intro, so you don't have to say it yourself. You're a pillar of the Boston Python community, you run coverage.pi. You've given lots of talks at pi cons here. You've been most recently at edX. I think everyone. Most people in the Boston Python community know who you are. And many of the people at pi cons know who you are. But for those listening who perhaps don't, how do you describe yourself these days? In terms of your background, and the community?

Ned Batchelder 1:25
Yeah, that's a good question. I often say I'm deeply embedded in the Python community. You touched on Boston, Python. So I'm here in Boston, well as you are, and or at least last I heard, who knows what happened has happened in the 18 months of the lockdown people seem to move all over the place. But yeah, so I'm, I've been organizing the Boston Python user group for over a decade, I guess. Now. You mentioned Python, I do, I have done a lot of Python talks. But with the pandemic and a few missed pi cons, it's actually been a while since I've been to a PI con. And I haven't done any of the virtual pi cons. So I need to get back into that. But I have been maintaining coverage.py for a very long time. And as well as some other much less interesting side projects. And I work at edX as a day job, which is open source project built on Django, and I work on the open source team at edX. So I, my whole life is open source Python, pretty much.

Will Vincent 2:30
So you, I think we need to be more specific. We both live in Brookline, because I saw your your mapping project. And I, I was I thought I saw you and your son like two years ago, walking along the river way. But I didn't want to interrupt you. And you have this very cool mapping project I was looking at on your site, which will have a link to and so I think I have a pretty good idea where you live, but I live in Brookline as well.

Unknown Speaker 2:55
That's right. So I've been during the pandemic, my exercise of choice has been walking because it keeps me apart from people and I can do it from home. And my son was at home with me, and he needed to walk to and to keep an interesting open trying to walk on new streets every time and I've done now I think 302 walks from the house to the house, and only two of them have missed getting any new streets. So it's been a fun, nerdy way to get exercise.

Carlton Gibson 3:21
So this is this is reminiscent of Kant's, Seven Bridges of callings work. The problem is,

Unknown Speaker 3:27
the topology is much, much worse. And, and we don't have time to get into all of the nerdy details that go through my head when planning walks and kicking myself for having missed a street, I could have walked on all of that. Okay, and

Carlton Gibson 3:42
you put it I saw an image go past this week, I think I didn't know about this. And then you've blogged about it on your site.

Unknown Speaker 3:48
That's right, there have been two blog, all three blog posts technically about it so far. But yeah, when I completed my 300th walk, I did another kind of visualization, about how hard it is to get to new streets on walks 201 through 300, because you have to walk over all of the old streets you've already walked on to get to the new streets, and etc, etc.

Carlton Gibson 4:07
And you end up going ever further from home because well, yes, both

Unknown Speaker 4:11
because I'm getting more fit. So I can do a six and a half mile walk and not collapse. But also, if I want to get to a new street I have to walk farther because I've already walked on all the streets within the five mile radius. So or the two and a half mile radius, I guess. So yeah. It's it's been a lot of fun.

Carlton Gibson 4:31
You mentioned that keeps you away from people is that would you say that's part of a programmer? personality type thing there?

Unknown Speaker 4:38
Yeah. It's it sits well with my personality. I don't mind the social distancing.

Will Vincent 4:43
Well, that's a beautiful thing to do during COVID I'm, I'm jealous. I was stuck here at a time.

Unknown Speaker 4:50
That's right. And I've walked on streets during these walks that are within 50 yards of my house that I never had a reason to visit before. So it is it

Will Vincent 5:01
Well, that's great. So Python and Django, maybe I could just ask. So you have a long history with Django. I think you said, I've seen you say back to 2006. curious if you could talk about that, and how you've how you've seen the project develop since that's pretty early.

Unknown Speaker 5:17
That is early. So actually, my earliest, my earliest touchpoint with Django was actually part of Boston, Python. So in, I think, November of 2005, Boston, Python, as one of their events said, hey, let's do sort of a shootout of some web frameworks. And different people volunteered for different frameworks, I actually got turbo gears as the thing to try out and someone else got this other thing, that new thing called Django to try out. And we built some app, we chose an app. And we each built the same app and compared. But shortly after that, in January 2006, I actually got a new job working for a startup called Tableau t A PBL. Oh, not the database, not the billion dollar one. Not the billion dollar database tool, the photo sharing and stowed storytelling website that was eventually purchased and then disappeared by Hewlett Packard. But that was all built in, in Django and Python. And the reason I got the job is that the founder of the company had asked around in Boston, like how can I find some Python expertise, and that eventually led him to me, and we had a coffee. And you know, the rest is history. And that was on Django 0.91. In 2006. And actually, the first my first week on the job, I had given three weeks notice that the old job instead of two weeks, and so through bad planning, my first week on the job was a week that my boss and the founder of the startup was going to be on the other side of the country for a business trip. But, you know, Monday morning, I showed up in the office, and he said, Can you put build some like ACL stuff, we need some access controls on this app, we're building c later, and I spent the week digging into Django and hacking around and building ACLs. And he showed up like Friday afternoon, and I showed him what I was built. And you know, we were both happy with each other from then on. So it was kind of a trial by fire, but it worked out. Well, that's like the dream

Carlton Gibson 7:13
first week. No, yeah, that's a perfect go do some programming, we're not gonna,

Unknown Speaker 7:18
in some ways it Yeah, it was great. I mean, that and it was the site, the size of the startup was, I think, three engineers, including me and the founder. And so there was just a lot of heads downtime, just hacking on stuff that was not a lot of overhead or, or difficulty, organizationally. So that was great. It was very early on. I remember, we had one problem with our homepage that it was, it was doing a 9009 queries to build the homepage or something. And the reason was that, if you, if you built if you if you got a related object, and then ask the related object for the parent that would do a new query query to get the parent even though the parent is how you got to the object in the first place. And so I submitted a patch to Django that would just hold on to that object for later so that we only needed 100 queries or something like it's just one of those early on in the arm. You know, not there hadn't been that much optimization done. So there was a lot of low hanging fruit. And so early days, you could, you could dig into those kinds of things and make those kinds of patches. The great thing about Django now, in some ways is there isn't that low hanging fruit because it's so mature. But in those days, there was still a lot of work that needed to be done to bring it up to speed.

Carlton Gibson 8:27
Yeah, no select related, or any of those, you know, conveniences for me

Ned Batchelder 8:32
remember, select related was an option at

Carlton Gibson 8:34
the time. Yeah, no

Will Vincent 8:35
classmates use either you had to really run functions.

Unknown Speaker 8:38
But listen to from my point of view, class based views are still sort of mysterious and new and confusing, and almost a poster child for what's wrong with multiple inheritance. But yeah, definitely, we don't have to get into that I'm sure if I know co

Will Vincent 8:54
are now on Django, like the whole point.

Unknown Speaker 8:58
So we're gonna touch on this theme, a bunch I imagined, but I tend to start do a thing once and then keep doing that thing for a very long time and not consider new things. And the difference between function based views and class based views fits neatly into that into that narrative. I know how to write a function view as a view. And class based views to me still seem weird.

Carlton Gibson 9:23
So you'll later you fire up a blank, blank, py file views.py. And you need to write a view, you're going to write a function based view.

Unknown Speaker 9:31
That's, that's my go to. Yeah, and I mean, so since those early days of like writing spending a week writing access controls, I haven't done that much hardcore, direct Django development. So in some ways, it's you know, it's really moved on without me, and I've would that's one of the challenges. You know, being a long time. adopter of technology can sometimes mean that you're an expert in the way that technology was 15 years ago, and You're stuck, because you've just been spending your time doing something else rather than keeping up with all the amazing work that Django has been putting out. So, like, Don't even talk to me about async. I

Carlton Gibson 10:13
was asking me earlier on about something I how do I organize? I know where do I put my settings, files or something? And I described a little file system layout to him. And he said, all old school? Probably,

Will Vincent 10:23
yeah, I know, it's hard. I mean, I people call me old school now. But yeah, you were describing, you know, dash dot running manage.pi with dash dash settings. And I said, you know, you don't, because you're used to, and you can still do it. But you used to have multiple settings files, you'd have like a base and local production test. And I would say these days, you can use environment variables. So you have one file, and then you swap in with the environment variables, but still works for Carlton. So, you know, although

Carlton Gibson 10:51
I still, you know, I still specify my settings on the command line, because who knows where they might be otherwise, you know, even if it's one flat, it's still,

Will Vincent 11:00
I think that saves me as I'm constantly teaching beginners, so I feel like I'm obligated to find out if there's a better way, whereas for my own stuff, I mean, it's painful. It's like, what's the point? Like, I know what I want to build, I just want to build it, you know, it's sort of like installing Python. If you like, you know, we have our everyone has a different way to do it. Unless, you know, actually, I do want to say that like with running Boston, Python, unusually, you get a lot of exposure to a wide spectrum. I mean, both you make because you have talks, and there's project nights, you get total beginners, and then you get, you know, postdocs at, you know, weed science places. So, I want to just broadly ask you about that, because you probably see more spectrum than I do. Whereas most people are in their little lane with a, you know, work what's in front of them?

Unknown Speaker 11:46
Yeah, it's interesting. I mean, so you mentioned teaching beginners, the great thing about teaching beginners is you don't have to bring them up to speed from the way things were in 2006. Right, you can, you can just pretend that the timeline starts now. Like, now we do it like this. And, and the old stuff doesn't matter, right, Carlton and I are burdened with all that luggage. So we we will probably carry some of that around. And we may touch on another recent thing, which is me redoing my website, my personal website as an actual Django project, which, you know, full disclosure, I think I have seven or eight different settings, files with different names like bass and that server name, and etc, etc. And I know there's a better way to do it. But I also know that if I go and look for the better way to do it, I'm going to be presented, you know, in classic Python style with a dozen different options hacked together by a dozen different people, each of whom thinks their way is the best. And it's hard to sort it all out and etc, etc. Again, one of the one of the nice things about Django is the out of the box, like Django will tell you how to do it kind of mentality, which is a rare island in the chaos of the Python world where we love the fact that people can hack together something and get it out there and people will start using it. But that makes for a lot of competition among all these solutions that can be really overwhelming for for beginners, or not even beginners, but people for whom it's not their their passion, like I don't, I don't want to make settings files, I want to make a website. So please just tell me how to make my settings file so I can get on with the thing I'm trying to do.

Will Vincent 13:19
Right? But it's not JavaScript, right? I mean, because I feel that but then when I whenever I it's not just no JavaScript, I'm always like, wow, compared to JavaScript, Python is gloriously.

Unknown Speaker 13:29
Yeah, the problem with JavaScript is that they don't, they don't. And I shouldn't talk trash about JavaScript. But my impression of the JavaScript world is that every, every time I dip in there, which is about once a year, there seems to be a new way to do things that is the right way to do things. So they haven't, they haven't embraced the idea that there are many right ways to do things. They've sort of embraced the idea that there's always a right way to do things, but it's different every single year. And, and that's just as that's almost more difficult to keep up with because now it's not just like, Oh, I chose a and you chose B, it's I'm doing it right, and you're doing it wrong, because you're still doing it last year's way. That's the way it feels, to me at least. Maybe if I were more embedded in the JavaScript world, it would, it wouldn't feel quite so foreign to me. But you asked me about Boston, Python and the beginners and it is true Boston, Python has a wide range of people attending. You mentioned talks and project nights we've actually slowed down quite a bit during the lockdown. We haven't had a presentation night in a while. One of the interesting things about running a geographically based meetup during a pandemic is that all everything goes virtual. And then all sorts of people who who live nowhere near you start showing up to your events. And that's okay, I guess I'm not I'm not sure like what's the word Boston mean in Boston, Python, if everything's on zoom, now, I don't know.

Will Vincent 14:51
San Francisco one two is had a number of May I think that's the other large one that I'm aware of.

Unknown Speaker 14:56
And they've had a couple New York is very big, I guess they are not as active Right,

Will Vincent 15:00
exactly. Yeah,

Ned Batchelder 15:01
that's right. I think they put a lot of their energy into PyGotham, which is their annual. You know, Python like thing.

Unknown Speaker 15:10
so yeah, Boston Python, we do get a lot of beginners. It's interesting. We don't, we don't get that much discussion about from sort of Django, beginners, we get a lot of Python begin. Yeah. And a lot of data beginners. Yeah, well, there

Will Vincent 15:23
is that I'm not sure. There is a Django Django doesn't grow much true, which isn't as large. But I mean, I agree. I, for a while I went, I tried to go to a number of the Boston Python ones and almost always saw you there. Which is a good I was like, man, I don't know if I should go all the time. And yet, you found the time. But it's a lot of memories, a lot of grad students or it's, you know, are people getting a graduate degree in whatever and realize they want to, you know, script something. And so Python, so yeah, not much of a web focus.

Unknown Speaker 15:54
Exactly. Right. And we've been doing the thing we've been doing in Boston, Python, during the lockdown, the new thing we've been doing is weekly office hours. So Mondays at noon, we just get on zoom for an hour. And it's, that's one of those things that we couldn't have done in person, because who's gonna travel for an hour long thing at lunchtime? But it's easy to do? weekly, and there's no prep for it, you just show up? And it's like, people just start talking about whatever, and maybe someone knows the answer, or wants to talk about it, and you talk. So that's been very easy. You all

Will Vincent 16:27
are saints to do that. I feel maybe it's just because I get a lot of emails from people, which I do respond to every one eventually. But I feel like, I couldn't do that. It would drive me nuts. But I wish I could.

Unknown Speaker 16:40
On the flip side, I do get emails from people and I have every intention of replying to them, and then they're six months old, and I just have to declare bank? Well,

Will Vincent 16:49
you know, it's important just because someone emails you doesn't mean you're obligated to respond.

Unknown Speaker 16:54
Right? No, I know. But I mean, I want to and then it didn't. I mean, that's, that's the open source dilemma, right? You can't, you know, no one is owed your attention. But you get into it, because you want to give them attention? And how do you how do you strike that balance and feel good about it and still make progress on the things you want to do and stuff like that? So I mean,

Carlton Gibson 17:17
do you have that's a good question. Do you have wisdom on it? I mean, like you've experienced open source contributor, what would you

Unknown Speaker 17:22
say I, I go back to wills point, which is that you are not obligated. Like that's, it is very easy when you put a project out there. So as Will mentioned in the introduction, I'm the maintainer of coverage.pi, which is the coverage measurement tool for Python.

Will Vincent 17:36
Explain now Congrats. By the way that was yesterday.

Unknown Speaker 17:39
Six 6.0 came out. Yep. Yes, it was actually. Well, there's a funny story about that I was being kept Awake, awake in an Airbnb in Brooklyn by our next door neighbor, who was having an electronic dance party until about two in the morning. That's why

Will Vincent 17:52
you can't leave Boston, you got to stay shuts down at night. And

Unknown Speaker 17:56
and while I couldn't go to sleep, I figured well, I might as well shut out a release of coverage that pie I meant to do it to coincide with three point 10 anyway. And if I do it in the middle of the weekend, then maybe it won't crash too many Travis jobs. major releases often do so I did it. I did it then. So yeah, so coverage.pi, right. So it's, it's a big project in the sense that lots of people use it. Because there really isn't any competition for coverage measurement in the Python world. Most people don't even realize that the standard library has a coverage measurement module in it called trace pi, they just come and get a third party module called coverage.pi, which I maintain. But as a result of that, it's very, there's a lot of issues that get written that I can't debug because they tend to be about esoteric execution environments, like oh, my TensorFlow, something or other didn't get coverage measured. And I'm like, I only understood half of those words, and I'm not I don't have the time to go and figure that out. So it sits there. It's very easy to feel like well, I put this out in the world. So I, you know, I need to make it work. And if it doesn't work, then that's me being a bad person, not fixing it. And you have to get past that feeling you have to you you have to allow yourself to take time off or just turn away from it for a while. There's there are months where I'm just not interested to work on coverage.pi. So I just don't and occasionally, it'll feel like well, maybe I'm never going to get back to it. And maybe I'm just done with it. And then something comes up and an interesting burgers are released and it was like oh, coverage stockpile. That's a cool thing I'll do I'll do some stuff. And if it didn't come back, alright, then it wouldn't you know, maybe there'll be a whole year where it never comes back to me. And that'll be it. I don't know.

Will Vincent 19:45
I see Carlton nodding. I mean, you two are both Oh, uniquely in this bucket. I think it's Carlton asylum. Django has a number of projects that he maintains over many years as well. The big

Carlton Gibson 19:55
one is that I'm working to maintain at the moment is the channels project and yeah, You talked about esoteric execution environments, it's exactly that someone will post a bug. And it'll be like, you know, an nginx config and a few lines of some logging file, you know, and it's like, I'm sorry, I don't even know where to begin to respond to this question. And so I've been trying to click move to discussion for those. Because then it keeps the which it's a new thing. It's like, well, it's not it. It's not I just want to click Close. Right? But I can't answer this, it's not really an issue. So maybe my move it to discussion, it keeps the issues. Lesson. Interesting is one of the one of the sort of de motivators for me is when the, you know, the issues are piling up. And it's, I feel a pressure and it's hard, you know, what, I haven't got the emotional space to look, even look at the repo. Whereas if I can, if I can move those issues over to discussions, those on addressable ones over discussion, there may be some one else finds it searching, and they're like, Oh, I had this problem. And I did this, and that does happen. But then the issues can be like, Oh, actually, there's addressable code, I don't have this identified.

Unknown Speaker 21:06
That's one thing I don't mind. So coverage stuff by repo, I think has 200 open issues right now going back years, I'm okay with that. Whatever. Some people say you should close them, if you're not going to fix them, the author can close it, if they want to close it, they can close it. You know, I think the signals are pretty clear in that repo, your issue might not get looked at. And so one thing that has been great is that people maybe are getting better at making reproducible test cases. A few times, I've gotten a Docker image, with the bug report, like here is the Docker image showing it happening. And I'm like, great, I know I can, now I can see it. Because you know, too often you get people saying, look, you know, it failed like this, can you give me a reproducible case? Well, here's a link to my Jenkins job that failed. I don't, how many times? You know, how far in Do I have to click before I can even maybe find the error message you're talking about? You know, if you can't put work into this, why am I putting work into this? So if your listeners out there, when you write a bug, make it really, really reproducible, right? Write the instructions for that art school friend of yours, who doesn't know how to use computers? And and you are going to be doing the maintainer a huge favor?

Carlton Gibson 22:26
Well, because the maintainer will will run it, and then they'll get to the breakpoint. And they'll be like, oh, okay, I'll give you a few. Exactly. Oh, that's my code. Oh, why is that? You've got a chance.

Unknown Speaker 22:38
And I understand that the TensorFlow people, they don't know anything about how coverage works, right. And I don't know anything about her tensor flow work. So we have to meet in the middle somehow. Yeah. That's fine. I'm willing to dig into it's interesting to dig into it, there was a bug report about pony or pony LRM, which is one of the tools that they take Python code, I think it's the pony RM, they, you write a query in Python code, and then they rewrite that Python code. And so when your query is running, it's not actually your code running. It's a rewritten version of your code running. So when you try to do coverage, measurement, coverage doesn't think your code is running because your code isn't running. It's been moved over into a parallel universe. And, and that was very interesting to see. I think that was

Will Vincent 23:26
Yeah, it's called the smartest Python ORM. Okay,

Unknown Speaker 23:30
so I have a pull request against pony RM to make that a little bit clearer to people. And they haven't bothered to merge it yet, even though it's really simple. But there you go. Turn around is fair play, I guess, right? It's perhaps not

Carlton Gibson 23:43
reproducible? You'll see.

Will Vincent 23:48
Can I ask about? So you mentioned that Python has its own internal sort of coverage testing tool? Has there been any talk of so we have this in the Django community with third party packages, and Django is kept deliberately quite small. You know, you only pull in things when you really need to, has there been any talk of pulling coverage into Python? And I know the PSF has fellows and stuff. I mean, it's pretty much a standard tool for anyone I know who does Python professionally?

Unknown Speaker 24:16
Yes. Um, there hasn't been so there's two, there's two things you might have meant by that. One is pull it into the standard library, and the other is move it under the PSF organ? I guess I meant. I just as a question, either. Yeah. So there hasn't been any discussion of either of them. These days, I think Python is doing the right thing by being very reluctant to put things in the standard library. Under the theory that third party packages can evolve more frequently and will keep the bloat down on the Python releases. And it's just better to have things decoupled than to have everything piled into the box. That, you know, we should all get better at installing and using third party packages, rather than hoping everything's going to go Went to the standard library. And the PSF thing I think the moving. So the two projects that I know off the top of my head that are under the PSF organization on GitHub are requests and black. And I'm not quite sure. I don't know that that one always struck me as an odd thing. I'm not sure why the PSF which doesn't even own the Python repo, why it would own third party package repos. But

Will Vincent 25:26
I don't know the story. I'm black, I mean, on requests, it was because the maintainer was stepping away. And, you know, there was some other stuff. So I think want to keep it going. I'm not sure about black, but I know they have some

Unknown Speaker 25:39
black was sorry. Black was started by by Luca schulung. lanco. Right. So he's, he's well, in place there in the PSF. And very involved, it's he's hardly walking away from it. So right, I don't I don't know quite know what it means when something is under the PSF. Organization, because it's not the PSF that's maintaining it.

Will Vincent 25:59
Right. I suppose it just means maybe a fun? Well, yeah, we always I mean, we Django look to the PSF is sort of a big brother sister terms of size on these things, because many of the same issues around.

Unknown Speaker 26:13
Yep. Organizing. Yeah. And we're edX is going going through a similar transition right now as well. So we often look to the PSF for how to do things. For instance, the PSF has peps, Python enhancement proposals. Open edX has Oh apps open edX, Open edX proposals, I guess we call them. Not quite the same acronym. But so know that moving coverage.pi hasn't been suggested. And maybe that's just because I keep maintaining it enough. Probably,

Will Vincent 26:43
I mean, I guess, right, it's not a squeaky wheel. So

Unknown Speaker 26:48
it's not a squeaky wheel. Right? I mean, it's it's interesting, maintaining coverage up high, because I get involved with new Python releases, because the coverage is intimately aware of how different Python releases run the trace function to tell to indicate what lines got run and which didn't. And three point 10 The reason I there's two reasons why I coordinated the coverage, six release with the three point 10 release. Mark Hammond was doing a lot of work in three point 10 to fix quick fix how to trace function trace things, there are lots of weird edge cases. And he was doing a lot of work to fix that. And when he fixed it coverage.pi would break because coverage.pi was used to the old broken way that things were getting traced. And so I think I wrote I think 10 different bug reports against Python three point 10 during itself a face saying it looks like this Now, is this what you meant? And about half of them were Yes, it's what I meant. And about half of them were like, Oh, no, that's not what I meant. And so we were kind of going back 5050 what, you know, just coverage have to change now, where does Python have to change now. And we finally got that all straightened out, I think by the time of at least beta two, maybe even beta one. So I was really happy to be able to, to push that out there. But the other thing that coverage.pi had to do for three point 10 is the three point 10 now has the match case syntax for doing pattern matching. And that is a change in execution. And ironically, a change in in syntax highlighting, which is something that coverage that pi does along the way, too. So there were a few who was very involved released to get coverage.pi synched up with three point 10.

Carlton Gibson 28:36
That's cool. That's it. I mean, the new syntax in three point 10 is cocoa in Django, we don't get to use it for years, because obviously we know, yes, you know, we're still on three, six, which, right? Exactly? No, don't get to use the new match case for quite a long time yet, but quite

Unknown Speaker 28:50
Yeah, I totally understand there was a time when coverage.pi would run on everything from Python 2.3 up to I think 3.4 at the time, or something like that. Coverage six is mostly called six because I dropped Python to support. So now I can use f strings, for instance. So we're on three, six plus now as well.

Carlton Gibson 29:11
I mean, what's your policy following the Python because Python 3.5 was end of life last end of last year and 3.6 is going to be end of life this December I think and so what's your policy about dropping the support now with the new release cycle,

Unknown Speaker 29:27
I tend to be very, very accommodating. I know some people are like, oh Python two's out of support on January 1 2020 2020 I am pushing my new version that drops Python two on January 2 and and I'm doing it you know for the good of everybody because everybody should move up and etc etc. And my feeling is like I'm building this tool so that people can use it and if keeping Python to support isn't a pain, then you know why not? Just keep it coverage six really what how was that there was a different change in coverage that felt like it was going to need a major version bump because it changed the behavior. Coverage five and before would often accidentally measure third party packages that you'd installed in your site packages. And in coverage six, we made a change. That was it was clever about where the code was coming from. And it would automatically exclude things that had been installed where third party packages go. And that's great. But some people said, you know, actually, that kind of broke my coverage, and I have to go in and change my configuration. And so that felt like that should be a major version bump. And since I was doing a major version bump, and it's been 18 months since Python two went away, I

Carlton Gibson 30:40
figured I might as well, let's clean this up now. And drop hikes to folks can still pin to the old version,

Unknown Speaker 30:47
and they can they can pin to 5.5, if they want. Yeah, that's always the consolation is you know, you're not by dropping support, you're not dropping anyone using your code, because they can just keep using whatever code they were using.

Carlton Gibson 30:59
I mean, super. So there was an article in the news this weekend, you know, the tech news about? What was it that coverage itself isn't a good marker of the test base. And what they dug into was that it's the number of tests that have like, you know, coverage attorney, and I remember asking you a question on Twitter about this a few months ago about Is there a way of sort of marking tests of marking which tests are meant to cover what because I might have a few Selenium tests, which are kind of big end to end ones. And just by running those I get, you know, I get quite a high coverage number. And, yeah, so there's enough for to refer, I guess I'd asked Well, what's your thought on that, and how coverage work, you know, and how you build a quality suite. And

Unknown Speaker 31:42
yeah, and then that is, that is a tricky problem. Because for instance, if even if you write one test, and then run your project on coverage, you're probably going to get like 35% coverage, just because all of your import statements will have run and all of your definition statements will have run. And all of that gets measured. And it doesn't mean you're 1/3 tested yet. And and by the way, depending on what you do in your asserts, you might literally not be testing, you might be running lots of code and not testing any of it, right. So I didn't I didn't exactly read, I think I saw that headline go by about the coverage. Problem. I didn't go and read the article. But your point is a good one that you can write a test, which exercises lots of code, but can't actually assess the results of all that code. And if you had a way of marking a test saying that when I run this test, I don't want you to count the coverage, or only count the coverage on this part of it. Yeah. And that there have been a few ideas about that in the coverage, Issue Tracker, none of which we've, we've implemented the big I keep saying we but it's really just me.

Carlton Gibson 32:51
You and your alternate person at least.

Unknown Speaker 32:54
I code up whatever my rice krispies tell me to. The big thing in coverage five was that you We, the package can tell you for each line of code, which tests ran that somebody covered that line, which kind of gets at that same issue. Right, so you can take a look at the coverage report. And you can see, oh, that line was actually only run by the integration test, for instance. So

Carlton Gibson 33:22
it was that coverage, though, that changing coverage five, which prompted my question, was it could I somehow work this backwards, such that I could say, I'm writing this test, and it's for that function?

Unknown Speaker 33:33
Right? And one of the things that that that keeps me interested in coverage? Is those new kinds of ideas. Like that's a whole other interesting question. Like what can I say about a test that could tell coverage to focus more accurately on what that test is doing? Like some people will say, well, it should only measure the coverage of the immediate functions, it's called, it calls, but it shouldn't measure the coverage of any function that those functions call. But that doesn't feel right, that's too crude to measure. And I don't know if anyone's going to want to put the work in to say, well for this test, only measure the functions in that module and then for that test, and you've got 1000 tests, you're not going to go and decorate 1000 so what's the right what's the right balance there between the effort from the developer to say what they know and what they want about the coverage? And then how do you say that in a way that coverage can make use of that's a very interesting thing to me. And that might be the next big thing right? So coverage five had a lot of big changes we switch to sequel light and we switched we put in the context measurements so you could say which lines were measured by which tests coverage six had a lot to do with fixing the tracing with Python 310 what's what's next? I don't know what's the big feature for coverage seven, maybe it's something like that. But I want to make sure that it's something that people will actually get benefit out of I don't Yeah, you know, you we can we can invent esoteric strange features. And if no one uses them, then what was the point? So? Yeah, it's it's good to be guided by the questions, people ask them the issue tracker about doing these sorts of things.

Carlton Gibson 35:09
Yeah. I mean, so the flip side kind of questions that say I'm doing, I've got these nice unit tests, which are targeted specifically just to their one function. And then I realized I need to refactor. And what actually there now is an integration test around the sort of the outside and I kind of need to napalm all those nice unit. I mean, what's your thought when you face a challenge like that? Like, how do you address that, that kind of, I mean,

Unknown Speaker 35:34
it's a lot of work unit tests, unit tests, by their design are tied to what the units do. And so moving the units around is going to require moving the unit tests in some way, either getting rid of half of them, or writing 50%, more of them are changing what they all expect. And it's a lot of work. And, you know, some people, some people take the attitude like, I'm not going to bother with unit tests, what matters is whether the application works as a whole. And I'm just going to write a bunch of big integration tests. And what are the chances that it is a big problem, I'll get through those, maybe it's a good trade off of benefit versus effort to sing, it's hard to know. And, by the way, so I should say, though, for contexts, the coverage can tell you which tests ran which lines of code, but you can also do things like run your integration tests and say mark all of that coverage as integration and run the unit tests separately and mark all of that coverage as unit. And then every line of code will either say integration or unit or both, right, so you can decide sort of how coarse you want it to be. The problem with marking every test on every line is if you've got 1000 tests, and you're gonna have lines that are covered by 200 tests, there's no point getting a list of the names of 200 tests for a line that's just that's too much information. I went looked at my own coverage coverages own coverage, HTML reports, and they're a single file can have like a two megabyte HTML file result, because every lines got dozens of test names annotated onto it, and it's just not worth it.

Carlton Gibson 37:12
So I mean, a way of aggregating that information,

Unknown Speaker 37:17
right to clump it up at a bigger level, that's not a level that coverage can Intuit by itself. But it gives you the controls where you can, you can indicate those things when you run the tests so that you get the information you want.

Carlton Gibson 37:29
So just the thought that came up while you were talking about people arguing for integration tests as a line I saw a few tests mostly integration, something I can't remember exact time but it was that kind of idea is don't write too many tests keep them mostly integration.

Unknown Speaker 37:43
I think that's riffing on the the food or the pollen? Yeah, right. Eat less, mostly plants

Carlton Gibson 37:49
will swing over to edX in a minute. But like maintaining a mat, a big codebase like that, like thinking about how to help how you maintain your tests, and manage testing and manage the evolution of the code. I think that's a really interesting thing. And I wanted to just pull out some thoughts on that. I mean, unless you say more, though, please.

Unknown Speaker 38:07
Well, we just switched over to edX. So it's a edX is open edX, the codebase is a very large project. I mean, the edX organization on GitHub has 300 repos, probably at least a million and a half lines of code. The main repo is called edX platform. And it's a giant Django project. So I think it's probably got about 400,000 lines of code right now. And a lot of tests, the tricky thing about me talking about the technology is that I've been mostly working on community issues at edX for a long time. So my, my hands on the bits of Django are it's pretty few and far between. But, you know, edX does struggle with the bulk of tests, we have pretty good coverage. So we've got really a pretty extensive test suite. And we're constantly rejiggering the sharding of tests across GitHub actions suites so that we can keep the total wall time of running the tests to a reasonable limit. I think right now, it's about half an hour to run them all. But it's and there have been, there's been cycles of we have too many tests, these aren't telling us enough, let's just get rid of these. Which to me to be perfectly honest, I was I felt a visceral shock at the idea of just delete tests. But the fact is that if they aren't telling you anything, and they do take a long time, and especially these in particular words, the sort of front end tests that tend to be kind of flaky, where not only are they not giving you any value, but they're taking up your time by by with false alarms. So that was the right thing to do is to delete those tests. And it's hard to strike the right balance because, you know, developers may be like many people, but especially Developers tend to be very susceptible to gamification, you know, oh, there's a metric there. There's a needle that moves that way. I want to move it all the way that way. And so and you know, I'm, I'm one of the main purveyors of one of those needles, right? coverage measurement, what's your percentage, you're not at 100%. Yet, you know, you got to get to 100%. That's a lot of work that may not provide much benefit. So trying to strike a balance, rather than just being the best or the last, or the first or whatever, with it, is really hard. I don't

Will Vincent 40:35
know if hopefully, your team is as familiar with this. But Adam Johnson, who's on the Django security team, among other things, has a whole book called speed up your Django tests, which is, they should look at it, if they're not very well suited to a large organization. And just, I mean, crowds we both read it. It's just so many things that if you're on a big codebase, it's like, oh, yeah, that that helps. That helps. That helps. That helps. Like,

Carlton Gibson 41:01
go he's got a chapter on profiling is worth it just for the chapter on profiling. It's just amazing. Take a look at

Unknown Speaker 41:07
that. Make sure.

Will Vincent 41:08
Yeah, we'll put a link in the show notes.

Unknown Speaker 41:11
One of our one of our challenges at edX is that we have we use something like 150 different third party packages to build upon and we try to stay on the latest in we try to stay on supported Django releases. We're in the middle right now of moving to Django 3.2, from 2.2. But it's hard to move. I mean, aside from the question of what do we have to do to our code, to make it work on Django? 3.2, we have to go and look at 150/3 party packages and see, are they running on 3.2? And many of them are not because these packages, you know, they get stale. And then we have to decide, can we get rid of that package? Can we live with it the way it is? Do we have to fork that package. And actually one of our one of our engineering leads here at edX, his name is Jeremy Bowman, he's doing a whole talk at the upcoming Django con, about how we manage that and how we're trying to actually be proactive in the community to push out what we've learned about other people's packages and getting them onto Django 3.2. So that we can get on to Django 3.2. So he's he's doing some interesting work, pushing the envelope forward on how an entire community not just a large million line project can move to Django 3.2. But how an entire community of third party packages can move forward on to Django? 3.2. Yeah, that's.

Carlton Gibson 42:38
Yeah, I mean, it's difficult, but because you're maintaining, you know, sort of, I've just recently bond app comm which I know, edX depends on to the, the only bump was to add the Trove classifiers for 3.2 and fix it. Like, it's still it's still like, you know, do it clean up the you know, clean up the last few issues, make sure you know, data packet, it's still you know, it's a session to do and it needs doing okay, for our conference once every couple of years, do a release, it's fine. But, you know, other if your packages, needs actual updates, and it's all the more,

Unknown Speaker 43:16
right. And for us, it's it's hard to look at a third party package and decide, Oh, is this just a missing missing metadata? Or is it actually not going to work on 3.2?

Will Vincent 43:25
The question I did want to ask you about with testing? I mean, I think we're all engineers, we're all we all value it. managers don't always you've worked in a lot of organizations. What are you what is your advice for an engineer who is trying to sell taking the time to testing to a manager who may or may not be technical? And actually not see the value? Yeah.

Unknown Speaker 43:46
That's a good question. So so to go back to that, that job I started on Django with the the ideal job where the boss left for a week was the same. That was the same boss who said that the way we're going to test our code is we're going to push it to production and people are going to tweet at us maybe it was doing a live we're doing it live. Yeah, we're doing a live and actually they, that bill o'reilly clip of doing it live was was popular meme around the office. Absolutely. And what So one thing that happened there was that they were talking about, you know, we're going to ship in a month, and I was looking at the software and I thought this is not going to be ready in a month. But how do I convince them of that? And what I did is I brought in, I said we should do some usability testing. And I got some friends to come in and be the subjects in the usability tests. And I ran the usability tests, and I didn't know much about running usability tests, but you know, I knew enough to sort of pull the wool over those guys eyes on the usability tests, and, you know, I said, Look, this guy's sitting down to use the app. You said we're gonna ship in a month What? It does it, are we ready and they're like, okay, I can see it, we're not ready. So, if the manager is not convinced the testing is worthwhile. Keep an eye on the project. outages and try to convince them to do root cause analysis. And have the analysis come up with what is probably the answer, which is if we'd known that, you know, X and Y and Z earlier, we could have prevented this. The edX, the edX, engineering culture is great that way we do RCS all the time, and it's a blameless culture. And I don't know if I've ever heard someone say, you know, we, let's just build the feature and get it out there. And then we can write the tests later, you know, we just have to get the feature done in time, you know, those sorts of ill advised trade offs that you can sometimes hear pointy haired bosses making, right.

Will Vincent 45:42
And that just rots the morale of the engineers too. I mean, because I, I came into technology through the business end, and I sometimes talk to MBAs who go and be product managers, and I try to tell them the story of when you go and manage a team of engineers, and you don't, you're not an engineer, it's very easy. There's a new product that your boss says, you know, get it done in two weeks, your engineers say, it'll take four weeks, and you crack the whip and find a way to motivate them to get done in two weeks. And you think, wow, I just got to crack the whip on these lazy engineers. And the engineers think, well, this person doesn't know anything doesn't value testing, the code smells, the tech debt accumulates. And so both sides lose, even though the manager thinks they win. Right? And so and, you know, having things like I my advice is generally like, because the issue is you want to, when you're in charge of managing the team, like that's the onus is on you, you want to have an out an output or an outlet for that. So I would often recommend saying you need to have like a bug day, every month, every couple of weeks, so that the engineers have to prioritize it, because every bug is important to an engineer, but like, how do you when you're not technical, figure out which bug really matters? You say, Okay, I'm, you know, moving heaven and earth to give you the time, and we'll celebrate it and get a gong or something, but you need to prioritize the bugs. And then you're not just saying no, all the time, you're saying, okay, yes, at this date, and then we'll, and that sort of doesn't solve the problem. But that helps morale a lot. And it can also sometimes be effective, versus You know, every bug that comes up. Is it important time often will help you with that?

Unknown Speaker 47:15
Right? Yeah, and giving the engineers some measure of autonomy over, you know, some, some part of their effort is a really good thing to do. And, by the way, this startup that I'm talking about it, he was not a pointy haired boss. It I like to think of it as sort of a healthy debate over the costs and benefits of alternative approaches. And we we shipped a good, good product. I think, like I said, we got acquired by Hewlett Packard, it's not around anymore.

Will Vincent 47:45
Yeah, well, it's like 100% test coverage, right? I mean, it's a goal. There's the truth is at somewhere south of that, but question is Where?

Unknown Speaker 47:52
Exactly, exactly? Yeah, by the way, when we got hired by Hewlett Packard, they looked at us and they said, Oh, this this Django Python thing, how quickly can you port it to Java? Those people, those people went away, and we never heard from them again. But there was there was way more craziness after the acquisition by Hewlett Packard than before.

Will Vincent 48:15
Yes, I think that's something that from the outside, perhaps it seems that large corporations are more stable on the individual level, and most of my experiences is the opposite. So we're coming a little bit up on time. Are there any topics that we haven't asked you about or things you want to mention? While you're talking to the Django community? Well, so

Unknown Speaker 48:34
the Django coverage plugin? Yeah, is a thing. And it's interesting. So So the cool thing about Django is it's got these templates and templates aren't meant to have logic in them. But they can have a little bit of logic in them. You've got if statements and loops and things. And there's a plugin called the Django coverage plug in that will tell you which aspects which parts, which lines, I guess, of your templates have been used in your tests and which have not. And that's it's an interesting project in and of itself, because I, when I first wrote it, I took the strategy of, I don't need to stick to public interfaces, I can use whatever I want that I find inside Django. And that's fine, because it's on me, and I'm taking responsibility for that. And one of the one of the problems with the 6.0 release of coverage is the hurt from at least two different people who said, six oh, broke my thing. And I looked at and like, yeah, your thing was using private stuff that I didn't tell you to use. It's not my fault. I don't want to do anything about

Will Vincent 49:38
that. You're welcome. But I'll just say You're welcome. It was free.

Unknown Speaker 49:43
On the flip side, Django coverage plugin does use internal stuff from Django, but like I said, I knew going in, that's what I was doing. And when it doesn't work anymore, because Django has shifted. I update Django coverage plugin to do a different thing. And I have a testing strategy where every, every week on Sunday, it runs With the latest tip of Jango, and if it breaks, I will hear about it and I can update it quickly before it becomes a problem. Ironically, that test run this week only told me about how coverage six broke it, and not how Django had broken it. But, you know, that's that's what tests are for to tell you this stuff broke, the universe changed and your stuff doesn't work anymore. So

Carlton Gibson 50:21
well, the template engine doesn't change very often. But there's a bit of work going on at the moment to optimize you know, various bits, you know, make it a bit bit more performance. And so maybe you can

Unknown Speaker 50:30
understand that, you might be surprised to hear that inside coverage.pi is its own template engine, partly because it was just fun to write. And partly because I didn't want to have any third party dependencies in coverage.pi. But I wanted to make nice HTML pages.

Will Vincent 50:47
Yeah, great. Well, we'll definitely link to that. Anything, anything else?

Carlton Gibson 50:52
Well, I just I don't know if we're gonna talk about it much. But I wanted to talk to you about CoC, which is one of your other little projects, which you claimed at the beginning, your other projects aren't as interesting. But cog is an amazing little code generator that.

Unknown Speaker 51:03
So yeah, so quickly, cog cog started when I was working at a startup that was doing mostly c++ code. And we had the need to have a SQL schema and Python and c++ code that matched. And I wanted a way to generate the two from something. And the something was going to be Python. And I tried using Cheetah if you remember that templating engine, but that was for texts, and not for code. So it was difficult. And so instead, I came up with this thing called cog, which is basically a way to have a text file, in which you could embed bits of Python and it would run through the text file and execute the Python and whatever the Python generated would go, would replace the Python in the output. And it worked great for that, for the sequel, making c++ and SQL, and I actually use it to make my Python talks. So I author my Python talks in a giant HTML file, using a JavaScript based slide slide package, and there's bits of Python in the presentation that generate parts of the slides. So when I want you know, a diagram or a table that's easier to generate with code than by hand, I put in the Python code and it generates that stuff. So it's, it's lives our lives on in kind of a much different environment. And everyone's and it is, it's one of those packages that I hardly ever do anything with. But about once a year, I hear from someone who's using it for something or wants to make a change. And so it's got kind of this tiny but dedicated following. So yeah, that's I guess I'd forgotten about cog that's, that may be my second most successful side project.

Carlton Gibson 52:41
It's just super, like, I don't know, I've used it for generating like, JavaScript clients from you know,

Unknown Speaker 52:48
Okay, there you go. That's, that's similar to its original purpose in life. Yeah. And

Carlton Gibson 52:52
it's like, this is you know, whenever I'm find myself writing, like, really boilerplate. It's I'm just repeating myself here. Yeah, exactly. You know, reach for, like cogs. And just an awesome little tool as I wanted to thank you. Thank you,

Will Vincent 53:06
for that. Sure. I think that's like levels like they say that, you know, developers want to do a blog. So you end up building your own static site generator. And this is like you doing that for talks? So I respect that exactly.

Unknown Speaker 53:17
I'm constantly on side sides of sides of sides projects to make nicer looking diagrams in my blog posts on my Django hosted self written Django side project sites, etc. It's

Will Vincent 53:30
Yeah, no, that's great. You know, because from that comes coverage.py and who knows what else right? If you lose that sense of play?

Unknown Speaker 53:39
That's right. You're Yes, it's very it's very helpful to have a side project where either you can do it exactly right. Because you can't in your day job, or because your day job makes you do it exactly right. And your side project you can do it wrong. Like just having that outlet testing is your personal

Will Vincent 53:58
testing branches that branches everything is master like let's go exactly, it's great. It's like my site like wrenches my personal

Carlton Gibson 54:08
because we we let you slip personalizing it, but you're rewriting that and that was the what was the what was the old rewriting that as a straight down Go site now? Right?

Unknown Speaker 54:17
Well, so it's more complicated than that. It? So my, my personal site started in 2002 as a bunch of Python code that generated HTML that I would FTP up to so you didn't have a generator? It wasn't well originally Yes. It was a three PIP

Will Vincent 54:32
free freeze, right? Was it flask? flask? Freeze? I think right? There's a flask freeze way you could do it.

Unknown Speaker 54:37
It wasn't well, there wasn't a web. I Fraser way before that, and so yeah, it was all XSLT. Yeah, okay, but don't laugh because it's still all XSLT

Carlton Gibson 54:50
is still the same. It's still SSL

Unknown Speaker 54:51
tea. So it started as just like, let's use XSLT and generate a pile of HTML, FTP it up there. And then I saw At some point, I switched it to being a bunch of Django site that I would use a static site generator with locally to generate a pile of FTP, HTML that I would FTP up to the site, and to do comments that had PHP code in there, too. And there was some interesting Django middleware that would execute PHP along the way for local testing or something like that. Yeah, it got really wacky. And then this summer, my hosting provider said, we're going to you, we're kicking you off, they were getting bought, and they said they could transfer my site. And then they said they couldn't transfer my site. So you got to find a new host. And so I went looking for so I thought, fine, I'll just do a real Django site, I made one little stop on, maybe I can just move the site as it is to a new host, but the old site was PHP five, and the new host was PHP seven. And I definitely didn't want to invest any time in understanding how to upgrade PHP. So I bit the bullet, and I made it a real Django site that's really hosted on on digital dreamhost, sorry, dreamhost. And so it's the I had to re implement the whole comment system, which was cool. And now I can do better things with the comments and etc, etc. So but it's still very wacky, so I still generate a bunch. I import a lot of XML files into a sequel lite database. The sequel, lite databases are synced up to the server, where the Django site will use XSLT to convert the XML into HTML and serve it. So all sorts of wrong decisions. But it works.

Carlton Gibson 56:37
But yet, like 10 years of, you know, just bolting on a new bit. Yeah, yeah.

Unknown Speaker 56:41
2020 years, 20 years in spring. But now I can do cool things. Like if I have images, I can auto convert them to web P, and serve them as web p from then on. So the first hit does the Convert, and then everyone else gets a better image. And I can just do JPEGs locally. So you know, you know, cool stuff.

Will Vincent 57:01
Love it. Maybe the last question, take us out. So edX, I think this is public has just been so a nonprofit that's been acquired by a for profit company. I'm just curious what you can say about that. And how do you envision that changing your role if at all?

Unknown Speaker 57:17
Yeah, so this is a huge topic. But I mean, honestly, there. I pretty much mostly I know what's public. Yeah. Much more than that. You know, people. So my role at edX has been sort of the face of Open edX, within edX. And then outside of edX, my role has been sort of the face of Open edX, from edX. So I think of myself, sometimes some people think that I'm like the big brains behind Open edX or something. I have called myself the sidelines mascot for Open edX, like I'm the guy in the suit, dancing around as hard as I can to get everyone excited. So when the announcement was made that to you an ad tech company was going to acquire at edX. I think that was at the end of June. People were getting in touch with me like, like, you know, did you organize this? And I was like, I, I found out about it, when you found out about it. You know it when the news went public? That's when I knew

Will Vincent 58:18
Yeah, sometimes you're allowed to know when you're on the inside, because Yeah,

Unknown Speaker 58:21
exactly. So yeah, two years acquiring edX. But edX is a nonprofit, and a for profit company cannot acquire a nonprofit company. Or at least, I don't know the legality, but what's actually happening is that there is going to be a new nonprofit formed, which will get all of the proceeds from the sale, because you can't, you can't create a nonprofit to further education and then sell it to a profit company. And then me keep the money like I don't get any of the proceeds of the of the acquisition because it was a nonprofit, right? That the money that went into the nonprofit has to continue the goal of the nonprofit. So there's gonna be a new nonprofit whose main tech continues that goal of education, furthering education. And most of that x is going to go to to you. There will be people who work for the new nonprofit, we don't know who those people are. We don't know who's going to run the nonprofit, we don't even know exactly when this is all going to happen. It's, it's, you know, under legal review by I guess, the Attorney General of Massachusetts. So but it's going to be an interesting time. So one of the challenges of edX and Open edX is that I mean, then this is classic problem with any Django project is that you set out to build a platform, but mostly what you're building is an application. Right? So like when, like Django app, so I'm gonna have a blog app in my project. And your blog app is never really just a blog app, right? I mean, it's not general enough to be used by other people. It's always got connections to the rest of the project and you hard coded the name of it in there because It was just too much trouble to make it a setting the right way. And you don't know what the hell you're talking about. Yeah, yeah. Okay. So and you know, Open edX is no different, right? So way back in 2012. And you know, edX was building a site to serve education, and it was a Django site. And the goal was always to open source it so other people could use it. But you know, mostly it was just edX that we offered to other people to use. And over time, we've gotten better about making it more of a platform and more generalizable. But all along the way, we were, we were deploying from Master and we deploy multiple times a day from Master today, to edx.org. And that's, you know, the business that pays my salary. So it's very important that we keep it running. Meanwhile, we also want to get contributions from people into that open source repo. But how do we manage that risk, right? They're not running a site that has 30 million, 40 million, I don't know the exact count now 50 million learners right now. So they're not dealing with the kinds of scale we're dealing with, they don't know what our roadmap is. So up until now, the contribution process has been very tightly controlled by edX, every change has to be reviewed by edX, roughly, we're opening it up a bit now. But for the most part, now that there's going to be a separate nonprofit that's going to own Open edX separate from the edX company that's running the edx.org website. What's the new contribution model going to be? What's the flow going to be? How do how do we keep code moving at high velocity and keep the business stable, under two separate legal entities, one of whom is going to want for the most part to increase contribution, and one of them for the most part is going to want to keep the business stable. So that's a whole new, open source dynamic that we're going to have to have to navigate. I mean, it's very interesting time.

Carlton Gibson 1:01:54
It was brand new colleagues came on, you know, a year or so ago and talk to the overnights, I went, checked it out and downloaded and tried to get up and running. But it wasn't easy to get up and running new. It's now contributed as a new light. You know, I know my way around Yang, but it was weighing on this is tricky.

Unknown Speaker 1:02:12
Yeah, it's big. It's big. And, you know, like I said, for the most part, edX engineers are focused on how can we make edx.org a little bit better today? Not? How can we make it easier for someone else to run another site that also does education, like no one's opposed to that goal, it's just probably the fourth or fifth goal in their lists, and they're not being measured on that goal. So it's very easy for it to lag.

Will Vincent 1:02:35
I think that's I mean, I've, I've heard that something of a similar dynamic is, is off is even the case with say, react, which is us, you know, within Facebook, but it's, the engineers there want to do that. And Facebook itself, sort of does it to humor them, but doesn't really care. And so, from the outside, it's like, well, Google, or Facebook, or Microsoft supports this open source package. And really, it's probably a handful of people on the inside fighting. You know, they're getting paid by those companies. But, you know, their boss isn't saying nice job and react, release, you know, it's like, okay, when you're done with everything else, you can maybe do that if it helps us hire people.

Unknown Speaker 1:03:12
Yeah. And that's one of the things we're constantly I mean, for eight years now, we've been trying to find good projects that we can compare ourselves to, in other words, you know, the, the problems we have, are there other projects that have those problems? How are they solving them? Can we use those solutions are too. And there's always differences between the projects that make it not quite a direct map. And so it's hard to find analogs. But actually that the the example of react coming out of Facebook actually was a recent one that we've discussed. I don't know how we'd actually get to the bottom of that, like, how do we get into the cubicles at Facebook to see what their actual, you know, goals and measurements are? And how do they support it? Maybe we should just go ask Oh, no. No, no. But it is it isn't. It's fascinating to one of the I mean, it's one of the tricks of open sources that it's not, there's no one way to do it. And there's sort of the classic way, you know, just do it like Red Hat does it well, Red Hat doesn't deploy to master multiple times a day, we'll just do it like WordPress, does it? Well. That's a little different. So you know, it's it's interesting to work in open source and have to sort of rethink it from first principles to make it work for everyone. Right? edX loves open source as a way of expanding our engineering capacity. Right? Instead of having I don't know what it is 100 150 engineers at edX, we could have 800 engineers among all of the people using Open edX. But we only get those 800 engineers if we can coordinate their contributions in a way and open up the channels so that the contributions can flow. And that means we have to tell those 800 engineers how to get started, like Carlton said, we have to tell them, well, what are we interested in? Where are we headed? What's the roadmap And, you know, edX, edX works like most companies, it's, you know, they're the edX employees, and we all talk to each other. But how do we talk beyond ourselves so that we can get those 800 people? That's, that's the big challenge. And we're entering a whole new phase of that with this split of the acquisition and the new nonprofit.

Carlton Gibson 1:05:18
I mean, it's a really good good example, if you want, you know that someone was asking you on Twitter just today about you know, what are examples of big Django sites? Well, Open edX platform is a really good repo, there's a lot going on there. As you know,

Unknown Speaker 1:05:30
the problem, the problem with people, people want those examples because they want to know how to do it well. And it's easy to find large projects. And it's easy to find good projects, finding good large projects is really hard, because the large projects have been around for a long time. And so they've just acquired lots of first they've got the archaeology, archaeological layers of how best to do it, right. So way down at the bottom, you've got the functional views, and then maybe we've got some class place views and the settings files, like we said, have been through their evolution. But they've also just accumulated the tech debt and the cruft from having, you know, 100. People work on it for eight years. And it's just hard to keep everything with one voice and best practices spread across a million lines of code. And

Will Vincent 1:06:12
but isn't that what being an expert in coding feels like is that you still feel the same? It's just that people ask you and you realize there's no solution. So you just have to pick one as a expert? I that's the that's how I feel like when people ask me something like, I don't know. But then I asked around and you know, Carlton doesn't know Jeff doesn't know a couple of Adam doesn't know, it's like, it's unknowable, or there is no best practice because I just checked.

Unknown Speaker 1:06:35
Yeah, exactly. Well, and and sometimes we've got a joke among architects at edX, that the answer, it's always the same answer. The answer is, it depends.

Will Vincent 1:06:43
On the tagline for our podcast, we haven't said it yet. We always we try really hard not to slip it in.

Unknown Speaker 1:06:50
Right? But even even like people, beginners come in saying, well, what's the best way for me to install Python and get it ready for my project? Well, it depends. Are you doing scientific work? conda? Are you comfortable with the command line? Well, you might like virtual env wrapper, but otherwise, why don't just go into pi charm and pick new project? You know, it? It depends things things are complicated,

Will Vincent 1:07:11
right. And that's the kind of thing that a newcomer they just want to work. Right and

Unknown Speaker 1:07:15
and that's not just because they're a newcomer, it's because it's the part of the thing they're not interested in, right? I'm not a newcomer. And I would love to have not have to think about virtual env wrapper and just have it beat solved for me. Seems like about once every six months, I have to revisit how I get Python installed on my computer and how I get virtual env in those versions of Python, and

Will Vincent 1:07:36
yeah, as we wrap up, I think Python has gotten better in that, you know, a just use Docker B, if you just need one version, you can just go grab the official installer that works fine. And then if you need multiple versions, like a regular person, you can use Python. And you know, I guess poetry if you get fancy, but that third case, hopefully you it's not your first time installing Python if you need to have multiple versions to work on stuff.

Unknown Speaker 1:08:01
Yeah, but you just described the better way. And I think you had three conditionals along that path.

Will Vincent 1:08:07
Well, it depends. Yeah, it depends on what what they need. I mean, I think that depends. And yes, I mean, so I've been doing this for my my Django books, because I have a section on, you know how to get to Django. It's like, well, I got to slog through Python. And I have switched away from homebrew on back and just the python.org installer actually, I think works quite well. And you know, yeah, there's all these things it doesn't do like how do you switch Python versions? It's like Well, there's a whole universe of thought on that but yeah, we're exactly doing clean Greenfield stuff and so it's you know, Jenga three two and Python 333 10 soon and don't have to worry about well Yep. Anyways, thank you so much for taking the time to come on. I know we've got a little over but I appreciate we a lot of questions around testing and edX and yeah, and you were one of the very first people we wanted to have on this podcast so I appreciate you taking the time to do

Unknown Speaker 1:08:57
so now the whole podcast is done you can just can just this is the series finale

Will Vincent 1:09:02
Yeah, it's like covering six you know, we'll take 18 months off and

Unknown Speaker 1:09:07
this is this is a lot of fun. I always I always enjoy doing these and it's great to have a chance to have a longer discussion. I know we tweeted each other and maybe even see each other in IRC or discord sometimes but having an actual discussion with paragraphs and replies and thoughtfulness is great Yeah,

Will Vincent 1:09:24
and I mean I sort of would come in on the goal is to lose people like I passive people are like oh coding, even podcasts like Glasgow Listen, and they get fight, you know, PhDs and something else get five minutes, and they're like, he lost me. I was like, good, because I'm not trying to appease you.

Ned Batchelder 1:09:38
You're not the audience.

Will Vincent 1:09:39
Yeah, that's right. If you are interested in the podcast, Djangochat.com, chatDjango on Twitter. And we'll see you all next time. Bye bye.

Carlton Gibson 1:09:47
Join us next time. Bye bye.