Django Chat

Ethical Ads - David Fischer

Episode Summary

David is an engineer at Read the Docs where he focuses on, a developer and privacy-focused ad network. We talk about internet advertising and why much of it is not ethical, the open source code powering the ad server, and what it takes to serve tens of millions of monthly API requests.

Episode Notes

Support the Show

Our podcast does not have a sponsor and is a labor of love. To support the show, please consider purchasing one of the books on or suggest one to a friend.

Episode Transcription

Will Vincent 0:06
Hello, and welcome to another episode of Django Chat, a weekly podcast on the Django web framework. I'm Will Vincent joined by Carlton Gibson. Hi, Carlton

Carlton Gibson 0:13
Hallo Will. Longtime no see.

Will Vincent 0:15
Yeah. And this week we are joined by David Fisher from read the docs. Hello, David.

Carlton Gibson 0:20
Hello, how are you? Thanks for coming on, David.

Will Vincent 0:23
Yeah, thank you for coming on. There is a whole bunch of things want to ask you about. But maybe we'll just start with your origin story. How'd you get into programming? Python, Django, and then we'll go from there.

David Fischer 0:34
Oh, all right. Um, so I have a pretty like, standard origin story. You know, I sort of studied math in college and loved programming before that wanted to get into it sort of, had a programming heavy undergraduate, and went right into like a job at a, you know, pretty big tech company right out of college. So that was a while ago. But now, I sort of decided, you know, I'm done working for these big companies. I, you know, I'm not like super negative about them or anything like that. But I, I wanted to get my hands a little dirtier, wanted to be a little bit closer, maybe to both the business side and the code side. And so I started working for more on this startup, startup sized companies and and read the docs, actually, a couple years back about not quite three years ago, it was a pretty natural fit. For me as they were sort of building out their advertising platform.

Will Vincent 1:24
Got it. And when did Python fit into your programming journey? Oh, man, I

Unknown Speaker 1:29
took like one class that touched on Python in college. And it actually helped me get my first job, which is sort of random. I did Python in one class. And then this job application was like people who know Python, I was like, that's me. You know, as much as an undergraduate who doesn't know Python knows Python. But it sort of helped me get my first job, didn't do Python at that job for years. And then maybe four or five years into that job. They were like, do some Python. So that was that was how it ended up. Working. I was already working on some web stuff. And I ended up picking up Django, I sort of like, looked at a couple of the alternatives at the time. And Django This is an oh nine, six days was the that it ended up being a great fit. So that was how I got into Django. It was a long time ago,

Will Vincent 2:16
Carlton, what version was it for you? I don't know about

Carlton Gibson 2:20
one, or something around the 1.0 times over because I remember going to a conference and there was this? It was like Django Django Django, so it must have been around 1.0. Around that

Will Vincent 2:32
time, so maybe for listeners, what is um, maybe read the docs, what's the quick story on that? If they're not familiar, they I'm sure most of you, listeners have seen read the docs, but maybe they don't know the story of read the docs.

Unknown Speaker 2:45
Yeah, it's sort of an interesting story. And I'm not the main sort of protagonist in this story at all, sort of Eric hoelscher. And Anthony Johnson are sort of the main the main folks in this story. I came much later, although I was an early user of read the docs, I think I started using read the docs in like, 2011, you know, had an account. You know, I sort of look back in my password manager. It's like, created date, I think it was. Yeah, exactly. Yeah, exactly. That's how you figure that out. Um, but it, I think it started at like Django dash in 2010, which was like a one weekend sprint project. And, and it was basically like, the automation around Sphinx. That was that was exactly what it is. So how do you do continuous documentation building off of every commit to master, you know, something like that. And so, Eric hoelscher, I think at the time was probably one of the main people behind it, there was a couple other people at that time, who ended up not sort of transitioning, not launching it as a company. And it just sort of it took off in the Django community. And in the wider Python community, people started using it. And Eric had a day job, you know, working a regular job, like a regular person, and he would get these pages or calls being like, hey, read, the docs is down. And he's like, Well, you know, I got, I got my regular job, guys, you know, like, So, eventually, it was so big, so many people relied on it. I think Eric and Anthony decided to launch a company around it. And, and basically, it's funded through a combination of advertising on open source documentation. They sell sort of ad free commercial documentation that has a few additional features, so sort of companies can buy it. And then there's, there's also sort of, like, what's called read the docs gold, but it's basically like, regular people can pay a little bit of money and you get you get some extra perks on read the docs, like ad free on read the docs, and you can sometimes designate a project ad free, there's a there's a few different things you can do. So that's sort of where it all comes from. And, you know, I sort of looked at this and I, Eric could produce sort of this guide on how their advertising was different from other people, you know, sort of Nope. Advertising without, without tracking people. So pretty much the opposite of what most advertising does. And, and I sort of just emailed Eric out of the blue, I'd met him before at a conference, like, you know, probably seven years before, I'm sure that he had no idea who I was. And I, I barely remembered who he was at that time. But I emailed him to ask him a few questions about that. And he was basically like, you should, you should work our advertising. I was like, Yes, I should. So that was that was how I got the job. I read the docs, I sort of did a few projects as a consultant for them, and then transitioned over to full time before we go off into the

Carlton Gibson 5:39
ads story, which is, there's much more to talk about there. Can I just whine back a bit a little bit in the early days? Was it like Eric, just like hosting this on his own server? Like, because

I think like,

it grew up quite big quite quickly. Right. So what was the cost? Great question. If you've just got your blog, big question, your

Unknown Speaker 5:58
host. This is this is where a number of the problems came from, you know? So it was it was, when I stepped in here, I was like, Man, this is hosted insanely cheaply. You know, they're doing,

Will Vincent 6:11
you know,

Unknown Speaker 6:13
50 x, or even more, maybe 100 X the traffic of some of the other properties I've worked on before. And they have a budget, like a monthly, you know, infrastructure budget, that's like the same as other places I've worked. But did you know, yeah, 2% of the traffic that read the docs it so it was it was built to run insanely cheaply. Now, it has some advantages. Read the docs is almost entirely hosting static content that is already built, you know, just straight HTML, CSS, JavaScript that's built through something, you know, built through a builder. So yes, you have to run builders, that's kind of expensive. But the but the hosting and the serving is relatively inexpensive. So there are some advantages there. But yeah, read the docs, just to you know, our our stats are pretty public, we're upfront about them. It did something like in August 45 million pageviews.

David Fischer 7:05
In last month, she does a lot

Unknown Speaker 7:08
it does a lot of traffic is good. And that's just the open source side of the hosting. That's not the commercial hosting. And we have some privacy protection in there too. Like, you know, we don't we don't send anything to analytics, if somebody has do not track marked on their browser, things like that. So it doesn't count any of that you run an ad blocker, you don't get counted. This is 45 million pageviews. discounting all of that. So it does it does an absolutely massive amount of traffic and tech people are quite likely to go down all the time, because it was hosted on a shoestring budget. Eventually, Eric sort of moved it over to AWS and had like load balancers, you know, like, things that you would do if you wanted something to stay up.

Will Vincent 7:47
Yeah, crazy stuff. Yeah.

Unknown Speaker 7:49
And, and this worked pretty well on still what was at the time, a relatively small budget. But they they launched sort of some crowdfunding campaign, some of these brought in, like, real amounts of money. You know, they did like a big crowdfunding push, and you got, they got like a real amount of money. It was like $30,000, which sounds like a lot until you realize that's like a few months of infrastructure budget. But like, it was all one time donations. And so then like, the month after they got $30,000, the next month, there was like, Oh, we brought in $1,000, which is you know that that's going to be underwater on on the infrastructure budget. So they realize, what do we do here? And the answer was mostly advertising.

Will Vincent 8:31
Why remember that post, maybe I find it I'll put it in the links about you talking basically, kind of what you said, and the, you know, not joyously jumping into ads, but basically being like, we need to cover the costs. And I remember thinking it was really well written and really sort of laid out the dynamic for a lot of people, which is, yeah, it's hard to charge and can't lose money on something that's,

Unknown Speaker 8:51
yeah, that's a side project, essentially. Or it was at one time, you know, yeah, you're absolutely right. And, and there is sort of the reality of the situation that, you know, the budget of read the docs is a rounding error to somebody like Google, it might even be a rounding error to somebody like, like GitHub, even pre Microsoft acquisition. So like, they could just launch it, and if it loses money, whatever, no big deal. But, um, you know, for us, it's like, real money. You know, commercial hosting does not bring in as much money as advertising

Will Vincent 9:21
for us right now. Wow. That's interesting. Cuz, yeah,

Unknown Speaker 9:24
yeah. Huh. And and, you know, like, we don't have venture capital backing us, there's no sort of like, money banks, Daddy, Daddy moneybags behind us or anything. It's like, Oh, we bring in a little bit more money. That means we can hire one more person. So it's all sort of bootstrapped. There's no, there's no venture capital at all.

Will Vincent 9:45
Right? Well, I think it's in some ways that force discipline is the best thing you can have. I mean, it's Yeah, unpleasant in some ways, but I mean, when I back 10 years ago, I was working in a company called Quizlet, which was a top hundred website by trial. Traffic with what was it? Two and a half engineers, not a big budget. And I think in many ways my tenure there, which was about three years, what we, the main thing we did is we didn't harm it. We just because I spent so much time trying to recruit people, you know, we kept it free. We kept it up. And you know, those constraints were a good thing. They were frustrating, right? Yeah, as you always have your long list, you're like, man, but it really does focus priorities. And I think can also lead you away from sometimes if you have 20. Engineers, it's like, well, they gotta be doing something. And even though maybe that's not what your business needs.

Unknown Speaker 10:37
Yeah, you're absolutely right. You know. So, yeah, constraints sometimes are sort of, that's how invention happens.

Will Vincent 10:43
For sure. Well, I, I love that you're so your code is open source, you're the ad server, you know, client, server, server client are both up there. And so I was actually, in prepping for this interview going through back to the first commit, because I love seeing how people build things. And I wonder if God put the link in for people. But essentially, it's I love actually how simple the project is, even today, it's essentially a single ad server app within Django, because it's quite easy to, I would say, blowed out a Django project. And you've been very constrained. But I wonder if you recall, you know, building, you know, starting from the kind of process, right, like, what did you start with? To the extent you can recall, like the, you know, the changes over time, because there's a big difference between prototype to first stage production to, you know, the scale that that you are at now?

Unknown Speaker 11:40
Well, you're probably looking at a commit, that's not that old. I think it was as a 28. Team. Maybe? Yeah, yeah. So basically, when we, when I first when I started to read the docs, the advertising was basically a Django app that was closed source, but got built in to read the docs at compile time. So there's sort of these private extensions and read the docs that are in a private repo, the read the docs, the main read the docs, repo is all public. But we have a couple private extensions, repos. You know, one is for the things that commercial hosting gives you. But some other things, we have some other just closed source that are closed source for, for a variety of reasons. And advertising was just one of those. It was just in a different repo, it was just one Django app. And it was closed source. And so when when if you looked at the first commit of the ad, server, it's probably just mostly taking a bunch of stuff from this private, read the docs repo and bringing it into here. But yeah, a bunch of things were sort of renamed. It was

Will Vincent 12:42
pretty iterative. Yeah, it was I saw, I think, yeah, that I think of the first commit was just, you know, first commit nothing. And then the second commit was import ad server, but then, I mean, I could see you, you. You started with basic auth. And at some point, you added Django all off, you know, kind of all the, to me standard steps that you're not every Django developer gets to do, because you often you just parachute into existing projects, and you don't have that flexibility. And you don't do it a lot from scratch. So I always love seeing, actually how, like a production site is done. iteratively, because that's not an experience many people have,

Unknown Speaker 13:23
yeah, it's actually sort of a crazy experience, because I got dropped into this project. And we were basically like, we're gonna break our ad server out from read the docs, because previously, it was just a Django app and read the docs. And, you know, it does a very large amount of traffic, something like, you know, 30 million ads a month. Most of those are not paid ads. But, you know, still, we're talking about, yeah, 30 million API requests a month, it's kind of expensive. And and then there might be additional requests on top of that. So like, when we broke out the ad server, we had to build something that on day one is is going to handle, you know, 30 requests a second sustained 24 hours a day.

Will Vincent 14:05
No pressure, no pressure,

Unknown Speaker 14:07
you know? And yes, the first time we tried to stand it up it absolutely. So okay,

Carlton Gibson 14:11
what does 30 requests per second, sustained look like in terms of infrastructure in terms of the you know, what are you running? What's, you know, how many workers are you running? How many, you know, is it using a threaded model, a pre fork, you're using dynacorn using you whiskey. How do you how do

Unknown Speaker 14:27
you solve that? So it is pre fork, it's, we are using golden corn. And it is I think we're looking at for it's either four or six workers. I know I've tweaked this setting, so I don't remember what it what it is, I could check for you. But it's either four, six or eight workers per, per sort of instance, and we're running. I think we're currently six instances. Okay. So that that's about where it is, and that that handles it fairly well and we were right now we're working hosted on Azure. We we started out we were, we were prototyping on Heroku. But now for production, we're on Azure, that's where read the docs is. So we decided we just be on the same infrastructure, it is set up slightly differently than how read the docs is set up read the docs is also on Azure, but it's basically just using like, base VMs. And what are called scaling sets, where you can just sort of scale the number of identical VMs that you have. So that that's how that that's how read the docs is set up. So read the docs actually auto scales.

Carlton Gibson 15:28
That's super interesting. Because, you know, a lot of people, you know, you, you'd have no idea what it takes to run a site at bigger scale, right? You You build, you build your little thing locally, okay, fine. You put up a worker file, you run it out, and you know, how do I, how do I plan in advance, but if I want to grow, you know, that kind of traffic that well, you need to think, okay, you're gonna need half a dozen servers, you're gonna need, you know, this kind of infrastructure. That's Yeah, you know, so it's really nice that you can come on and share that kind of information.

Unknown Speaker 15:58
And it's much easier to build something that will handle that kind of infrastructure than it was 10 years ago, 10 years ago, cuz it would have been much harder.

Carlton Gibson 16:07
Yeah, no, I mean, this is the, this is the cloud thing, right? is that if I need six servers, I just get six servers. It's not

Unknown Speaker 16:16
Yeah. click a few buttons in the AWS dashboard, or whatever, you know, the Azure dashboard. Just drag the slider to the right. And, you know, my bill also scales linearly, linearly.

Carlton Gibson 16:28
Okay, so you're running the ad servers as a kind of a micro, a massive service on the side?

Unknown Speaker 16:34
Yeah. Yeah. Yeah. I mean, the the big reason why we wanted to push it out of read the docs is that we had sort of this vision, which just sort of started to happen a couple months ago, of basically, taking the ads that we've served for read the docs and making it so that we could help other projects, you know, other tech projects, other sort of similar places, like read the docs, they need to earn some money, how do we help them run ads? And we didn't, we didn't want them hitting sort of read the API endpoints. So we sort of said, Hey, we'll make this ad server out. It'll be sort of its own thing. Yes. It's part of read the docs. Yes. Read the docs is like the primary user of the service. But we wanted to break it out. So it was separate infrastructure, all that kind of stuff.

Carlton Gibson 17:18
Right. And so this is where it gets really exciting. Because it's ethical ads, right?

Unknown Speaker 17:22
Yeah. So what

Carlton Gibson 17:24
so I'm, I'm putting up a site, and I'm thinking to myself, Oh, I need to make some money. But I can't bring myself to put the Facebook tracking pixels in and the Google tracking pixels in because I just can't bear to be part of the massive surveillance capitalism world that we live in now. And as an alternative, so tell us about the ethical side of it. And like, Why might choose? Right.

Unknown Speaker 17:47
So you know, this is exactly what I emailed? Eric about? It was probably about three years ago now. Almost. Exactly. And I, you know, I had some questions about this, you know, how does it work? How, you know, how does it work relative to something because I had some familiarity with my last job, I wasn't in the marketing department. And my last job, I was the head of development. But um, you know, the, the marketing team would come to me all the time, they needed help. And I was sort of the liaison with the marketing department, I helped the marketing department anytime they needed any tech stuff. And I can remember sort of, like helping them set up advertising, and I was basically horrified at everything that they were doing, you know, I have a bit of a background in security, some of that extends into privacy. And basically, I was horrified, you know, sort of standard procedure in this world is, take your customer list, upload it to Facebook, and tell them Facebook, I would like to advertise to people similar to these people. That is like, that is standard procedure, in advertising, you know, for anything, really not just SAS companies, that sort of anything. So that that's sort of like, I was horrified at that. And so I I talked to Eric about this, what, what do we do differently, and there's a few different things that we do. You know, one, we basically as much as possible, don't set cookies. There, there are some cookies that are like, borderline unavoidable, like, like the CloudFlare cookie, if you want. So if you want your site to be protected by CloudFlare, they sort of set a cookie and some things are sort of unavoidable. But as much as possible, we try basically, none of that. And, and, and there's a few other aspects we we try to like align ourselves with the site owners, not sort of against the advertisers. But like advertisers are constantly pushing you to put more tracking. And since with read the docs, we were sort of the publisher, and we heard very much from our from our users, like read the docs, regular visitors, they don't want tracking, they don't want cookies. Even when I started to read the docs, I would probably get an email a week about something privacy or security related. You're running Google Analytics, we hate that you're, you know, whatever, something like this. So I would get all these emails. So we were basically like no cookies. Try to align with the site owners. Don't run any resources. Nothing at all from the advertisers. So not just scripts. You can't you have to take the images from the advertiser and host them yourselves. Otherwise, they'll cookie your users. Yeah, yeah. So just all these sorts of things, you, you start to realize, like all the I mean cookies by themselves, and there's nothing wrong with cookies, but like, you can do a lot of bad things with cookies if you if you really want to. So that's, yeah, all of this is. It's just, it's a shady industry. There's good players in this industry. There's bad players in this industry, but like, we try to be one of the good guys, they was

Will Vincent 20:36
calling us from on that. I mean, three, the doc, so you're dealing with people who understand some of the implications of it. Yeah, you know, not all players care about their privacy, but enough do that it's a meaningful differentiator for you.

Unknown Speaker 20:50
We heard from them that they do. So you know, a lot, I would get literally an email a week, you know, something related to this, you know, either, you know, turn off Google Analytics or, you know, respect do not track or, you know, something. So, we took a lot of the steps there around do not track. I don't know, if you're familiar with it, it's sort of like a pseudo standard. It, it's not a real standard, or rather, there are real standards around it. But like, there's no, there's no teeth around it, you can say, Yeah, I support do not track, I tell you that you're being tracked, boom, support, it

Will Vincent 21:24
sounds like something corporations would create. Absolutely, as well. But um,

Unknown Speaker 21:29
you know, the E FF sort of has their own idea of what OF WHAT DO NOT track stands for. And there's they sort of, if you, if you subscribe to that, there's sort of real things that you're supposed to do, you know, keep server logs, no more than 10 days, if they contain an IP address, you know, things things like this. Read the docs, actually, we did not originally support this, but we've moved and now now we're in compliance with the Yeah, with with this sort of pseudo standard that the E FF has put out for do not track both on the read the dark side, and on the advertising side.

Will Vincent 22:01
Yeah, well, I mean, we can I go, sorry, go ahead, Carlton. Good.

Carlton Gibson 22:05
Well, what what I'd like to ask is a kind of if I'm a site developer, and can I can I ask you for your expertise here, I want to do some, I don't want to put Google Analytics on my site. And I'm not I'm being good and only see the big one, because I'm concerned about these privacy issues. And I don't want to, I just don't want to be part of that. But I would like some analysts like to be able to, you know, like, I can grep my logs, I guess, and get some idea of how busy My website is. But what would you recommend? If I'm a small, small website developer? How can I put on that enables me to do some analytics.

Unknown Speaker 22:37
But ethically, it's, it's hard, I'm going to be 100% honest with you, it's very hard, you know, I, there's this sort of newer startup called plausible analytics. And they, they sort of that that's exactly what they build themselves as. And, and, you know, some people will say, Well, that's all marketing, but it partially isn't. It partially isn't there, they're doing some good stuff there. So I don't want to, I don't want to speak negatively about them they're doing they're doing really good things. And having more alternatives to Google Analytics is a good thing, not cooking users is a good thing. People who say, you know, oh, but just just grab the server logs, those people, in my opinion, have not run a real business, like they don't really understand what they're talking about, you get so little from that, compared with what you can get from from JavaScript based analytics, you know, even if a number of users are are blocking it, you still get just so much you get things like the time on the page, you get whether they scrolled or not. You can you can attach actions to specific things like Did somebody click this button, you know that maybe that button doesn't trigger another page view or something like that. You can get all this additional data. You also it's much easier to filter out bots, you know, read the docs, traffic is like half bots, you know, some of those are malicious, but most of them are just not there. Just search indexes indexing read the docs. And so, you know, we would have so little data, we actually do use Google Analytics on read the docs. Still there, we're we're always sort of evaluating a few alternatives here. We've thought about maybe running it server side where you actually hit a read the docs endpoint. And we basically strip out a bunch of stuff and then send it off to Google Analytics from the server side. And you could do things like drop IP addresses, anonymize user agents, you'd not have any cookies for users. So you know, you wouldn't have any Google Analytics cookies for users. So Google wouldn't be able to tell, hey, this is exactly this person. So these are advantages. But you know, on a site like read the docs, you're talking about 45 million additional server side API requests that then have to hit Google while you wait for a response. Right.

Carlton Gibson 24:46
So there's, there's real good, I'm really care about privacy.

Unknown Speaker 24:50
Yeah. And you could host your own solution. But, you know, you're we're talking about a real we're talking about a service that's 45 million requests a month, like Standing up, something that's going to run analytics on that is hard it is, it will be expensive, it will cost many hundreds of dollars a month, just an infrastructure. And we tried it a little bit. And many of these solutions, like fell over at that scale. So there's drawbacks. It's hard, it feels like to me just broadly with advertising, there's even more than usual, there's just this divide between if you have a little bit of money, you can not see any ads. And if you don't, you're just gonna get bombarded. I mean, like YouTube, right? Like, last two years, YouTube went from almost no ads to now to ads on the Start ads every three minutes. And,

Will Vincent 25:38
you know, probably like a lot of people like I don't watch regular TV really ever. And when I do, it's for a live sporting, and it's just awful because of the ads, right, everything is a streaming service. And which ties back to so YouTube, they have this premium thing, and they have a free month trial, and I've been using it and it's just like, I'm almost tempted to spend, you know, 12 bucks a month just to not have the ads. So guess which, you know, he wishes to speak if there was some global like, Chrome ad thing where I just never saw an ad on a chrome site. You know, that would be appealing. But that's because I have some capital and I value that time. And I guess totally. But I don't like that about the world. I feel that's that's, that shouldn't, shouldn't be the case. I mean, that's like capitalism. to them. Whatever power

Unknown Speaker 26:28
put yourself though, into the advertisers mindset. Like the people who are willing to pay $10 a month or $12 a month or whatever it is to not see ads. Unfortunately, those are also the most valuable people to advertise to that there are. So yeah, charge a market rate for this, you have to actually charge way over a market rate for something like this. Because like, the people who will pay $12 a month to Nazi ads are worth more than $12 a month in advertising. That's a good point.

Carlton Gibson 26:58
Yeah, it's, it's, it's a bad thing. Like I don't know exactly. Like what the solution here is, I've had on my list I've had on my list as to a solution I've had on my list to try out this pie hole thing where you, you set up a Raspberry Pi two on your local network, and you set it up as the DNS resolver for your whole local for your router, so that all traffic goes via it, and it's got a blacklist, it just won't load any ads or anything. And I hate glowing reviews, I haven't had that, you know, Dana,

Will Vincent 27:23
this is your manner that people overnight Carlton,

Carlton Gibson 27:29
a local solution, it may not be you know, it doesn't solve the

Unknown Speaker 27:33
I know a lot of people. So you know, they, they work really well, you will run into some issues where like, some sites are just gonna not work. And you're gonna have to, basically somebody's gonna send you a link, you're gonna see, oh, this doesn't resolve at all, you're gonna have to log into your, your piehole and fix something and check it out again, but like, you will not see ads essentially, at all, anywhere. And again, like, the advertising companies are probably unhappy about this. But like, again, those are the people who are the most valuable to advertise to, yeah, we read the docs, we've sort of like decided, you know, ad blockers, there's nothing you can really do about it, people are gonna add block, you can try to sort of maybe urge them not to, you could try to diversify your revenue stream. And and that's essentially the tactic that, that we've taken, you know, you can think about this. The other way, too, is so many people are blocking ads. And and yes, as a result of that the ad industry has gotten, they've had to get more intrusive, more ads, bigger ads, etc. So like that, that's sort of one response to this. But the, the, the other side of this is like, people are essentially boycotting advertising, like it, you know, it's a very wide boycott, a lot of people, they hate it. And for a few different reasons, I think tracking is part of that problem. I think just more intrusive ads is part of that problem. Although, let me tell you, I remember what ads were like on the internet 15 years ago, they were terrible. You know, pop under ads, pop up ads, like all those things, you know, I

Will Vincent 29:04
mean, those are still there.

Unknown Speaker 29:05
Those are still there. But like, you know, pop under ads are are literally the bane of everything. I still get probably an email a week from at the read the docs advertising email to be like, you're losing money by not running, like the worst ads the internet possibly has. And and they're certainly right. Like, could we make more money by selling out our user base? Yes. True.

Carlton Gibson 29:27
Yeah, we could you always have to remember, you always have to remember that at the end, your heart will be weighed against a feather, you know? Yeah.

Unknown Speaker 29:35
Yeah. So we don't do that. But we, you know, I get an email a week probably about it.

Will Vincent 29:39
It's also that that short term, long term thing where you can always, if you do any sort of analytics, it's always going to show that you should do it because you can't measure the long term. You can't measure this objective brand quality. I mean, even for someone who has ads on their site, like I was at Quizlet, top hundred website, we had a new Google rap every six months. You know, another 23 year old. And every time it was like, is there actually something useful? You can tell us and you know, I don't blame the the rep, but all they could say is a bigger ad more prominently up there. And the problem is once you put on there, you get addicted to the revenue from it, and you can't take it off. And then your site looks like Facebook looks like now. Yeah. So it's about it's, um, it's I I admire sites that don't have it. I mean, yeah, I know, I understand how it happened. Yeah,

Unknown Speaker 30:28
I mean, you know, if we put a second ad on, read the docs, for example, like that would probably double

Will Vincent 30:34
the revenue. Right,

Unknown Speaker 30:36
right. Like it and the revenue is good. But like, we could probably double it by just putting a second ad on there. We don't want to do that, you know, like, there are drawbacks to this. And you hit it on the head, like, you got to think long term versus short term.

Will Vincent 30:49
But I think users can't articulate that, right. Like, sometimes people will tell me with my personal site, they love the design. And really what they're saying is they love no ads. Yeah, like, I don't fool myself. I'm like, Is it the design? I think it's that it's just, there's no ads on it, or there's very small ads on it. But carthon you were gonna you had a point?

Carlton Gibson 31:08
Well, I was just gonna, I was gonna segue back to the the Django setup of the Yeah, yeah. So because, you know, you must have some interesting story. So perhaps we can go through the third party apps?

Unknown Speaker 31:23
Sure. Yeah, I think I, I went through them to re familiarize myself with them for this. Because you know, like, some of them you don't you don't work with like every single day. So you don't even remember, you're like, What was that for? But

Carlton Gibson 31:37
yeah, so is what I've seen. I see on the list we've got in the show notes here is you've got Django rate limit. Yeah, amazing. Yeah. So that's brilliant. And we should talk about that, because I think people Django doesn't come with rate limiting built in, into I know, DRF has got rate limiting on the API views. But you know, tell us about doing a rating.

Unknown Speaker 31:55
Yeah, we actually use it for probably something different than what a lot of people use it for. A lot of people probably use it for things like rate limiting logins, or something like that, we actually use it for rate limiting advertising. You know, you can't, there's, there's sort of a maximum amount of ads that anybody can click on in any sort of amount of time, there's a maximum amount of ads that, you know, a real person could view in a reasonable amount of time. And we sort of use it for those kinds of features. So it's, for us, it's kind of a security feature, or, you know, ad fraud feature. Ad fraud is real, like, you know, I spend more than my fair share of time dealing with it. And, you know, like, it's one of the easy things when you're just Google or Facebook, you just, you just handle it at huge scale. But for somebody like us, it's, it's hard. And, you know, especially when we were both when all the advertising was run on read the docs, and, you know, we have sort of an incentive to report things correctly. But when we now have sort of these third party publishers who were running ads on their site, that that's only started, you know, I think I mentioned a couple months ago, and they sort of are getting paid out on this, they have different incentives. And their incentives don't necessarily, like align with ours. And we have to make sure that, that things are legitimate. Okay, you know, we, we, we basically have advertising for developers, developers know very well how to automate clicking on ads, or viewing ads or whatever, you know, and, and,

Carlton Gibson 33:26
well, yeah, like, yeah, you know, I've just learned async IO. So now I can click on the edge, you know, concurrently?

Unknown Speaker 33:33
Absolutely. hundreds of times a second.

Will Vincent 33:37
Wow, that's how you Yeah, I've been I've, I've used some version of that, too. Like, we've had to log into, like, go to the local beach here. And I'm, like, I almost wanted to just set up a thing, you know, to slam the site. I didn't think too busy with kids. But it's like, it's right there for you. You know, you can script kitty. It just works.

Unknown Speaker 33:56
I think. I think what was it Django rate limit? I think made the it made like the newsletter, the Django newsletter.

Will Vincent 34:04
Oh, that thing recently?

Unknown Speaker 34:05
Yeah, I think so. I think it was on there pretty recently. But yeah, so we use it actually, for something a little differently than that. Off sort of has its own rate limiting sort of built in. So we use that for authentication. But that that's not as big of a we use that both on our ad server and on read the docs all off. That is, but yeah, rate limit is specifically for sort of manually rate limiting advertising as a as an ad fraud feature.

Will Vincent 34:31
Can I ask so there's a number of so you're an international platform, and there's a number of packages around country specific things? I guess broadly, can you talk about the challenges of moving beyond just being US based and a global supportive Django project because it's that's a whole nother thing. Got it?

Unknown Speaker 34:50
Yeah, it's it's hard, but and I, I actually almost don't want to talk about it too much, because I don't think that read the docs spends like an appropriate amount of effort there. You'd be surprised, but something like 92% of all documentation on read the docs is just English.

Will Vincent 35:06
Is it? So what percentage is the US? Oh, you mean traffic wise? Okay. It's totally there. Yeah, there's nationalization of language and there's traffic.

Unknown Speaker 35:16
Yeah. So So traffic wise, it's about a quarter North America, US and Canada. So like 21 22%. us is sort of the US percentage, but like in terms of language, like written spoken language, it's 92%. English, virtually all of the remainder is Chinese. Hmm. So everything else is a rounding error. Sub 1%

Carlton Gibson 35:38
was like the support for multi language sites, right from my I can, if my docs are translated, I can serve them in English and in German and in French, but the the the effort to do that is just monumental, we do it on Django, but we have like, literally a whole team of people who you know, and volunteers who do the translation on each language. And it's just, it's the amount of effort for solo, you know, for a smaller project, like, you know, my dango filters, no way I could translate the docs for Django filter, just correct, you can nap.

Unknown Speaker 36:11
Yeah, and this is actually an area. So Anthony, this is an area of interest of him, one of my co workers that read the docs, he really, he like, this is an area, that's definitely one of his interest. And he, he has a few ideas here, like integrating with some of these third party translation services, like trans effects or something like that, where you can maybe, you know, Sphinx supports something similar to what Django supports where you, you have these sort of dot p o files that you can upload to a service and hopefully get a translation for them and then serve them. It is a little tricky to set up a multi language project on on read the docs, but it is possible we have a few projects doing it. Probably the biggest one is NGO dot, which is like a, it's a game engine, c++ game engine, they they're a huge project on read the docs, totally not in the Python community. until you start looking at how does read the docs make money? And what are our biggest traffic projects, I would have not been familiar with them at all. Because you know, it's not in the Python community. It's not something that I work with on a daily basis, but they have translations for probably a dozen languages for their documentation. Okay, they are also a very, they are very expensive to host because their documentation builders take like 20 minutes each. And so it's like, oh, we committed to the main repository that we have to build documentation for 20 languages.

Carlton Gibson 37:33
Was that's that that's one thing I wanted to ask is, how clever is the caching on the on the docks rebuild, like, do to kind of check songs up front on the docs folder, that kind of thing to say, Hey, no, do you know what once there's been a commit, nothing's changed. We don't have to rebuild.

Unknown Speaker 37:50
We used to, and we removed most of them. And the reason why is because it's actually extremely hard to do it correctly. Because a lot of times Sphinx projects will use some auto API or something like this. That's referencing code. So it's actually building. It's building. Doc's from code directly. And so it might be a change that isn't in the docs directory. But because of a change in the code directory, it affects the output of the doc. So we we just decided at some point, this is too hard. We'll just waste resources and how our builds be correct.

Carlton Gibson 38:25
So yeah, actually, you're in danger using more energy, try to guess whether you should rebuild them whether you would, you know, yeah.

Unknown Speaker 38:33
Yeah, it ended up being just sort of like a problem that we decided was, was too hard to solve. And interestingly enough, I think a lot of the CI services went the same route. So we we sort of model a lot of what we do off of some of these ci services like Travis or or circle ci, and that's, that's what they're doing. You know, they're, they're not trying to get clever and say, Oh, well, we think the tests are gonna pass because you didn't change anything over here. They just rerun it. Okay.

Will Vincent 39:01
Okay, interesting. I really want to ask about stripe, but maybe that just works brilliantly for you. And the new API has been no big deal switching over.

Unknown Speaker 39:10
Um, we're actually not using most of the new API's. We're actually using it to pay out publishers. It's sort of a little bit

Will Vincent 39:16
different Connect side or not, so not Yeah, connect. Connect. Yeah, marketplace.

Unknown Speaker 39:21
Yeah. So basically, publishers can sign up for a stripe account. We don't get their bank account information, but we can basically transfer money to them. So that that's what we're using it for. Probably most of our payouts are still PayPal, though. And we don't have like a fully automated solution there yet. This is one of the things when you stand up your own advertising network over like a couple months. Lots of things are not automated yet.

Will Vincent 39:43
Well, even carbon ads is a PayPal, I thought it'd be a little more advanced. But it's I think a lot of the new stripe stuff, too, is around subscription and blanking on the European law. But there's all sorts of things around that which is So it's less so for sort of one offs, though, stripe is finally adding, I think I think they just added in, like tax support, because for a normal, like someone my size, if I added stripe, it's like to collect tax per state per country is impossible to do. Even if I tried to which I have, it's just like, I guess it's a no go. But stripe is slowly rolling up all these things around analytics, you know around tax jar and all these other third party things you can sub in to do taxes appropriately. But

Unknown Speaker 40:32
we're actually not using it for subscriptions, which I think is like a lot of what

Will Vincent 40:35
yeah, I guess new stuff is,

Unknown Speaker 40:37
yeah, yeah, we use it mostly for, you know, sending invoices to advertisers and paying out publishers, those are sort of our main things and the stripe Connect stuff has worked fairly well. Although there's sort of like a beta for, like, if you want if you have your balance and stripe in US dollars, and you want to pay somebody whose balance is not in US dollars, or actually even just not in the US. It's a problem. stripe has beta support for it. So we are like, applied to join the beta. So right now, we can only pay out publishers via stripe in the US. But maybe that'll change in a month. Nobody else has to use PayPal, or something else, or they have to give us their bank information. Right, which is, yeah, as much as possible. We want to not do that.

Will Vincent 41:20
Well, that's like my mom would vary something not to pick on my mother. But as an example of you know, she wanted to write a check as opposed to enter her credit card. And I was like, you know, your check has your routing and your account number on it. Right? Like you know that and your address.

David Fischer 41:35
Yeah, checks are way less secure.

Will Vincent 41:37
Oh my god, they're amazingly insecure. It's just oh my god handing out to a stranger when the things I could do if someone gave me a check. But anyways, so while Django con us has been in San Diego, the last two years will be again next year. Can you just briefly talk about you wrote the bid for that. Right? Are you I really the bid for that.

Unknown Speaker 41:57
So I you know, I'll combine this with sort of the San Diego Python stuff. So yeah, we have a couple. We have a there's a couple people who are maybe bigger, they have a bigger sort of, maybe name in the Python community than I do. I really focus my efforts locally. I'm not big on social media, like I have, like very little social media presence. But I'm Trey Hunter, who's one of the other sort of San Diego Python organizers. And he was basically like you, you should David, you should write the bid for Django con us to come to San Diego. I was like, Alright, fine, tre. I'll do this. And so I wrote it, it was pretty convenient, actually, because at the time I was working in this office in downtown San Diego, and actually the San Diego Tourism Authority, which is like this quasi government, actually they are governmental. They're part of the local government. They were in the same building. So I just sort of stopped by their office and was like, I have this conference, you know, it's gonna be it's, it's this big, it's gonna probably bring in this money dollars to San Diego, I need help writing a bid. I wrote the whole bid, but like, they pointed me in the right direction. so much, you know, Django, Django con was basically like, you know, here's like our target budgets for, for hotels, here's like our target budgets for some other things. And I brought that to the San Diego Tourism Authority, and they're like, this is perfect. These whole hotels this area, like you can't even talk to them. You know, they helped me so much, because I would have spent so much time talking to the wrong hotels that are just, you know, double the budget and stuff like that. So, yeah, it helped a lot. I ended up writing the bid. We I think there were a couple other bids, but we got accepted. And I'm sure it's probably more expensive than some of the other places they've hosted. But you know, it's certainly not Bay Area prices or anything like that.

Will Vincent 43:34
No, for sure. I mean, San Diego's a lovely place to visit. I mean, I'm often I've only recently come to appreciate all the work that definitely you Jeff triplet, and others there do to organize these conferences, because, I mean, Django con Europe just happened, and it's so much work to do a conference and it's like, you know, very much in the, you know, not not publicly seen. So, it's interesting just to hear about what it takes to put that on.

Unknown Speaker 44:02
Yeah, the bid is relatively minor compared to like the operations and all the other things so really like Jeff triplet, and and I don't want to just call that one person, you know, that whole team. Yeah, it's really, you know, they, they're what make Jango Khan a success. Django con us anyway. So, I, my my contribution is, like, so minor by comparison. But yeah, so sending a Python getting back to that. Yeah, you know, I'm sort of one of the CO organizers. I sort of took over as probably like the main organizer, and maybe 2012, the person who was the organizer before moved to the Bay Area, which is a common problem in San Diego for for organizers. You know, they just sort of like make a name for themselves or, you know, take the next step in their career and the logical next step is moved to the Bay Area and make way more money. So, I was the person who I'm never leaving San Diego. I love it here. It's great. So I was I was sort of the person Who ran it by default, because I was not moving away. But now we have this great team, you know, like that. That's one of the big one of my big successes, in my opinion is that now there's a bunch of other organizers. And when I need a month off or like something else like that Django or San Diego Python continues without me. My, my wife and I had a daughter four years ago, and I sort of didn't do any San Diego Python stuff for a year. And it worked. Like people still went. And there were still talks. And you know, that sort of a, that sort of the other, the other organizers really made that happen. So

David Fischer 45:36
I'd say that's my biggest success is that the group will

Will Vincent 45:38
continue what I've gone. Well, I mean, in any organization, I mean, here in Boston, there's there's a Jenga Boston group, which I think is the third largest by members, after

David Fischer 45:49
we modeled ours after that one.

Will Vincent 45:52
Okay. Yeah. So it's, i'm john Baldwin, and I'm, I'm sorry, I'm blanking on the other. But it's, you know, it's one or two people who do, you know, all the work, and it's quite a lot of work. But it's such a great attribute for the community. And even I guess, even as big as Boston is, which is a pretty big developer community. You know, we've had, we used to have at, you know, 50 to 80 people. The Python meetings, of course, are like, three 400. But it's a great, that's way bigger than us. But, yeah, okay, well, well, that but I would say, you know, I got mindful of time, but you know, in Boston, web is not that big a thing. It's so much more data science, and, or even hardware stuff. There's not as much web stuff. Whereas when I was in the Bay Area, it was very much you know, consumer web kind of things. There's much more of a, you know, I don't know, PhD level programming in the web is sort of this thing you sprinkle on top to just deploy it to customers. Sure. Carlin. what's the what's the scene on the coast of Spain?

Carlton Gibson 47:01
Well, where I am, this is me. There's a few little local web dev shops. There's not much here they're down in Barcelona, there's a good amount, there's a Python Barcelona meetup there, which is good. I'm really bad. Because I've got so many children, I just don't go and like, yeah, I could, I could take out this massive chunk of my life to go down to Barcelona and then hang out till quite late at night, and then drive home would be totally I can't do that. So I don't go down very often. But they're a great crowd. And you know, it's active and they still keep going. I think they've got a an online roundtable tomorrow discussing, you know, the changes in the remote working and all the rest, because they bought in new laws in in Spain yet, like yesterday about remote working and how that's going to affect everything. So it's, you know, it's a really active community. I mean, Barcelona is. Yes, there's quite a good tech scene there.

Will Vincent 47:54
I would like to ask you more questions, but I think we're, we're close to time, we're gonna have links to the code, I definitely recommend people take a look at the ethical ad server, especially if you look at the file. That alone is one of the rare readable production level code snippets I've seen. Well commented and everything else. So I definitely recommend that ethical Right, that's the site if people want to have ethical ads on their site. Yeah. Anything else as we head out, you want to promote a shout out?

Unknown Speaker 48:26
You know, no, just I think San Diego Python is sort of like that is one of my biggest successes. I feel so I'm happy that we talked about that I'm happy we were able to fit that in read the docs is fantastic. I love them too. And and, you know, I'm really happy I mostly I'd read the docs work on the advertising side. So you know, I work sometimes on like, the security and privacy stuff on read the docs as well. Probably my big, my biggest success there. And maybe I'll shout out to CloudFlare thank you for that. But like all this stuff, where we now have HTTPS on the thousands of custom domains for docs on read the docs, that's all like courtesy of CloudFlare that would cost us like thousands of dollars a month, it would probably that alone would double our infrastructure budget if we were paying, like, retail price for that.

David Fischer 49:13
So I'll thank them for that.

Will Vincent 49:15
I mean, I use CloudFlare I feel like there should be a whole course on Django plus CloudFlare since it's so powerful but

Unknown Speaker 49:23
yeah, read the docs is actually doing some really cool stuff there but maybe there's maybe there's not time for that they have this we have this whole thing where like when there's a new Doc's build, we're, we're purging from the cache. Those docs and all those sorts of things. You know, when we rolled that out, like docs, especially if you were browsing them in like Asia, or Europe, they got so much faster, especially Asia, because it's far from our main

Will Vincent 49:45
center. Well, maybe maybe we'll have to have you on again. I mean, yeah, just before this, I was manually purging some. Some pages on learn Django that I updated, and I'm like, I know I need to automate this, but I just can't be bothered. Yeah, so I can't even imagine at this scale that you all are out.

Unknown Speaker 50:02
Oh, yeah. And yeah, I won't go into it because I know where we're at time. But yeah, it's super cool. I didn't work on that. So I don't want to take credit for that. That was mostly our culture.

Carlton Gibson 50:11
But the long and short of it is you don't go through the, through the UI though. Clicking purge per

Will Vincent 50:19

David Fischer 50:20
Oh, absolutely not. No.

Will Vincent 50:22
Well, thank you so much for coming on, David. We'll have links to everything in the show notes. Really appreciate I've been glad. Jeff. Jeff connected us. Yeah, Jeff triplet, because I was saying, I was complaining about ads. And he was like, you should use ethical ads. And I was like, what's that? And then he's like, and there's someone there. You can email. It's like, Oh, okay. So,

Unknown Speaker 50:43
yeah, advertising. It's a crazy business. Lots of lots of bad stuff. But you know, we're trying to do we're trying to do good there as much as one can.

Will Vincent 50:51
So as ever, we're at Django chat, Django and Twitter. And we'll see you all the next episode. Bye. Join us next time. Thank you.