Simon Willison is a co-creator of Django and currently works at Eventbrite. We discuss the early days of Django, the founding of Lanyrd, and his ongoing work with Datasette to make it easier to publish and explore data online.
Carlton Gibson 0:06
Hi, and welcome to another episode of Django Chat. I'm Carlton Gibson. I'm joined as ever by Will, Vincent. Hi, Carlton. And today we've got a very special guest, Simon Wilson with us. Hi, Simon.
Simon Willison 0:15
Hi. great
Carlton Gibson 0:16
talking to you. Thanks for coming on. I'm sorry, Simon, you Well, you're you. You're one of the three hands that created Django. Right? That's that's your sort of opening in the Django community. That's your so for us. We're just like, Wow, amazing. You created. Can you tell us something about let's start with tell us about the birth of Django Come on that so sure, yeah,
Simon Willison 0:35
I can do that. So this is what nearly 15 years ago now. This was back when I was I was at university and in England, and my university course had an option of a year in industry. So you can go out and spend a year working for someone and then go back and finish your degree. And it means you can get student visas for things and so forth. And I had a blog I was writing about web development. There weren't that many of us doing this back in what 2000 To Adrian kalavati, also had a blog and he was writing about web development for this little newspaper in Kansas. And he posted a job ad saying, hey, I need somebody to come and help me build newspapers and newspaper websites. And it coincided with this opportunity for a year in industry. So I thought, well, that could be kind of fun. You know, I could I could go to Kansas and spend a year there and work with this guy who I'd been been following online for a while. And so I got in touch and said, Hey, would this work is essentially a sort of paid internship. He said, Yeah, absolutely. We can do that. And then I headed out there and I spent, I spent most of the year out there in Lawrence in the tiny little town of Lawrence, Kansas, working on what turned out to be Django though at the time. We just thought it was called the CMS right? Because like that the code we were writing for the newspaper websites,
Will Vincent 1:44
and you were I we interviewed Frank Wiles. I believe he was in charge of the web part at the time, and then you guys took over Is that right? What was that transition?
Simon Willison 1:55
Um, yes. So Frank, Frank. Frank had built much of the website and Whoa, I mean, it was 2002.
Carlton Gibson 2:02
What else would you do?
Simon Willison 2:04
And so Adrian agent has been building things in PHP. I've been using PHP for a few years, we'd both hit the limits of what we felt was productive to do with PHP. This was back when PHP had only just got classes and things. Yeah,
Will Vincent 2:17
very early days.
Simon Willison 2:20
It was very early days. And we both were aging. They were both really into Python. We'd been following the work of Mark pilgrim who'd been writing all sorts of great things about Python. We wanted to use Python for websites. But the back then the answer for how to use Python on the web was was quite flaky. You know, there were a few very early web frameworks. zope was the sort of majority mindshare for Python on the web. And it didn't really fit the way we thought about how we wanted to build websites. You know, we were into things like really cleanly designed URLs and using CSS to separate your presentation from your content and all of these kinds of things. And so we started poking around and figuring out Okay, can we use Hoping to build these websites. And actually the real inspiration, which for Django, was that we were looking at using mod Python, the Apache module for building the for building things. And we realized that there weren't that many people using it. And we were a little bit nervous that mod Python might turn out to be the wrong direction. So we thought, Okay, well, what we can do is if we build a very thin abstraction layer between our code and mod Python, then if it turns out mod Python was the wrong choice. We're not and we can, we can switch to something else and use it. And so that's where Django came from it. We started out with the request, the request object, the response object, the way URL routing works, those kinds of things. Because essentially, it was it was a insurance policy in case mode in case mode, Python turned out not to work out what Python worked out great for years and years and years. But that was that was really why we why why our CMS had its own had its own thin abstraction layer. Wow.
Carlton Gibson 3:54
Right. So Django is an indirection layer. Exactly.
Will Vincent 3:57
It's an indirection layer on my Python originally. Did you know You're gonna be working on a CMS or you just knew you're gonna work with Adrian before you headed off to Kansas.
Simon Willison 4:04
I just knew I was gonna be working on Adrian doing some really novel and interesting things involving news websites. And I should say a little bit more about the Lawrence journal world back then. So this was a tiny little newspaper in a tiny little town, but it was very well resourced, because the family who owned the newspaper had become independently wealthy off I believe they were laying fiber optic cable around the Midwest back in the 70s, when everyone thought that was a crazy thing to do. And then they sold their network to Comcast or somebody. So they had a lot of money. They were very heavily invested in local community. And they were running essentially the sort of media empire that spanned about 100 miles. So they had the local newspaper they had I think that did they have radio stations, they had the local TV station because they'd run you know, they weren't they were running a broadband provider for this little town. So Lawrence, Kansas back in 2002, had fantastic broadband across the whole planet. Which meant that the newspaper website could experiment with things like online video, no other local newspaper was doing online video because who's going to be able to watch it? And so that was really exciting. And then basically, they were trying to, you know, because they had all of this investment, they had this really well stuff newsroom. They were really trying to push the edges on what you could do with web stuff as well. And the chat running the online departments, a guy called Rob curly, and he was very much of the opinion that you should go all out on everything that you were doing. So a lot of the fun stuff we did with Django was sort of inspired by Rob coming up with ridiculous ideas for things to do with the website. My favorite example has always been there was its kids Little League season. And so it's when kids play softball, and you know, in England, this is not a big deal in in small town America. This is absolutely the most important church.
Will Vincent 5:54
Yeah, yeah.
Simon Willison 5:54
Oh my goodness, it was phenomenal. So um, so Rob said to us, he wanted us to take These little kids softball teams and treat them like they're the New York Yankees, right? So we'd have a website with player profiles and team alerts and schedules for the league. And then he sent two of the interns out to take 360 degree videos using QuickTime VR back then we pitch every every baseball pitch in the whole city so that you have those on the website. You can get like a virtual view of what these kids softball Oh man, that's so cool
Carlton Gibson 6:27
with these like kind of drew with these, like kind of draggable views.
Simon Willison 6:31
Exactly. It was a 2002 2003.
And we did SMS alerts so that parents could sign up and get an SMS about what their kids were up to and all of this kind of stuff. And then I've got a lot of this went into the newspaper, you know, because it turns out if you print a supplement in the paper with a bunch of photos of kids playing softball, everyone in town who knows any of those children buys a copy. Yeah, all of the local businesses want to be seen sponsoring the kids softball and stuff. That's a big commercial venture. And but we so we had, and we basically have two weeks to build this website. So this was sort of the Proving Ground for Django. This is what it had all been working up for. We had the we had Well, it wasn't the ORM at them, we had code generation for our SQL bits. And it allowed us to get this thing out. And that's sort of that's when we realized that what we were building was really working, you know, that we could take on these ridiculously ambitious projects with a very small team in a very small amount of time. Super, super.
Carlton Gibson 7:30
And yeah, I mean, thinking back at similar things I was doing with people trying to do with PHP, it was just crazy. It was like the, you know, the same sort of time, scale type sort of same sort of epoch. It was. It was difficult. You were building a lot of things from scratch, and they weren't the frameworks to support you. And
Simon Willison 7:46
right, it was tricky. I mean, the other thing that's worth mentioning is, you know, how Ruby on Rails was famously extracted from Basecamp. Right, they built Basecamp. And then they pulled Ruby on Rails out and made that open source. Django was very was slightly different in that we already had this website called Lawrence calm, which was the local entertainments website for the town of Lawrence, Kansas. So it had all of the bands that were playing it had articles that had restaurant reviews, all of that kind of stuff. And this is the mammoth website, that agent had been building in PHP. And it was an extraordinary thing. Absolutely amazing website. And the goal with Django was always to grow the framework to the point where it could run dance comm so rails was extracted from Basecamp, Django was sort of evolved to fit a site that were already existed but we wanted to replace and we did manage to replace that and I think just before we launched monster calm on, on what became Django, just just shortly before I ended my my paid internship there,
Carlton Gibson 8:46
okay, and as part was, was the site's framework or sort of early part of the planning as well,
Simon Willison 8:51
very much so yeah, I, I can't remember the exact details, but I believe Yeah, the site's frame was in there right from the very start, because we knew we were going to have one second CMS that was supporting multiple different websites on the front. Right. Okay, so yeah, there's that there's a lot of stuff in Django, when knowing its history of, of how it came out of this, this tiny little newspaper group that was that was punching well above its weight, sort of helps helps helps explain some of those, those different features. Fantastic. Wow.
Will Vincent 9:17
So then, you know, it's a pretty productive work placement, I would say in the history of work placements. But then you went back and you had what another year or your another
Simon Willison 9:28
two years at university and about, I think was about six months after I left the newspaper, and they managed it to get the guarantor to open source it. And actually, I'm one of I think the most impactful things I did at the newspapers. I helped to hire Jacob Kaplan moss as essentially my replacement so we only overlapped for about a month at the end of my time there. Okay. Um, but but but you know, Jacob, Jacob and Adrian, once I've left, absolutely pushed forward and got the thing open source and the open sourcing was They had to make the argument to this very sort of traditional newspaper business that they should release this code into the wild. And they went in there with a bunch of arguments, I believe one of the biggest arguments was look at what happened with Ruby on Rails, right, Ruby on Rails and Django were developed privately at about the same time, Ruby on Rails was open sourced about the time just after I'd finished at the newspaper. And it took off like it like a rocket. And so they went to the newspaper management said, Look at this thing that has happened with this, we've got something that could also be really successful that make it easier to hire people, we'd have other people doing development. But I'm impressively the argument that really convinced the newspaper was we have been using open source software for years to build this business, right? Like mod Python is open source, we use Linux servers, all of that kind of stuff. This is our opportunity to give back and that was the argument that swayed them. As from what I've heard, that was it was a no this is absolutely make sense the weeds enough to give you hope. No.
Will Vincent 10:59
You Yeah, that's the answer you hope. That's, yeah, that's fantastic. I mean, I was just a Python for the first time I saw you were at Python and I had flying back or number of people on the plane with me and a bunch of them were, they were kind of big, industrial companies. And they were saying, they don't really know Python, but they're saying, Yeah, we're all switching to Python, because the only way to hire people because they're switching over from Java, solely because they can't hire anyone under 40 to work in Java, so they're kind of like dragging and screaming a little bit. In so it's interesting that the recruiting thing wasn't the one that hit it was the more altruistic like,
Simon Willison 11:34
again, back in 2004. Nobody else was using Python to build websites basically. So um, but um, which I guess maybe that but but they actually the newspapers span off a separate company for several years that was commercializing a CMS built on top of Django. And so for quite a while this was doing very well amongst lots of other newspapers, because they were and junk at that point, you know, buying This commercial CMS, you can hire developers because it's built on top of Django. And so it's open source. And so there's a, there's a marketplace for that. So a lot of those commercial factors did pay off really well. But it was the it was, from what I've heard, again, this was after I'd left the newspaper, it was the altruistic argument. That was the the convince for getting an open source in the first place.
Carlton Gibson 12:19
That's super, that's lovely.
Will Vincent 12:21
So what was it like going from that real world experience of building something doing all these creative things? And you have another two years of, I don't know, data structures? I mean, what kind of what was that like stepping back into a formal academic setting after having this such a prolific?
Simon Willison 12:36
It was pretty good. I mean, one of the things I learned working on Django is, um, you know, people who have computer science degrees often say, you know, I don't use that stuff very often. The Django template system comes from me having done compilers when I was at university, I had not yet done compilers 102. So, so it was very much a sort of loose sort of lexer and interpreter kind of thing that was going on, but Um, you know, I'm I think, and it was interesting as well, having BIT bit sort of being able to step back, I actually wrote a project in Ruby on Rails, when the moment it came out, because it's like, this is the thing I had at the newspaper, right? This is brilliant, somebody open sourced a productive framework, well, that I can get on with because I, you know, I, I didn't know if the, if what became Django would would be open sourced after that. So I I did dabble around with Ruby on Rails. And it was also just interesting seeing that explosion of interest in Django and all of the, like, contributions and the creativity that came came out of that. So I don't feel I didn't feel Yeah, I've never really thought about the that transition back to academia. Again, it was so long ago and I was I was so green in my career. You know, I'd essentially had I had a bit of a programming job before University which was working in the first.com boom in like 1999 to 2000. I worked for am online gaming company down in London building, building their file downloads websites. That was super fun. But then everything completely collapsed the company that I believe they lost like 30 million pounds or something. So it was proper disappeared in a puff of smoke. Yeah. Oh, absolutely. It was very exciting. Yeah. So I was quite happy to be safe in academia for a couple of years waiting for the waiting for the.com explosion to boil over again.
Will Vincent 14:29
Yeah. And so this. You mentioned your site where you were writing is this the same? Simon Wilson dotnet that you have today? Yeah,
Simon Willison 14:36
basically, it's the same content. It's been to a couple of different web addresses but but yeah, that's
Will Vincent 14:41
my blog, not UI changes. A couple a couple. I love the layout of your site. I love it. People should take a look at I love it. Thank
Simon Willison 14:50
you. Yeah, I am. I did a big upgrade to a couple of years ago. I moved it from being 800 pixels wide by default to 1024 pixels wide.
Carlton Gibson 15:00
unmodern screen.
Simon Willison 15:01
But yeah, basically that blogs been going since 2002.
Will Vincent 15:05
And that led to Yeah, connecting with Adrian. And even then we had Tom Dyson of torch box was on and he said he was reading your blog back in the day and didn't realize you were a teenager. Because you later worked with him, I think right in a capacity.
Simon Willison 15:17
Um, that's right. Yeah. I mean, no, I so I started a blog when I was 2121. Yeah, yeah, I worked. I contracted with talkbox for for about six months at one point. And that's actually I think that's where I met Andrew Godwin. He was at talkbox at the time.
Will Vincent 15:34
Yeah. Andrew was working on South for a project for torch box Thompson. No.
Simon Willison 15:39
Well, yeah. I mean, when I was at torch box, we were working on torch boxes, internal migrations mechanism, which was the inspiration.
Will Vincent 15:46
He said it was related in some way. Yeah,
Simon Willison 15:48
yeah, I think South definitely that like Andrew was at torch box for a lot longer after my left. Andrew and I back it for the first ever Django con Andrew and I working on rival migration systems together. Go. I had this thing called the D migrations. And he had, I think he was calling it South back then. And we managed to set it up so that the first Django con, we had a panel about database migrations. So it was myself, Andrew and I think
Will Vincent 16:16
Russell Yes. was keeping he was working on something really scary. Andrew
Simon Willison 16:20
and Russell, were on this panel back at the first ever Django con, talking about our various approaches to migrations. Okay.
Will Vincent 16:27
Yeah, yeah, I saw I saw a talk he Russell gave where he was saying that. Yeah, just that and then what became South came in, he was like, Oh, yeah, you can you can take Yeah, I
Simon Willison 16:36
kind of gave up on D migrations, after I saw hacker se getting.
Carlton Gibson 16:40
Okay. So good. It's like celebrity deathmatch.
Will Vincent 16:45
Okay, so there's so much I want we want to talk to you about I mean, what are some of the highlights? I mean, we definitely want to talk about lanyard Eventbrite. You just gave some fantastic talks on SQL light, and then you have this journalism fellowship coming up. So where do you want to start with all that? What do you most
Simon Willison 17:00
I was told about SQL ice at the moment.
Carlton Gibson 17:03
Yeah, good, good.
Will Vincent 17:05
She gave a talk at pi con, which will link to Carlton was especially Carlton's very much team SQL light. But you've been using that in a lot of capacities around data set rights, maybe you talk about data set as well.
Simon Willison 17:16
So basically, this is an idea I had about a year and a half ago, where it stems from the time I spent at The Guardian newspaper. So after the launch, john, well, they went back to university after university, I bounced around doing sort of contracting and freelancing for a bunch of different places. And but then I ended up at The Guardian newspaper, doing data journalism projects. So data journalism, I realized recently, not everyone knows what data journalism is. And it's actually a little bit difficult for me to define it. But essentially, it's when as a programmer, you get to work with journalists and build tools and do analysis that helps them find the stories in amongst the data. And that sort of most obvious example is anytime you see an infographic in the newspaper, like a graph or a chart, that somebody had to gather the data that somebody had to often write the code to help pull those things together.
Carlton Gibson 18:05
Good example might be the Panama Papers leak, which without decent interpretation and decent visualization on top of all that stuff, you wouldn't know that, you know, so and so's exactly that is, you know, holding money in Panama.
Simon Willison 18:18
Yeah, it's some, I think it's the most exciting job you can have as a programmer, because if you, if you like, novelty, and deadlines and building things quickly, and having an impact, you get all of those things, you know, you get to work directly with journalists breaking, helping to break news stories. And so I worked at The Guardian for a couple of years. And one of the things that happened there is I was working with this chap called Simon Rogers, who was he was the nerdy journalist who gathered the data for the infographics The first time I met him, and so the other journalists didn't really understand him. They're like, Yeah, he's he's in the newsroom, but he's always got Excel open. And he's, he gets really excited about about about like, getting data from places and I have this so But when I first met him, he was like, Well, what do you do with all of this? He goes, Oh, it's all on my desktop, and he points his desktop onto his desk, would you turn that had hundreds of beautifully crafted Excel spreadsheets about every fact about the world you could possibly imagine. And I'm like, okay, we need to, we need to introduce you to the web team upon upon the floor above. And so we started brainstorming, okay, how can we release some of this data? What's the best way for us to publish this? In the end, we went with the simplest thing could possibly work, which is a blog because the Guardian was very good at running blogs. And so we called it the data blog. And every time we had data by a story, we'd put something up on the data blog about it. And then the data itself, we published mainly using Google spreadsheets. Because if this was back in 2010, it worked. You know, you can dump files into Google Docs into a Google spreadsheet. You can post a link to it, other people can then pull the data out and start doing things with it. And this works really well. We have a flicker community of people building their own visual against the data. There was some it was it was pretty sort of revolutionary at the time. So to sort of have this mentality of No, you publish the story, but you also publish as much of the data as you can publish as well.
Carlton Gibson 20:12
If I remember, rightly, they used to expose all they still expose a JSON API around a Google spreadsheet, I,
Simon Willison 20:19
yeah, Google Sheets have a API, there is their API layers built in there that you can tap into most of the time, you click File, Export as and you can get the data out that way as well as CSV or something. Yeah. But I was always a little bit frustrated about this, because while it does have API's of sorts, they're not the most convenient things to use. You know, I don't know many engineers who are thrilled at the chance to integrate with it to pull data out of a Google spreadsheet. So I always felt there should be a better way of doing this. I actually mucked around at The Guardian with them with couchdb. I thought maybe couchdb could be a way to publish these in a in a more reasonable format. And so anyway, a couple of years ago, I was Thinking about, I was thinking about Docker containers and some of these new who's hosting providers like, Zeit now was one that were doing this at the time, which can host your Docker container for you. And you know, these, essentially the sort of serverless Docker model. But the problem with all of these hosting providers, they don't let you do any rights, right, you can just serve static files, you're expected to pay for a hosted database somewhere else. And I thought Hang on a second, if you're dealing with read only data, the fact that you can't have writable data and you don't have a database doesn't matter anymore, because you've got sequel Lite. So you can bundle a bunch of data into a sequel lite file, stick it, literally bake it into your Docker container, ship it somewhere. And you've built a extremely fast, extremely powerful dynamic web application that just doesn't accept any rights at all. But the data is part of the deployment. And I started playing around with this idea and it quickly turned into this open source project. I've been running for a year and a half now called datasets. It's named after the Commodore 64 cassette player. Which, as far as I can tell their copyright on that term expired about 15 years ago. So I believe the name is up for grabs, like, I hope that's true.
Will Vincent 22:09
Oh, you're here for me? No.
Simon Willison 22:10
Yeah. But data set is. So it's a, it's a couple of things. It's a web application that sits on top of a sequel lite database and exposes the whole thing. So you get a table view where you can look at the tables, and you can filter them and all of that sort of stuff. But then more importantly, you can execute SQL queries by typing them into a form and clicking a button. And because I've opened the database file in immutable mode, or in read only mode, you can't harm the database doing this, you know, like, allowing arbitrary SQL queries is that's the definition of a SQL injection attack. So most most web applications will avoid that like the plague. Turns out with SQL lite, and then if you have a few safety precautions around it, it's actually okay. So SQL becomes your API language, which is interesting, because they said also offers a JSON API. So anything that you can see you can get out as JSON I can export a CSV and stuff as well. And it means that you can take any database you like and dump a JSON API on top of it, which accepts SQL over a query string as the query language. And I've been I've been cracking jokes about graph QL. People like that sounds terrible. Yeah, well, everyone's excited about graph QL. This is just SQL as graph qL from the 1970s. And turns out, it works really well. So that was the sort of initial idea was published data in a way that it's super, super quick and cheap to get it published. And other people can make any API shape query they like, like, like I said, then the other part of the ecosystem I've been building is the ways of publishing that data. So out of the box, data set has a command line tool for publishing, you can say, data set space, publish space, Heroku, space, my database.db, and you hit Enter, and it creates an app on Heroku and uploads your database to it, and you're done. it spits out a URL. It's got two other providers. There's a site now On their v one platform, which they don't let people sign up for anymore, which is a bit of a shame. And then Google Cloud run, I now have spoke with, which only got announced back in it got launched back in April. And somebody sent me a pull request for data set, which implements is Google Cloud runs. So the joys of open source, we didn't even have to. I didn't even have to
Will Vincent 24:20
do anything. Yeah, so as a version two gonna support it or and
Simon Willison 24:26
it's tricky. So like version two no longer uses DACA. It's all about lambda functions. And that's tricky for data set, because data set was sort of built around the idea that you do things with Docker, I think I can get it running on site v2. Now that Python, the other problem is that AWS lambda, their version of Python didn't used to package SQL lite because there seemed nobody would want to use SQL lite in a serverless environment. And that Python 3.7 does have SQL lite support. So it is feasible, but there's a whole bunch of things. I need to To unspool and get working to have dataset fit into those lambda environments. And so I want to get it done. I think actually, I'm the biggest project everyone right now is porting data set to ASCII. we pronounce it ASCII right? So,
Carlton Gibson 25:13
yeah, I mean, yeah, did pronounce. ASCII is always stupid. But it's whiskey. So it's ASCII.
Simon Willison 25:19
Its ASCII rights. So that's the so so once I get that set ported to ASCII, there are various mechanisms for getting ASCII running on site. Now, version two that I want to have a play with.
Will Vincent 25:31
Yeah, I've been hearing you talk about data set, because I hadn't actually played with it until in preparation for this interview. And when I teach SQL to beginners, you know, using a data set to learn how to do basic SQL would be the perfect playground. It's not just made up stuff. It's just and it gets read only. And so I'm going to start using that because I've seen that there's, there's a ton of amazing open source data sets, but it's kind of built for playing and exploring. Just see itself.
Simon Willison 26:00
It's that's one of those accidental use cases. You know, I did I was not intentionally building a sequel learning tool, but actually no, it totally works for that. And I'm really excited to see that kind of thing, thing thing being explored more, because yeah, I'm a sequel lights variant of sequel is a very it's almost entirely based on the SQL standard. So it is a very good tool for learning SQL. And these days, it's got like common table expressions and window functions just made it into SQL lite about six months ago. So there's a whole bunch of more advanced SQL stuff that you can start playing with that.
Will Vincent 26:33
Yeah, well, actually, I probably relevant cuz I, I, I have a site SQL j s. org, which is just a wrapper around there's an open source JavaScript implementation of SQL that, that I use when I teach and that more and more teachers are starting to use because you don't actually have to use SQL to run SQL. But yeah, I should look into because there's ways to load in database files. That would be really nifty to not actually have to run SQL lite itself which
Simon Willison 26:59
don't you js, that's the one that sequel lite compiled to JavaScript, right? Yes. Yes. So things are really interesting.
Will Vincent 27:07
Yeah. So this so this site I have is just I basically just did the UI on top of the original code. But But yeah, well, in the same way that you know that what is it? For? You know, there's for Python two, there's Python, Python, I never say it right. You know, there are these jobs, cumulant implementations as well, that are most of the way there and for certainly, for a learning perspective, which solves a lot of problems, web
Simon Willison 27:30
assembly, this stuff gets super interesting. So there's a project which does the entire Python data stacks, so Jupiter notebooks and NumPy, and everything all compiled down to webassembly and running in a browser, which is an astonishing achievement. You know, I didn't realize it was that advanced. Yet, I've been following webassembly pretty closely recently, because I've started you know, originally it was like, Okay, well, it's, I guess that's kind of interesting, but I don't really see the practical applications for somebody like myself, and Now I'm seeing all kinds of really interesting practical applications of it. And my favorite example is something the Google Chrome Dev Tools about the Chrome Developer evangelism team put out called skoosh skoosh dot app, which is basically it's an image compression website. So you drag a JPEG on and it gives you a better like more compressed version of that JPEG. And it does PNGs and gifs, and JPEGs. But the real magic is, the way it works is they took the best in class C libraries for JPEG compression, and PNG, so opti, PNG and all of these different things compiled into web assembly, and they run them in your in your browser. So now your browser has the best in class implementations of compression for like three or four different formats. You literally drag an image onto this onto this page. It shows you a preview of before and after. With a little slider, you can slide back and forth. The interface is brilliant, but the fact that it's running, it's doing the best in class compression algorithms for all of these formats by running them in WebEx. Somebody who I thought absolutely fascinating.
Will Vincent 29:02
Yeah, I mean, it can't as an art as an outsider just because it has an option to run in offline. So I presumably just loads it once. And then it's there that I when I went to the site,
Carlton Gibson 29:12
yeah, it's a it's a brilliant piece of engineering. Like for me as well, this this idea that you, as you say, you can pile best in best in class programs or deliver them over the browser. It's kind of like the the great hope of the internet and the web browser thing where you did like delivering software over the Internet to be run. And the trouble with that has been that web applications have been historically not very good or not as good as you know, desktop applications. But if that changes in the web assembly can allow that to change and it's the future
Simon Willison 29:41
isn't too exciting angles I have on this are Firstly, there's a now a Python library that can run web assembly from Python. So you can download these somebody compile something for a browser, you can download it onto your computer, you can import it into a Python process and make calls to it over the over the interface. Just really exciting. So now I can now I haven't done it yet, but I could potentially run SQL lite, the web assembly version inside of Python without having to compile extensions and all of that kind of thing. So I think that's really cool. And then fastly, the CDN has been looking at getting web assembly running sandboxed in their on their CDN edges. So you can run a program in anything that compiles to webassembly, compile it, deploy it to what 50 points of presence around the world. And now you've got sandboxed incredibly fast, like, stateless code running in on the CDN edge, which again, that's that's revolutionary. There's some very exciting things we can build with that.
Carlton Gibson 30:40
That's really cool. That's really cool. And how does this tie into? You've got journalism fellowship, right at Stanford was funny, like, what's,
Will Vincent 30:47
what's the story that's coming up, right? The j s? Yeah,
Simon Willison 30:49
this is the JFK fellowship, which is Stanford's, and it's associated with Stanford, the Stanford journalism school and basically it's a it's a fellowship program where they Getting 19 people a year, they pay you to spend a year at Stanford, working on projects, essentially to advance the cause of journalism. And I applied for this, essentially with the data set project, right? The idea because the idea behind data set is helping newspapers publish the data behind their stories. But it's also a help provides like much more powerful local analysis and visualization tools that journalists can use to analyze the data that's coming back from all of these different places. And really, what it comes down to is, if you look at newspapers, like the LA Times The New York Times The Washington Post, they do incredible database reporting, because they can afford programmers, right? They have the funds to hire teams of experts who can work on these things. If you're a smaller local newspaper, there is no you're having enough trouble staying afloat as it is, you're not gonna be hiring programmers. So my pitch for the fellowship was, how can we build an open source ecosystem of tools that help local newspapers deliver the same kind of reporting take on In the class of projects to these much larger publications who can afford the programmers? And then that doesn't just mean data set, like data set is my first foray into this world. But then I plan to spend the first couple of months in the fellowship talking to as many journalists and as many local news organizations as possible, figuring out, okay, what are the tools that if we were to provide these tools, it would give you a huge like, allow you to punch above your weight, and give you a big boost in terms of covering some of these stories,
Carlton Gibson 32:26
and you talk about data set, and you say, you say just but there's a whole load of little tools around it, right, that you've pulled out. So,
Simon Willison 32:34
um, so we've got data set is the thing that publishes your publishes your database, and it lets people explore your database. And, of course, if you're gonna do that, it's SQL lite files, you need to get a way of getting data into SQL lite in the first place. And so I've been experimenting with a bunch of different ways of doing that. The first tool I built was called csvs to SQL lite, and it takes CSV files and it turns them into SQL lite tables. Because, you know, everyone publishes their data as CSV. csv has its flaws as a format, but it is universally understood. And there's a huge amount of data that's being published in that way. So CSV is to seek like and do a bunch of interesting things. One of my favorite features of it is if you pointed at a directory, just say CSV is dislike like name of directory, and then file name.db. It will recursively loop through the directory structure, finding every CSV file and convert those into tables. So you can run it against a nested folder of 400 CSV files, and you'll get a database with 400 tables in it. And that works really
Carlton Gibson 33:34
well you can then immediately serve our web API. Exactly,
Simon Willison 33:37
yes. So the big demo I've got is 538. The blog who do a lot of data reporting about sports and journalism and all sorts of things. They publish all of the data behind their stories in a big GitHub repository with like 400 CSV files in it. I've got a Travis CI job which once every 24 hours, grabs the entire lock just pulls the full repo converts it all into sequel lite database and publishes that with dataset. So that's always my go to demo for what they've said is, is take a look at 538 dot data sets with an S on the end Comm. And you can start exploring all sorts of things. They've got data, my favorite states that they have is that TV show where that chap paints, paints, mountains and trees and lakes. Bob Ross, Bob Ross Joy of Painting. They've got every episode of his show with dirty painter cloud dirty painter mountain
Unknown Speaker 34:30
clouds. Yeah.
Simon Willison 34:31
And that so you can run a query that says, Give me back every episode where he painted at least one cloud and at least one mountain or at least one hats. And that's just that's nice, you know? And now that's available as a JSON API, should you want to build the ultimate JSON explorer JavaScript exploration of the Bob Ross series.
Carlton Gibson 34:49
I've got an idea for now.
Will Vincent 34:52
It I had a question about it. So in terms of these tools for journalism's in the front end, I've been watching. observable which is Speaking of talented people out in New York Times and stuff, are you is that one that you're thinking of or aware of that are, you know, sort of built in front ends that can be applied on
Simon Willison 35:08
top of that. absolutely obsessed with observable. So for anyone who hasn't played with this level yet, it's like Jupiter notebooks and do Python. observable does JavaScript with a couple of differences. It's only available as a hosted platform. So you go to observable HQ, and you can start playing. But it's fully reactive. Like the one confusing thing about Jupiter is it runs the cells in the order that you executed them. So it's easy to end up with a notebook where everything's sort of jumbled together. observable, any cell that depends on another cell will automatically re execute when the first cell has changed. And it means kind of like an Excel spreadsheet. And it means you can build really interesting interactive tools where you muck around with a slider at the top of the page, and it's updating a map at the bottom of the page, those kinds of things. So it's a it's by Mike Bostock is one of the three developers who invented d3 So it's
Will Vincent 36:01
like Tom and I forget who the third is.
Simon Willison 36:03
Jeremy, who did CoffeeScript. So it's called
Will Vincent 36:07
Tom, that guy
Simon Willison 36:08
who works on a lot of the matte box stuff. I think he worked on matte box GL and things. So yeah, the team behind it astonishingly like like that well suited to building this project. So I've done a few bits of bits with observable where I use data set as the back end API, pull them to an observable network and do visualizations there. That works really well. And I think there's a lot of potential for both for that kind of stuff. But then the other thing I'm trying to do data set is it has a plugins mechanism. And so the idea is I want an ecosystem of plugins that can do any kind of visualization you'll manage at the moment. The two best visualization plugins are mean almost all the plugins by me at the moment, you know, when somebody else writes a plug in, that's a huge like when when that happens, but I've got a plugin called Data Set cluster map. And what that does is it looks at your data and if it finds a latitude In longitude column, it draws a map and it loads the pointers on and does that. And you know the clustering thing where you'll get a five that you can click to zoom in and see all five points back, that kind of thing. And that's using a leaflet JavaScript library for the for the bump map clustering, but it works amazingly well, it turns out in 2019, a browser will happily display 200,000 points on a map using the right so I've got a 200,000 point data set of every tree in San Francisco, which,
Will Vincent 37:29
oh, yeah, I've seen you
Simon Willison 37:30
in the city of San Francisco released this CSV file of all of the trees in San Francisco that are managed by the Department of Public Works. And so this plugin will port draw 200,000 trees on a map and then you can zoom in and see all of the individual trees and their species and when they were planted, and so forth. And then I've got another plugin called dataset Vega, which uses the Vega visualization library to essentially do bar charts and line charts and scatter plots. And I'm trying to get it so that these things kick in automatically. So the idea is they'll analyze your data and go, Oh, it looks like this could be graphed against this. Here's a quick preview of a graph, click here to expand it. And that's cut. But that that's, it's really fun. It's also a way for me to muck around with some of the more advanced JavaScript visualization things that are going on. And again, I'm hoping I can convince other people to start building plugins for all of the different types of visualizations people might want to do. I'm sure you can because like you're picking up lots of interest now. I see a you know, it comes up. No, I don't I see you tweeting about it. I see other people tweeting, right. Well, you've been putting on glitch recently, which is
Carlton Gibson 38:34
Yeah, it could glitch the the Trello company that
Simon Willison 38:40
Yeah, Fog Creek software. So they're responsible for I mean, Stack Overflow was a partnership with with with people from Fog Creek. And then there was then there was Trello, which they sell to Atlassian. And then new focus is glitch, which is this phenomenal learning environment for programming and you know how the absolute worst thing about that learning to program is setting up your development environment. And I've been doing this for 20 years. And I still have trouble setting up a development environment for anything that's even slightly different from what I normally work on. So glitches thing is it's entirely browser based. And you literally click a button on glitch COMM And it gives you a environment with a running web server with a editor built into your browser, you can edit the code, it's got gets running, but you don't have to know about it. So it just constantly snapshots where you've got to.
Carlton Gibson 39:27
But that would mean that you could clone it if you wanted to. And they don't
Simon Willison 39:30
close. They call it re mixing so I can go to any projects on glitch, and I can click the remix button, and I've got my own copy of that project and I can start mucking around with it. So as a learning environment as the community of people it's, it's phenomenal. And the thing I got working with dataset a few weeks ago is I've got an example. I've got an example project on glitch which you can remix it's called dataset hyphen csvs. You can remix it, you drag a CSV file onto your browser and it will convert that CSV We found SQL light and service instantly through dataset. So it literally is a drag and drop to create a API for your data. And this this interface to explore your data. And it's I'm so excited about this thing. Like as a demo, I've been gleefully doing this demo where I get people to run the demo themselves. And I don't even touch their laptop. I'm like, go to this URL, download this file of Seattle Public Art, drag it onto your browser. Look, you made a website, you made a map of this thing. That's like code demo Blackwelder. Oh, man, it's so much fun. It's such a great demo to hang on
Carlton Gibson 40:36
to you.
Because when I looked at it, Originally, it was JavaScript only. Well,
Simon Willison 40:43
it turns out glitches have Python support from day one, but they never really documented it. You know, they, they made the reasonable decision to focus on one language for all of their material around it. But it's actually running a Docker container with an Ubuntu with Ubuntu in it. I think. I'm pretty sure it's Definitely containers, I think it's Docker containers. And so they out of the box give you they give you Python two and Python three is running python 3.5. They don't have anything more recent that just yet. And they actually have PHP on there as well, if you can, you can. There are examples out there for how to run PHP on glitch. And so you can install anything by running PIP three, install something dash dash user, you have to use the dash dash user option or it doesn't work. And they have this configuration file that lets you say, forget about the node.js thing, I want you to run this process and bind it to Port 3000 and then wrap traffic to it. And so actually getting once you know this, and it took quite a bit of digging around in their forums to where people could reverse engineer it and figure it out. You can run any Python thing on there that you like, which is really exciting. So getting data set on there was way easier than I thought it was going to be once I found the relevant forum posts,
Carlton Gibson 41:55
but now those now you found those relevant porn posts that presumably you've created Python starter examples which can be remixed,
Simon Willison 42:03
right? Yeah. And there's there's a flashcard around already. I've got a couple of I've got a data set basic one. And then I've got this magic data set one that does the CSV conversion and stuff. And yeah, and it's just a case of clicking remix. You don't even have to log into glitch to play with it. You can remix any project as an anonymous user. And they'll delete it five days later, but it's that's fine. You know, it's enough for you to click remix on something and start mucking around and get a feel for how it works. So yeah, it's my go to demo for data set. Now. I'm the delay set. Doc's now recommend that people use glitches the first place to start playing with it.
Carlton Gibson 42:38
That's really cool. It seems like a perfect environment for that. But would you deploy an actual application on glitch?
Simon Willison 42:44
So the only downside of glitch at the moment is the apps go to sleep, because they're running apps for millions of apps running right now. And so the first hit, you have to sit through a loading screen while the app wakes up and they also limit you to 4000 requests an hour. Which is about one request per second, which is fine for just you. But if you put on a high traffic website, it would fall apart. I'm pretty sure they're going to introduce a commercial, pay some money. And now your app one. This was
Carlton Gibson 43:12
my next question is good. So how, what's the monetization
Simon Willison 43:15
and they haven't announced it yet. My best guess is it's going to be they also, they have a gender pants that Well, I know, I think I like because they've done Stack Overflow in Trello. In the past, I think they have a very good idea of Yeah, that's where these things sustainable. So they've got a team's product, which is currently free, but presumably they'll start charging for that. And again, I'm assuming that they'll start charging you for permanently keeping your projects online. See at the moment, I wouldn't recommend it for more than sort of small side projects and things where you're where you're where you're playing with it but I'm very I'm very much looking forward to them having a empty your credit card details and keep your thing running option, because that's the point when I can tell journalists that they should use it. Now at the moment. You can't tell a newspaper to host on glitch Their thing is going to go offline if they get a spike of
Carlton Gibson 44:02
traffic, but um, yeah, and like the deployment story there is just so sweet. Right here to anything else. So you know, we can talk about serverless. But getting something running on lambda is non trivial writing provisioning a VM is non trivial deploying Docker containers is non trivial. Yep. If you've got, it looks something like that would be amazing. Yeah.
Simon Willison 44:20
I mean, it's such a clever product, the way they've built it. The way it works is absolutely fascinating. And they've done such a great job of having this community around it of people who are teaching each other to program. So yeah, that's that's been said that was a that was I think, a I feel like data set itself has had a bit of a tipping project within the last month, partly because I got it running on glitch so I had a much more compelling demo. And also the talk I gave at pi con. Like the feedback I got after that talk was a bunch of people who I really respect who've been aware of dataset but you know, it was Simon side project for a year and a half and now it feels like it's okay, this thing sticking around, you know, this is not A this this is this is something that we can trust to keep on developing. And I've got the fellowship as well. So that'll give me 10 months of working on it. But yeah, it feels like I'm now getting serious attention from people who previously were aware of it and it was on their radar, but it wasn't something they will going to really commit effort to exploring.
Will Vincent 45:20
Yeah. So if we can let's, let's speak for a little bit about your your day job. Because there's interesting story there. Because you were at Eventbrite, which is one of the one of the largest Django sites in the world. That's probably true. Yeah. Right. In terms of traffic. So I'm curious. And I know you had founded lanyard, which was acquired, I'd love for you to maybe speak briefly about that. And then what, you know, what is it like working on a Django site at scale? Because there's still sometimes a perception that Django doesn't scale, despite event, right Instagram. So what does it look like? You know, actually working on a large Django site? Sure.
Simon Willison 45:56
Okay, so I could give you the I'll give you the very quick startup story. Yes, Guardian 2009 2010 and I married my wife, Natalie. And she was she married me what's the point, but we decided that we wanted to go off on honeymoon for as long as we possibly could. And so you know, quit our jobs, give up her apartment, just travel the world, take laptops with us and do the sort of work on freelancing projects and stuff to try and keep us going for as long as possible. And so we did this, and we managed six months of traveling and which is great, you know, that's pretty fantastic honeymoon, we were traveling through, mostly through Europe and then Morocco and Egypt of places. But the freelancing work was difficult because when you're traveling and trying to freelance at the same time, you'll find that you know, the client hasn't sent you the thing that you needed, and you've got like an hour now, but they haven't sent you a thing. So okay, we'll get to it tomorrow, that kind of thing. And then Meanwhile, we had, we were batting around ideas for side projects together. We have this one idea that would be a Websites make sure that we didn't miss out on the great conferences and events that our friends were going to. And then we got to Casablanca in Morocco. And we got food poisoning. And we were really ill, and it was during Ramadan, and Casablanca was not on the main tourist trail. So during Ramadan, everything's shut down, you know, none of the restaurants were open, and so forth. So we said, okay, well, we'll rent ourselves an apartment for two weeks, and we'll try and like, cook ourselves better and look after ourselves. And I guess we'll work on one of these side projects that we were thinking about. So we picked the conference websites and built the first version, it was very, very scrappy, you know, we knocked it together in two weeks, we put it live. And the key feature when it launched was, you sign in with Twitter, and it shows you the events that the people you follow on Twitter going to or speaking at, and that's it, that was all that it did. But it turns out that was a really compelling. That was a really compelling thing to offer. Partly because it's a bit of a cheat, right? The thing about Twitter is people who speak at conferences really love using Twitter. And they've got lots of followers. So if you build a website where and back when landed, launched, Natalie and I were the only people who added any data to it, but it had over 100 speaker profiles on it, because there were over 100 people who we knew who were speaking at conferences. And so if you signed in, if you were following any of those 100, very prominent people, we'd recommend an event to you. And it was like magic, you know, you click the blue button, and it goes, Oh, and you're following Geoffrey's album. And he's speaking at an event spot you like. It's like, it looked deep into my soul and figured out everything that I need to know about the world. So because of that, and because Twitter is naturally viral, it started taking off way, way faster than we'd ever expected it to. So we're trying to travel around Morocco. And like, have fun on our honeymoon. And this thing is, is is breaking servers, and we're having just one stop growing. Yeah, exactly. So we made it for about two months. So Morocco and then Egypt. And then we were like, you know what? People keep on getting in touch with this and saying, hey, I need to talk to your support department. And we're like, it's just us and our land and our laptops. This is this is a lot bigger than we thought it was going to get. So we ended up applying for the Y Combinator startup accelerator from Egypt. We were in. We were in Lux, or was it? Yeah, we were in. We were in Luxor on the Nile. When we when we filed our application, part of the yc application is that you have to have a bit of one minute long video. And so we carefully positioned a Egyptian temple in the background of our video, but didn't mention it. We thought we would play it cool. And actually, I think that video is available if you want to link to it from the show notes. So we did so we got accepted into Y Combinator and sort of Qatar honeymoon, short, moved to Mountain View for three months Y Combinator program. And then after that, we moved back to London, and we did the whole startup thing. We raised money, we hired people, we got an office. We We spent a solid two and a half years in London, growing the company and trying to get the business model working and all those kinds of things. And then we got to the point where we either needed to raise a series A, or we needed to get acquired. And so we started looking at what the acquisition options would look like, what were their companies that would make a good fit for this? And what Yeah, we'd already had conversations with Eventbrite in the past. And it turned out, that was the, the point of their growth where they needed to really bulk up, they're sort of like Django engineering team. They had a fantastic team, but it was pretty small. And they needed to, to really be accelerating the growth of the engineering team. And we thought there were good alignments in the product as well with some of the things that Eventbrite were looking at doing. And so yeah, we we negotiated an acquisition, move the entire team to California, so that was 11 616 members plus families, 11 people total, and none of whom have gone back to England. So it did work out well. For the employees, which is good people, people were happy with it with the California lifestyle. And yeah, and I've been I've been working at Eventbrite ever since.
Will Vincent 51:08
Wow. Well, Carlton can agree with leaving England for sunny weather, right? Mm hmm.
Carlton Gibson 51:13
Yeah. done for any number of reasons at the moment.
Will Vincent 51:17
So So what is it? Um, so what does it look like day to day? I mean, because event, right, you you're adding new features, but it's, it's, you're not spending all your time and features I imagine a lot of is around scaling. And so and
Simon Willison 51:28
I'll be honest, once you're at the scale of Eventbrite, I don't think the web, the web framework really matters that much, right? You can scale PHP, you can scale rails, you can still scale Django, it all fundamentally ends up as the same sort of shared nothing architecture, where you've got a bunch of different application servers, they're talking to some replicated databases. If you need to handle more web traffic, you fire up more application servers running a copy of your stack.
Carlton Gibson 51:53
And you put caching layers in
Simon Willison 51:55
exactly
Carlton Gibson 51:57
like queries in
Simon Willison 51:58
Excel. And so the techniques are pretty Universal across different stacks. Django November right? These days, we mainly use it for the so we never used the Django templating language they made a decision before I joined the company to use Maiko. I think because it compiles down to Python, I thought there'd be a performance increase. I'm not a huge fan of Maiko. My problem with it is that it's so easy to accidentally embed business logic in Python inside of your Miko templates. You know, it's very sort of PHP like at that point, although these days most of embrace new features are react components that are rendered client and server side. So so so today, most Eventbrite features will be a JSON API, and then a react component, which is rendered server side for that sort of initial hits, and then click tech side client side after that,
Will Vincent 52:45
and are using rest framework or do something custom are the API's.
Simon Willison 52:49
So Django rest framework is the API layer at the front. But internally, we've been migrating Eventbrite to a micro services model and the micro services within Eventbrite on our own, we have a protocol we call pi Soa, it's a sort of service oriented architecture in Python using Redis as a message bus. So you communicate in jet, you basically build message pack messages, stick them in a Redis queue, a service at the other end reads off the queue, does the work sends you the message back again. And that's we've also got an older soul mechanism from a few years ago. And the classic problem with a site like as the scale of Eventbrite is, anytime you transition technology, finishing is really hard. You know, you can end up with 95% of our stuff is using the new thing. But there's still this one old bit that's on the old thing, which is terrible. And we're very much trying to develop an engineering culture where we don't let that happen.
Carlton Gibson 53:45
That means you never get to retire the old thing.
Simon Willison 53:47
Exactly. They've got to properly use, right, absolutely. And so we did. We retired API v1 last year, we were on API v3 now. API v1 retired several years after we said it was going to But we did retire it. And this was a huge like, this was a milestone. This is, hey, look, it is possible for step to turn off these things. And we've got and the Eventbrite engineering team is so much larger now than it was when we joined the company. We've got engineering spread across five cities on three continents, which is very exciting. So we've been having to figure out how to work as a international first distributed engineering team. We've got engineering in San Francisco and Nashville in the US. Mendoza in Argentina is one of our largest engineering offices now. We've got Vancouver, and then we also have engineering in Spain. So we've got offices in vn and Madrid. And yeah, that's that's a really interesting challenge figuring out how to productively work with an engineering team spread across that many areas and times. That sort
Carlton Gibson 54:51
of Western Hemisphere that Americas thing at least on relatively similar time zones, you throw in Spain, you're like, well, that's like 10 hours out.
Simon Willison 54:58
Yes, that's insane. And also given like Spanish working hours, there's basically no overlap between the San Francisco office and the and the Spanish office. But what we do have is we've got Mendoza in Argentina is closer to Spain in terms of time zones. So they do have some overlap. And they also have like that they also they speak Spanish. So we got a similar culture working culture. Yeah. So Mendoza has almost become the center of gravity for how engineering works, because they tie the two hemispheres together, which is a really interesting development.
Carlton Gibson 55:30
Can I, you were mentioned you using Redis. Can I ask you using the stream the new streams?
Simon Willison 55:35
We're not yet
Carlton Gibson 55:36
we were on the roadmap, because it looks really exciting, but I haven't had the chance to dive into it. And yeah,
Simon Willison 55:41
I don't think so. I think we've we've explored it a little bit. The the stuff we're doing is built on top of blocking, site blocking, list operations, you know, right at the moment, and the thing I'm personally excited about the registering stuff because I love katka except for the We have to set it up, which is a night. And so we're running calf care because we've got an entire ops team who can support Kafka. And for my side projects, there's no way I can get Kafka running. And if you look at Heroku calf grow, it starts at like $50 a month. So it's not easy to pace. But Redis streams gives you the same primitives, but it's Redis, which I can run anywhere and is already available and on all of my projects. So yeah, I'm really looking forward to exploring that for some of these smaller projects.
Will Vincent 56:27
And then as you look ahead, I mean, because I know your your colleague, Andrew Godwin is doing a lot of the async stuff in Django. Do you see any of that impacting the architecture of Eventbrite, assuming it happens, or is it separate?
Simon Willison 56:40
I think it's very possible. I can we have so Eventbrite ones are sort of API gateway at the front that like API requests come in, we turn them into our internal service calls, pass them out through different services. They come back, we send them back out the front. Right now that API framework is and it's based on you whiskey. So it's lots and lots of you with Key workers that that handle traffic. And we know that we can. And it's easy enough to scale. You know, you fire up more you whiskey instance, more boxes running you whiskey instances, I think for API framework stuff. That's where, like Python three async really shines. Because most of its just its IO, right? You get a request, you send it off to a service, you wait for like a half a second for that return. And then you feed it back out, again, blocking a whole worker for that feels really wasteful to me. So I wouldn't be surprised if the first application of Python three async at Eventbrite ended up being something that's one of these these gateways, these sort of proxies that sits at the front.
Carlton Gibson 57:36
And it's well as it strikes me that underneath the ASCII design, there's kind of a message bus waiting to happen,
like,
Oh, yeah, the way that you can nest the applications and everything's ASCII all the way down. It's like,
Simon Willison 57:49
Andrew designed our internal solar mechanism. So he's been he built out event rates internal. So our mechanism works on Redis while he was working on it, Both like channels and as key. So the the design of the two systems definitely informed each other. I think there's there's a whole bunch of ideas from asking definitely that we're now running inside of empire and vice versa. So yeah, it's, um, you know, in in 2019, the way you build software is lots and lots of little services and API requests and so on. And async is such a natural fit for that style of development. Yeah,
Will Vincent 58:24
I think we're almost near the end of time here. Are there any last projects or things you want to mention that we haven't already covered? You do so many things. It's really inspiring hearing you talk and you're so excited about them, too. I mean, you sound like the like anti burnout engineer. Yeah, I'm,
Simon Willison 58:40
I'm really excited about ASCII in general. I released my first piece of ASCII middleware a couple of weeks ago, basically to sort of test the waters of what ASCII looks like. And that's this. It's ASCII hyphen, cause so it's a middleware for adding cause C, like cross origin resource sharing headers to an ASCII project. But what I'm really doing with that my real ambition with ASCII is I'm getting a version of dataset working the next version of data that will be on top of ASCII. At the moment. It's, it's Sanic, which is a custom web framework. And then I want the plug in system and data set to allow you to add plugins, which are basically ASCII middlewares. So like data set authentication is going to be an ASCII middleware, and dataset cores and all of these different features that I want to build. And that's made me realize that because ASCII is turtles all the way down, the interaction between asking plugin systems is really interesting, like you can, I'm hoping that will get lots and lots of ASCII middleware out there. And then you can use data that uses pluggy, which is the plug in library that the PI test built. But you can use that to basically make composable web applications by throwing together a bunch of weird little plugins. It's kind of like the Django reusable apps idea, but at a slightly different level in the stack, right? Yeah, that I'm finding great Interesting
Carlton Gibson 1:00:01
that kind of struck me that data set was kind of like the the Django admin 20 years later, you know the way introspects
Simon Willison 1:00:07
Did you ever see data browse in Django?
Carlton Gibson 1:00:10
I that was before my time.
Simon Willison 1:00:12
So Django used to ship with a contract module called Data browse, which was exactly data set like this was Adrian havarti built this. And so yeah, given any database, it gives you basically what data set gives you today and data browse. We eventually, it was decided to pull it out of Django. And so now it's available as a third party module somewhere. But yeah, it's very, it's they're very similar to each other in that respect. But I think the biggest innovation from data set is this arbitrary SQL thing it's using using SQL as an API language is I'm almost gleeful about it. So I've been building this project at work, which is a search engine for our internal documentation. And because we've got internal doc spread across like 10 different systems, and the way it works is it's a crawler that every half hour pulls everything from the And these internal systems sticks in a two gigabyte SQL lite file with the sequel lite Full Text Search stuff, but I'm turned on, and then it sticks out in data set and the user interface. The search engine is literally 500 lines of HTML, CSS and JavaScript in one file that runs SQL queries constructed in JavaScript against data set. And I've been Lich I've been gleefully cackling as I show people look, look, it's sending. It's generating SQL queries
in the JavaScript and
it's sending to the back end. And this sounds like an awful idea, but it's fine.
Will Vincent 1:01:33
Well, that actually that makes that makes sense that I saw December last year you wrote a really great in depth post on search with write the data set related to that work.
Simon Willison 1:01:43
Yeah, this internal search engine is not exact pattern but applied on a much sort of larger scale. And my absolute favorite feature of it is that I added facet I love faceted search engines and I added facet by emoji to our internal doc. Yeah, I thought I saw that. I thought I The way it works is it literally constructs a SQL query in JavaScript with emoji embedded in the SQL query. And then it sends it to the back end. So if you fire up the Firefox dev tools, you can intercept SQL queries with emoji and going over an HTTP GET. I think this is the best joke, right? This is, this is very much akin to my sense of humor. I'm hoping I can open source this internal search engine because it's, it's just really funny. But I've been doing it. So I've partly been doing this because it's a way of trolling other developers going look, I'm constructing sequel in JavaScript right here.
Carlton Gibson 1:02:34
I'm just imagining a pen tester with the traffic analyzer watching it go pass, right,
Simon Willison 1:02:39
exactly, yeah.
crazy thing is I've been working on the system for a few months now. And it turns out embedding SQL in your client side, JavaScript is fine. Like none of the obvious flaws in the system have actually caused any loss in productivity or anything. It's super easy to maintain because only 500 lines of code for the whole thing. It's understandable. It's But yeah, so I'll probably be writing a bit more about the sequel and JavaScript pattern at some point because yeah, it's, it starts as a joke, but it's a joke that's actually turning out to be pretty useful.
Will Vincent 1:03:12
Yeah, well, I mean, and famously, what flask came out on April Fool's Day, right. So it's not the first time. Maybe a good thing has come out as a joke. Yeah, absolutely. Well, thank you so much for coming on and spending this time with us. We wanted to have you on for such a long time. And it's so great to hear about the early days of Django. I'm sure that some stories that people hadn't heard of, and then all the work you're doing now and combining Django and all these, all these things, and you still get to use Django. I mean, there's a fair number of prominent Django people we've interviewed and they don't get to use Django day to day and they're Oh, yeah,
Simon Willison 1:03:43
yeah, we're using Django that dates read event, right? with data set. It doesn't use anything from Django yet, but as key I think, is the way I'm going to link it back to the wider Django ecosystem, you know, as as kiss such such a perfect fit for what I want to do. And it also means I'll be able to do things like You have a data, a Django app with a data set powered view that's just embedded into the URL, which is gonna be really interesting.
Carlton Gibson 1:04:08
Yeah. I mean, there's absolutely no reason why we can't just embed was as gapped inside the Django as we are
Simon Willison 1:04:15
right now and vice versa, and
Will Vincent 1:04:17
vice versa. All great. Well, again, thank you so much for sparing the time for being on.
Simon Willison 1:04:22
Yeah, this has been really fun. Thanks a lot for having me.
Carlton Gibson 1:04:24
Super Simon. Thank you very much.