Django Chat

Datasette, LLMs, and Django - Simon Willison

Episode Summary

Simon Willison is a co-creator of Django who is currently working on Datasette and writes actively on AI/LLMs. We discuss the current state of web technology, his role as a director of the Python Software Foundation, and the NYTimes lawsuit against OpenAI, amongst many other topics.

Episode Notes

Simon Willison’s Weblog
Datasette
Datasette Cloud running on Fly.io
Choose Boring Technology
Datasette enrichments
LLM and LLM plugins
Mistral, Mixtral, and ways to run it on LLM
NYT lawsuit against OpenAI
ChatGPT for AppleScript
Simon uses https://llm.mlc.ai/#ios to run Mistral 7B on his iPhone
Building a Blog in Django
Simon’s 15-year-old single file Django attempt djng and notes
AI Superpowers book

Support the Show

Episode Transcription

Will Vincent 0:06
Hi, welcome to another episode of Django chat podcast on the Django web framework. I'm Will Vincent joined by Carlton Gibson. Hello, Carlton,

Carlton Gibson 0:12
Hello, Will.

Will Vincent 0:13
And we're very pleased to welcome back Simon Willison. Welcome, Simon.

Simon Willison 0:17
Hey, Will, hey, Carlton.

Carlton Gibson 0:18
Hey, Simon, thank you for coming. I'm really excited to have you again.

Will Vincent 0:21
So for those who don't know, Simon is one of the original co creators of Django. He's currently working on dataset. He writes a lot about AI LLM and much, much more. So we'll get into all that. But I'd started off with actually, so you were at the most Django con us I guess, last year. But day to day you don't do I don't think a lot of Django. So I'm curious. How do you see Django 20 years in as someone who is familiar with it, but isn't maybe as in the weeds as some other folks? How do you assess its kind of strengths and weaknesses in the web framework, landscape as it is now.

Simon Willison 0:53
So the thing I love about Django today is that Django qualifies as boring technology. And I have a heat there's this incredible essay that is online men look fondly, and Dan McKinley put out this wonderful essay A few years ago about how you should pick boring technology, where what he means is that anytime you're building something, there are things that you want to innovate on and where you want to build, like something new and exciting and solve problems that have never been solved before. And then there's everything else. And for everything else, you should pick the most obvious boring technology you can so that you're not constantly trying to figure out oh, how do I do CSRF protection in this framework by music or whatever. Just Just make sure your defaults are boring. And I love that Django absolutely qualifies now, right? I never in my wildest imagination stream that Django would be the boring default choice for building things. But it is. And so actually, I'm building dataset cloud right now. It's SAS hosting for my dataset project. The core of that is a Django app. I've got a Postgres and Django app, which manages user accounts, and manage your signups all of that kind of thing. And then it launches, Docker containers on fly.io, which run data set and all of that. So all of the exciting stuff I'm getting to innovate on in the corner, but the sort of bog standard bits that make the whole thing when it's Django, and that's great. So yeah, I love that. I love that Django is now the safe default choice for building a web application. Lovely. Well,

Will Vincent 2:17
so you mentioned user accounts, I have to ask so Carlton's thoughts on, you know, maybe 20 years on changing some some of the defaults to Carlton, do you want to give you a quick pitch? And we'll see what Simon Okay, yeah,

Carlton Gibson 2:28
so my my kind of take is this is that we've kind of got a leaky battery with the user model, because we asked to create this custom user. And it's a whole world of complexity that for the set, that central auth model, which is like for every single request, or is the identity of this user is x. They're not the profile data, which obviously you want custom per app, but we sort of have this, this custom user model, which we forget to set up and we you say we there's all these warnings in the doc how you should use it, but I'll don't migrate to it, because that's too hard. And ah, and I think we made a mistake that I think we what we should have done is tripped is trimmed off all the non identity stuff from that user model, and then locked up Django country both really tight.

Simon Willison 3:12
Couldn't agree more. There were four flaws in the default user model. Firstly, it expects everyone to have an email address, which doesn't work for it makes people pick a username, which is very archaic, and expects or your name to split and first name and last name, which for many cultures doesn't work. So yeah, I'm very with you that the the user model has not dated Well, unfortunately.

Carlton Gibson 3:36
So I mean, I, what I'd kinda like to do is cut it down and trim it, you know, some find a way slowly. It's obviously over time, because Django is very stable, and we have the migration policy, but just reopen that debate about whether we can fantastic trim off the bits and somehow somehow do it. I think we gave up a little too early. On that is yeah, that's, you know, so I've been experimenting in that domain.

Simon Willison 4:00
Yeah, I love that. I mean, I use the user model as a key that other things key onto and then I have a live of Google accounts that have been associated with them. And I don't make people pick a first name and last name. Yeah. All of that kind of stuff. Yeah, no.

Carlton Gibson 4:16
Okay, good. So

Will Vincent 4:19
are you are you you write your own then it sounds like to manage social authentication?

Simon Willison 4:24
I mean, I use Yeah. So how am I doing social authentication, I think I rolled my own Google Google OAuth thing, which keys against Django. I think I've probably got both of that lying around somewhere. And I've done that before. In the past, I tend to use the default user model, partly just for the admin, it's the most convenient way to get the admin up and running. And I always have the me the admin is such a key feature for me to quickly iterate on what I'm doing and build out like internal tooling and so forth.

Carlton Gibson 4:52
And yeah, I get when you're starting a new project as well. The last thing you want is to sit stop. Oh, I need half a day of planning. I need for a user model. You just want to start and then yeah, I'm impressed. You implemented your own Oh,

Simon Willison 5:07
I've done it so many times at this point. Right. Okay. So, yeah, I mean, so my blog runs on Django, Simon willison.net. And that's open source like the it's not a very complicated application. But it's all sat there on GitHub. And I find myself I tweak that about once every three or four months, I'll go in and I'll tweak something about it. And it's always fun. It also, it's managed by dependent bot, so it magically upgraded itself to Django five, a few weeks ago, it just did it, which was great. I didn't have to think about it. I've got just enough automated tests that I trust it that the things gonna work after I apply updates. And yeah, that's that's a nice sort of way of staying staying connected with what's going on in a very sort of low risk environment as well.

Carlton Gibson 5:51
I saw you put out a post a little while ago, just on the blog topic about you know how to build a blog in Django, it was really good. It's like a kind of check. Oh, yes. Yeah, Andrew, how to build a blog. And I kind of, I see everybody struggling with Hugo sites and this static site and this generator or that site site generator, and I sometimes think no, just run your own Django app, because it's great to have that playground and you've been doing it for like 20 years with the same Django application just evolving. It's lovely.

Simon Willison 6:16
Really, it's, yeah, it's it pretty much. My very first version of my blog was PHP running on my university's shared hosting with Flatlanders. As like, was it even? Was it the PHP equivalent of pickle? I think it might have been PHPs pickle equivalent of just a big, like, array of posts and stuff. And yeah, then I flipped over to Django, actually about 2000. No, I might, it's on my blog somewhere when I first posted it to Django, and then I did a major upgrade in 2017. When I came back after not blogging for like seven years, and did the python three upgrade and stuff. And I've just been iterating on it ever since it's great.

Carlton Gibson 6:59
But also Jacob Kaplan Moss, His blog is built on Django, but also if you go to his GitHub repo, it says forked from Simon Willis.

Simon Willison 7:09
That's brilliant. I actually, I stole the feature off of him a few years ago, he has this idea of a series of posts around certain topics. And so I added series into my blog inspired by what he'd been doing. Didn't

Carlton Gibson 7:23
you open a pull request to get it merged into the upstream branch?

That would that wouldn't surprise me if you had

Will Vincent 7:29
that's, that's Carlton would have done? Yeah.

Carlton Gibson 7:32
I don't know. Anyway, well, carryon.

Will Vincent 7:35
I guess just one more. You know, putting your kind of old man hat on with Django. I've heard you mentioned that. The fact that you know, so flask, Django the the fact that flask can be a single file. I don't know if you've kept up with this. But Carlton did a talk in 2019 on single file, Django, and then at the most recent Django con us Paulo Melchiorre. Has, like there's a whole repo of like, I think it's six lines of like, kind of proving like, you can do Django in a couple of lines. And I guess I wonder, I think about this as teaching because I, again, like my brother in law is going through a boot coding bootcamp. And I'm like, Hey, I'm here. Let me help you like, Oh, we're doing flask. I'm like, why? Like, I almost feel like, I don't know if it's worth doing showing like single file blog on Django or something, just to make the point that like, hey, it's possible, because even in flask like you don't no one does it that way you could, but no one would do it that way.

Simon Willison 8:29
I do. I love the single file thing. I built my own Django single file thing. Like 10 years ago, something called DJ N. Zhi Jing chatted about the bells. And yeah, that was basically just trying to do a little thin shim that lets you do a flask invitation on top of Django, because I love that for just hacking out quick things like I don't know, having to bother about the directory structure, and so forth. So yeah, I'm thrilled to help people with still pushing, pushing out on that. It's a great idea. If we were to jet design Django today, I'm certain it would be like capable of doing single file out of the box. That just makes sense to me. Okay,

Will Vincent 9:05
one more, and then I'll let you go, Carlton, this question comes from Eric Matthes, who wrote Python Crash Course. And he was asking you sort of answered it. But what is your preferred way of building web apps today? I think specifically on the front end, having seen it go from, you know, server side render to jQuery to spas, now, I guess htm X. But where do you fall on that pendulum.

Simon Willison 9:25
So I spent a few years trying to do the React thing, because everyone was it was clearly the way it was going. And I hated it so much. The thing I hated. It's the build scripts. I hate it. When you have a front end project, which you work on every six months, and you come back in six months, and nothing works. It's all you have to re spin up your web pack configuration, all of that kind of stuff. And so a few years ago, I said you know what, I'm gonna give myself permission to write JavaScript like it's 2008 again, and so no libraries, no build scripts, no TypeScript, nothing like that. Just like little bit Because the thing is that we used to use jQuery because of the browser differences. But the browser differences are gone right today, document dot query selector all and all of that stuff. It works exactly the same across everything. So you can build code, like you're using jQuery, but without using jQuery, you just write like event handlers, and so forth. And it was so liberating. Like, suddenly, I enjoyed front end development again, because I didn't find myself fighting Webpack and whatever, VT that whatever the new cool stuff is. And I could go back to projects I wrote like this two years ago, I can drop in, and I can maintain them. And I can add new features to them. And then on top of that, the like language models stuff, chat. GPT is really, really good at all forms of JavaScript. So it's not like I ever find myself stuck trying to remember how a certain API works. If there's something which is going to be a bit tedious, because the JavaScript is going to be 20 lines of boilerplate. it'll spit out 20 lines of boilerplate, I could just just let it go and get on with it. So yeah, I've got really into that I have played with them. HTML works on a couple of projects. I really like it, it fits my I've always been into the the sort of unobtrusive JavaScript, the idea of progressive enhancement, HTML is so good for that kind of thing. And so I really, I like that, I love that that's getting popular. And I love the performance that you get from it, because you don't have to serve a megabyte JavaScript bundle, just like share a contact form or whatever. And then dataset itself is very strictly, it's just HTML. And when you click the link, it loads a new page. But I've been playing with the chrome view transitions stuff recently, which is super, super interesting, like cutting edge Chrome, I think you might still have to turn on one of the experimental flags, you can actually serve up CSS that says, And when the user navigates from this page to this page, keep this area of the page stable and sort of like blur update this other bit. And it's like a couple of lines of CSS, and suddenly it feels like a SBA. But you click a link and only part of the page updates and so forth. But it's a real navigation, there's no JavaScript involved. That's thrilling. I can't wait to see that rollout to other browsers as well. I

Will Vincent 12:10
was sorry, Carlton, gonna go promise. Last one, just with bundling because I just I just did a redesign of my main site, which is using tailwind. And it I love I like tailwind but it's a little disappointing. I now have to have like Node and stuff running to. It's almost like it's switched from JavaScript to CSS now to have a build script for everything.

Simon Willison 12:29
Yeah, that this is one of the reasons I've not adopted the modern CSS stuff as well, is I just the build scripts, they're fantastic. For larger, more complex applications, the stuff I do, I always try and keep it small and simple enough that you don't necessarily need that. And then they just become friction. Like, it's just something that prevents me from from being able to because I have so many projects on the go at once I've got what 100 And nearly two hundreds. It's it's I've got some ridiculous number of actively maintained projects. And the only way to do that is to make it as easy as possible to drop into something you've almost forgotten all of the details of and get it up and running. Again, I feel like with the front of build snack, if you do you work on the same projects every day, it's completely fine. It gives you a huge productivity boost, that there's none of that friction, because you've constantly got that stuff sort of warm in your head. If you drop into a project every six months, it's completely different. And that's how I like to optimize for being able to hop across hundreds of different projects and make small changes to them without getting stuck on the building.

Carlton Gibson 13:32
That's the exact same point as the boring technology talk right? As you focus your if you focus on one or two or three technologies, then you're able to really get the most out of them rather than spreading yourself thin over, you know, say half a dozen. And when and that slows you down. It's a sort of saying that

Simon Willison 13:47
the secret to running lots of projects is they've all got to be as boring and similar as possible. Like, I've got 100 repos, they're all Python pluggy, plugins, Ginger templates, like dataset plugins, they're all the exact same shape. They've all got GitHub actions, running workflows, and so forth. But it just works. Okay,

Carlton Gibson 14:07
good. Interesting. So you mentioned LLM, zero and chat GDP and things like that. But we're gonna I wanted to ask you some before we talk about those in more depth, which is your you're not as well as doing all this amazing working. So if you're now on the board for the PSF. Yes. So can you tell us a little bit about that what you're doing there and how you're finding it, because you're new. This is your first year on the

Simon Willison 14:29
it's my second year. Now I just hit the 12 month point. And it's interesting. So the reason I'm on the board and the PSF is that I'd been hassling the PSF other sort of low grade bases every now and then I'd go I'm really annoyed that the PSF isn't doing more to help make Python easier for people to get into like solving the the horrors of the Python learning, development, environment, all of that kind of stuff. And also the fact that it's very difficult to distribute applications written in Python because you You want to you don't want people to have to install Python to use your stuff. And I realized I almost had a snap judgment one day, I was like, You know what, it's not reasonable for me to complain at the PSF and not offer to help and not try to do something. So I put myself up for election on the basis of I want, these are the problems that I think we have set up addressing. And I got elected, which was a little bit of a surprise, because I didn't really I mean, I think it is name recognition, because you show up on the list of names people like, oh, I recognize that person or whatever. And of course, then I made it out now that I'm in the PSF, I realize that the PSF is not particularly well equipped to solve the problems that I was most interested in solving. Because the PSN Nobody told you, it's it's always difficult to understand quite what what these organizations are able to do. The PSF is basically a it's about, it's about money that's raised, and it's distributed around the Python community. And initiatives to the PSS focuses on the community and the health of the community. There's a huge amount of sort of sponsorship of events of, of initiatives like that, which is fantastic. The stuff I care about is it's not completely aligned with what the PSF is for, but it's not underlined, either. So what I'm having to learn is okay, how do what how do I align what the PSF can do with the things that I want to get done in a way that supports the mission of the organization so far? So it's been a huge learning curve, you know, this is my first time on the board of the nonprofit. It's understanding what levers are available to pull and what priorities make sense, and so forth. And yeah, so the first year I was in mainly in sort of just trying to understand what this how this thing is shaped and what it can do. Now that I'm through that, and looking forward to maybe trying to tweak those levers a little bit myself, put

Carlton Gibson 16:45
all your weight on this one.

Will Vincent 16:48
This is why the DSF we just switched to two year terms exactly for this reason, because it basically takes a year to get up to speed. And during COVID We had less turnover. And I feel like we got a lot done because we had largely the same crew for two or three years. Because it does, it takes it takes a year to just understand how it works. Right. Sorry, I interrupted that you were gonna know, the

Carlton Gibson 17:11
joke about thing, but I was gonna when time was talking, I was like, this is exactly Will's experience with being on the DSF board is that people think, Oh, the giant the DSF can do this. The GF can do that. And well, well, I hear from Willis, from his experience on the board. There is actually the word SF can't do very much.

Simon Willison 17:27
Well, it. I mean, I think it's interesting. I mean, we're gonna have Jacob Kaplan moss on in a couple weeks, who just joined back the DSF, after obviously working on Django and being one of the I think the first president. But when I joined, I had a similar list of things I wanted to do, which I guess I in hindsight, I guess I was lucky that they aligned with they were all things that could be done around like sponsorship and got I forget, I have a blog post on it. But I didn't I hadn't thought about the fact that maybe the things I wanted done didn't align directly with the mission. But But you're right, it's fundamentally these organizations are about money and community and helping others. I mean, one thing DSF is is now doing is having working groups, which the PSF has had mixed success with but at least some success. Whereas historically, it's just been everything goes to the DSF. And when you're on the board, that's kind of its own thing. It's unreasonable to be on the board and spearheaded an initiative. From what I've seen, I imagine it's similar on the PSF, or I don't know, are you thinking of like you actively doing it or more you can well, group that. That's what I'm still trying to figure out. Because the other thing is that the PSF is a like the PSF has staff, the PSF is a like this is unlike DSF. And the staff do incredible project. I mean, PyPI, I think is one of the most impactful things that the PSF does outside of the PyCon and event sponsorship and so forth. And so the directors are not there to do the work. The directors are there, essentially to sort of help make those high level decisions help set strategy and yet make decisions about where that where the money goes to a certain extent. Again, it's understanding okay, what also what's ethical and responsible to do like if I throw all of my weight and trying to push the PSF in one direction? Am I actually starving other important initiatives at the PSF are doing that just don't have to align with my own personal interest.

Will Vincent 19:19
Right. Well, and in a sense, your pure is not the word but you know, quite a few people work for big tech companies. And so it's there's even more of a potential of I don't know, not conflict, but it gets a little bit kind of watch what you're doing there. Yeah,

Simon Willison 19:34
I mean, and I think there were rules about how many PSF how many people on the board of directors can work for the same company as there should be because yeah, that's always a risk with these kinds of things. Yeah, I'm being unemployed by a large company. That gives me an aspect of independence, but most of our board members are independent. I'm not unique. Okay.

Will Vincent 19:55
I think maybe when Jeff Triplett was on the board, I think maybe it was the allocation was a little bit different. But yeah,

Simon Willison 20:02
I mean, we just, we just had new board members joined a couple of months ago. So I think we've had quite a reshaping just recently.

Carlton Gibson 20:09
Okay. Just before we move on, I

Will Vincent 20:12
got one last one, I promise. So, again, as this Django person, but a little bit of an outsider, I can ask you all the questions that, you know, I want someone to weigh in on. So an executive director, we're gonna have depth Nicholson on the podcast in a couple weeks, there has been talk of Django potentially having one. What is your experience been seeing an executive director at work in one of these organizations? Can you imagine why these organizations without one, since the DSF, considering having a paid full time person, as an executive director, it's something home the President and others have, it's been just it's been discussed, because I'll give my two cents I think a lot of the stuff won't happen, absent someone full time to do it.

Simon Willison 20:59
That completely makes sense to me like having, having this is one of the problems with boards of directors is if everyone's just a volunteer who's investing a few hours of their their time a week or maybe a month, it's very difficult to make progress on things you find you'll you'll have a meeting and it'll be that not much will have happened since the last meeting. And when once you've got an executive director and stuff that completely changes, you know, there was constant forward motion, where I keep on telling people, the I think the best thing about the Django Software Foundation has always been the fellows because that's the and that this is something I say, I don't understand why other opens like community driven open source projects aren't trying to imitate this exactly, because it works. So well. The PSF now has, I think, at least two fellows inspired by the Django Software Foundation, and those are incredibly impactful, the work that they're doing the work that Seth has been doing around security, and so sort of relatively new addition, absolutely extraordinary how much impact you can have with that. So yeah, I'm very, very keen on the idea of these nonprofit open source supporting foundations that actually have staff that can get just keep on making progress on things.

Carlton Gibson 22:07
But just my experience, just having been a fellow is that these other tasks, these non fellow tasks would arrive. And it'd be like, Okay, well, I'll do that. But you know, I've got a bit of time in the week, I can do that. But it wasn't really the fellow role. And there wasn't enough capacity to make any sort of significant progress on, you know, for instance, you know, reworking the Django project.com website, okay, don't do a little bit of work on it, but it's literally an hour or two here or there, and not the massive month long project, months long project that's going on now to actually do a proper assessment, and what does it need? And how do we refresh it in a sort of professional, you know, to a 2420 24 kind of standard? You know, rather than just Oh, yeah. Can you make a tweak? It

Simon Willison 22:51
turns out, there's a lot to be said, for having somebody whose job it is to get specific things done. You know that yes, exactly. Yeah. So I think that that sounds very, very sensible to me.

Will Vincent 23:02
Yeah, I'm biased. And I realized we we I think I think it was Anna, the past DSF president and I had a had a call with private call with with Deb Nicholson, the new the new one. And she sort of went through what one does in that position. And we were just like, Oh, my God is so need that. So yeah, I put that out there. But I'm not in the board now. So. But Carlton, yes.

Carlton Gibson 23:26
Can I no job. So I want to get so I've been using copilot and whatnot. And I think it's awesome. And it's it's, you know, you mentioned JavaScript earlier, like my Java scripts come on so much, because they're a bit like how do I, how do I filter this array to get the the one value that I need? And previously, that would take me 10 minutes or looking at because it's not something I do, you know, do it once every six months, but now I can just ask the LM it's got it. And it's not. It's not rocket breaking code. It's not it doesn't have any value other than it saved me. 10 minutes. So I guess my question is, how can I leverage that? And how can I leverage continue to leverage that and your tooling? How can I install that? And can I get something equivalent to the closed source that I can use? That's open source?

Simon Willison 24:13
Wow, that's a whole bunch of things to talk about. Yeah. So that's kind of let's get into it. Yeah. And with like, the thing that excites me about Ella lens is, I love them as as sort of teaching assistants, right? It's something I can ask questions, I can ask the dumbest question in the world at three in the morning, and I'll get an answer and it doesn't judge me and I don't like how to do a four loop in bash or

Will Vincent 24:42
whatever it is. You don't want to post posted on the Django forum like how do I do this? Exactly,

Simon Willison 24:47
exactly. I love that. And I love that. It lets me be so much more ambitious with the projects that I take on because like a great example I shipped code in go for I needed a little like high performance and At Work proxy router thing, and I, I ended up writing it and go because I don't know go, but chat GPT GP for those go throughout and I know go just well enough to read to the code and be able to tell if it's doing the right thing. And I can I can get it to write tests. So I ended up building this like 100 line, little custom go server thing with comprehensive unit tests and get him actions running continuous integration and continuous deployment running all of the things that I consider to be important for, like robust project, and I shipped it, and it's great. Like, last month, I had to make a change to it. And I find that GPT four and I worked with it, we figured out what to do. And, and I asked, I mean, that was extraordinary, because normally, I would never write something in go because I'd be fine tinkering with it. But I'm not gonna write production code and language that I'm not completely fluent in. In this case, I'm, I feel like me plus GPD. Four is fluid enough that I'm willing to deploy code written in the language I'm unfamiliar with. I've written code in AppleScript. AppleScript is notoriously a read only language like yeah, it does.

Carlton Gibson 26:08
There's like a continuum, this AppleScript, one and a pearl on the other, like, read only

Simon Willison 26:13
Absol. Absolutely. But yeah, I'm using AppleScript things, I'm using all of these weird little domain specific languages. I use JQ all the time now, because JQ is really powerful, but I can never remember. So I love that I love it as a sort of an accelerator for me doing lots of things I'm taking on more projects, which is terrifying, because I really had too many projects. And I'm like, Oh, I mean, me plus Jack GPT, I can probably get something working in 20 minutes. And of course, it takes two hours. But still, at the end of that two hours, I've got something that works. And it's interesting that I wouldn't have built otherwise.

Carlton Gibson 26:48
But it's that first 20 minutes that you wouldn't have put in, that gets you to the two hours,

Simon Willison 26:54
I do so much coding on walks with my dog now. Because I can be walking the dog and I can on my phone, I can just like prompt it to write me some code that does this, I can use the code interpreter mode, where it actually runs the Python code that generates so I can get back from an hour long walk with the dog. And I've got 50 lines of Python that I know works, because it actually ran the code, found the bugs, fix them, all of that kind of thing. It's incredible. Like, you can even turn on voice mode, I can literally talk to it while I'm on a walk with the dog. And it writes code for me. That's utterly surreal that that's even possible. Yeah, I love I love that aspect of it. And yet, but the as you mentioned, the problem with GPT is it's a, it's for a company called Open AI, it could not be more closed, right? It's this proprietary hosted model, they change it all the time without telling you what they've changed. So people keep on complaining that it's got weaker, it's worse with x and so forth. I never know if that's actually true, because it's basically a random number generator. And so it's very easy to assume that it's changed when it hasn't. But that's really frustrating. And then the great news is that in the past, like 12 months, we've had so many new options for running these things ourselves to these openly licensed models that you can run on your own hardware, and they're beginning to get pretty good. Like I don't use any of them on a daily basis, because GPT four is so good. So it's sort of my default, right? But I'm constantly experimenting with them. My favorite at the moment. And my two favorites are these Mistral models. There's Mistral seven B, which literally runs on my telephone. Like there's an app that runs on my phone. And it's not awful. Like I was on a plane. And I was using it to do the kinds of things I might have looked up on Wikipedia and Okay, it'll probably hallucinate stuff. So don't depend on it telling you the truth. But it's still useful for sort of getting things getting just starting to explore different ideas. And then the other one is this new one called MC strong, which is a Mistral a mixture of experts model. They just released that a month ago. And that runs on my laptop and is feeling it, the quality begins to feel like jacked up at 3.5. Like, it's very, very good. So if you've been resisting using these things, because you don't want to use some weird hosted model by some, like closed, open company, mixture or something you can run on your laptop right now. It's, it's Apache licensed, that the whole thing is Apache licensed, although whether it's truly open source is up for debate, because they won't release the training data that was trained on which right? I think that's the source code, right? I think that these models, the the raw training data is the source code that was used to compile the model, because you can't open source that training data because you ripped it all off. It's full of copyright data, and you can't just slap an Apache two license on someone else's copyrighted works. But yeah, so this stuff is really exciting. It's really interesting. So

Carlton Gibson 29:49
I want to come back to your talking about you've just mentioned that the copyrighted training data thing and so there's this, these lots of cases where the LLM will reproduce its training data Almost exactly. In cases, so the New York Times got this lawsuit patchy, explained that I kind of see it though. And I'm like, Oh, well, yeah, it is actually reproducing, you know, you type in underwater sponge and you get a sponge got Bob Squarepants come out from the bottom of the mind generator.

Simon Willison 30:21
This is so fascinating. The ethics of this entire space could not be more murky, like every aspect of this space, you're like, Wow, is that okay? And the answer is, maybe not. It's all very bad. And but so a lot of people have ethical qualms against this. And I agree with everything that they're saying, you know, the New York Times thing is, is it's the most recent thing is the New York Times filed a very big lawsuit against open AI a few weeks ago, and it was against open AI and Microsoft, and it was complaining about three different things. It was complaining that firstly, you took all of our work without permission. And it's copyrighted work, and you use it to train your model. And I don't think anyone is is dis is disputing that. That is what open AI, they use open AI, they use New York Times data as part of a vast amount of training data that went into these models. So effectively, you could look at it as it's a crawl of a sizable chunk of the internet that was used to train these things. But that included New York Times data, the New York Times say that open AI put more weight on the New York Times data than they did on other data that they trained on because of the high quality of that that training data. I don't know if that's conclusively proved or not, I think the GPT two paper a few years ago did explicitly less than the New York Times data was being used like that. So they might be assuming that that's still true, it probably is still too true. But this is one of the things I'm excited about this lawsuit is I want discovery, because I want to know how GPT four was trained, because they haven't told us so you know, that comes out of this that will be useful. Okay, so complaint number one, they trained without permission complaint number two, is that the models can spit out exact copies of New York Times articles. And this was news to me, I didn't I thought that the act of training, muddle the stuff up to the point that it won't spit out the exact copies. It turns out, if you set the temperature to zero, and then feed it the first two paragraphs of the New York Times article, it can often spit out the next four paragraphs. And sometimes there are very slight differences, like one word will be changed. But effectively, it's memorizing it's regurgitating the same thing,

Carlton Gibson 32:28
if you if you tried to publish that that would be clear violation of copyright. It would be exactly

Simon Willison 32:32
and then so the question is, well, our open AI publishing that just by having an interface where people can see it, and that's, I mean, so many of these things. I don't think there's a obvious legal I'm not a lawyer at all. But there's a reason this is going to go to court because that these are legal questions that are very blurry and unanswered. So it's like number two is it could reject regurgitate their content. And they've said, this means that people would bypass our paywall by getting the model to spit out articles, which is a bit of a loose claim, because you've got to have the first three paragraphs of the article anyway. But they did have a really interesting thing where they talked about the wire cutter, right, where the wire cutter is a New York Times Company, it does product recommendations, if you ask chat TPT for product recommendations, it will often spit out the wire cutters picks, but it won't give you the referral link. That's the wire cutters business model. And this is the definition of fair use. And American law specifically talks about whether the thing is competitive with the thing that it ripped off. And so the New York Times case, the main thing they're trying to demonstrate is this competes with us, this is harming us financially, because you can bypass our pay wall, you can go rip off work gets recommendations, all of that kind of stuff. So that's argument number two, complaint number three is actually about retrieval augmented generation. It's about the thing that Microsoft Bing does, and chat GPT browse does where you can ask you the question and it goes and does a search on the internet. And it'll find the New York Times article about something, read bits of it, and then, like summarize that and give you the summary back again. And so then the New York Times are saying, well look, you're clearly subverting our paywall you're, you're profiting from content that's derived from us. Now that one that was one, it's almost the one that worries me the most in terms of I think they've got a completely fair point in complaining about this. But summarizing stuff is my favorite use of LLM, like if we can, if we end up with legal precedent that you can't even copy and paste data into an LLM to get a summary back out. Again, that would be very harmful for the sort of the ways that these tools are most useful. But that's the problem is that I read the 69 page lawsuit, and it's very clean. It's very well argued, and I think, like, like I said, not a lawyer, but all of these points feel to me like points that are worth putting in front of a judge and jury and trying to get answers about. Yeah,

Carlton Gibson 34:56
I think I mean, two things come to mind, from what you've just said. One is, I know Google has been told it has to pay news publishers in various countries at various times, because it does exactly that. If you googled the news in account in Australia, I'd pick Australia. I don't know if it's applied in Australia, but it will go and, you know, get the Sydney Morning Herald and summarize that without you ever having to leave google.com. And they were, you know, the

Simon Willison 35:22
word came up those lawsuits, they were just about the headlines, the headlines, like the first few words of the story even and what these what generative AI is doing is so much more than much Google are clearly going to be if the Google are clearly going to be on the chopping block next, after opening after open to Microsoft, because they've got a prototype, like an alpha version of their search page that does exactly that. It just adds generative AI and it spits out a generated answer to your question at the top, they've been doing this with their like little content, snippet boxes, and so forth, as well over the past few years. And it's super worrying, right? If you've got a web where nobody ever clicks a link from a search result, because they just get their answers right there and search. What point is there in trying to buy Build a Profitable web business anymore? You know, so all of these ethical complaints are very, very legitimate.

Here's another question for you. So we know now that MLMs are being used to generate a lot of the content on the internet. How do you see this going forward? If MLMs are going to be really trained on themselves? Do you think that like, is this is 2021? The the, you know, the, the high point or is there a way out of that? Because it seems vicious. It seems like an aura Boris situation does,

doesn't it. And it's people have been talking about this for a couple of years now. And one point I heard that open AI, the reason they haven't updated their training data, like there was a training cutoff of what September 2021, I think. And the reason that that though, to that is that after that point, there was enough usage of these tools and the Internet was beginning to fill up with LLM generated text. And they didn't want to train LMS on LLM generated text because of the aura Boris effect. At the same time, in the openly licensed language model community, almost all of the really good ones actually trained on GPT. For output, like the way you the way you build a really useful, like chat tuned language model is you need to give it 20,000 examples of good conversations. And the easiest way to get those is to get GPT for to spit them out. And then you train your model on GP for. And so if that was such a bad thing, we wouldn't be saying models that were trained almost exclusively like that show up at the top of the leaderboards. So I think this is all I mean, this is all part of the larger problem that we really have very little insight into how these things work. They are giant like 16 gigabyte blobs have floating point numbers, where and we're still trying to figure out just the basics of how you sort of poke around inside that weird matrix brain and figure out how it's working and what it's doing. And so yet maybe the fears of LLM training on the LLM output are don't don't actually work out. Maybe it's okay, maybe it's complete catastrophe, we have no idea. And it's funny that we had no idea six months ago, I still have no idea now. So despite the rate at which this technology is improving the rate at which your understanding of it is very sort of dubious in terms of how much we can figure out.

Carlton Gibson 38:17
So it really is a new world. It

Simon Willison 38:21
is and as a computer scientist, it's infuriating, right? Because I like computers that do exactly what you tell them to do. And you can write tests, and you can fire up a debugger. And everything is, is is repeatable and understandable. And these are not that at all. It's like a completely sort of weird, blurry alternative world in which everything's based on vibes, you, you come up with weed, you pick a model, and you poke around with the see if the vibes feel right. And then you tweak your prompts. And does that seem better? I mean, it kind of does, but it's awful. It's really difficult to do sort of responsible development on top of it. It

does seem like the closed LLM is like, you know, like if I'm a hospital or if I have billing records or like very Nishi things. Lm 's are fantastic. And especially like I'm in Boston, there's a lot of research places they're like, we can't use an open LMT thing but these close things are definitely being sold and used on whatever industry company has huge amounts of their own data. I would say I almost feel like that's got more promise than this like the entire web being you know, stolen approach in the long run well, I don't know where that is.

So Bloomberg built their own lamp they train their own language model on the internal financial documents. It was supposed to be the best possible LLM for finance. And then it turned out the GPT four came out and as a general purpose model, it was beating the Bloomberg one on financial tasks because the this is one of the things that's so challenging right now is the rate of improvement if these things are such that if you've got a project will take six months, you may be shouldn't do that project because you might spent six months on it. And then GPT 4.5 comes out, and it solves the problem that you've just spent six months trying to solve. And so there's this interesting strategic problem where At what point do you actually settle down and start building on this stuff? As opposed to thinking, you know, what would be quicker is if I waited two months and then started building because I get a better result than if I started building today. That's absurd. But that's a genuinely the position that we find ourselves in

Carlton Gibson 40:24
that Zeno's paradox for the 21st century completely.

Will Vincent 40:29
Well, if you I don't know if you have you ever read this book, AI superpowers that's a couple years old now, by Chinese, Chinese. He's Chinese American, he's works in China and us a researcher. And I read that I think I read that five years ago, he had he summed all this, this is before opening, I came out. But he basically said, you need three things you need the algorithms, which finally like we had at that time, you need training data, and then you need processing power. And he argued with the cloud, that basically it all came down to data, you know, this is back in the day, because we have the algorithms, they're basically open source, we have the cloud computing. And so it's really all about training data. I think he went on to say he thought China would surpass the US for that reason, because it has no privacy controls. But all of that is to say to you, where do you? Do you see it as tweaks? And is there more juice to squeeze out of these LLM models? Do you think? Or is it really more about like a data science thing, where it's all about what you put in and trying to optimize that I'm gonna pick between the two, I'm

Simon Willison 41:29
very competent, it's both, mainly because if you look at the open model community over the past, like, since since since February, people just keep on coming up with new little tricks that make the models run faster and smaller, like the fact that I can run a chat, a GPT 3.5 class model on my laptop now. And I certainly couldn't do that a year ago, because like the models that were coming out, like the first questions, and lambda and stuff were much larger, required, much more hardware, much less optimized. So there are so many techniques that can be used to make these things. And I like smaller and faster, right? I want, I want a model that works on my phone and can do the things that I need to do, I wanted to be able to summarize and extract facts and call functions and all of that kind of stuff. But at the same time, people keep on finding that the higher quality the data, the better. Like, it really is so much to be said, especially when you're fine tuning these models, just having super, super high quality data that you feed into them. If the New York Times thing plays out one way, we may find that it's no longer possible to just steal the entire internet and train your models on it, at which point that becomes raises some really interesting questions. The thing that worries me most about that is does that mean that other lands then become incredibly expensive to build because of the licensing costs to the point that you don't give them away for free? And so does that mean that only people who are very wealthy can afford to use these tools, whereas today, anyone who can afford an internet connection has access to some of the best in class of these models. So that really scares me like the that I feel, despite the fact that the the the ethics around copyright. I mean, there were very, very real concerns here. But at the same time, a world in which only the most wealthy have access to the, to these tools, and that feels unfair to me as well.

Carlton Gibson 43:17
Yes. And we can't lock these tools up, they are super useful. Like to take them away, it would be foolish, like

Simon Willison 43:24
also, if you banned them, I've got a USB stick with half a dozen models. You create a blank market of people. It's very cyberpunk, right people swapping USB sticks with Mistral that was released on them. It's

Carlton Gibson 43:38
super. So there was a paper a little while ago, just pick up what he said that about open AI saying we haven't got a moat, or something like

Simon Willison 43:46
that. There was a leaked memo from Google, it was somebody within Google put this memo together saying saying there is no moat for this technology. It's interesting to revisit that that I think it was it was quite it came out and maybe March or April of last year. And it's interesting to look back at that and say, Okay, how much of this played out? Because one of the real challenges with this stuff is, if it's all just driven by human language prompts, the cost of switching to another language model might be as simple as saying, Okay, we'll run this against Claude instead of GPT. Four. And maybe that will give you the exact same effect, right? Or maybe it won't, because so much of the prompting comes down to these very small tweaks that you make where you like, oh, okay, if I capsulize the instructions to output in Markdown, maybe it'll actually listen to me this time. But that effect itself is kind of hurt by the fact that openly I upgrade the role models. So just because that won't work. Now, will it still work in a few months time? It's it's kind of uncertain. So that's part of it, then. There's also the fact that the the closed model providers are up against 10s of 1000s of researchers around the world collaborating together. That's something I really like about the open model community is there's all of this, this sharing and this accelerate And that comes from just having 10s of 1000s of people worldwide, all trying to solve these problems and open AI are an incredibly talented, experienced set of people. But I still don't like their chances against 10s of 1000s people around the world. Although, of course, when those people around the world figure something out called Open AI, I can just take that research and use it themselves. So so you can they can sort of keep it that way. But yeah, it's and there's also there's the compute, right? Like, it's, we still don't know why GPT four is so much better than everything else. The most likely thing is that they ran it, they trained it for longer, and they trained it on more data than anyone else has been able to do yet. But people are catching up now that there is if you have $100 million, maybe it's worth trying to trick funneling that into data and training, you know, but it's not like there's a shortage of investor money floating around the space at this point. Yeah, I guess. And

Will Vincent 45:57
the economics are so crazy, because yeah, it's $100 million. But then, then it's just a file that, you know, anyone you can sell to anyone for virtually nothing. Right?

Simon Willison 46:07
When people, people often complain about the environmental impact of language models, where they say, Well look, training, like training, a language model takes this enormous amount of carbon dioxide, which is true at the same time, it's about the same amount of conduct, such as flying a Boeing 747 across the Atlantic twice, you know, which is a vast sum. But I would argue it benefits more people because your your airline flight benefits the people on that plane, the language model, if it's then used by a few million people over the course of six months, it feels like you're getting more value for your for your for your sort of carbon dioxide at that point.

Carlton Gibson 46:44
Yeah, I have a question about the carbon, the co2 usage. So I my understanding of ML, which is machine learning, which is quite limited, but it was the training was the hard bit. But then once what you get out of the training algorithm is a kind of vector operation, which you can run almost, you know, quite cheaply. And then I saw though people complaining, and I didn't have the time to follow up, but that every time you generate an image with Dolly, or or whatever, that uses so much water or so much this because it's so computationally expensive to run the model, not just train the model is that this

Simon Willison 47:21
is an interesting question. So like I said, I run models on my iPhone, I run models and my laptop, I'm not worried about their resource constraints. But again, I don't know what GPT four is running on. I'm pretty sure it's running on a full server rack of GPUs. So my hunch is that for the very large models, yeah, there's a lot of costs in the inference, I still think it's a fraction of what it costs to train them. That's, that's the intuition I've gotten from this. And you know, like the image, like stable diffusion also runs on my phone. So there are versions of these models, where the environment, environmental impact of running them is no worse than than turning your laptop on. That's, but I don't really have good insight into what the models are doing.

Carlton Gibson 48:04
So I had so we've talked all about LLM, I wanted to ask about your tool, because if I want to run this, you've got the perfect tool for me to download and do such please tell us about that. Because we we've talked about all the exciting things out okay, if I actually want to do it, what do I have to do?

Simon Willison 48:19
So I built this tool in Python called L L M, I got lucky LLM was still available on the package index. So you can PIP X Install LLM. And you get a command line tool for interacting with models. But what's really fun about it is that it's it's inspired by dataset, it's all based around plugins. So out of the box, you can give an open AI API key, and it will run against open AI. And then there's about a dozen plugins you can install that will add additional models, including models that run it on your own machine. So you can essentially pip install my tool and then pip install a plugin that adds a language model to it. And now you've got a four gigabyte file on your computer that you can start interacting with. But crucially, the interface is the same no matter what model you're using. So it's LLM space, double quotes your prompt, or you can pipe things into it as well, you can do cat my file dot txt pipe LLM. And then if you by default, it'll use your default model. If you stick dash m space clawed on the end, and you've got the cloud plugin that will run that against Cloud and so forth. And of course, everything it does is logged to SQLite. Because I do everything with SQLite. about using this tool is that it's a way of building a sort of database of all of your experiments across all of the different models. So I just use it on a daily basis, but all sorts of different bits and pieces. And I've accumulated like a few 1000 prompts and responses in my SQLite database of things that I've tried out, maybe at some point, I'll do some analysis on that and try and start comparing models that way. But really, the fun thing about it is I'm trying to make it so whenever there's a interesting new model, you can install a plug in and start playing with that model and that works for hosted goes and it works for local models as well. And yeah, it's, it's really, really fun to hack on. One of the things I've realized and playing with it, one of the original ideas is the Unix philosophy, the Unix command line of piping things to other things. It's an amazingly good fit for language models. Because language models, it's a function, you, you pipe it a prompt, and it gives you a response. And so one of the things that I use Mitel for is it's, it ties into this concept of system prompts, which is something that open AI did originally and other models have started picking up where you've sort of got a second prompt that gives you instructions about what to do with your other data. So a great example is I can take I can take a file, and I can say cat, my file.py pipe, LLM dash dash system, write me some unit tests. And then the model gets the right to unit test, and it gets a bunch of piping code. And it'll spit out a bunch of unit tests. And of course, they won't be exactly what you need. But it's that skeleton that you can start hacking on, it's really good at explaining code, I pipette code and say, explain what this thing does. I use it for release notes not to publish, I kind of feel like it's rude to just straight up publish something that an LLM wrote for you. Because I mean, what are you doing right? Like, it's fine to take, as long as it's fine to publish something that you're willing to sign your name to, because you at the very least reviewed extensively, and hopefully revised it and tightened it up. But there are lots of projects out there that don't bother writing good release notes. And what you can do is you can check out their git repository, and you can do git diff between this version and this version, pipe, LLM dash dash system, right release notes and GPT fork and understand the diff format. It'll it'll read it, and it'll spit out release notes, which in my experience are about 90%. Correct and 10% slightly wrong. Or maybe there's a loose motion in there. And that's fine, right? That's good enough. For my purposes, we're just saying, Okay, what have they done in this release that they didn't otherwise release notes for? So yeah, I recommend trying this thing out, partly because it's fun to play with models. And something I'll say about the models you can run on your own laptop is, they are kind of crap like they are, they're very, very weak compared to GPT. Four. But that's a feature because it's easier to build a mental model of how they work when you work with the weak ones, like GPT four, because it's so good. You can use it for a few days without really seeing the weaknesses and the flaws in it because it gets most things right. But it's still just, you know, guessing what word should come next, it's doing the same kind of thing, but little ones will hallucinate wildly, which is so useful for getting a feeling for Okay, these things are not intelligences, these things are dumb autocomplete that just been scaled up to be able to cope with lots of things I love. I use myself as a test thing, because I've been around on the internet for long enough that these things can answer questions about me, like I can ask for a bio. And some of the models will get most of the details, right, they might say went to a different university or whatever. And some of them will just hallucinate wildly. And so I've had models telling me that I co founded GitHub and things like that. And it's amusing, but it's also quite good as a sort of like just an initial sniff test to see, okay, how good is this model when it comes to hallucination and that kind of thing? Okay, super.

Will Vincent 53:21
We're coming up on time a little bit, I want to add one positive note, which I've heard about, you know, we mentioned that these tools could further increase the economic divide, but they are democratizing a lot of things like in unexpected ways, at least to me, like, for example, someone I know, is a admissions director at UC Berkeley. And someone and a friend asked that person, hey, you know, what is it now? What is it like with these college essays now that chat UTP exists? And he said, it's actually great, because it's an equalizer, because rich kids have had private essay tutors for forever. And now everyone has, you know, 80% 90% of it. I mean, it probably makes them all sound kind of the same anyways. But it's a tool that people who don't have these external resources they know how to use it can, you know, up to, you know, just like Grammarly, and all these tools to help increase the writing and, and so that's, I was pleased with that, because I think it's very easy to get a little Doom and gloomy about it. But it is, for almost no money bringing these resources to so many people who didn't have them before.

Simon Willison 54:23
I couldn't agree more. I feel like we always get very hung up on the many ethical flaws of this technology and the harmful ways they can be used. The positive ways they can be used are just enormous. Like the reason I'm spending so much time with this tech is that I do believe that it's genuinely useful and it does genuinely provide enormous amounts of value to enormous numbers of people. If you have English as a second language, this tool is phenomenal. Right? You can now you're no longer cut out of those things in your life knows the parts of society where you need to be able to write like somebody who's a native speaker who's at a certain level of energy Question. And that's that that Valve has been completely flattened. I am, like people sometimes say, oh, it's not worth learning to program anymore because the chatty video just do it or I think that's complete rubbish. I think now is the best time it's ever been to program. Because anyone who's coached somebody learning to program has seen that the first six months are just utterly horrific, like, it's, it's so frustrating because you try something and you get this obscure error message that doesn't make sense to you. And you can bang your head against it for two hours. And maybe you give up lots of people do give up, they assume that they're not smart enough to learn to program. And it wasn't that they weren't smart enough, it's that they weren't patient enough. And nobody warned them how tedious and stupid this stuff is. And now we can give them a tool, we can say, Look, if you get an error message, paste it into chat GPT and nine times out of 10 it will tell you what to do next, and how to get out of that condition. That's phenomenal, right? The the flattening of that learning curve getting more people, my my ideal endpoint of all of this is I think every human being deserves the right to have computers automate stuff for them. Like I can do this, right. I've got 20 years of programming experience, if there's anything that I can tedious in my life that the computer can automate, I can get it to automate that thing. But it's ridiculous that you need 20 years of experience to do that, like that should be a universal human ability. And I think this technology might get us there. I feel like if we get to a point where people are able to get those tedious, automated things just done for them, because because they didn't have to learn to program first. That feels enormously valuable to me. Yeah,

Carlton Gibson 56:32
absolutely. Um, so I've just lighting up as you spell out that scenario, like one of the sort of parallels to that is why hasn't technology sort of pervaded more deeply throughout the clerical world, for instance, it's like people are still doing with paper or sort of manually repeating a task on a computer. Yes, it's in a spreadsheet, but it's not automated, why not because they never picked up that programming skills. But all of a sudden, if these these assistants are built into Excel, or built into Word or built into what the software they're using, it can be automated. Easily.

Simon Willison 57:07
I heard a horrifying story the other day about a a local fire chief, like the guy running a fire department, who due to some mess up had to manually unsubscribe, 2000 people from a mailing list, and he spent a full day clicking the unsubscribe button over and over. And again, in some horrible piece. This is somebody who has a very real, very important job to do. And I think this pattern plays out a lot. A lot of people with a lot of important things in life end up stuck for a day doing something tedious and manual because they didn't we haven't given them the tooling that lets them lets them not have to have to do that. So yeah, so I'm really excited about that. I think as an educational assistant, it's amazing. I think one thing that isn't necessarily talked about enough is these things are actually very difficult to use effectively. And, like they feel like they should be easy, because it's just a chatbot. But actually, to really get the best out of them. You have to understand the chromatic techniques, you have to know what it can do, what it can't do, what are the things that it's going to break on? I love that we've created computers that are bad at maths and can't look at facts for you, which are the two things that have always been best. Like so people sit down chat with you, well, like it got maths wrong. And I asked it for a fact. And it couldn't tell me the answer, because that was that's not what it's for. But that's really not obvious, you know, but

Will Vincent 58:26
that they can now right? Isn't that the whole thing with the Sam Altman. One of the things is it can do math now, allegedly, version

Simon Willison 58:36
10 If you give it tools, so chat GPT, like, the paid version now has access to Bing search. So it could look up fact. And it has access to code interpreter. So it can run mathematics using Python, which on the one hand, it does fill those giant gaps. On the other hand, it makes it even harder to use because now you have to know what Bing searches, you have to understand bits of Python, you have to know this is the kind of thing where it's going to it's got compute, it's got vision support, so it can read documents. But the interaction of all of these features is incredibly complicated. A great example is some sometimes I will give it like a photograph of a receipt and ask it to add up the numbers in the receipt. And it will then write Python code that imports Tesseract use Tesseract OCR to pull up the numbers, and that will try and add them up. But of course, Tesseract isn't as good as GPT vision, right? If it had, if it had taken that image and used its own. It's like built in OCR to pull the numbers out and then pass the price and I just got a more reliable result. How the heck am I supposed to explain that to anyone like director's cutting but as they add more features, the matrix of complexity of how the features interact gets even more complicated. Yeah, so having experts skills to use this stuff gets harder.

Will Vincent 59:52
But I think you said this in a recent interview. I mean, using a chat bot is one of the worst user it's like terminal, right? We're not like no one's using Terminal. Right. So it's do you have any? And I guess the final question for me, do you have any predictions on, you know, the mouse equivalent of where this stuff goes? Because we're not all going using chatbots? Forever? I don't?

Simon Willison 1:00:11
I certainly hope not. Yeah. Like, I mean, yeah, like you said, the problem with chatbots is there's no discoverability, like, you've just got a blank box to start. And they might give you a couple of suggestions. But it's, it's a terrible user interface. But that's the thing that's really exciting about the space right now is, there is so much low hanging fruit, like you could sit down and just couldn't come up with an alternative UI for interacting with language models. And right now, maybe you'll invent the thing that everyone be using for the next six years, because there's been because they were so early in this process, there are so much scope for innovation around how we use these, how we interact with them. And that I find really exciting. I love that now that people are beginning to understand this tech and what it can do. But we need designers on this stuff. We need user experience people we need. But it turns out machine learning nerds at the worst possible people to actually make use of this technology. That thinking in terms of okay, well, I've got to optimize my gradient descent or whatever, you don't need to know what gradient descent is to innovate on top of language models, that's almost a distraction from what we're trying to what we can achieve with them. Brilliant.

Carlton Gibson 1:01:18
Brilliant. Okay, so we are going over, I did want to ask you about datasets, but we kind of run short. So can you give us the 32nd? What's new in datasets? Yeah, talk to you about it before. But what's what's hot,

Simon Willison 1:01:30
the most, the most exciting new feature and dataset I've been building feature called enrichments, where the idea is that you've got, say, a CSV file with 10,000 addresses in and you load that into a data set, and you want to see them on a map. So you need to geocode those addresses. With enrichments, you can have a plug in that lets you select the address column and say geocode this, and it'll go and churn away against the geo coder of your choice. And it'll populate latitude longitude column next to it. But crucially, these things are all built as plugins. So you can have a plugin that does geocoding, I've got a plugin that does just regular expression extraction of things. And I've got a GPT plugin. So you can say, take this, take this database table, run this prompt against every single row, and then put the output of that prompt in this other column. And there's one example of that I can do the GPT vision thing. So I actually fed it a database table with 100 URLs, two images, and told it to write me description to those images. And I got back three or four paragraphs per image describing what was in the image right there in my table. And of course, now I can search against that, and, and all of that sort of stuff. So I'm really excited about that. It means that dataset is evolving into more of a data cleanup and manipulation tool, which is a departure originally, it was about publishing exploring data. But I realized that the problem I most want to solve, especially around journalism is if somebody gives you 100,000 rows of data. What the heck are you supposed to do with that? Right? If especially if it's slightly too big to put in Microsoft Excel, but you can't afford to hire some programmers to build you like a custom Django postgrads app for this thing? What do you do, and if I can build plugin based tools, especially with dataset cloud now, so I can host them for people where you can upload your CSV file, click, click geocode, wait a couple of minutes as a progress bar fills in. Now it's all geocoded. Now you can visualize it on a map. That's really exciting. And yet the and the enrichment, I tried to make it as easy as possible to write additional enrichment as plugins. So I'm hoping to see people building their own enrichment for all sorts of other data transformations they might want might want to pull off,

Carlton Gibson 1:03:31
I have to ask just one more question that if you're doing it on dataset, code cloud, sorry, and I've got an enrichment, and I'm going to give you some software? Or did you find a solution to how you can run my software in a sort of trusted way? Well,

Simon Willison 1:03:45
at the moment, I can review your software and make sure it doesn't fail, then I've actually Datasette Cloud has a feature now where I can basically say for this customer pick, install this additional package. So I do have that now. And also basic cloud is I built it on top of fly.io Precisely because they offer secure containers. So with dataset cloud, every customer gets a separate container. So if you somehow managed to, like screw up the security in your container, it's isolated. Like that's a problem for you. It's another problem for other customers in the system, which felt really important to me.

Carlton Gibson 1:04:17
Yeah, okay, good. Good, because I know you've been noodling on that problem for quite a long time.

Simon Willison 1:04:22
Yeah, still, I still want to, I want to be able to run web assembly server side reliably for untrusted code. That's like my ultimate goal. Because, yeah, I want users to be able to say, here's some Python code, run this against all of my data to transform it, without risk of them breaking things or whatever. And I, it feels like we're almost there with web assembly. And that would be amazing. If I can take untrusted Python code and run it in a web assembly sandbox that's locked down and it can't do network access and can't reach the file system. That would be amazing. Okay, super.

Will Vincent 1:04:55
We could go for another hour. But thank you for thank you for taking the time to talk about all these things. Gonna have links to everything and you know, Data Set Data Set cloud enrichments. Those are I think the three big things that fans of yours should go take a closer look if they're not already familiar. Cool.

Simon Willison 1:05:10
Yeah. Thanks. This has been really fun. Yeah, I'll um, I'll put together some links to this as well.

Carlton Gibson 1:05:15
Okay. Thanks, Simon. Thanks so much for coming. That was really awesome. illuminating and, you know, filled in so many questions around a really hot topic. So super.

Will Vincent 1:05:22
Thank you, everyone, for listening. We're at Django chat.com. And we'll see you next time. Bye. Bye.

Carlton Gibson 1:05:26
Bye bye.