Transcript 108: Sam Tobin-Hochstadt

THE COGNICAST TRANSCRIPTS

EPISODE 108

In this episode we talk to Sam Tobin-Hochstadt about Racket and Clojure. 

The complete transcript of this episode is below.

The audio of this episode of The Cognicast is available here.

Transcript

Hello, and welcome to Episode 108 of The Cognicast, a podcast by Cognitect, Inc. about software and the people who create it.  I'm your host, Craig Andera.  

Okay.  Well, all these things I'm about to tell you about are events happening in 2016, just in case you happen to be listening to this episode down the road.  The first thing I want to tell you about is Stu Halloway is going to be speaking about clojure.spec at two different conferences.   The first is Strange Loop, which is held in St. Louis, Missouri, September 16th and 17th, and the second is JavaOne, just after that, September 18th through the 22nd.  

Again, he'll be talking about clojure.spec at both of those.  The title of the talk is Agility and Robustness: clojure.spec.  Stu is really on fire for spec.  It's really cool to see.  He's got a lot of energy, always an entertaining and thoughtful speaker, so check those out if you're able.  

Speaking of conferences, The Conj is coming up.  The Conj's location and dates have been announced.  That'll be held December 1st through 3rd in Austin, Texas.  It's a new location for The Conj.  It should be pretty fun.  I'm going to be there, so I'll see you there.  

The call for proposals closes Friday, September 16th.  So if you are thinking about submitting a talk, and you should, make sure you get that in by Friday, September 16th.  

There's a ClojureBridge coming up in Pittsburgh September 9th and 10th.  You should check that out, ClojureBridge.org.  

And there's an interesting workshop being held in North London.  It's about 3D printing.  The title of the workshop is Clojure and thi.ng workshop.  This is a workshop that will talk a little bit about how to do procedural generation of object models, which you can then print.  I've actually done this and used Clojure to do it, and it's a really good fit.  If you happen to be in the North London area, that workshop is going to be September 10th and 11th.  If you look for Clojure and thi.ng workshop, I'm sure you'll find out more information about that.

Back on the conference trail, there's a ClojuTRE, which as we well know, I have difficulty knowing how to pronounce, but that's okay.  ClojuTRE 2016, that's going to be held September 10th in Tampere, Finland.  This is a free conference that's put on by Metosin, so it's got a single track, a late start, short talks--they're only 20 minutes long--and kind of a cool party--I'm told--afterwards, so that sounds like a really interesting event.  Certainly they announce it as being open to both newbies and seasoned Clojurists, so I highly encourage you to check that out if you happen to be anywhere near or can get to Tampere, Finland.  

That's all I've got for you in terms of announcements.  We'll go on to Episode 108 of The Cognicast.

[Music: "Thumbs Up (for Rock N' Roll)" by Kill the Noise and Feed Me]

CRAIG:    We're all good.  Like I said, very casual.  All right.  

SAM:    Okay.

CRAIG:    Cool.  All right, so welcome, everybody.  Today is Wednesday, August 3rd, 2016, and this is The Cognicast.  Today we're very pleased to have with us a member and contributor, a member of the Racket community, a contributor to Racket, an assistant professor at Indiana University, Professor Sam Tobin-Hochstadt.  Welcome to the show, sir.

SAM:    Thank you very much, Craig.  I'm really happy to be here.

CRAIG:    We are thrilled to have you.  We've had Matthew Flatt on a couple times.   He's always been great.  We kind of circled around again to say, "Oh, we should talk to somebody in the Racket community again.  They're always so awesome," and your name came up.  We're like, "Oh, yeah, yeah.  He's done some interesting work," so we are thrilled that you were able to take the time to come and talk to us today.  

But before we get into matters technical, most likely anyway, there's a question that we always ask at the beginning of the show.  I warned you about this.  This is the question about art.  We ask our guests to share some experience of art, whatever that means to them.  I wonder what experience you'd like to share with us today.

SAM:    I want to share an experience of architecture.  In the city of Barcelona, which I got to visit in 1998, there's this incredible profusion of architecture by the modernist architect Antoni Gaudi.  He designed a huge number of buildings and as well as everything from the Cathedral that they've only just finished in the last few years, after something like 80 years of construction, to the sidewalk tiles in the city.  

His style is very flowing, natural, curvy, and really distinctive and unlike the sort of architecture that you see just about anywhere else.  It's one of the incredible things about being in Barcelona is walking around and just coming around a corner and you'll see this amazing building with these remarkable colors and shapes that just seems like it's been poured into place just a few moments ago. 

CRAIG:    Hmm.  Now I've been to Spain briefly, and I visited at least one cathedral there.  It's been a while.  It was the one where Christopher Columbus is entombed.  I forget the name of it.  It's huge.  I think it's in Seville, but I think of a cathedral, which I believe is one of the buildings you said you were talking about.

SAM:    Yeah.

CRAIG:    And I always think of the sort of classic giant, Notre Dame style, huge, impressive, dominating stone architecture with huge ceilings and very medieval in feel or seem, or at least renaissance.  Is the one you're talking about at all like that, or is it radically different?

SAM:    It brings in a lot of the sort of ideas that you can see in classic, medieval cathedrals, but also a lot of really different, distinctive ideas to Gaudi.  The cathedral is called the Sagrada Familia, and the thing that's most distinctive about it is that it looks sort of like a sandcastle, the way that if you go to the beach and you make a sandcastle by, like, taking wet sand and pouring it to make towers that are sort of futuristic and drippy and that sort of thing.  That's what this truly enormous cathedral looks like is it has many, many towers and spires that look like they're poured from wet sand.  

CRAIG:    Very cool.  I'll have to take a look on the Web for images of that.  That sounds very, very cool.  

SAM:    Yeah.

CRAIG:    Thanks for sharing that.  Well, awesome.  I mentioned that you're a member of the Racket community or contributor.  We've had another member on a couple times, Matthew Flatt.  He was great.  We were thrilled to have him on.  We're thrilled to have you on as well.  I wonder if you could, since this is kind of how we know you, talk a little bit about your involvement in Racket and just kind of what you do over there, if you will.  

SAM:    Sure.  The primary thing that I do that I've worked on in Racket is a system called Typed Racket, which is a system that I originally created when I was a graduate student and that I've continued to develop over the past ten or so years and that's now got a lot of other contributors besides me, which is very gratifying.  The basic idea of Typed Racket is that it's a way to take your existing Racket program and add some static types like you would see in statically typed functional language like ML, Haskell, Rust, Swift, or F#, and you can type check your program.  Not only can you do that for your whole program, but you can also just do it for part of your program.  That's an idea called gradual typing that says that you can add these types to just part of your program and the whole thing still works even though you've only changed one or two components.  That's the biggest thing that I work on in the Racket community, but I also work on tons of other things ranging from continuous integration to bug databases to pattern matching libraries to who knows what else.  Those are some of the things that I work on, but primarily the thing that I've spent the most time on is working on Typed Racket.

CRAIG:    Mm-hmm.  We've had Ambrose Bonnaire-Sergeant on the show and, of course, he is the author of Typed Clojure, and he's very clear that he totally took many, many ideas and I can't remember what word he used, but he might even have said stole or certainly took major inspiration from Typed Racket.  I know that the two of you are well familiar with each other.

SAM:    Yeah.  Ambrose is currently my graduate student at Indiana University.  He's been here for a few years now, and we've been working together, actually, on improving Typed Clojure and figuring out what additional ideas can go back and forth between Typed Clojure and Typed Racket, and how the different systems can inform each other and what new experiments we can do in each system that can help the other one grow.  That's been a really great experience for me to learn how these ideas play out in a different language and a different community where people have different priorities, where people are looking for different things, where the language is very different in some important ways, and that changes how all of the ways you put together the type system fit together and what you want to do, how you want to design things, so that it can fit best into that language that you're working with.

CRAIG:    I'd love to hear more about that.  I think there are any number of reasons for the two communities to learn from each other.  I always have a great time every year when I go to RacketCon, even though I don't really do any Racket programming at all.  I have a Lisp that I use for work and it's what I'm familiar with, but I still really enjoy hearing from Racketeers, I guess is the word.

SAM:    Yeah.

CRAIG:    And hearing about Racket.  It's just fascinating to me.  I'd love to hear some of those insights that you've been gaining, both about the languages and the ways that they differ and the ways that they could learn from each other, and specifically about Typed Racket and Typed Clojure.  What have you discovered?  What have you found?  What's been interesting about those interactions, comparisons, and that type of thing?

SAM:    Sure.  There's a bunch of examples.  One that comes up really at the beginning when you start thinking about comparing the languages is just how people write data structures in the two different languages.  In Clojure, people usually write data structures using maps with symbol keys.  That's a really common way of using the lightweight syntax that Clojure gives you and making things that are flexible and can be extended later, and making use of the high quality data structures that Clojure provides out of the box.  

That's a really common pattern for writing data structures in a lightweight way whereas, in Racket, the structure definition form is used almost all the time when you want to create a new data structure, and so that's, I think, analogous to a def  record in Clojure.  That's used really heavily.  People almost never use hashes when they want to build their own data structure unless they really need it to be extensible in some particular way or easily able to be transmitted over a network or something like that, or serialized to JSON or something like that.  

The type system for hashes in Typed Racket is pretty boring.  It's basically like what you would expect if you knew about hash tables in any other language.  You have a type for the keys, and you have a type for the values.  That's about it.  

Whereas, in Typed Clojure, that's totally insufficient to support the way people actually program in Clojure.  So one of the things that Ambrose developed when he started working on Typed Clojure was a pretty sophisticated type system that can track interesting uses of maps in Clojure saying, well, this key is going to have this type.  This key is going to have that type, and so on and so forth.  Also, additional features that are saying these keys are definitively here in the map.  These other keys are optional.  And if they're there, this is the type they have, but they might not be present.  These keys are definitely not in the map, and that turns out to be important for other uses where people check, for example, that they've done an appropriate data structure conversion by looking at the fact that some key is no longer there in some map.

CRAIG:    You mentioned the Racket structure definition.  You said it's somewhat analogous to a def record in Clojure.  The thing that makes me wonder is when you define a structure in Racket, is the resulting value open?  In other words, can I associate additional values in there that weren't part of the original structure?  

SAM:    No, you can't.  

CRAIG:    Okay.  That's an interesting property of records in Clojure is that you can.  

SAM:    Ah, interesting.

CRAIG:    Yeah, so you can say a person consists of a first name, a last name, and an address.  Then I can hand you that value.  You can take that value, and you could associate on favorite food if you like, and it remains.  It retains its type identity, but it still has all the keys, so it is truly associative.  In most ways a record is actually like a map in Clojure with the addition of having a distinct type identity.  

That's interesting because it does point out.  It goes right back to what you're saying about the differences in the way that you program it, and I find it interesting that you ran up against that and mentioned it as one of the first things that you encountered when you were thinking about the differences.  

SAM:    Yeah.  That's one of the big differences.  There are lots of other differences that come up based on how people program in Clojure and how the languages are different.  

Another big difference that manifests itself sort of less obviously, but is really fundamental to how the two systems, Typed Racket and Typed Clojure, work differently is that Clojure is really oriented around a sort of traditional Lisp style top level where you're evaluating a series of forms.  You might switch name spaces at various points, declare various things, but really loading in a new file is evaluating a series of forms.  Whereas in Racket, almost everything is done via the module system, which really isolates things a lot more.  There's no redefinition, and modules are really pretty isolated from each other.  You can't go into someone else's module.  You can't switch to a different -- switch in the middle of something to a different module or do any of those things.  

That also means that, for example, Typed Racket is implemented as part of a module system.  It's just a normal use of the module system whereas Typed Clojure has to sort of hook in at a lower level where it's intercepting exactly the data that the redevelop print loop gets so that it can transform things.  It has to be sort of outside the language like that.  That ends up being sort of different ways that the systems have to hook into the language depending on where those extension points are and what the sort of interaction style is.

CRAIG:    Yeah, so this is something that I need to understand better.  Matthew also mentioned the module system.  It is clearly different from the way that Clojure handles these things. 

I'm ignorant, so I'm going to ask stupid questions, but hearing you describe it, you mention things like you can't go into someone's module and modify it and whatnot.  I wonder.  I can't imagine that it's the case that Racket would not consider itself to be a dynamic environment.  It is a dynamic programming language where you can go in, and you can kind of mold the program as you go.  But I'm trying to figure out how to reconcile.

First of all, make sure I'm right about that.  Second of all, reconcile that with the statement about modules and their self-containedness, if you will.  Does that make any sense that I'm seeing a conflict?

SAM:    Yeah.  

CRAIG:    Yeah.

SAM:    I think the definition of what a dynamic programming language is is always a potentially contentious one.

CRAIG:    Sure.

SAM:    What I'll say is that, in Racket, you can, using reflective techniques, go in, switch into a context that looks like you're … so that you appear to be inside some pre-existing module and you can work in that context, see the things that they haven't exported from that module, and that sort of thing.  

CRAIG:    Mm-hmm.

SAM:    But there's a couple things that you don't get to do, so one is you don't get to change the exports of or the definitions from someone else's module.  If they haven't given you a way to change something, you don't get to change it.  The second thing is that you can, in Racket, provide people with access to code, but with sort of a lower, more restricted set of reflective capabilities.  For example, the standard Racket IDE called DrRacket is implemented in Racket itself.  And one of the things that I think is distinctive about it is that everything is running in the same Racket virtual machine: the IDE and the program you're writing.  

CRAIG:    Hmm.

SAM:    Racket, with sort of the only other exception being the Web and JavaScript, has really tried to stick with a single process.  Everything is shared between the sort of IDE and the program model, whereas other IDEs have mostly gone to a shell out and exec the program separately the way you would do in Emacs, but also in Eclipse to actually run your program.  That means that we want to have these protection mechanisms so that you can't just reach into the IDE and change the behavior of the IDE from within the program running in the IDE.  Of course, you can write plugins for the IDE with which you can do anything, but there's a separation just like in an operating system between the user program and the system itself.  And so there really are those strong barriers that are real abstraction barriers and that you can't get around.  

What this means for Racket status as a dynamic language seems like a question mostly of definitions, but you can certainly write your own redevelop print loop.  You can load code dynamically.  You can do all that sort of thing.  But there are ways in which you can't change the internals of other people's components libraries, systems, et cetera, and ways in which you can run code with a restricted set of permissions so that they can't escape out of a sandbox.  

CRAIG:    I'm going to take a little bit of a left turn here.  Hopefully you'll see how it's related.  There's a problem when we program in Clojure that I'm wondering if there's a solution for in Racket.  If there is, then that would be awesome to steal it, as we have with other good ideas.  

The issue is it's tied up with the fact that the way that we tend to organize programs in Clojure is with this kind of one-to-one-to-one relationship between projects, artifacts, and Git repositories.  In other words, when I'm writing a program, I might be building a system.  It consists of two or three applications.  Maybe there are services that I'm writing, and there's some code in common.  

I've got my application A, application B, and then I've got code in common.  I'm going to put in library C.  I tend to, as a Clojure programmer, stick each of those in its own Git repository and have each one of those contain a single project that is capable of producing exactly one artifact: application A, application B, or library C.  

SAM:    Sure.

CRAIG:    And so then working with those things becomes fairly awkward for a variety of reasons, one of which is simply from an administration standpoint around version control.  If you want to make a change that encompasses some change in the library, but also some change in the two applications to make use of it, then there's really no atomic record that you can create when you split them across three different Git repositories like that.  

SAM:    Definitely, yeah.

CRAIG:    Right?  Of course, arbitrarily complex variations on that scenario are possible where you have libraries that make use of libraries that make use of libraries.  I feel like that's something that there's got to be a better answer.  There's got to be a way.  

For starters, you could stick all the things in a single Git repository and manage things with branches where there are version conflicts, et cetera.  But that has its own administration overhead and isn't necessarily easier to deal with.  But I also feel like it touches on that correspondence between projects and libraries as well that we have this idea that, oh, this is the application project and it only talks about how you would produce that one artifact, the application artifact.  It's not that there's no way to do it.  It's just that people aren't doing it commonly.  It seems like there's no easy way to do it.  

I realize this is a very vague question, but I'm wondering if Racket has anything to say on this topic.

SAM:    I definitely run into this and, just as background, about maybe a year or a year and a half ago we actually split up the main Racket development repository into a large number of smaller repositories to give ourselves more of this problem.  The reason we did this was that the Racket project, as a whole, had grown a huge amount over 20 years.  It had originally started as really one sort of thing where everything fit together nicely, and you wanted to develop everything in concert.  But after 20 years, it had everything from Typed Racket to the runtime system and the compiler to the IDE to the Web server to the code that built the website all in the same repository.  That has some advantages, but it also has some disadvantages, especially around encouraging open source contribution from outside.  Not everyone needs to deal with the huge repository and coordinating their changes with everyone else. 

We decided to split everything up, and I think that's worked pretty well for us in what we wanted to get out of it.  Although, recently we realized that there were some pieces where splitting them away from the core sort of virtual machine and minimal standard library was a mistake because we were ending up doing exactly those synchronized changes that you described and that was problematic, and so we moved a few things back.  But I'll say both that we don't really have a solution to this in Racket, and I think there is an interesting problem here, which is how does sort of in language features like libraries coordinate with IDE and development time features like projects, builds, and I'm building this application, and then coordinate with things like version control.  I think there's a lot of room to integrate those pieces and do a better job sort of across that whole stack of making our lives better as developers.

If you look at, for example, what people have done in the Smalltalk community with images and people have even built version control systems that integrate with the language and with the image.  You can get some real power out of integrating across all those concerns.  That's one direction to go, but we haven't done that.  We've stuck with text files, Git, and the usual sorts of things.

CRAIG:    Mm-hmm.

SAM:    Which have all their own strengths.  The one thing I will say is that we have done something that's nice that I haven't seen in a lot of other packaged systems, which is a pretty tight integration between the Git workflow for development and the package manager.  The package manager for Racket knows how to update packages that are installed by doing a Git clone, so they can run Git pull for you.  It can install multiple packages from the same Git repository, so you have one Git repository with several subdirectories, and you can say, "I want to install these packages.  They're all part of the same Git repository."  It'll share that.  It will even warn you when you're about to do something where you install two different packages that really seem like they belong in the same upstream there in the same Git repository and you're going to have them in two separate repositories or one in a Git repository and one just installed from a zip file or something like that.  I'll offer to coalesce those so that that works better.

There are some nice tooling steps that you can take to make working with the kinds of projects that you describe somewhat better.  But I don't think we or anyone that I know of has really a solution to those problems.  This is ultimately why people like Facebook and Google are really excited about their single monolithic repository.  There's definitely some value there, but there's also some big drawbacks there, especially if any part of what you're working with is open source and things that external people can contribute to.

CRAIG:    Yeah.  The tradeoff there for if I'm not doing an open source thing, even inside a company, if I'm working with a team sufficiently large, the number of branches you have to maintain can go up.   If you make any incompatible changes, well, that's one of the nice things about having separate artifacts and projects and repositories.  Maybe not repositories, but having separate artifacts and a way to refer to versions of them is that you can keep them independent and application A can move forward on version 2 of the library while application B sticks with version 1 until such time as it can be upgraded as well.

SAM:    Yeah.  

CRAIG:    Yeah.

SAM:    Definitely.

CRAIG:    Totally.  All right.  I guess take it as a compliment that I was hoping that the clever, clever folks in the Racket world would have solved this very hard program.

SAM:    Yeah, it's definitely something we run into, but it's not something that we've solved yet.

CRAIG:    Okay.  All right, well, cool.  We will certainly be keeping our eyes on you.  Like I said, I always keep an ear to the ground because I have a lot of respect for the Racket folks, and I know that you've got a lot of big brains over there working on interesting problems to hopefully include this one at some point.  

Speaking of the collection of smart people, I wonder if we could take a minute to talk about RacketCon a bit.  I actually don't know if you're involved in organizing.  I know I've run into you there a couple times, but what's going down at RacketCon this year, if you're aware?

SAM:    Yeah, so for anyone among your listeners who doesn't know, RacketCon is an annual, one-day conference for Racket, and it's held the day after Strange Loop in St. Louis in the same hotel that Strange Loop is at.  That's a great opportunity if you're already going to Strange Loop to spend an extra day and see some really cool stuff that's going on there.  

What in particular is going on in RacketCon, I'm a little bit involved in the organization, which is primarily done by Vincent St-Amour, who is a researcher at Northwestern University, who is the primary organizer of RacketCon, but there's some really cool talks coming up.  The keynote is being given by Emina Torlak.  Emina is a professor at the University of Washington.  She's built this really amazing tool called Rosette.

What Rosette does is it lets you build a little domain specific language using Racket, but then it automatically integrates whatever language you build with tools, automated solvers from the formal methods community that let you do things like automatically verify that your programs in the domain specific language you've written satisfy a certain specification or even you just write down a specification or some examples and it'll automatically synthesize a program in your domain specific language that meets the specification that you want.  They've used this for a bunch of really cool applications ranging from things like synthesizing web page scrapers, so you give it some example web pages and what output you want to find on those web pages, and it'll write a program for you that does that.  Or, in a totally different space, figuring out how to write programs for these incredibly low power chips made by a company called GreenArrays, which is the company started by Chuck Moore, the inventor of Forth.  These are chips that have 144 cores, I think, extremely weird instruction set, and are very hard to program.  

And so what they did, what Emina and her collaborators did was come up with a way to generate these programs by using the tools that she's built in Racket.  That's the keynote.  She's done some amazing stuff, and she'll be giving a really cool talk at RacketCon.

CRAIG:    yeah.  I'm really looking forward to it.  I go.  I've gone the last two years, and I'm signed up for this year.  I should point out to our listeners that tickets are incredibly affordable.  I want to say they're in the neighborhood of $50.  I don't have the web page in front of me.

SAM:    Yes.  I have the web page right here, and tickets are $45.

CRAIG:    There you go.  No reason not to go if you're anywhere in the neighborhood, and I would venture to say that if you can get yourself to St. Louis that it's completely worth the trip, even if you don't go to Strange Loop, which of course is also awesome.  

SAM:    Yeah.  Yeah, so some other cool talks that people are going to be giving are about building simulators for populations to do social science research or how to use the macro system in Racket to express little domain specific type systems, or one of the talks that I'm really looking forward to is by a guy named Matthew Butterick, who is a typographer and has done a lot of work in making lots of parts of Racket look nice, but is in the process of writing a book about writing domain specific languages in Racket.  And he's built his own domain specific language for writing books and web pages, which is actually used to build the RacketCon web page.  He's going to be talking about all the things that he's learned and this product he had to write this really cool book that's coming up.  

Then one other cool talk is about writing generative art with Racket building rules and using that to generate repeated patterns that turn into some really cool art.  

CRAIG:    Very cool stuff.  Yeah.  It's just so fun.  It's a really, really great conference, good energy.  The people are super nice, and I just always feel like the dumbest person in the room in a very good way.  Like I said, I encourage people to go, and I will definitely be there.  Awesome.  

I wonder if we could turn for a moment.  Actually, I want to ask you a question about something that we're excited about in the Clojure world right now.  Maybe you haven't had a chance to look at it.  If that's the case, that's fine.  There are other things we can talk about.  

I'm talking about clojure.spec.  There's a fairly recent release that Rich Hickey came up with.  Like he always credits it's, of course, built on a number of other ideas.  It's this library that we are in the process of shipping.  It'll be out some time before too long.  Have you had a chance to look at this at all?

SAM:    Yeah, so I've looked some at clojure.spec, and it seems really cool.  It builds on a lot of things that we've done and been working on, for example in the Racket community, about what we call contracts.  That's something that I've done a lot with and that is a big part of how Typed Racket works.  It's just a big piece of how we work in Racket all the time.

If you look at Racket documentation, every function is documented by giving the contracts for its inputs and its outputs.  That's a really big piece of how we program and it's really cool to see other people giving their own spin on that idea and using it in new languages.

I should say that contracts are definitely not something that was invented in Racket.  It goes way back and was first sort of popularized in a language called Eiffel by Bertrand Meyer.  But Racket has really taken it further, in particular focusing on how to handle higher order functions and things like that.  And how to really build it comprehensively and make sure it performs well, which is something else that I've worked a bunch on.

CRAIG:    Mm-hmm.  Yeah, it's great to hear you say that that's a big part of how you work in Racket is to make use of contracts because spec is pretty much brand new.  There was an amusing tweet, so I think Rich announced spec on, like, a Tuesday.  On Wednesday I saw a tweet that said, "Looking for Clojure engineer.  Must have three years experience with spec," right?

SAM:    Yeah.

CRAIG:    The point is that as Clojurists, we haven't had spec around for a long time.  I myself have built zero production systems using spec, and I suspect that's the median, the mode, and the mean for everybody, right?

SAM:    Yeah.

CRAIG:    And so I'd love to get your take on any advice you think we might be able to leverage from your experience since you've been able to work with this approach for so long.  You know what I mean?  How do I approach writing a system with contracts?  How should I not overuse them?  What are the mistakes that I can make?  What are the things I definitely should do?  Anything you can offer us and help us get up that curve would be great.

SAM:    Yes.  I'll offer a couple pieces of advice about this.  One is that the way we think about contracts in the Racket community, and I think this has been really helpful, is really focusing on boundaries.  Here are two components that are talking to each other.  What's their interface, and how can I document what that interface is?  That's really an incredibly valuable thing to have written down in your program.

The other piece is that if you're using these specs to check that your values actually conform to things and you're doing that at runtime, if you've put those things at the boundaries between libraries instead of in the middle of your inner loop, then often you can run with many of those checks on, even in production, which means that you'll figure out what's going wrong when somebody violates one of those abstractions.  I would say thinking about boundaries and thinking about that as the number one place you'd want to put these specifications, that's been a really big thing for us in Racket.  

The second thing that I'll say is that it's really possible to get a huge amount of value out of contracts or out of Spec without writing a really detailed specification for things.  If you just write a specification that says this takes two inputs and they're both maps and it produces a map, then that's already a big step forward even though you haven't specified what's in these maps, what are the keys, what are the values, or really any level of detail, but you can still get a big win.  Your clients will have an easier time understanding how to consume your library.  You'll have an easier time documenting things.  You'll be able to test things more easily and that sort of thing.  That's the second piece of advice.

The third is that there keep being additional places where you can use this information.  Some of the ones that we've discovered over time are already built right into spec, which is really cool to see.  For example, random generation of inputs based on the specification, so you can do random testing based solely on writing down a specification.  Now we know how to generate some input for this and we can test it.  That's really cool to see.  

But things like turning contracts automatically into documentation or providing people the ability to print out the contracts for values at the REPL, or other things like that, or listing that information when you're doing auto complete in your IDE.  All those sorts of additional places where you can use that specification turn out to be really valuable.  I'm sure we're going to discover more, and I'm sure the Clojure community will discover more as well, as you folks get more experience with spec. 

CRAIG:    Mm-hmm.  One of the things that struck me about your description, and this matches with my initial explorations of spec, is that I've been surprised by the extent to which the balance between human oriented leverage of spec and machine oriented leverage of spec--and I'll explain that in a second--has tipped towards the human.  In other words, I write these specs and I can do a couple things with them.  I can use them to either generate documentation or simply read them myself.  Or I can enable checking, and so I kind of think of those as being human oriented, an aim towards understanding, or machine oriented where I'm trying to verify that program behavior is correct in some sense.

For me, I've found so far that the balance has been heavier towards the human usage.  In other words, I felt like I've won more on that side than I've won on the other side.  Now that might just be a matter of the particular problems I'm solving or the way I'm using it or my inexperience with these technologies.  Does that, though, match up or not match up with your experience?

SAM:    I would say that I think the two sides really fit together.  One of the advantages of something like spec is that you can have both of those sides.  That you have the human side where you look at something, and you have a nice, quick summary of what exactly this function does.  But you also have that machine oriented side so that you have somewhat more trust when you look at that specification that that's really what it's doing because we've all had the experience that you come across a piece of software and it has some comments saying what it does, and those comments bear no resemblance to what the function actually does.

Often you'll see, like, "Fix me.  Update this comment."  That's very frustrating, especially if you spend some time thinking about the problem based on that incorrect idea about what's going on.  That's one of the great things about things like spec is that you can make sure that that human use stays in synch with what's actually going on in the system and that you can't in a way that your comments aren't going to, but also that you can turn the dial as much towards making it easy to write whatever you want without needing to be really strict about that machine use of the specifications the way you would if you were going all the way to something like a type system.  

CRAIG:    Hmm.  It's a good reminder that it's not really a dichotomy.  You don't have to pick one or the other or pick where on the spectrum you are.  

SAM:    Yeah, and that's, I think, one of the great things about systems like spec.

CRAIG:    There's a couple other things I want to talk to you about.  There was so much interesting in there that I'm sure we could dive further, but one of the things that caught my eye that I don't want to let you go today without digging around a little bit on is on your web page.  You mention that you're interested in the evolution of software.  I think that that is such an evocative and interesting phrase that I want to hear from you in your own words what you think that means.

Obviously we've talked about things like Typed Racket and stuff, and I think that's part of that story.  But I think the phrase to me evokes a much broader vision or set of activities or technologies or whatever.  What does that mean?

SAM:    Yeah, so what that means for me is thinking about how to support programmers as they work on programs over a long period of time.  I originally started thinking about this in the context of Typed Racket where the particular use case I was focused on was something where you have some system that often you thought would be simple turned out not to be so simple.  You want to make it easier to maintain, and you've decided that having some type checking, at least in some parts of the system, will help you accomplish those maintenance goals.  That's really been the driving force behind the design of Typed Racket.

As you say, this is a much broader idea.  That goes for everything from how do you go from the kinds of programs that we write still these days as bash scripts to evolve them to be real programs written in a real programming language.  I'm sure you've had this experience where you have something where you copy and paste some commands into a bash script or a makefile, and you run them.  Eventually that gets too complicated, and you have to throw it away and write a program in a real programming language like Clojure. 

CRAIG:    Mm-hmm.

SAM:    I think there's a really interesting question about how can we make that process easier for people and what's the right way to do that.  Is it to provide ways of writing those shell scripts in a language that's more like Clojure?  People have tried to do that.  People have come up with shell-like languages for a variety of systems, so there's a system called Scheme Shell, for example, that was aimed at doing shell scripting like tasks, but nothing has ever really approached the convenience and flexibility of shells for that.  And I think there's an interesting question: What is it that allows us to do that so easily, and can we bring those features into a programming language?

Another aspect that I think about is what happens when your language grows some new features and you want to take advantage of them.  How do you do that in a way that doesn't require starting from scratch?  That's a question that I've thought a lot about in the context of my work on JavaScript.  I'm on the standards committee for JavaScript.  It's called TC39, if any of your listeners have heard of that.  We've made, over the past six years that I've been on the committee, a bunch of big, important, and valuable changes to JavaScript.  

For example, we've added a module system to the language.  That's something that I was one of the designers of.  For people who have JavaScript that they wrote before this and they want to move to using JavaScript modules, or they've written code using one of the previous module systems, how can that integrate with new code written using new modules?  How can you evolve your program piece-by-piece to use these new technologies?  And how can you make those pieces work together over time?  

That's something that we thought a lot about when we were designing the system in JavaScript, as well as lots of other features in JavaScript.  How can we make this play nicely with the programs and the idioms and the way people work that they already have and that they don't want to throw away?

CRAIG:    Are there any--?  I'm always looking for guiding principles, rules of thumb, or insights that you've gotten.

SAM:    Yeah, so I think the central guiding principle that I take is that you've really got to look at the way people actually use the system and then adapt your solution to work for the way people actually work.  That applies in all of these examples.  You can't design your type system for Racket or for Clojure without thinking about how people really write programs in those languages, which is going to be different for every language.  There are some common principles that you can learn, but you've really got to get familiar with the programs that people write, the ways that people want to work, and really tackle those things in particular.  

This is an idea that, for me, I learned a lot from this early work on a type system for Smalltalk in the early '90s by Gilad Bracha.  He designed this system called Strongtalk that was a system a lot like Typed Racket or Core.typed that worked for a Smalltalk system that they were building.  That also had this idea about here's the way people actually use the language and we're going to focus on that.  We're not going to focus on everything that anyone could possibly have done.  We're going to focus on what people actually do, and we're going to support that.  That's been sort of the biggest guiding principle when I think about software evolution is to look at the idioms that people actually use.  

CRAIG:    Does that become more challenging in a language like Racket?  One of the taglines anyway is it's a programming language programming language.  This idea in Racket, very strong idea in Racket that you first come up with a language or, as part of the process of solving a problem, you come up with a language that helps you solve that problem.  Then do you wind up with more diversity in the way that people are programming that you have to account for in your solution?

SAM:    I think, yes, this definitely comes up.  This gets into a sort of broader point about different Lisp communities, I think, that I'll get to in a second, which is definitely the case that, in Racket, people use macros, they use the syntactic extension facilities, and they use the whole of the programmable programming language to shape the language to their liking.  That can certainly make life harder for Typed Racket, and some of the biggest pieces of work we've had to do is to work hard to extend the type system to accommodate some of the big abstractions that people have built like the OO system in Racket, which is built entirely using macros, entirely using these language extension features, and so Typed Racket has to work hard to figure out how that works and how to apply a type system to that.

That certainly makes life harder.  But one thing I will say is that I think Racket, as a community, has done a good job of staying somewhat unified about what new language features we build using the power that the language gives us.  It's not to say that people don't create simple, one off macros a lot because they absolutely do.  But you don't see things like you see in the Scheme community or, I think, in the Common Lisp community where different groups of programmers really program in effectively totally different dialects of the language that they have built themselves.  

There's a lot of pressure to standardize on abstractions like the OO system, etc., that even though they're built with macros and somebody could conceivably build their own, we unify on a particular one as a community and that reduces some of the potential confusion that you can have when switching from one code base to another.  That incidentally also makes life easier for people trying to do things like Typed Racket. 

CRAIG:    Yeah.  Maybe you do.  Maybe you have the same phrase.  At least you have the same idea.  We talk a lot in the Clojure world about idiomatic Clojure.  There's this idea that there is an idiom.  It's obviously not strictly true, but certainly when I go to clients, one of the things that they're often interested in hearing is, "Does my code look like other people's code?  Yes or no?"  

SAM:    Yeah.

CRAIG:    It does in some ways, it doesn’t in others.  

SAM:    Yeah.  I think having a community sense for what code is idiomatic is really valuable for keeping the community together and having everybody going in sort of a similar direction so that when you go to a client, everything doesn't suddenly look totally different and alien compared to the last body of code that you were looking at.

CRAIG:    Mm-hmm.  Well, cool.  That's very interesting.  I do see, looking at the clock, that I should probably not keep you terribly much longer.  That said, we have a conversation and I tend to ask a lot of the questions, which means that, to some degree, I'm influencing the direction of the conversation.  I always like to leave room at the end of the show, and we certainly have however much time we need to take, in case there's anything that the guest has that they would like to talk about that hasn't come up yet.  

I don't know if there's anything that you would like to share with our audience or discuss with me today that we haven't hit.  Certainly I would love to have you back on.  This has been an absolutely fascinating conversation, and I think we've got lots more to talk about.  We could do that, but I also want to give you room to talk today, to me and our audience, about anything else that you think we should cover before we go.

SAM:    Hmm.

CRAIG:    No pressure.

SAM:    I don't think I have a particular topic that I want to get to.  I do want to say that I really enjoy the opportunity to talk with people in the Clojure community because I think, in a lot of ways, the Racket and Clojure communities have a lot to learn from each other, that the two communities have a lot in common, but also some really different focuses and ways that that's played out in social aspects, in project approach aspects, in technical aspects, and that that cross-pollination, I think, is really cool and one of the reasons why I really enjoy having RacketCon collocated with Strange Loop where there's a lot of people from the Clojure community who come and who've gotten to talk.  We've had Fogus give a keynote at RacketCon that was really cool about a lot of different things.  That's been a neat sort of cross-pollination that I've really enjoyed.

CRAIG:    Yeah, me too.  I mentioned already that I really enjoy RacketCon, and that's definitely a part of it is the sort of sense of being cousins, right?

SAM:    Yeah. 

CRAIG:    You go to Strange Loop, and Strange Loop is quite a variety of people.  Primarily, I suppose, functional languages, but you get people there working on a ton of different stuff.  It's quite different to something like one of our conferences like the ClojureConj where there's a single language that people are mostly oriented around.  But when I go there, I definitely get the same sense you do, which is that among all that very, very friendly cross-consideration, cross-community interaction.  I definitely get the sense that when I go to the Racket people it's like, "Oh, well, yeah.  There are differences here, but at the same time we're more alike than other communities -- than we are like other communities," so I'm just agreeing with what you said.

SAM:    Yeah.  

CRAIG:    Well, cool.  Well, awesome.  That sounds like this makes a great place to wrap up for today.  Certainly love to have you back on.  Always love talking to people in the Racket community, and you have been no exception, sir, by any stretch, so it's been really fun to have you on.  Thanks for taking the time.  I know the life of a professor is a very busy one, so I certainly appreciate you coming on and talking to us today.

SAM:    Well, I really appreciate the opportunity to talk to you and to your listeners, of course.  This has been a really fun conversation and I'd love to do it again some time.

CRAIG:    Excellent.  Well, we'll have to make that happen then.  We'll go ahead and wrap up there, though, and thank you one more time as we go out.  This has been The Cognicast.  

[Music: "Thumbs Up (for Rock N' Roll)" by Kill the Noise and Feed Me]

CRAIG:    You have been listening to The Cognicast.  The Cognicast is a production of Cognitect, Inc.  Cognitect are the makers of Datomic, and we provide consulting services around it, Clojure, and a host of other technologies to businesses ranging from the smallest startups to the Fortune 50.  You can find us on the web at cognitect.com and on Twitter, @Cognitect.  You can subscribe to The Cognicast, listen to past episodes, and view cover art, show notes, and episode transcripts at our home on the web, cognitect.com/podcast.  You can contact the show by tweeting @Cognicast or by emailing us at podcast@cognitect.com.  

Our guest today was Sam Tobin-Hochstadt, on Twitter @SamTH.  Episode cover art is by Michael Parenteau, audio production by Russ Olsen and Daemian Mack.  The Cognicast is produced by Kim Foster.  Our theme music is Thumbs Up (for Rock N' Roll) by Kill the Noise with Feed Me.  I'm your host, Craig Andera.  Thanks for listening.