 So Chris is a math geek, turn software engineer and data scientist. About a decade ago, Clojure graced Chris with the miracle of simplicity, that's nice. And he has been passionate devotee ever since. He is currently the research director of the Computational Democracy Project, a non-profit which seeks to make governance a better reflection of the public. We will be using data science and machine learning and Clojure to map out and synthesize the opinion landscapes around complex issues, helping people better understand the other side. Yeah, thanks so much for having me. It's great to be here. I really appreciate everyone participating in this conference. This has been a real honor for me to be sharing the stage here, the virtual stage with Stephen Wolfram and with Gerald Sussman. Obviously, you have both legends and programming in computer science world. And in particular with Stephen was really fortunate that as part of preparing for the conference, I got a chance to actually give him a demonstration of Clojure, which was pretty surreal having had Mathematica as my first experience with a programming or computational language. And so really cool event this year to be able to be a part of. So thanks everyone for being here. Again, my talk is scaling deliberation with data science and Clojure. So this story starts back in 2010 with the Arab Spring. Around this time, you saw across the Arab and Muslim world people rising up to demand change from their governments. Shortly thereafter, we saw the occupied movements which started here in the States and sort of trickled around elsewhere. And these were really about looking at the fallout from the financial collapse from just a few years earlier and how the big banks had gotten bailout but Main Street America had sort of gotten sold down the river. Or a lot of people felt that way at least. And so there were protests organized to demand that something change. So a group of friends and I were looking at all of these, you know, this sort of turmoil and seeing these movements which were these sort of large leaderless movements which were able to get people out into the street but in many ways failed to achieve their ultimate ends. And we sort of started asking the question how can technology actually help people understand each other at scale? And part of what we realized here is that the technology at the time was good enough to get everyone out in the streets and protesting but it was unable to bring coherence to the crowd, right? So Twitter and Facebook were able to get people there but they weren't able to figure out like what is it that they're all here for? What is it that we all want, right? Because in many cases, different people want different things and finding that middle ground that everyone can get around is extremely challenging. And these sort of vacuums of power provide opportunities for those who are organized and who do know what they sort of clearly want to come in and take advantage of the situation. And this is something that we saw in particular with the Arab Spring. So our idea to this at the time I had been, I was working at Fred Hutch Cancer Research Center on the Computational Biology Department studying viral epidemiology and cross species transmission of emergence of infectious disease, stuff which has of course become sort of relevant lately but in the course of that job, I'd become familiar with a number of machine learning methods that we were applying towards some of these unrelated problems and started wondering if we could use the same sort of techniques to solve this problem of helping people understand each other. So the idea came around to use these techniques specifically to create something that we now call an opinion space. Or maybe I should say to create a model of what you might call the sort of true or latent opinion space within a population of individuals. So the start of this is what we call dementia reduction. So if you think about a large number of people voting agree or disagree, yes or no, one zero if you will to a large set of comments submitted by other participants, you can think about every person's position in this landscape as being some point in a n dimensional space where n is the number of comments in this discussion. Dementory reduction is a technique for taking data in a very high dimensional space and sort of compressing it down into a smaller dimensional space. And ideally in such a way that you preserve some amount of structure of the original space. And where things get interesting with different methods is how much of that structure to preserve and what structure to preserve and what degrees of freedom do you have in sort of creating this new space and what sort of patterns do you show? A very simple method for dementia reduction is called principal components analysis which you can think about as taking a set of points in a space and rotating it around until you're able to sort of create the biggest shadow in the projection that you're trying to find here. So on the right here, we actually see all these little circles here, these are profile icons or profile images, excuse me, avatars from individuals in a conversation and their positions here relative to each other actually based on this dementia reduction. So here really quickly we get a sense of if two people are close together they probably tend to agree but if they're further apart they tend to disagree. So already just by creating this space that we can actually visualize and kind of think about a little bit more intuitively we're able to start to get some progress. I mean, our idea here was that by reflecting back to people their position in the landscape and others that we can start to build some understanding and solve some of these challenging problems. The next sort of phase of what we had in mind for analyzing what we thought of as the opinion space was to apply clustering to find opinion groups within the conversation. And so specifically here as a start anyway, and while I'm still we use the K-means algorithm which is a very simple clustering algorithm which basically just tries to find these centroids, these centers of the clusters and cluster all the points around them and Lloyd's algorithm is a sort of iterative step for moving those centers closer and closer towards an optimal solution. So very simple to understand and easy to sort of interpret. Now by taking the results of these analyses we can start to visualize the landscape as I described but we can also use these opinion groups these clusters to aggregate vote data and use those aggregations to start to figure out what comments were actually most reflective of each individual group as well as where is there actually unexpected excuse me, unexpected consensus between groups. So as an example of this here on the right you can see it if you click on the group this whole shape on the right this larger opinion group with 119 individuals in it. Once we click on that we can see the statements that were important to that group. Again, as identified by these opinion group based aggregations. So this group in particular felt that during the campaign they found out that 75% of minimum age workers are 20 or older. I think it's the right thing to bring higher it comes to this group of people. So this is something that really highly identified this group as being different from the other. And if you look at these bars above the overlaying the opinion groups you can see how much more this group agreed with that comment than the other group. So just by clicking around and sort of exploring this interface you're able to start to build up intuition for what people think and feel. So once we'd kind of come up with this idea and started working on an implementation we decided that the best way to sort of move forward this project which is starting to seem promising we built a prototype and it was working and doing the thing. And we decided to start a for-profit startup to fund the mission and vision that we were trying to accomplish. We thought that if we can find a market that this technology can be applied to we can use the funding that we get from seeing to that market to further this sort of mission that we had of affecting the way that people make decisions together at kind of a broader scale. So this is where closure interests the picture. Again, we built sort of a prototype of the system with the math having been implemented in R which was a language that I'd used a lot at Fred Hutch Cancer Research Center and was great for this kind of prototyping phase of it where you just wanted to get something working and see if it did the thing but didn't really feel like the kind of language that we wanted to implement a big system in. Even just simple things like reading in and out JSON just felt sort of much more challenging than with a lot of other languages. I won't go into all the problems with R but as much as I love what it's really good at but I had recently started using Haskell a little bit as well as OCaml and was really taken by the functional approach to programming. So I started looking for to see if there were functional programming languages that would be good in the sort of space that we were in. So this is of course, how I came across closure and some of the things that appealed to me about it pretty quickly where the fact that it was on the JVM and that it felt like it had so much potential in the big data space. And I put big data in scare quotes here because if you think back, this is about 10 years ago now, data science wasn't really a common term then. This was still sort of a nascent thing and really back then big data was sort of the buzzword and all the rage and closure had a number of different libraries from Storm to Summingbird to lots of different tech organizations were starting to use closure to do kind of big data processing. And so it felt like a good space to position ourselves. We'd also considered languages like Scala but ultimately decided on closure in large part because it just felt more cohesively and composably designed. And I haven't spent extensive time with Scala. So, I know that a lot of people really love it but we just got this sense that it was trying to be too multi-paradigm. It was trying to be too a little bit of everything for everyone and we really liked that closure just felt like it had made a bunch of decisions that all kind of worked well together and just felt like a really well-designed system to build software with. And the other thing that really attracted us to it were the super concurrency primitives. And obviously, Scala has plenty of strength in this area as well. But this was something that really was a foundational aspect of closure's design. And so it felt like a strong point in English. So moving forward a little bit, we started this organization and we're beginning to try to explore different markets for the technology that we built. When in 2014, the sunflower movement happened. So this started with a rotten trade deal between Taiwan and mainland China which almost the entire population was against. This was something that had been pushed through kind of by upper level bureaucrats and there had been, it had been sort of fast tracked through the standard political processes and it was made clear to the people that it didn't matter what they think this was gonna happen anyway. So this ended up being sort of the straw that broke the camel's back. And a large, largely student led, but also organizers from older generations set a protest, resulted in a peaceful occupation of the Taiwanese parliament. So they, and when I say peaceful to be clear, there was no destruction of property. They actually cleaned up when they left just to contrast with some recent events. It was, but part of what made it peaceful and part of what set the tone for, this was not something that was again, destructive or malignant, was the fact that this civic tech organization called GOV-0, typically written G-0-V had sort of come in and set up the kind of technological infrastructure that they needed to make sure that this movement wouldn't go south. So specifically they rolled out fiber optic cables out into the streets. They were setting up makeshift Wi-Fi antennas with Pringles cans, if I understand correctly. I mean, it was just this really remarkable thing that all these people came out and made sure that there would be internet for people to collaborate and get help if needed. And very importantly, that everything was able to be recorded in live stream. So there was no opportunity for the media or other powers that be to come in and say, oh, look at these violent protests that are being destructive. Everything was above board and GOV-0 played a huge role in that. So much so that in sort of the aftermath of the people actually getting what they want and provoking this trade deal, the government was sort of in a moment of crisis and then digital minister Jacqueline Tsai, if I understand correctly, actually came to a GOV-0 hackathon and asked them if they could build a platform for rational discussion and deliberation of policy issues that the entire nation could participate in. So this was obviously a pretty bold request, but it gave birth to what is now known as V-Taiwan, which is an adaptable process using many different tools, which uses Polis for the core understanding at scale component. So this process has now been used to deliberate and legislate issues on a national level. And initially V-Taiwan was just about virtual, the virtual space for V for virtual. So issues around digital online. And in particular, some of the issues which it was used to deliberate were successful regulations of Uber, Airbnb, both of which have of course been challenges for polities around the world to sort of figure out how to deal with, have these huge organizations that are very powerful and found ways to sort of getting what they want. But with this process, they were able to come up with a solution that really ended up being good for all sides. And you were able to see that even though there was a lot of contention around these issues, there were also points of common ground. And that those points of common ground can serve as anchors on which to build or foundations on which to build solutions for everyone. So this has gotten recently a lot of press. It took a little while, as you might imagine, right? When something happens on the other side of the world in Taiwan, it takes a little while for it to filter down to the English speaking world. But eventually it did. And so some of the articles which have been published, which you can take a look at if you're interested in learning more, include the MIT Tech Review article titled The Simple, but ingenious system Taiwan uses to crowdsource its laws. Wired put out an article called Taiwan is making democracy work again. It's time we paid attention. And very recently at The Atlantic did a wonderful piece called How to Put Out Democracies Dumpster Fire. All of this has been absolutely amazing for us. I mean, I cannot tell you how insane and surreal it feels to a decade ago have started working on this crazy project, which just seems pie in the sky to having a nation actually make decisions using this technology and to start to get the kind of attention that the project has been making now. Where things got a little challenging for us though was realizing that if we wanted to continue growing in the civic tech space we needed to open source the software. The civic tech community was just not willing to go and take it other places without that. And so this led to some kind of immediate challenges. So how do we monetize an open source project? The sort of natural answer to that was, well, we can do consulting, but that leads to further challenges. How do we scale a consulting practice without taking VC? If you're looking at a consulting business you're looking at much higher dollar amounts than softwares of service. And so now your cost of acquisition of clients is higher and you get into this sort of cycle where without some kind of funding to push that project forward, you just kind of dead in the water. And at this point, we'd been working on this for many, many years and realized that we didn't want to take VC because we'd seen what happened to other organizations in similar space who had gone that direction. And so a few years ago, 2018, started the Computational Democracy Project with the goal of carrying out the mission and vision that Polis Technologies had started. And just this last year, we finally received official 501C3 charitable status from the IRS. So this has been, again, just the conclusion of about a decade now of work on this project and it's fantastic to be able to share with you today that both we have this organization and that it's achieved 501C3 status. But back to closure. Don't have a lot of time here, so you don't have to race through a little bit, but so this is supposed to be an experience report of closure in data science usage. So some of the good, closure was obviously a joy to learn and work with. I wouldn't be here if that wasn't the case. As a data-oriented language, that it's just data philosophy really makes sense for data science that has just continued to be the case. It also has a really sharp and highly leverageable set of tools and libraries which have been throughout a great experience to work with. And again, the amazing concurrency primitives that are built into closure have been really valuable for us. So the sort of awesome potential though that I see for closure and that we're starting to apply now not necessarily to the core tool itself, but with the kind of data analysis that we do with the data that we collect and that others can potentially take advantage of from the data science closure community. We have closure script, which means that closure is really, as far as I'm concerned, the only sort of data science capable language that can do anything from big data running a server to running machine learning algorithms that also has a viable and well-trod front end target for building interactive clients in closure script. We've also now as a community largely converged on Vega and Vega Lite for data visualizations. So these are data visualization languages which are very philosophically compatible with the closures which stay to approach to things. And there've been amazing advances in terms of the kinds of numerical computing that we're able to do with closure. So Dragon's work on Yandere Fallon Deep Diamond have been just absolutely amazing in terms of what we've been able to accomplish and making closure really fast language for implementing numerical methods and for doing deep neural networks. And yeah, I don't want to say more recently, not similarly recently. We've now, I mean, closure has always been built on the idea of interop with the JVM and closure script. And this has also been, I think, something that really speaks to closure's potential as a language for data science. So we can interface with the Python CLJ to the Python world and actually create two way bindings where closure programs are able to call Python functions and Python programs are able to call closure functions. Similarly with closures are we can interface with the R world. If we need to implement things in that language, then we can still take advantage of them from closure as a glue for stitching things together. And as we saw Stephen Wolfram demonstrate yesterday, we now have, well, we've had for a while but we've now had some dusted off bindings to the Wolfram language, which just opens up an entire world of really fantastic possibilities. But I'd be remiss if I did not also share some of the ugly, some of the challenges that we faced with this. So first of all, 10 years is a long time for any code base, even closure. I mean, it has definitely been the case that I've found that closure project from years ago is much more likely to fire up, you know, down the road without mucking around with things than basically any other languages I've worked with. So this includes JavaScript and Python and yeah, a number of others but Ruby certainly. But it's still the case that after 10 years, there's gonna be some stuff that's changed, even if it still works just in terms of what the community is using for different tasks, different approaches to things. I mean, we've evolved a ton in 10 years. I mean, closure is only a little bit older than 10 years, right? So there's a lot that changed in that time. We've also had increased scaling demands. So we were able to get around some of the scaling issues of implementing these algorithms in closure prior to Neanderthal and Deep Diamond and such by being kind of careful about the algorithms we chose. But for various reasons, you know, we're now starting to hit up on some of the limits of the system that we've built. And some of this stuff is incidental and not necessarily at all, closure's fault, more my fault for having influenced this stuff as my first experience using closure. But also, and this is kind of the third bullet point here is kind of in reference to the first, you know, moving away from core.matrix to the tech ML stack is something that we'd really like to do. It's take advantage of all of this interoperability that it provides with Python and kind of the rest of the emerging closure ecosystem, which is really exciting. But more kind of to the point, there's a really difficult question, which is a reason which as a nonprofit, we hope to be able to take advantage of volunteer efforts. So a lot of the work that it took to take this initially, you know, for-profit startup software as a service package and open source it, you know, that was sort of, it's easy to open source something. It's not easy to take something that was built to run on a single instance and make it so that anyone can go and fire it up to build that DevOps infrastructure to make it deployable and, you know, have repeatable process for getting things running. And a lot of that has been driven by volunteer efforts. But a problem sort of emerges with respect to closure in that, you know, closure is really a small community. And so we as a nonprofit, we have to ask ourselves, when we build a new piece of infrastructure or, you know, some new part of the system, if it's not already implemented in closure, is that how we want to implement it? I mean, I personally would always like to implement something in closure over JavaScript or Python when I can, when it makes sense. But we also have to think about who's gonna maintain that. And again, as a nonprofit organization, how can we make sure that we could actually take advantage of the volunteer base we have available? So part of my goal here in putting this talk together was to ask you as a community to help with this process of modernizing the closure code base and, you know, showing that we as a community can punch above our weight and that, you know, this is largely for selfish reasons. I wanna use closure as much as possible at my work. But I also think that there's a real opportunity for us here as a community in that this is a really exciting project that's turning a lot of heads. And I think that it's a wonderful showcase of the kind of cool stuff that closure can be applied towards. So I'm going to end there, but I just wanna say again, thank you so much for having me. And sorry, I went a little bit over, but I'm excited to chat further here and take questions and that sort of thing.