 He's well known for his foundation publicresource.org. He's got a long history in sort of the deep innards of the internet. Did a lot of early work with multicast, including the internet's first radio station. He's the guy in no small part responsible for the fact that the Edgar database of SEC filings is available online. He came to the attention of many with his 2009 election campaign for a non-elected office. He campaigned for the role of being the public printer to the United States. This is a role that Ben Franklin, my hero, helped bring her out. And Carl's campaign was very specifically looking at questions of making existing government documents available online and downloadable machine readable form leading to the campaign slogan, Yes We Scan. He was not in fact appointed to the job. I think this was a terrible disservice. But he did win the 2009 EFF Pioneer Award for his dedication to protecting the public domain. He's going to talk today about another wonderfully named campaign, Yo Your Honor. Which I'm looking forward to hearing more about. I know at least that it focuses at least in part on PACER, which is a system that activists, notably Carl, have been calling attention to since 2008 as this is a system that provides access to critical government documents, court filings, etc. at the low, low price of 10 cents per page, thus guaranteeing that full access to the judicial system is available for those who have deep pockets to pay for it. And Carl has worked with other civic heroes like Aaron Swartz, who have been involved with downloading large sets of documents from PACER to study and understand the problem that they are dealing with. Carl is an amazingly effective and creative advocate. He's someone who is deeply, deeply dedicated to the notion of civic information and knowledge, has incredible wealth of understanding around copyright law and access to information, and is always one of the most creative people thinking about how to use civic activism to make deep change within the U.S. government. So we're thrilled to have him here. I can't wait to hear what you have to say. Welcome. Thank you. I really appreciate being here. I was at Harvard Law School yesterday. It's a very different environment. I actually worked at the Media Lab in 97 after I finished running the Internet World's Fair, and I needed to place the land for a while, and Nick Negroponte was kind enough to let me camp out here and write a book. This was when the Media Lab was one tiny little building, and they had dreams of building something much bigger, and this is like so amazing. It really is a big difference. So I started my new nonprofit in 2007, public.resource.org, and the goal was to make all the laws of America available. The idea was that the primary legal materials, what I call the raw materials of our democracy, should be available not on a pay-per-view basis, not just West and Lexis and a couple big firms, but available for citizens to use who might not have a credit card and a subscription and a law firm affiliation, but also for researchers to be able to bring in all the materials for state statutes, for example, reconfigure them into XML data, begin comparing the laws in different countries, in different states, analyzing how those laws work, how they differ, making the laws available to the people, because in the United States, the people own the law. That's very different than in other countries. In England, the Queen owns the law. It's crowned copyright, but in the United States, the people own the law, and because of that, the law has no copyright in the United States. So I began by doing what I thought was the right thing. I did a whole bunch of workshops around the concept of law.gov, and I went to Stanford and Harvard and the U.S. Congress. I went to the Center for American Progress, John Podesta and Larry Tribe, and we came up with these principles of how the law should be available. And it was nice and it was fun, but we didn't actually get anything done, right? It didn't change the way the laws are available in the United States. As part of that, though, I began kind of working on some real stuff. So we did two things that was fairly significant in 2007, 2008. I began posting all the building codes for the country, because these are the law. These are not advisory codes, right? These are incorporated into law in all our states, things like the National Electrical Code. And I began posting those and nothing happened. Nobody sent me takedown notices. These are copyrighted documents. These are created by 501c3 nonprofits like the National Fire Protection Association, American National Standards Institute, the Association, American Society for Testing and Materials. They keep copyright on these documents, but they want them to be the law. They lobby aggressively for them to be the law. And so I did that for a few years and nothing much happened. I began posting all the safety standards that are required by law at the federal level in the Code of Federal Regulations. And then the shit kind of hit the fan on that one. We got sued in two district court cases by six plaintiffs and we're currently in court. It's an intense legal battle. National Fire Protection Association says they should be the only place that is allowed to post a national electrical code because if they don't keep the revenue from the sale of the national electrical code, they will be unable to continue making high quality codes and babies will die. I have actually heard that argument. And my argument is, well, you know, I know you need the money. It's a good code, but it's the law. And in the law, in the United States, the law has no copyright. And so we've been posting them and we're fighting very hard. I'm defended by EFF and two other law firms. Like I said, there are six plaintiffs in two district court cases. So this is not an experience you want to go through. Among other things, I took every email message from 2007 that I sent to anybody, my wife, you name them, and handed them to my lawyers. And the lawyers went through and did searches and disclosed anything that was potentially relevant to the plaintiffs. And so when you're going through a process like that, you learn a lot about federal litigation. The other thing we did is we began looking at the PACER system. So PACER is the workings of our federal courts. It's not just the opinions. It's the dockets. It's the motions. It's the briefs. It's the orders. It's the arguments that the lawyers make before the judge actually rules. And the way the PACER system works is you pay ten cents per page in order to access these. You have to have a valid credit card. And if you're a researcher, if you want to download everything, you can pay ten cents a page. It'll cost you a lot of money. There's close to a billion documents in the PACER system all over the country. And PACER is expanding. It's not just the district courts. It's the court of appeals at this point, too. Now, in 2008, I began an enterprise to say, gee, let's make PACER more broadly available. Hey, everybody, download PACER documents, upload them to my site, recycle the public domain. And I thought that was just a vehicle for frequently asked questions, learn more about PACER, raise some awareness. And I got a call from a guy named Steve Schultz. He said, you know, Aaron and I are working together and we want to do something. We want to do this download PACER things, the thumb drive core. The idea was that there was a free access mechanism across the country in 20 libraries in order to determine whether or not maybe the public might have some interest in PACER because right now you go to the courts and they say, oh, come on. Nobody needs PACER except for a few lawyers. And so I thought a few people might download a couple of documents. It was a symbolic gesture. Well, Aaron Schwartz came in and he took the crawler that Steve Schultz wrote and he went, gee. And he said, you know, I've got some data. Can I upload them to your system? I said, well, OK, it's Aaron. We'll give them an account. And I got a call from my sysad man about a month later. He goes, you know, we're up to like 700 gigabytes of data. I said, well, Aaron's a smart boy. And I got a call saying, well, it turns out all public access to PACER has been terminated and they say they've called the FBI because they've been hacked. Now they weren't hacked, right? There was no appropriate use policy saying only download one document from the library. It didn't say don't download 20 million documents and I'll admit that was a bit of a surprise to them but it wasn't illegal. They called the FBI, FBI went away and said there's no case here. There's nothing that happened. I knew that there were going to be privacy issues in PACER because I had previously put the Court of Appeals online and I thought a whole bunch of social security numbers in Court of Appeals opinions and I knew that PACER would have a similar issue. So I did a comprehensive audit of those 20 million pages, sent in the audit to the Judicial Conference of the United States, sent it to 32 chief judges of district courts. They ignored me, sent them again and then ended up sending these notices to chief judges that said third and final notice and big red letters on the top and you really got to stop and think carefully before you do that. Some judges wrote back to me and said, you know, you're right. There are socials, they removed the documents and I thought that was really good. I thought that was a good thing. And so I stopped working on PACER for a while. That was a pretty strenuous thing. It involved New York Times articles and, you know, the Senate sent a letter to the courts and said what's with all these privacy problems and the Judicial Conference wrote back to the Senate and said, oh, we're aware of this issue now. Thank you, you know, for this audit. I got a nice letter from the chair of the committee on rules of the Judicial Conference and so I kind of left the problem and then this summer I decided it was time to go back in and look at it again. And the reason was the courts announced, the administrative office of the courts announced that they were deleting from PACER all historical documents for five courts. And all the law librarians in the country were like, you are what, you're deleting the data? I mean, we have a shortage of disk space here. They were upgrading PACER to the next generation version of PACER. But, you know, I thought that was really dumb and so I organized a bunch of letters. I went and called members of Congress. There ended up being six congressmen writing to the courts. There was three senators wrote to the courts and what are you doing? Senator Leahy went storming into the Judicial Conference meeting. That was the topic because he's allowed as chair of the Senate Judiciary Committee to address the Judicial Conference. And at the end of the day we all got fooled. It turns out they didn't have any documents that were historical. They had a docket sheet. They had a list of the cases that had been tried but they didn't have the briefs and emotions in the orders. They weren't deleting anything and they could have easily taken this one page file and migrated it to the new system. But we were all fooled. I should have figured it out because I knew that they hadn't been on PACER in the early days but I didn't. Senator Leahy's staff didn't. Members of the House Judiciary Committee called the administrative office up and said what the hell are you doing? And the AO was like well it's technically too hard to do this. A single page. And so I started working on a memo on PACER. Asking myself what can we do to raise awareness of this brain dead system. Has anybody actually used PACER? Have you looked at it? It's oh my god. I mean they set this system up in 1994 and it looks like it. Worst UI you've ever seen. No search capability whatsoever. All sorts of other problems. But every time you go talk to someone they're like what is your problem? It's just you know it's not a big deal. So I was working on this memo. The Chief Justice. Every year does a year end report. It's typically a 8 to 10 page beautifully written essay. It used to be they do it on a typewriter. And this year his year end report was about PACER. I think it's because he didn't like Senator Leahy yelling at him over that whole docket controversy. We'll never know why he actually decided to pick this topic. But his year end report says look in the judiciary we have to be a little more conservative. And PACER has a billion documents. And for a modest fee any American can access this data. And then he went on to say and we're going to put all the Supreme Court stuff online. Which is great. That's wonderful. All the briefs, everything. But we're not going to use PACER. We're going to write our own system. And so there's rousing defense of PACER. So I said okay fine the memo I was working on I turned it into a response to the Chief Justice. Which is not something you're supposed to do. They don't like that happening. And in fact my friends at the Supreme Court don't return my calls right now. You just don't do something like this. But my theory is that if people knew more about the PACER system they might do something about it. So I have come up with seven different ways we can maybe do something. So number one let the judges know you care about PACER. Because when I go see a Chief Judge and I've met with several of these I actually addressed the Ninth Circuit in her business meeting. Judge Kaczynski asked me to come in. Their initial reaction is oh really? You care about this? Why? And so letting judges know that this is an issue you know judges don't get a lot of fan mail. They don't get a lot of lobbying. And so my theory is that if a judge gets 50 postcards saying you're on or I'm a law student I'm an engineering student. I followed you know your wonderful opinion in the great muffin case whatever. I live in Cambridge you're the Boston judge. If they get 50 cards maybe they'll say hey let's look into PACER maybe something's going on. So that's kind of an unusual technique and what I've done is I've printed out a whole bunch of postcards that have pictures of judges Lewis Brandeis. I have here mailing labels for the Honorable Patti B. Saras who is the Chief Judge of the Massachusetts District. And we have custom stamps that have PACER on them. We have pictures of telegraph operators and the word PACER pictures of clerks with big books and it says PACER. And so these are not souvenirs but if you want to write a postcard make sure you only write on the left-hand side because I need to put a stamp and an address on the right. I'm going around the country collecting these and on May Day which is also Law Day we're going to open up a PACER polling place at the Internet Archive come in and vote for justice, vote early, vote often. We're going to have a PACER polling place in Chicago there's probably going to be one in Washington and my hope is if we collect enough of these postcards and they go to various judges maybe the next time the Judicial Conference gets together they'll start talking to each other and saying yeah I got a bunch of these things. Oh my god I had no idea people cared. So you guys are welcome to pass these around again these are not souvenirs if you don't want to write a postcard you don't have to but if you do want to write one bring them up at the end and I'll put a stamp on them I'll put an address label on them I'll bring them home I'll scan them all and put them up on Flickr so that we all see what's going on and then we'll put them in the mail and send them to the judges so we can pass these things around I have more. So that's way number one but it may not work I tried that once with the Smithsonian when they were issuing copyright notices on everything they were doing we got 300 people to fill out postcards and they went to members of Congress and Smithsonian and they totally ignored us it didn't do anything so that's the hearts and letters campaign let me talk about six a little more substantive things one can do about PACER so the obvious thing is sue the courts say you're denying me access to the rights of our judiciary and a fundamental premise of our judiciary all the way back to Magna Carta is that we conduct our hearings of our courts in public and today that means on the internet it has to be available to the public it's the fundamental check on our system of justice it's the way we know our courts are fair now here's the problem it's really hard to sue the courts for a couple reasons number one Congress said they should charge for PACER now they didn't say they should charge everybody right they didn't say that they couldn't make it available for non-commercial use for free but they did say charge and if you go to the courts they say well Congress told us to charge the other problem is there is no place in the Constitution that guarantees you a free lunch it is hard to go in and say my constitutional rights are being violated I think they are I think when you charge for access to PACER the principle of equal protection has been violated this is a poll tax on access to justice I think it's a due process problem I think there's all sorts of constitutional issues but it's really hard to bring an information technology issue up to a constitutional level and we've tried to look for ways to sue the courts there's been clinics at different universities looking into this I've talked about this with many many lawyers it's possible but it's hard it's a stretch so cards and letters sue the courts there's a much easier way Congress said to charge Congress can also say don't charge it's really that simple and I've got a simple magic bullet so right now there's two ways that the courts bring in external money they get money from Congress too and they get paid for the fees as well as PACER fees so if you do federal litigation you're some big corporate entity suing over trademark violations you pay $400 to file your initial complaint in the federal courts that is a much bigger revenue stream than the PACER so they could up the filing fees just a little bit and get rid of the PACER fees and that's such an easy thing to do a member of Congress has actually put a bill this year that says the administrative office should drop the PACER fees, increase the filing fees or give us a report as to what their different business models are and how they're examining the issues so we don't think the courts are actually going to do anything but maybe they'll have to issue a report maybe there'll be hearings it turns out that Darrell Issa is now chairman of the subcommittee that has oversight of the operation of the federal courts this is not something he likes to see 10 cents per page he's not a lawyer he likes to break heads and so my hope is that maybe there's going to be hearings there's other people on that committee like Congresswoman Lofgren this is an all just Republican there's a real strong liberal contingent that views this as a problem so maybe there'll be hearings maybe not but you hope four more things and then I'll stop talking just out PACER has billing errors their system doesn't work there are two kinds of things you can do on PACER you can download a PDF file which is 10 cents per page or $3 maximum or you can get what's called unpaginated dockets or search results and those are based on the number of bytes right because it's an HTML file it's supposed to be 10 cents per 4,000 bytes I'm the kind of guy when I download my docket I actually count the number of bytes in there and I looked at it and I said wait a minute they're overcharging me so they charge me 60 cents for like a 7,000 byte document and so I sent in a request to the PACER service center and I asked for my 40 cents back I got a letter back from them saying we have received your request we cannot determine any refunds until after the billing cycle has completed at which point we will resolve your request and I'm reading this going I don't know if that means I'm going to get my 40 cents or not get my 40 cents and then they gave me advice as to how I could minimize my bills by clicking all sorts of weird things before doing my search and so I thought this is nuts and so I proposed I did a formal audit I actually looked at how far back this billing error problem goes and I got a letter to the division director of the administrative office of the United States courts who has oversight over PACER and said look you're overcharging millions of dollars here because every single docket you're overcharging it's the most common operation and then I quoted a court case that is used in contract 101 if you're in law school I'm not a lawyer I dropped out after one year but I did take contracts and there's a fundamental principle that if you advertise a price in a supermarket you have to charge that price you can't charge someone more and if you do charge someone more they can sue you and get the money back and that's what I told the courts is that they have to disgorge this million dollars or two million dollars or whatever it is that they've overcharged systematically for the last several years and I'm expecting to get mushback for an answer on that one but I don't know I've got a meeting next week with this division director if he doesn't cancel it and I'm hoping maybe the Congress will ask them to shed a little more light on their finances and maybe address the billing error problem remember I said in 2008 I did this comprehensive audit of privacy problems so as I was preparing my billing errors memo on this I was looking for public.resource.org on the US courts website and I discovered I had sent a 24 page memorandum to judge Rosenthal in 2008 that included a list of every docket number that I had found preliminary this was my initial audit that I had found privacy issues the document that had the privacy issues and the social security numbers that I had found and sure enough an unredacted version of that audit was on the US courts.gov website actually they posted it twice one in the civil area and one in the appellate area unredacted one stop shopping for identity theft and I thought to myself hmm and so I knew I'd gotten letters back from several chief judges but several hadn't answered me so I pulled up the northern district of Illinois I took the first document on my list went to PACER sure enough that document was still live they hadn't removed it it was a five page listing of name home address social security number financial information they just simply had ignored it now what pissed me off about that is the United States Senate had sent a letter to the judicial conference saying what about this privacy issue in 2009 and the courts had written back saying oh we take this really seriously we're on top of it we're changing our rules they did all this analysis of my audit trying to show that I was wrong they had a bunch of court functionaries write a big fancy memorandum but they didn't remove the documents so I sent a letter to Judge Rosenthal I didn't say that she had misled the United States Senate but it sure looks like they perhaps the bureaucracy had failed her and I actually talked to Judge Rosenthal and I apologized I said look I'm really sorry because she was really nice to me I mean she met me in chambers in 2010 sent me a nice letter back thanking me for my work and I know she didn't post this document on uscourts.gov but it was a letter to her and I had her name on it and so that's a big issue and I'm hoping that maybe some members of Congress maybe some members of the Senate might want to get a further explanation of the administrative office as to how the current situation relates to the assurances that they gave the Senate in 2009 two more things my initial idea in this memo was in PACER if you use $15 or less per quarter they don't bill you it's their one kind of free access thing now you have to have a credit card and you have to register but the idea was that maybe everybody around the country would use their 15 bucks every quarter and they would use the recap plugin which takes this document and puts it up on the Internet Archive and we've got over a million cases on the Internet Archive it's a small percentage of PACER but it's a reasonable subset and my idea was that maybe law students all over the country would band together and there'd be a little competition to see which law school downloaded the most documents right and so there's going to be an extension of the recap plugin where you can fill in a group name field Harvard Law School or Boston or whatever the idea was that the winning law school would get the Swartz Cup and we have a beautiful marble column this is Swartz, Law Day and the idea is we would do this on May 1 now May 1 you may know it as May Day but during the 1950s McCarthy and folks like that decided they didn't like all this communist stuff and so they decided May 1 would be Law Day and the American Bar Association embraced Law Day with a vengeance and this May 1 they are doing the 800th anniversary of Magna Carta there will be speeches, they're going to go to Runny Made, they're going to cut a ribbon and I figured this was an appropriate alternative to Law Day now I haven't had a whole bunch of law students calling me up saying we're going to do this, this will be great and so I'm not sure we're going to have a huge number of downloads on Law Day but you know I'm going to keep trying and maybe we'll get there and we're going to keep doing this for several years by the way it takes 10 years usually to get one of these big databases online took 10 years to get the US Pat database online it's probably going to take 10 years to do PACER I started in 2008 we may not even succeed in 2018 but we're going to keep doing the download your 15 bucks per quarter and see if maybe that will turn into something and then finally and this is I think our best shot so last week I filed an application for a fee exemption which you're allowed to do if you're a prisoner you can petition a judge and say I want free PACER and the way that works is you only get free PACER on a special account you're not allowed to redistribute the data you have to limit you know it's a carefully constrained thing so occasionally nonprofits get fee exemptions so our fee exemption request went straight to the 9th circuit and said we want an entire district an entire district court it's like four district courts and the 9th circuit so we applied for a fee exemption for five courts and what we said is we will download every document for that district and we will audit it for privacy violations footnote you've got a problem we found a problem last week still this is a pressing issue and we will report back to the court as to whether your privacy rules are being observed or not we will notify the litigants when we find a social security number and notify them of the problem so we're performing a service for the courts we'll do it one district we'll do four districts we'll do the circuit you choose and our argument is that the courts kind of need to have this done but we did something else in this one we said look not only are we going to redact these documents we want to post them all on the internet archive and we included affidavits from leading legal researchers saying oh my God if you had an entire court available we could do some amazing things affidavits from Brewster Kale saying we'll host the data from Brian Carver at the free law project we have a public library librarian we have the Lewis and Clark they've all submitted affidavits and our argument is the courts have the authority to grant our request and they owe it to themselves to conduct at least one experiment to see whether there's a better way every single district court 10 cents a page same mechanism the Ninth Circuit is a huge place our argument to the court is they should at least try one district and see if better public access and analysis of privacy and all this empirical research that people could do see if maybe that's a better way to do things and I have no idea if Chief Judge Thomas is going to be able to grant our request at the very least we think the administrative office of the United States courts will be asked to submit a report to the Ninth Circuit explaining why I'm stupid and maybe we'll get an in-depth explanation of why they think this is a bad idea so if nothing else this is going to force them to the table so that's the seven mechanisms I've come up with on a pacer strategy some are going to work some are not going to Congress is ultimately going to do it auditing for billing errors and privacy violations is something substantial and real that they really have to fix maybe that's going to force them to look at it maybe just letting the judges know because when I see a judge and I say we got a pacer problem they're like oh really I didn't know that because they don't use pacer they have clerks or when they use pacer it's just occasional but usually they just have a pile of paper on their desk they read and so letting them and members of Congress and others know that we actually care about this thing I think may do something and maybe our system of justice will work and our fee exemption request will be granted you never know Excellent, thank you Carla I assume you're happy to entertain questions Absolutely about pacer or about any other subject I'm sure that it's coming online but I think it's always nice to honor those so who wants to start the conversation Better call Saul You know how much it hurts to have a TV show like that I know Could you take a step back and explain sort of organization construction like what pacer is I mean is there a bunch of developers working for the court system and a server farm bettering under the Supreme Court what exactly is it and you know in the sense of who exactly are you fighting against beside the judges So the question is what is pacer what is technically what is pacer so pacer is a weird distributed system it is built by some contractors at the pacer service center in Texas each court has their own implementation of pacer so what that means is you cannot search across a whole bunch of courts there is something called a federal case locator which is sort of gets you a little bit of that it used to be a whole bunch of pearl code and you can tell it's pearl code it's very easy to figure that out they have been upgrading to next gen pacer which as best as I can tell is java based and has features like longer user names when I sent a letter into the administrator of the administrative office and said here's my pamphlet and I want to talk to you about pacer I got a letter back from Mr. Launey and Mr. Launey said I read your pamphlet that you sent to the director and every single suggestion that you had in there is something we're working on for a future release so it's really it's a single system per district or per appellate because now the court of appeals all have to be on this system and whether it's a single server farm or I don't know because they won't tell us they also won't tell us like how many dockets that they do every year or how many of their users they claim that 80% of pacer users are not charged under this $15 per quarter thing I don't believe that number I think what they've done is taken every person who's ever registered for pacer and even if they're not using it today they counted as oh you got free access I think that's what's going on most of their so attorneys get free pacer for their own cases I think that's a lot of their usage is attorneys downloading their own filings I know West, Lexis, Bloomberg do a lot of pacer access but again we don't know the answer to that and so I wish I could give you more details but that's actually one of my frustrations is you know I've been doing this since 2008 I have spoken once briefly with a member of the administrative office I was asked to speak at a panel on pacer and the administrative office let it be known that if I was there they wouldn't be there and so I was disinvited so Prida you know the code and then you have to pay to get the code which is should be free because it's publicly enacted or has been adopted by these publicly adopted I should say so the question I have is in the pacer context what are the beneficial uses to the public like so much of it is kind of in the weeds for lawyers in a particular case how do you see using this in a way that's beneficial okay so there's two important answers the question is one of the beneficial uses of pacer as opposed to the law must be available that's pretty clear the opinions of the court must be available everybody agrees on that although not all the opinions are available so there's two answers to this one is right now the courts have not decided what should be public and what should not so for example bankruptcy proceedings have an incredible amount of personal information I am not convinced that data necessarily needs to be on Google there are some compelling reasons that some bankruptcy proceedings should be but my argument there that is if it's available on pacer for ten cents of page and on west and on lexus it should also be available to others so the question is who are the others that don't have access to pacer today if you are a journalist and you're doing significant work on the federal courts you got to go see your editor because you're going to rack up some real bills right you can't go look at a hundred different court cases in a district to understand whether people of different ethnic heritage are being stopped more often for crimes or violations right you can't do that empirical research today on pacer if you are an academic researcher who wants to look at again civil rights litigation across different districts does the south enforce civil rights differently than Missouri what about patent litigation is that being done in a uniform basis what are important cases there's a whole raft of empirical research that we're beginning to see there's young law professors at Michigan State University there's a guy named Matthew Sag at Loyola in Chicago Daza Greenwood here at MIT Media Lab is beginning to do that kind of stuff and that research is precluded because there's just no way you can download everything from a district and do that empirical research you can't do my social security thing right that's my one trick is I'll download everything and run regex on it I mean it's not very hard to do you run OCR and then you take the text files and you know you do grep but nobody else does that and it's one of the things I know how to do I've done it with the IRS I can do it with the courts I can't do that a ten cents a page well yes I am the professors are the ones that do the big data analysis and there's some very compelling big data analysis that has come out of analysis of Supreme Court opinions they do citation analysis they're able to determine which cases really are important because they're cited so for example there was an analysis of Marbury versus Madison right seminal case we know it as one of the landmark cases well turns out Marbury versus Madison was not an important case until the interstate commerce commission began to do its work and industry started to happen and so you can actually see by citation analysis that this case became much more important in the late 1800s now that's sort of intuitive we could kind of figure that out but at a lower level a court of appeals court it would be interesting to analyze what are in fact the important cases that this particular judge has authored and so there is a lot of big data analysis that I think is significant there's a lot of muckraking and journalism work that I think is very significant and there's a lot of citizens that I think want to follow cases and some of them they can do on recap but a lot of times they can't often I've gotten this with every database I've done SEC patent you know you go into the securities and exchange commission say Edgar ought to be online and the answer is you know people don't really care about this it's just a few financial in fact cats why should we subsidize their use right if you make pace or free it's only the lawyers going to want it and they're going to have no you know why shouldn't they just pay their way they got lots of money and the answer to that is you're precluding a much broader audience that actually really does have an interest in the proceedings of the courts maybe not everything but there's enough cases out there that you know I'll give you a simple example I know a lady who had a crooked landlord turns out he was involved in all sorts of court cases she went on to face her to research because she was renting from this person and because of I believe billing error she racked up a thirty thousand dollar bill I think it's because she was searching a lot and this whole unpaginated ten cents per page thing the courts called out a collection agency after her put a black mark on her credit record she's a young single mom she happens to work for a judge he couldn't step in because you know he can't really intervene and as far as I could tell it was a totally bogus claim but she wanted to research her landlord and see what was going on and that was hard to do Carl I got a tactical question in from Charlie who's a former doctoral student here at the the media lab and he points out that it's really hard to get people involved with campaigns focused on the US government and trying to figure out how you get people into those campaigns as sort of an art form in and of itself so we have folks like John Oliver you know making these very blunt statements about the NSA has your dick picks and Charlie observes that you know you're honored dot org is pretty dense and pretty technical you're writing it you know in the form of essentially a legal filing what's the what's the choice behind that why go after it in this way are you trying to build a broad movement of this are you trying to make a few visible provocations what's the theory of change on this and how has how you've launched this played into this I think it's all of those so the first thing I wanted to do was write a memo somewhat for myself as to what are the different ways we can solve the pacer problem and I wanted to have enough facts that someone that took the time to read this could turn around and you know maybe explain it in simpler language and as you can see I'm going around the country trying to explain it in simpler language this may be too much but it seemed to me that at some point you really got to dig into the facts and begin to address the issues of is there a constitutional right to pacer what about fee exemptions how do they work what about the billing and it just seemed necessary to put it together and that's got to be dense building a broad movement around this stuff I have failed miserably every time I've tried to do this okay I did law.gov we have 15 workshops hundreds of people principles but it didn't change anything I've been going after the Smithsonian for years because they assert copyright over their materials I say you got to have a license to use public domain materials and I think that's nuts but I got 300 people to do postcards but that's because I basically you know grabbed 300 people by the elbow and forced them to write them so that that was getting a broad movement I don't know you know I don't know what it takes to do that Aaron seemed to have figured that out on SOPA you know was able to mobilize along with a bunch of other people a huge outpouring over a very important and somewhat obscure issue one that normally the MPAA would have just been able to get SOPA done and people were able to get a lot of people standing up but I've never figured out that magic ticket and so at the end of the day this is a series of tactical efforts that I'm hoping one or more of them will actually do something and so there's a fee exemption there's billing errors there's privacy problems there's cards and letters are any of them going to work I don't know I don't know but I think you just got to keep trying and I think it takes sustained effort like I said it takes you know you got to just keep on doing this and eventually if you think you're right you know maybe what I find is when you meet a policy maker and you get those five minutes with them usually at the end of those five minutes are beginning to shake their head they have some doubts but they understand that this isn't totally nuts maybe it's something we need to look into and so that's really the hope is you get enough people saying okay I'll get some staff looking into this maybe there is somewhat of a pacer problem and like I said most you know the administrative office thinks we're nuts they think we're making a lot of noise over nothing but to me it's a fundamental principle because you know if pacer is behind a paywall what is to prevent your state regulations from being behind a paywall or your building code behind a paywall and to me our federal judiciary is like one of the most important sources of what is truly public domain raw materials that we need to be able to access and I think if you permit that data to be buried and behind then all of a sudden you get the state of Georgia and the state of Mississippi and the state of Idaho sending me takedown letters which they have for posting their official state codes online and they haven't sued because I don't think they have any legal ground to stand on but you know the state of Delaware the corporation code of Delaware has a provision in it that says if you copy the corporate code of Delaware without permission from the secretary of state in jail for three months clearly unconstitutional but it's on the books and I think that's really the core principle is raising awareness that again edicts of government the primary legal materials of the United States need to be available I'll grant you that pacer is a lot of secondary materials as well as primary but you can't have the primary materials without the underlying briefs it's like the supreme court you got Marbury versus Madison online that the lawyers argued and I think that's a vital part it's like the legislature right we have our laws online but we also need the congressional hearings that led to the laws because when you go to court you're going to argue not about the law but also about legislative intent when you appeal a court decision you're going to be looking not just at the court's opinion but at the briefs and the arguments that the lawyers made and so I think any entity that emits edicts of government things that we have to live by also has to emit the supporting materials that are underneath those and that's kind of the big picture you know I sent out notes to all sorts of places saying hey I'm doing pacer this month and all sorts of law schools not one of them said oh gee come on in and talk because they got Bloomberg or something else I am finding all the legal hacker communities and journalists are intensely interested in this issue so that's who I'm trying to reach out to our folks that are not already you know having access to the system I was a little disappointed though because I thought more law students would view this as a fundamental issue the law librarians get it right they're totally behind this campaign the professors are kind of like eh you know I do justice I do the constitution I don't this is a legal research thing and so getting them to pay attention is kind of hard so again I don't know what the answer is to getting people to stand up and say something but I'm trying different ways and seeing if any of them work well can I ask you about the nature of these pacer documents when you download them I guess there are like PDFs that you have to go and run Tesseract I guess my question is you know with say SECN a lot of this data you can download I think now is like kind of CSB or kind of structured format so do you see this kind of first campaign to get all these PDFs do you see that as a stepping stone to eventually like all lawyers like file their briefs electronically or all this like data is available and structured from what is the path that you take so the question is what is the technical path here so SEC was actual real data when I when I did it and now it's XBRL which is nice you know XML formatted information really well done Pacer everybody pretty much e-files with the one exception my ninth circuit brief that I sent in it turns out if you want to e-file something to the ninth circuit you can only do that if you're appealing an existing decision and since we're not appealing a decision we had to print it out bring it to the clerk's office which then scanned it and put it online now the folks that do e-file many of those actually are born digital documents some lawyers print them out scan them and then e-file them and so the long-term path is if we begin getting broader access and doing research maybe we can get better standards on docket formats maybe so today if you file in some courts you cannot have an active hyperlink so even if you got a URL there it can't have a link under and so the last thing that a lot of law firms do is you know like my briefs all have you know a ton of URLs in them this may be viewed at the following URL if we move the underlying hyperlink maybe we could convince the administrative office that hyperlinks are okay I understand why they don't want them in there because they're afraid that they'll resolve to something weird a virus will be introduced you know whatever but yeah that's the hope is that if all of a sudden there is broader access to the data more people using the data you know every state has a different docket format and it makes it really hard to analyze state court decisions so maybe standard there on hopefully some simple XML format although who knows when they all get together you end up with some you know million options DTD and schema but you know you can always hope they do it right but yeah that's the hope is that more people use the stuff right now it's west lexus and bloomberg are the big users of the data and they don't complain because one of the things they do is they add value to this data and they try to compete right and so a researcher being able to knock on the door and say you know we could do this much better I actually think if we had a broader base of users west and lexus and the others would also stand up and begin asking for better better formatting of the underlying information because it's a pain in the neck for them every one of them has to do the same thing that they do so Carl help help me sort of zoom out and get the big picture you've been involved with an enormous number of fights about the public domain about access to government information you know pacer is one that I think many of us heard of because Aaron was very good at sort of attracting attention to it and obviously you know as people have gone back and learned about Aaron's legacy it's something that people have learned about as well you just referred to it in some ways as saying this month I'm doing pacer help me understand sort of the larger arc of your work at this point what are the interrelated issues that you're working on how do they work together towards seeing the change that you want to see within the US this is all about access to knowledge right this is about letting anybody who has a computer on the internet and today that means anybody anywhere in the world being able to bootstrap themselves up and educate themselves about something I learned about the tcpip protocols by downloading all the rfc's and reading the things and that's how I learned about the internet I could have gone to college and learn but you know what I didn't do that I did something else in college but to me it's very important that we have access to knowledge in a broad framework right not not just the stuff I do but the entire library of congress should be digitized and available as Karen Schwartz was looking at jstore right important scientific journals should be available on a broader basis to me that's a fundamental human right that is possible because of the internet and the area that I focus in is mostly the federal government and the state government and I do that because there's a hook because with the federal government works of government have no copyright right we have a right to that information and so it's something that I can go in and say you know what just give us the data it's a technical issue the law similarly works of government is a federal issue right in the United States works of the federal government have no copyright but in the United States the law has no copyright state municipal anything that is an addictive government has no copyright and that is the area that I specialize in because it's a hook right it's something I can look at that this is important information to me technical standards things like building codes are not just the law they're the codification of engineering knowledge right the national electrical code is the way to safely do electricity in the United States maybe it's not the optimal way to do it but it's the way that everyone has gotten together and agreed it's an incredible educational tool for a young person wanting to be an electrical contractor or wanting to be a plumber or a factory worker trying to understand what makes a safe ladder it's a codification of technical knowledge and that's some of the most important laws in our modern world because if you look at things like civil procedure and criminal procedure most people don't interact with that but every homeowner has to deal with the fact that their electrical outlets have to be you know no more than three feet apart and the reason for that is you don't want big long stretching across the floor now here at the media lab you probably have those cords all over the floor but again it's the kind of thing that is education about how to practice our modern world safely how to safely transport oil in rail cars how to make factories safe how to make our environments safe and again it may not be optimal knowledge maybe you don't like the EPA regulations but if you can't read them and discuss them then we're never going to get better laws we're never going to get better knowledge of that sort so to me it's about education but also about justice and democracy and you know those kinds of little things because I think that's an important thing in the United States we are overly lawyered and one of the reasons is you have to be part of the guild in order to access the material and as I've been doing this issue for a while there are so many people that are non lawyers that are intensely interested in the operation of our system of justice and I think those people should have the same access as those that are actually practicing inside so the ideal outcome for you is a one in which anything that ends up being law or sort of background material to the law is freely publicly accessible and the transformation you hope to see from that is law being more open to the non professional either in terms of activism or in terms of participation within the system More open and better and we're doing this and so this whole technical safety codes thing I'm doing not just in the US we're doing it in Europe and we're doing it in India and in India there's an incredibly strong reason that these public safety codes have to be available that's because the Indian Constitution has fundamental rights they're like our bill of rights and you know fundamental rights they're freedom of speech but there's another one in there which is the right to practice your profession it's a caste thing in India you can't say you can't be a plumber because you were born in this caste and so it's a fundamental right under the Indian Constitution the Indian standards are all about how to do plumbing safely how to be a safe textile worker how to do irrigation how to build dams and so we have a very strong constitutional argument in India that these technical standards should be available just like in the United States if it's a law it's got to be public so yeah the hope is that we get this basic principle in the United States I've been advocating an edict of government amendment to the copyright act which follows on copyright office procedure that says edicts of government have no copyright in the United States long run of Supreme Court opinions have said that the law has no copyright but getting Congress to say that would solve some of these problems we have in the state of Georgia and with the national electrical code and with municipalities having copyright over their city ordinances and suing people for making copies of them it's a fundamental problem it's also got an effect that I think is really important if you look at legal tools anybody use West or Lexus or Bloomberg have you used any of those systems they're awful oh my god the innovation that we've seen on the internet has just not hit the legal profession and I think one of the reasons is we've got these walls around our primary legal materials so when I go speak to a judge I say one of the things you're going to get is justice and democracy but you're also going to get better tools because right now if you're an internet start up and you want to like gather all the statutes and court opinions and stuff it's a 10 to 20 million dollar endeavor to do that and ought to be just simply download the silly things that are hard to get those materials and so you're not getting any clue for internet start ups going after West and Lexus the only ones that are doing that are companies like Bloomberg that are very very deep pockets and they've reinvented West it's better than West but you know it's not not at the level that we kind of expect from Silicon Valley anybody else could talk about these tools things like the state decoded with that project sort of fighting the battle from the other direction inventing all sorts of neat ideas like proclamation of digitization just cutting the books down and scanning them and then hey I digitized it so the state's decoded is an ex-spell format that Waldo Jaquif put together for the state of Virginia my friend Seamus Kraft does a lot of municipal codes and I've actually been very active in that effort I got about 1500 municipal codes and put them online and based on that a lot of cities decided that maybe they wanted to work with Seamus to like get their code up and we've had remarkable success at the city level city of Chicago, San Francisco, Baltimore all have open data available XML format it's really clue full public domain Washington DC did an amazing job they actually now have the official version of the DC code is available with a CC0 stamp on it no rights asserted not just an unofficial version but an official version and so yeah change kind of bubbles up from the bottom we've had much better luck with the cities than we had with states and the federal government much more responsive because people read their municipal codes right you do a better municipal code and you're the city clerk we found this in Chicago that city clerk's gonna get all sorts of fan mail from realtors and insurance people and contractors they're all gonna go whoa this is way better so huge kind of upside for those folks to make those laws available I don't know I don't know how you get that magic moment where all of a sudden change occurs some of it's serendipity some of it states are bigger they're older they're less in touch with the people the people that do the law are low down in the state government whereas a city clerk is usually an elected office and has a certain degree of autonomy that's why I wanted to be public printer of the United States I wouldn't have had authority to make decisions but you know when you're a constitutional officer or you're a senate confirmed person you can at least go knock on the door of someone else and they'll probably take the meeting right the administrative office has avoided meeting me but I know if I was public printer of the United States they probably would have at least met me once or twice sorry about that this as it turns out is sort of one of the great open questions in Civics at the moment there is a reasonable amount of enthusiasm for the idea that there could be change possible at city levels and you hear people saying things like mayors are the new presidents those are the races we care about that's where we actually feel like we're going to have influence but of course part of what's going on is that people then find themselves sort of abandoning battles at a state or a national or international level do in part to sort of lack of confidence within those institutions so that question of sort of how you take momentum that's going on at a city level and figuring out how to get it to scale up the stack is actually a pretty critical question for the field as a whole I'm giving a speech tomorrow night at Civic Hall for Code for America in New York and one of the things I'm going to talk about there is the Civil Service Commission in the late 1800s because our federal bureaucracy was totally, totally broken and one thing, every time a new president came in the entire staff got fired and they hired a bunch of new people not just the post office, the patent office the whole bit, Teddy Roosevelt joined the Civil Service Commission and they had statutory authority to go in to the agencies and totally reboot their personnel policies and that is what made the progressive era possible right there wouldn't have been a food and drug administration, there wouldn't have been a securities and exchange commission effectively done by the Department of Justice if they hadn't reformed Civil Service and to me the fundamental problem at the state and the federal level is broken information technology $80 billion a year spent on federal IT I happen to have done a lot of work on the IRS $2 billion a year they spent on information technology and a lot of that is on Windows XP platforms they had to get a special waiver from Microsoft to continue supporting Windows XP the exempt organizations database is processed on a Windows XP box with an oracle application on top so to me that's the fundamental problem at the federal level that's why we can't get changed because you go see the archivist of the United States wonderful guy David Ferriero and you say you know I don't like the way you're doing electronic archiving he'll sigh you know he's got this thousand page contract with Lockheed Martin it would be really hard to change that it's just really hard to go in and say we want to do things better even if you want to do things better and to me that's a fundamental issue it's a kind of bootstrap issue that if we don't solve that in our federal government we're going to continue to have a gridlock federal bureaucracy because the tools we give our civil servants many of whom are amazingly talented and dedicated to public servant public service those tools suck they're really really bad Megan Smith right chief technology officer of the United States used to run Google X is sitting there on a Dell box running you know Windows 7 or Windows 8 or whatever but she can't go oh gee I want to you know I'm going to run Unix or Mac or whatever it's just fundamentally hard to do changes in IT and that means you can't change the way government works and to me that's something that you're not going to do and that's my my big argument with our chief information officers and OMB and the folks are very dedicated people that are trying to solve our procurement problems but I don't think it's enough right I think it's something that the president members of congress and others need to say this is a real problem we need to solve this because you know you look at a lot of these billion dollar systems that they build and the state of California three billion dollars to automate that many courts do a little math on that one and try to figure out if you could spend three billion dollars to automate that many courts it's a lot of money and the system never worked they had to throw it away the FAA the next gen system for the future of air navigation hasn't worked it's not going to work it probably needs to be shot the patent office really good people went into the patent office but they're still running missing source code on key applications I want to see the commissioner and said you know you have all the patents online all you got to do is put in an FTP server and his first reaction he looks at me goes well you've obviously never worked on government stuff before I said well you know I actually put the patent database online three times he goes well my electrical grid is at 96% if I plug in one more computer the whole building goes down and I'm looking at I'm going to say well you know we could shoot some of those hundreds and get you a few watts of electricity and maybe we could throw in a few blade servers but you know that was not the conversation you have with an under secretary of commerce so luckily Google went in and was able to do all the bulk patent data and make it available for everybody and it was a big win but you know that I started that database in 94 John Orwant who is right up the road is the guy that did the Google patent thing and you know that was 2009-2010 and even now they're retrenching they had it online and they took the thing away from Google and now they have another contractor and it's still not available the way it should be so we probably have time for one last question that anybody want to throw a great one out there Saul close this up actually I'll add two suggestions for Pacer one you know Michael Morden did this great movie called Robert and me about his independent with the president I mean I think you've got a great public shaming case here of the fact that the judicial office the judicial administrative office won't even meet with you they figured that out and that's why I have a meeting next week with Mr. Launay so they understand that's a PR issue years ago which is the only way the Pacer issue is still stuck with me is how much the federal government has to pay about 30 million dollars a year the federal judiciary paid to get access to its own documents Department of Justice has to pay to access Pacer and I have never seen this memo but I heard from an assistant US attorney that occasionally they get a memo saying we're over budget please stop defending the government so aggressively because we're spending too much on Pacer that's insane something wrong on the PR level you know those sorts of things Pacer hearings I think a big Pacer hearing in Congress would be a good way if they invited the right people to talk would be a good way to begin turning this into something that perhaps the Washington Post or others right now it's really hard to get mainstream media to write this up unless there's some huge scandal or privacy problem you know we hit the New York Times with our privacy audit but it's really done but there hasn't been a lot of Pacer stuff since then so yeah I mean it's partly a PR issue and again I just never know when something hits that big PR button and people get interested IRS you know you would have thought 600,000 social security numbers in the exempt organization database was a huge scandal particularly given all the lowest learner stuff that was going on we got very little press on that very little congressional interest so this is a room full of people who spend a lot of time thinking about how you build popular movements around things if anyone wants to come and advise Carl and try to figure out how you build a big movement on this I'm sure he would welcome the conversation for everyone who's taken a Pacer postcard I would ask you to please write your thoughts on Pacer bring them back to Carl so that they can be sent to the state judiciary and let's have a round of applause thank you thank you and if you don't have anything to say about Pacer just pass up the blank postcards now you don't have to do the postcards you can just send a letter to a judge I mean that works as well but what I'm trying to do is collect as many of these so if I have enough I'll put them up on Flickr and not only do the judges get them but other people can see them thanks for this