 I am Stuart. I know I'm the last person on. I'm going to try and keep the energy levels up. I work at the open data institute as a software engineer. I have talked to you about some of the work we've been doing with Coma Chameleon, a desktop CSV editor with reuse in mind. I work in the ODI Labs team, which is the R&D side of the ODI. We build stuff, work on research projects to help develop standards and techniques for using and publishing data in particular, open data. I started at the ODI three years ago, and as we were a new team, we had the pretty unique opportunity to try and get things right from the start. One of the things we talked about is whether we want to do 20% time, which I'm sure everyone knows about the concept of that through Google and things like that. What doesn't exist in any real form at Google from what I hear, we still thought it was a good idea to try out. We started doing that 20% time every Friday each week, but it just quickly became unwieldy because other stuff would take over. We were like, well, we've got to get this finished, so let's not do it this week, but we'll definitely do it next week. We just found it just wasn't happening. Plus, one day a week, it was really difficult to get anything significant done. We then decided every fifth week we would spend a whole, we'd have an innovation week, although most of the time we're supposed to be innovating anyway. It meant we had a decent chunk of time to develop something significant. Some of the things we've worked on, this is from me, this is Git Data Publisher. It gives non-technical people the ability to publish datasets on GitHub, fill out a quick form, fill out some metadata, add the datasets, and as well as publishing it in GitHub, it uses GitHub pages underneath the hood to automatically create human-readable webpages with machine-readable metadata underneath, as well as doing data previews. If anyone's been to the ODI office and might have seen this, this was a piece of artwork by an artist called Ellie Harrison. Every time it monitors the BBC news RSS feed, every time there's a piece of news about the economy, it gives you some crisps. It was on a pretty old laptop, the laptop broke, we couldn't get it to work again. Basically, we rebuilt the code underneath and used the Raspberry Pi, and also added a restful interface, because why not? We made it tweet and sent notifications to Slack and things like that. That was good fun. It's recently come back to the office. It was on loan at Somerset House in London on the data exhibition. That went on tour. I liked to think that I did some art, but I didn't really. My colleague James has also built a Chrome plug-in that allows you to see diffs for CSV files on GitHub. We've also done a lot of smaller projects like internal dashboards, research on things like blockchain and small libraries that work on top of some of our bigger projects. We've focused quite a lot on CSV in the last few years. In the early days of open data, I've been plugging away at this for quite some time. There's a lot of talk about linked data, semantic web, and there's loads of potential there, but a lot of data publishers are non-technical people. A lot of open data you're going to see is going to come as CSV, which is actually great, because it suits publishers because it's dead easy to publish. You don't have to be particularly technical to write CSV, and it suits for users because it's plain text and super easy to pass. However, flexibility is not always a good thing. We've seen a lot of examples of badly published CSVs, but this is an extract from a PDF showing the 2013 New Year's Honours List published on Gov UK. Alongside this PDF, and I was looking some time ago, I saw a CSV. I thought, great, we can probably do something with this. So I downloaded it, opened it in LibreOffice, and saw this. This is quite an extreme example, but I think from looking at it, I'm pretty sure someone just went, control A, control C, control V, into Excel, file, savehouse, CSV, job done, right? So it made it difficult, if not impossible, to work with. So this inspired us to build CSV Lent, which I think a couple of people talked about. If anyone was at CSV Conf the year before last, my colleague James did talk about this. We had some funding for central government to do it. It allows you to upload a CSV or specify a URL and see how reuse ready your CSV is. There's a selection of some of the types of warnings and errors that we check for. I won't go into them all, but things like, if it's hosted on the web, things like content type, and then invalid characters, inconsistent values, blank rows, that sort of thing. So you can go and check your CSV, get your results so far so good. You can also specify a schema. We've based the schema on the JSON table schema, and include things like whether the column is required, the data type, column title, and value constraints, which is generally via a regular expression. Any of you saw Jenny talk about CSV on the web, we're just starting to support CSV on the web as well. That's currently supported in the Ruby gem that sits underneath the CSV Lent website, but we haven't yet got full CSV on the website, but that's coming. One of the biggest uses of CSV Lent schema has been the local government association in the UK. They offered funding to local councils to help publish their data, providing it met a given schema, so they'd provide a schema. People would get the CSVs out of whatever system they were using or generate themselves by hand, go on to a CSV Lent and check if it met the schema, if it did, then they uploaded it. But the main issue here is a lot of people are still using Excel to generate CSV files, generally via a save hours. The problem is Excel is designed for building spreadsheets, and spreadsheets are not CSVs, as most of you know. For some time I've been thinking, especially with the LGA project, I've been thinking, wouldn't it be great if someone built a CSV native editor? Rather than going, I've got my CSV on file savers, go into something like CSV Lent, check it against the schema, and go back and fix the problems, wouldn't it be great if you could just do this on the fly? Someone built a CSV native editor, which stopped common problems before they cropped up. This is where this wonderful bit of tech came in. Jeremy earlier was talking about Electron as well. For those of you who don't know, I've never used it before, it's used to build GitHub's Atom editor. It allows people who primarily develop for the web to build desktop applications using JavaScript basically, a node in the back end and your front end JavaScript on the front end, which is great. I'm not a JS guy, really. I'm getting there, but it's still a bit of a challenge for one person to learn all this in a week and build something. Luckily we had help in the form of free interns over the summer. Ben, Stephen, Daniel, I managed to convince them this was a good thing to do. They were 100% convinced, but I contacted them for a week and we got cracking. They put together user stories, we did it properly, tackled them one by one, and we had something. It's a simple stripped down CSV editor, removes much of the cruft you're seeing in Excel and open office. It's just dead, dead, dead simple. You can add rows and columns and you can also validate CSV. Call that to the CSV and undocumented API and it turns the results straight away. If you click on the error down there, for example empty row, click on that, it will show you which row is empty, but obviously this is a very simple example. It will show you where the errors occurred. You can fix any errors in place before you go ahead and upload. Fix your errors, validate it again. We saw some talk about data packages as well. We have that support, so you can go export as data package, export as data package, add your metadata, add your license, key words, and it will also generate a basic schema for you as well. If you click generate headers, it will try and work out the schema. Export is as a zipped data package. Any problem is we had a nice desktop editor. We still had to cross the hardest hurdle of software development. We couldn't think of a name. Naturally, we reached out to the rest of the ODI team and the best name was Coma Chameleon. That was the logo designed by a wonderful intern, who said software developers aren't artistic. We didn't stop there though. I've done a bit more work with big refacts with the code, writing tests. I've got continuous deployment of binaries now, so every time we push it to master, it pushes new binary up to GitHub, new version, bug fixes, and a new website where I shamelessly listed the same template as the CSPconf website, because I have absolutely no imagination. I've also had his further support for schemers. Now you can open a schema in CSVLimp format and automatically add the correct column names for you. What I want to do further down the line is to have live validation for each column, so if, for example, one of those has to be an integer, if you try to put in alphabetical characters, it will go red until you can't do that. Also, just matching against regular expressions, that sort of thing. That's something I want to add further down the line. So that's up for you to download, install, and tinker with. Really, really happy to get any bug reports. There are bugs. So, yeah, have a look, have a look on the GitHub repo, and if there are any bugs that haven't already been reported, please let me know, and I'll try and get round to fixing them. When there's always room for improvement, I want to add an offline mode, because at the moment it calls the CSVLimp API, which isn't ideal. Obviously, if you're offline, you can't validate, which sucks. Better support for Windows and Linux at the moment. I'm limited by the OS and some of my laptops, so I don't get too much testing in, particularly on Windows. More support for schemers, as I mentioned before. CSV on the web support, I'm really keen to do. I'd really like to do autof publication to GitHub, which should be a case of just file, file export. Dat support as well was an idea I had. I keep getting ideas, and it's probably dangerous. I just wanted to sort of sign off by talking a bit more generally about why innovation time works so well for us. It allows us to try new techniques and tools and build things without the pressure of having to find funding, and we can solve problems for colleagues. We've got a whiteboard to suggest ideas, and increasingly we want to involve other members of the team, non-technical people, and we're starting to teach some of them how to code as well. On a personal note as well, I've got two kids punishing homebrew schedule, and I don't really have time for side projects outside of work, so I'm really super lucky to have the opportunity to grow as a developer. So, yeah, that's quite short, but I'd love to hear any questions. You can see the website at comacameleon.io, and I'm at Pezherty on Twitter. So if there's any questions, comments, please let me know. Yes, there are. We're trying to support both, basically, is the thinking behind it. Not on a moment, but that's something that we could quite easily do. We don't want to prescribe anything, and as there's these two divergent technologies, we want to really be supporting both, ideally. But yeah, at the moment, we are very much focused on the adjacent table scheme rather than the CSV on the web, but that's something we want to do more on. Anyone else? It might, I think actually, I think there's a hard limit in ElectronJS to about 20 meg, I think anything over 20 meg will probably cry. But yeah, again, I haven't really tested it in anything too heavyweight. Again, the idea really is it's more for creating rather than editing. It's perfectly capable of doing that, and we can also import Excel files as well. That's something I didn't mention, but that is a thing we can do. Interesting project. What would be some examples of pursuing this opposed to waiting, let's say, plugins for Google Sheets, the online Google Docs spreadsheet? So anything like a validation plugin for Google Docs. That is something that occurred to me. I think partly it's out of wanting to experiment with new tools and techniques, because I've never used Electron before. And partly, yeah, I think that's mainly it. That's something that we'd certainly, because it's the right tool for the right job, right? If people already use Google Docs or whatever, then it's a good idea to get them using that. But I think, again, it's going back to the Google spreadsheets and Excel and Open Office. They're not really designed with reuse in mind. It's giving you a little bit too much freedom. So the idea was, as well as the validation, it was also giving people, removing some of that flexibility that lets you probably do a bit too much. So, yeah. Hiya. Sorry? It's pretty much do what you like. We've got this whiteboard and people do suggest things. Occasionally, it will come out of... We did some research on blockchain. That was something that was suggested to us. And other little projects like that. It's an initial match of things, really. The way it's organised, we used to do two-week-long sprints. We now do one-week sprints. We do four sprints, and then the fifth sprint is an innovation week. That's basically how it works. But, yeah, like I say, I'd like to work more with internal teams and try and solve some of the problems that they've got. Because I think developers are very good at solving problems. They're not always particularly brilliant at identifying them. So that's why I'd like to work internally a bit more with people and actually find out what problems there are that's solving. Anyone else? No? Cool. Well, you're finished, aren't you?