 Well, first of all, thank you so much for the invitation for having me here today or my presentation. So my name is Gilles Villas. I am the CEO of Zero-Q's Cube Innovation and data officer. So today I wanted to offer a good perspective on how do you think of this source that I'm going to bring in? And how do you think of mindset in an organization such as city governance? So I am the officer of analytics performance and innovation at the city. That's a department with 20 cultures. So we're multiple, for four, multiple divisions, one of which is our data innovation, which includes data engineers, data analysts, data project managers, which are in this team member, Armina. But yeah, so we work with departments throughout the city and trying to think about how we bring that data out. So here's a journey that I wanted to take you on today. Because I'm going to be talking about our different nations for our data development. First of all, I understand what it challenges that my department is trying to solve for, and the city's trying to solve for. What it needs to be the foundation for data. And I think it felt there was a very nice thing to do with this conversation, as well as the role that open data plays throughout our city development. And then I'm going to deep dive into some real life, like in the nature of this case, like, what are we doing in the city development to teach data? So, to start off with, I actually wanted to talk about something completely out of the field, which is the NHS. Does anyone know what the NHS is or what that sounds for? It's a national health care system for healthcare. Anybody can get the free NHS option. That's right. It's a national health service of the United Kingdom. Now, what sort of data do we imagine that an organization like the NHS would have, would work with? That's different. Then the data just selects like one. That one would be kind of a patient health record, yeah. That's a big one. These strategy data, like, we found so many different forms, like, what kind of entities do they have and what they would like them? Also, I would go in there, administrative data. Like, how much, you know, time keeping for doctors and nurses, how much time are people spending, you know, in the exam room, for example? Well, so there's all sorts of data. So the reason why I wanted to try this, and I wasn't buying one, but they're not even in the economy. We start out with some of the ancient data points, and I thought it was such a great encapsulation of any large organization, data interest. And I just wanted to, like, frame the article. So I wanted to review it a little bit, you know, and also for those of you that may not know this, but it's an organization whose budget is 181 billion pounds. So if that kind of figure it out. But anyway, so the article I'm going to review, I think, is great. How to make brilliant health service AI rate. They aim to achieve enough and open up its data, patients will benefit. So the National Health Services Generations holds vast lots of data on British health, organized, using NHS numbers, whoever the person in it's care, the system enables worthy, excited, blah, blah, blah. You might suppose it's a treasure trove for artificial intelligence. Why do you need to be a treasure trove? Why do they do this? Is the data what's better? Exactly, you could improve health outcomes. But what are some of the issues that may stand in the way? Does anyone want to take a file yet before I reveal the punch? I'll give you a file later. I'll give you a file later. I'll start your data. I'll start your data. Data that's very hard to parse. Yeah, great guess. Let's see what Gario says. Much of this sort of energy is data is a mess, organized in a way which serves doctors, treating patients, but not AI developers, opening up to different fields. Without good systems, the man's data overcomes often by empty, despite endless amounts. When I do systems kind of talk to each other, the sick must drive us out of this, repeating the medical histories in every new and running. So the lack of proper data ecosystem doesn't have any real life implications here with people's lives. It's the order, though, of the environments. Again, we're talking about the same thing, but what they're trying to do about it, there are really the bugs, but such ambitions are not inevitable, meaning when we reduce by stitching together disparate datasets across NHS in an outcomeful project, the federated data plan. Full communication, this is a different part of our member, and who recently saw many NHS across its own data sheets were unsure of what the LDP was meant to do. As pressure had mounted on common services, which was confirmation has once again been de-barred by hospitals spending 234 million pounds on storing paper records. The procurement process has also raised in ten of biograms, in ten of biograms as though together. They would always have to each have the frontrunner for the 480 million contract financed here, which has combined data for the CIA. So, why can't we take away from all of this? Why doesn't common things affect me in my job? Metadata, mentioned. IP systems have not talked to each other. Paper-based records all over, where I can go and say, hey, here's the data. Let me analyze it. But also from our people perspective, the lack of communication, the lack of a common understanding of what a data system means, as you've all studied, the metadata, we understand what each column means, what are the possible things that could be contained. And the lack of an ethical framework around handling it. One example is privacy, but we can also think of things like algorithmic bias. Anyone know what that is? Has anyone heard of them before? Algorithmic bias? It's not here. Yeah, so algorithms, they just act on the data we train them on, so if the data has some sort of bias, it's different on the relationship it makes. Right. So we're thinking about treatment data, but all of our clean data wasn't quite fixed. That same thing wasn't applied for algorithmic bias. So all of this stuff that we come on in algorithmic organizations, I just put the NHS case that was an excellent encapsulation of some of these themes. Now, to provide both, as I said, I work with various departments, like I've surveyed them, like, what are you doing? You know, as 558% they're not able to access internal CD records. Like, let alone data records from elsewhere. 50% was more advanced. Why do you use your data better? Because you don't necessarily have to read through it or to know how or the data will do so. So if you're looking for a short client, I will try to approach him just saying and 30% of my data is always used to do their data. So, what are we doing about it? So, how do I think about it? So, I know I'm not sorry, but one of my mentors, she loves metaforce and she always says, you need to have a good metaforce. So her metaforce is for data you need to have rows, you need rows, you need rows for your load and then you need to have you have to change your address. It's just for that agreement. But then I run my own metaforce. So, okay, I think that was something that was like 10% more of the team. So, if you want to make ticket reviews, we need to ensure that it's mandatory protocols. We need a recipe, so thank you my grandparents and we need a kitchen. So, what does it mean normally? It means we need governance, we need rules for how we use our data. We need the culture that sees capacity for the practice of the community, the data innovations. When we're talking about a kitchen food, it refers to infrastructure, governance and culture. People want to summarize it somehow. So, after thinking about this, we remain components of it. We need to develop a data strategy which includes milestones along those three goals. Because the point I'm trying to make here is you can't think of a data just as building networks, right? You have to think about the data work and you have to think about the ethics and you need to think about the people that are going to use it. So, through the authority, we try to look at all of those names and develop milestones to build a data foundation. So, these things include things like establishing a framework for internal data governance. So, we have a data governance and for more of what that means to be interested, we need to protect those privacy from passing a privacy ordinance and then keeping the data that is at a sensitive level, like more of a way or things like that. Open data, etc. So, when we're talking about infrastructure I'm sure we already know what it means, but if I'm to simplify it, it means we have data from multiple systems we didn't see. So, we, in this case I'm going to put here our example from Firmly. We have a database, it's called IBS we have another database and then we have a whole bunch of flash files that we're campaigning for and making forward. We do a whole bunch of ingestion data transformation and we write scripts so that happens every nice, we could have it happening every second, if we wanted to, that would be more expensive. And then all of that gets followed into our cloud data or data warehouse. And then we use that to set data to be reports, data tables or visualizations, depending on what department it is. Another way of thinking about this is you'll see the bottom you have an integrated organizing team data with a data platform and solid data governance so that's the infrastructure and the governance part and in the middle you see what that ultimately transforms into, like the use of the data which could be like seeing insights, capturing insights from that kind of analytics if we predict new things to machine learning, automating more operational time saving or creating new products or applications. And we're an example of all of this in the CEO series, although even on a truth level we mostly stick to insights at this point. So I would like to talk a little bit about this. Remember that this is a whole because when we do the infrastructure customer aggregation program the rules we set for how we govern our data set in parts of the business. But anyways, so all the data is in one community with the constable of the data. All right. So it's just data that people can access that best for their own project from research? Yeah, that's exactly right. Anyone want to venture why often data is important? I can't believe it. About document, if you will, please start for me. You may have a record of everything that's going on in a particular issue. That people can continue to do. Exactly. All of those are brilliant. So if you can see the implementation of the data program which you can go now on data as well as well in it you can search for particular topics. You will see which ones have been added more recently and you will notice here there's updated, I did this speech yesterday 10, 20 states, 20, 30 states again. So this is a list of our server properties or this is our city land request which is what citizens use to as a city looping program for this request we have code validations. All of these ideas is that you can go look, analyze there's even APIs which you can use to put the data into your own file. So there's some numbers to throw away. We have one of our open data sets which are at 5.9 built out of our data platform for daily updates because we feel that it's not enough to have data sets in the open like what's most useful is to have accurate data sets up to the data set. So that's what we have been doing that the world of connecting our data with structured world and data so that if someone for example wants to do a translation like yes it or on their own they feed and listen to it they have access to the data sets. So it's important also that these are data sets that are high quality data sets that have been reviewed and so on. That's the most important thing is that it's perfect and in fact part of the value that I see here for our open data platform is that the CDD gets an extra service on this and so sometimes there are researchers that have been doing something with one of our data sets they're like hey, this is the thing you're having to know about it you stop reporting this we didn't see that the user department that's near the area this year so then that allows us to go back and like fix the problematic source. We also receive around 300 900 3900 pages. But our open data is not all the out of the web it's also about the people that use it so that's why as part of our open data milestones to make sure that we are in the community talking with the users capturing their feedback we try to do in design the user user center approach user testing and all of that to make sure that we're including the accessibility of the code but also going on events. and from last year that's our data program we're doing our open actually the same and we also feature projects that have been done using our open data we encourage people to do more things and we feature it I love this first one it's Spotted Risks created by Figo I think she's a master's I don't remember it's something about school she's an intern for the fourth master someone created a chat board using our using our open data and it's a beautiful art analysis that allows you to like see in a map what's it for now you can see what's it for and this is part of a new thing that we launched this year which is a data challenge we're inviting people from the community to do things with a particular data as a way to continue that which really that is I know people getting value out of open data which then create and send this for the city to be able to release more data which then gets people more invested by having more of the field data and so on and so on hopefully as I mentioned have a lot of messy data in city development so putting it out there and being forced to be like open is sort of with the pressure on us to be like hey we really need to get our stuff together when it comes to data set so anyways I'm now going to turn to I guess the final presentation about the sort of work part of the presentation what are some cool things that we can do with the data in my part there are many helpful ways that you can think of just one possible way to further arise are there many different frameworks this is by the way from a framework that I put together when I was in college for cities that I helped write out with and wrap with to my colleagues but just how to think a little bit about what we can do prioritizing analysis matching estimating targeting another way to look at the same questions from Chattanooga the science team is making trying to explain how we can improve this data in a haystack for example easy restaurant and station data for audience if you have too many things to do you can start with the biggest return for example by looking at one of the main complaints for example using force for example finding where it would be most most effective doing some evaluation experimenting for what works finding evidence for them with a testing or an email to conversion rates or version of it this is from the our finish school the major article the case for government investment basically saying this is an update investment in money research and stuff but there is very few return investments that you can talk to by saving people time or finding fraud or improving collection I think that's also very important when you're doing analytics you're not doing it for the sake of doing it you're doing it because it's deliverable maybe it's not money but maybe it's not the city has a higher budget in the end but you should be at the very least aiming to deliver money for someone whether that can teach you in the department so some of the things that we worked on in series smart specifically this is pretty high-fibre patient transparency so this is just an input dashboard based on our spending of the American Rescue Plan around so after the many divided administrations release a lot of money for municipalities to power their recovery so we said okay let's use this opportunity to also be more data-driven and be more transparent so we decided to work in a dashboard that would explain to constituents where what's our money going in what categories and you can see here that economy, families, government infrastructure how many priorities are completed and there are levels of that how many priorities are being planned we also showcase how much money we spend in each one of these categories and then you can also see from granularly what are all the projects and what is the main output and what is the actual output so for example for the last one and this is for from a while so this is different we don't have an output yet but for some of these important third program you can see in the second tab how much money we spend you can see what the output goal is, number of stipends and what the economy will be 500 or so so very simple visualization I'm starting simple here but honestly the Chinese year wasn't so much building a dashboard but rather building a culture of collecting data and setting because we really didn't have a unified writing management approach so the third program was like you're going to get money but now you're saying if you want to use this money you're going to have to think about what are you trying to get at what is your economy ideally you would also be thinking about what your outlook is this is another one of our obviously the products it's our winter wear operations tool this is a theme of winter wear in collaboration with Esri so whenever there's a snow storm where do you see all the news? not for snow all of our snow clouds are equipped with GPS sensors and other sort of sensors we take that data easier time and every time there's a snow storm we contract where the snow clouds are and the meaning was very simple you could just see the snow clouds going through the the map and this helps alleviate some of citizens' concerns because and this is already public right? some citizens are like hey the cloud hasn't gone to my street so they can like oh and you can monitor it here every time we vote like we have an outlook where you can see like how long ago was this and now we are working towards a winter wear operation tool which tells you hey like how was our response in terms of time compared to the three news snow storm like we can track our performance over time and we can identify how why we're performing the work in the early part cars by the way, about the early part cars here with you but yeah so this is a big one for us this is our family dashboard network where we try different APIs in this case for furnace this is always emerging what I was mentioning before our Azure data platform the way we use this is because there are many multiple systems we bring it all together as I was saying earlier and this just helps to facilitate what we call performance management I don't know if we use this term here but basically we sit down in a room with the mayor and we're like hey let's do that with our permit let's do that with our permit package the ratio of regulation and remodeling we know we share those 50% of the time that's down to 60% in the last week and I think it's all about how we use the data and in this case we use the data to ask questions like hey, why was our performance then? why change? are we having more permits submitted this year or are we starting and we use the data by hypothesis like hey maybe this is for example starting so we need to work on that we need to make sure that our hiring policies are we're already too much into weeds here I do think it's important to know that it shouldn't be used as a punitive tool it's not like the mayor doesn't have a common view people overhead it but it's important to even everyone on the one if everyone is not like hey how is everyone performing as well final use case this is not from the series but this is from when I was working with support of the CODA this is going to be about five prevention five prevention in the city of Sioux Falls so in terms of what here was there's a lot of properties that are at risk of virus and we do have a number of fire inspections each year but we can possibly forward all the properties so we find a simple predictive model the basic basic model is on the side of the river we collect all the fire instant data we then combine that with all the data sets at the property level how a parcel that parcel has to receive a fire that's the variable that we want to predict yes or no but then we look at other things square footage or have to receive a code violation and so on there's a lot of exploratory data analysis so if you can see here in this corner over here it's a correlation matrix which viruses have a high spreadation with other viruses and then over there you can see the relative importance of a graph for what ended up being the model one help doesn't predict the most object so in the end this ended up looking like a tool such as this which basically takes all the viruses in the city categorize them by the level of risks and you can see all the different firewalls that are part of exercise and you can also touch the one that you read those are probably the ones that you want to organize for the environment so the end of my presentation a final production for the next week we have our open day-to-day which is I think we have a really high continuation of this here so we're going to have a session such as this one but more technical also using specific tools and whatnot we also have a hackathon which I'm going to tell you to participate we have cash prices thanks to our partnership with center state teo so hopefully a thing is a thousand dollars so please like actually everyone that's free next Saturday now take a look at the comments I can wait yes you're not hungry all right I'll tell you a few other things about it but anyways if you go to the center state you'll find it and also I'm hurrying so if anyone wants to come work for a city we have a one team internship available with multiple patients that are in the fields of data for other technology or for management who doesn't have to go to anyone that may be interested thank you very much it's a pretty nice thing to do although you're not going to be interested but if you don't want to for the expertise that you have you can provide them with that and if you don't want to and also if you want to ask them so I will read your full data can you open data for them yes um let me let me start with the second question so why are we in full capability of open data fight so the first thing we do in the city was to build a framework for determining what should be open data in the city so we have all departments to inventorize their data and to take a look at it we'll review it with them and we'll say hey maybe this is good for all of them but there needs to be a review of it where we take a look at the data but obviously you don't want to do all of your data set because you have you have TDI you have ASAP maybe for the city maybe for structure you don't want to put it there because you have a lot of cities like a very extreme example like maybe you don't want to put all of that out in the internet so then you can but I told you all of that first there and then we'll go over to the department to identify what data that are what should be open but it's always easy to do and we'll open it up to the public to tell us like hey what data do you like to use and that's a great excuse for us to go back to the department hey like people ask you for this and what was the purpose of this what you were planning on doing just to get a shortage of data um well in the case of city here is we we're not my team yet yeah we sort of provide that expertise for the department like the more advanced stuff the better people know what their operational goals are and we'll go for you to beat them however obviously we can't observe every single department so one of the things we've been thinking about is I don't know how to fully date this yet but how do we build pipelines including basic data expertise throughout the entire city so one of the things we have done is we've designated a database or a data expert but this is going to be like someone that knows like advanced statistics or anything the someone that already knows what data their department is doing for with them and then what we want to hope for in the future is like making the basic curriculum or because it's very hard for analysts and data scientists to do the work if data wants to be the same language or the business right so there's some there's some narrative we discussed in data literacy anybody can do like Excel right it doesn't need to be anything more advanced than that what sort of skills do you want to do to someone that remote from the city well I assume in this team or in the city in general yeah yeah well I always think that technical skills we learn but the soft skills and the people skills that are really really relevant to do your work well and I know there are some story types of like oh like if you're like a software developer like a data cube or you just want to be behind it in the future and not start doing it when I hire someone like the expense of that man because I expect my data professional to be incredibly good at working with others I'm working with others of various levels of expertise so I want someone that can set us a translator between what's a data a data developer also someone that can speak with their truck driver for presentation and be like understand what they're going to say which may be completely different right so another quality and this you know the previous presentation I was very well like that's sort of like each other's mindset of like okay we're going to start with one idea of what we want to get at but it's going to change over time so you need to be adaptable to do that effectively right and open data service it's a very good website I wonder if you know about an open database this project is only doing it by a service or a state-of-the-art most of the cities in the state-of-the-art so the website that I showed, that's OVC we try to work with organizations that may have relevant data at a city level and include the datasets as well for example, central battles there's a bigger movement of open data portal I don't know if your state has an open data portal I think they do but not every city has an open data portal but I think I have a solution but they have science major what skill we need to master but a dataset and science major and that's a tricky question for me because I personally feel like we are such a I I don't know how to deal with so much people but we're such a deficit when it comes to having quality datasets in the city that I feel like if I hire a data scientist I really wouldn't hire a data scientist like paper itself like Python and R so what I'm saying is I don't care I don't care if it's Python or R and I'm really tired if you might come to a dataset as local you need to understand that you don't have a lot of good input of data so even if you're a data scientist perhaps you're going to need to think a little bit perhaps you're going to need to think like I don't know what profession of data quality for my friend data quality so I would say that answer is the same in a lot of conflict students but I don't think that's specific in the city of Syracuse or cities you don't have data, you can't do data science and some large companies have teams with the same data and some are data science many companies they're together with the data and many of these are data science what are the most prominent issues that you and your office are facing right now it can be called by data or basic technology we are working but the biggest issue that we're trying to spoke for in my department right now is data science related to sanitation but fractions have been a big priority for the major and really attention but I would always recommend to a department that's starting out in this world to always try what you're doing to like something that is maybe disagreeing here because this is not a data science it's not a data science so sanitation is something that the mayor cares about and it's something that we have been really working on to capture data and to display the data in ways that are meaningful and that has to do with the sanitation trash carts so sorry, sanitation need to look like this trash carts yes and we are also putting GPS and device and clean management software in those but also like we have illegal trash that's happening in the country so what we're seeing reaches all of our sanitation forward so I think that this is an interesting point we made about the intersection of technology and data because if you don't have the appropriate system to collect data then it's really hard to like make it so right now we're going to a new system more exciting brush the idea of ourselves where the government sector before we were writing a paper and then that paper they get back to kids at the end of the day and then we'll send it to finance so that they can do it through the right trash and there's all paper based and it's really hard to be traffic so you feel that we're in a system where each sector has their cell phone taking pictures but it also comes with a printer and if there's an app where they can go out like hey I'm going to start this probably before this they put the trash bags without the carts whatever and hey that makes it more efficient for the code inspector that now doesn't have to act without the paper but it also generates a lot more reliable data where we're going to need to look at our website so that's anything thank you last question do you have a little of the structure of your department like are most of the people that work there with a data science background are they like folks that were doing different city roles were trained in it like how was the breakdown of the folks that are in there sure and I was saying I said well over time um but um right now we have a director of analytics and data management which is which we just hired this year to handle our entire data portfolio um we have data engineers um we have data analysts sometimes we are scientists but like people that are really good at building our corrective requirements in building but we are technicians which is what we use in our case then we also have data planning managers um to take those big complex problems and make sure that the project's the development project is all going smooth we don't have a data scientist on stock because I think that's complex yeah I get a little labor perfect um