 Good afternoon. I am Hemant Bakshi. I am a mechanical engineer working in quality control at Indiana Transmission Plant. It's a unit of Stellantis and V. Previously, we were known as Fiat Chrysler. This is my first time presenting at Open Source Summit. So, I want to thank Linux Foundation for giving me the opportunity to present. Today, I am going to present the work I did in creating Quality Control Dashboard using Python for recording and reporting manufacturing product quality data. For the disclaimer, this work represents the view of the author that is me. It does not necessarily represent the view of Stellantis. Stellantis is a registered trademark of Stellantis and V. Any third-party company product services mentioned are their own. As Tom Peters said, almost all quality improvements come via simplification of design, manufacturing, layout, processes and procedures. Simplifying the process helps in reducing errors and in-term help us create better product and service quality. So, simplification is the goal for the project. So, the overall goal for the project is to simplify the process. Going forward in this talk, I am going to answer how I reached this goal, the questions I had in the process and the methods I used to get there. So, the first question I am going to answer is why I used Open Source. Then I am going to give you a high-level overview of the project and the Open Source tools I used. Then I will dive into the actual project I worked on and lower-level objectives achieved. And finally, the outcomes attained and the challenges I faced. So, why Open Source? Flexibility. So, more control and flexibility in my design, I can create whatever I want to create. It gives me the freedom to be more creative. The second reason I used Open Source was easy integration with different backends and databases. I needed to use different data from various sources. We are using databases like SQL, Excel, Axis, CSV files, Google Sheet forms, APIs for live data. I found tools in Python which I can use to get the data from all these sources and process them at one place. The third reason I used Open Source was good community support. A lot of help available online from the community. You can get almost all your answers online. And the final and the most important reason it was fun. Having your own project gives you the platform to experiment, be creative and be innovative. Working on the project was a lot of fun and was a great source of learning. Here I give you an overview of the basic design. So, the user puts in the input data. The data is then processed and stored in the database of your choice. We then retrieve the data and then analyze to generate useful information. Using visualization and presentation tools, we present the information to the stakeholders in an easily digestible way. So, they are able to make informed decisions. So, we can break down the basic design. I showed you in the last slide into four to two steps. With each steps I am going to highlight the Python Open Source tool I used. The first step is collecting the data. A key part in this step is determining which data is needed. Data might be quantitative data like sales data or might be qualitative data like customer review. We use, I use tickenter to create forms for data inputs. The second step is data processing or cleaning the data. The step is to get the data ready for analysis. There are always errors in the data that duplicates missing data outliers. All of which are inevitable problems when you are aggregating data from numerous sources. NumPy and Pandas provides powerful functionality to do data cleaning and manipulation. The third step is data analysis. That is where the fun starts. Data exploration is done using various data visualization techniques to find trends in the data. The type of data analysis we do depends on the objectives or the goals we are trying to reach. But there are various techniques available like where you like univariate analysis or multivariate analysis, linear regressions, time series analysis, just to name a few. Scikit-learn library has features that allows you to build regression classification and clustering models. Once we are done with our analysis, the final step is to present our insights. We can answer the questions like did our data analytics process derive meaningful results? Were the results in line with our expectations? In this step, it is also the step where we share the insights with the wild world or at least the organization stakeholders. This is sometimes complex because sharing the raw result of your work is not enough. You have to present them in a manner that is digestible to all type of audiences. For this reason, data analytics guys commonly use reports, dashboards and other interactive tools to support their findings. Matplotlib is a commonly used library for plotting data for visualization. Plotry and Bokeh are other highly specialized libraries for creating interactive and effective visualization. While most of the Python libraries are robust and have incredible features and I did not have much trouble in deciding on which package to use. However, I have to do a bit of research in figuring out the best visualization library for the dashboard. So I did a bit of research. Pyvis.org is an open source platform which tracks the current Python visualization tools and developments. The library has been split into information visualization and scientific visualization libraries. The scientific visualization libraries are those based on open GL and provide high dimensional graphs. All other libraries fall into information visualization. This category of information visualization can be further distinguished as libraries based on Matplotlib and those supporting JavaScript. Matplotlib is the basic library available in Python. JavaScript libraries provide interactive feature which are advanced than what are based on Matplotlib. For most of my purpose I use Matplotlib for creating static graphs. I mainly needed interactive to create dashboard that is where I explore JavaScript based libraries. Next I am going to go through a couple of JavaScript libraries and why I decided on the one which I use. So the ones which I was exploring were Bokeh and Plotly. So I started to experiment with some finance data and I plotted some multi-line graph and candlestick on both libraries. I tried to make them interactive by integrating widgets. Bokeh comes with its own widgets while Plotly uses Ipy widgets. I found it easier to work with Plotly. Plotly handles a lot of the graphical works. I don't have to do the lower end of the work in Plotly. But with Bokeh you have to have lower level details that needs to be provided. And that gives you more flexibility in designing your database or dashboard. But there is a big learning curve in using Bokeh. Also Plotly has better aesthetics and interactive capabilities. So I went with Plotly because of the ease of use and its interactive capabilities. So all nice and good now I have all the tools I need, the visualization I need. But the one thing I still needed was to be able to render those graphs as a web page for dashboarding. I needed a web server. So the web server I looked at, some of the web servers I looked at as Viola, Bokeh and Flask. I decided to use Viola because it worked with Plotly and was integrated with Jupyter notebook. So until now I was deciding, I was describing all the tools I used. Now I am getting into how I use those tools to create my dashboard and how I use them to simplify the process. So the process I was trying to simplify was the quality check flow. So the first question we are going to ask is what is the quality check flow? So in any manufacturing facility when a machining operation is done on a part, a check is to be done on the part for the dimension which is modified so as to make sure that it conforms to the drawing specification for the dimension. So in the diagram you can see so once a part is processed through the machine, the machine operator picks up a part from the machine and takes it to the quality check lab to check for the dimension. And he does this at frequency that is defined. So once the part is dropped off at the quality lab, the lab technician checks the part on the gauging machine and verifies if the dimension measured is within specification. If it is within specification he says that the quality tech tells the operator that the part is good and he can keep on running the machine. But if the quality lab tech finds that the part is out of specification, he lets the operator know to stop the machine and get the machine fixed. And once the operator gets the machine fixed he runs another part and brings it back to the lab to be checked again and the cycle continues. So by simplifying the process we were trying to achieve a few lower level objectives for the project. So the first thing we were trying to figure out how many checks we were doing every day. The reason we wanted to know that because we need to answer this question of are the number of checks line up with the number of checks we are supposed to do as per the control plan. And the second objective we needed to find out was in case of a bad check quality check are we reacting quickly? Are we shutting down the machine quickly enough not to make more bad parts? And the third objective was are we recording those fixes whenever we have a bad quality check? Are we recording the fixes we did on the machine for future references? And the fourth objective was to get those three objectives done as painless as possible by automating part of the process. So the plan was to collect data by creating forms, store the data and run a script to generate reports. So how did we do that? So when the machine operator drops off a part at the quality check lab he will fill up a form that puts an entry for the drop off. The lab technician then checks the part and we created another form for the lab tech where he signs off on the check. When he does that an email is sent to the operator and everyone in the department regarding the status of the check. Once all the data is inputted it is stored and I wrote a script to generate report to track progress and I can email those reports automatically using SMTP servers. So on the left is the form I created for the operator to drop off the part and on the right the two forms are for the lab tech to sign off on the check. And these are the snippets for the generated daily reports. So I used the metplotlib lab to generate static graphs, xlx, xwriter to write, draw and manipulate data and chart in Excel. And I use SMTP server to send these files as email attachment automatically. So I have to do less work, less repetitive work. And finally I also created the dashboard to use during our weekly meetings. Using plotly and viola I was able to create this interactive dashboard. I will present you with a brief demo of how this dashboard works. But it not show, yet I am not able to show the other slide. It is not showing up there. It is not power. So it will not. So I have the interactive dashboard but it would not show up. But I cannot see it now. So this is the dashboard I was able to create. So how do I take my cursor over there? Sorry for the delay. So plotly provides interactive charts called sunburst charts which I am using here. So if I click on that, it is a hierarchical representation. So if I select a level I am able to use it as a filter to open up data on the bar chart and the table. I use different sunburst chart as filters by issue and by department. And I am able to use them filters for the bar chart and the table. Also at the bottom I have a pivot table. It is a bit hard to drag. So with the pivot table I wanted to keep the analysis flexible. We can use any feature we want to do a deeper dive and draw insights from. So when I started Python had a few pivot tables packages but I was not able to convert them into a widget and render them to be used in my dashboard. So I wrote my own pivot table widget on top of these packages and I have it in my github and you can go download it from there. Also I was able to put it in the pypy packages so we can import it as a package. So this way I was able to contribute back to the community which has taught me so much. So lastly my experiences, the pros. Python has a ton of packages and good documentation which make it easier for you to learn and implement. Tools and packages you can find tons of tools for almost all of your requirements. And the final is tutorials, lot of tutorials and help available online. And the only cons I could find is we have to be careful about the packages and license you are using as open source. And the second con I could find was the packages are updated frequently and it might create some compatibility issue with future Python versions. So that is how I created my dashboard. Any questions? Thank you. Thank you for coming in.