 Thank you very much. Hello everyone, my name is Atef, so I'll be presenting the first half of the talk and then Urjuman will be speaking the later half. So the topic is around Python for Arts, Humanities and Social Sciences. Both of our backgrounds in terms of the training is coming from computer science, but both of us are applying now Python and also see us as whole in Arts, Humanities and Social Sciences. The talk is organized by following content, so we are going to touch base on what is AHSS. Bit on our experience and exposure coming from data science, artificial intelligence, because it's very dominant and there are some data science roles where there is a difference between STEM and AHSS, so I'll just throw a bit of light on that. Some bits on Python and technology, followed by case studies, which would be covered by Urjuman. So Arts, Humanities and Social Sciences, you know, the academic counterpart to this is STEM, where we know like science, technology, engineering and mathematics, but when it comes to Arts, Humanities and Social Sciences, this is where the user or the human exist and a lot of the time when we are trying to solve a problem, it's mostly about humans. So if you are even trying to solve a problem from computer science, we are trying to solve a problem that exists outside of computer science mostly, which is why we see a lot of interdisciplinary roles within computer science as well and applications and so on and so forth. One of the challenges that we have with when it comes to Arts, Humanities and Social Sciences is lower attention has been given to this because there has been drive towards STEM more often. However, the problems are now coming from AHS more often than it was coming before. Within universities, Arts, Humanities and Social Sciences, they get less attention compared to STEM, but recently there is a change, there is a drift towards, there is a push towards Arts, Humanities and Social Sciences in Europe and also abroad. However, searching for a common good in the domain of Humanities, it's a big thing for the causes such as social good, social welfare and, you know, for the society, for uplifting it, for futuristic smart cities and all those kind of things, but the whole point here is human in the loop or we are trying to solve a problem for the human. That is where the Humanities is quite important. In UK, it has been rebranded because the attention has not been given like as soon as people hear Arts, Humanities and Social Sciences. For some reason there is a stigma around it in the minds of the students and as well as in the society. So they try to change it and bring it shape. They call it shape, which means simply Social Sciences, Humanities, Arts for people and the economy. And of course when there are people, there are society, there is economy. Now, when we see a big picture when we talk about data science or maybe perhaps artificial intelligence, we see a lot of intersections here where a lot of things happen. For instance, if I show you there is computer science and when we try to merge it with business or business problems, we have traditional softwares such as Excel, such as PowerPoint because they're kind of the tools that try to solve some of the business problems. But when we try to mix computer science with maths and statistics, this is where we usually see machine learning. And when we try to mix maths, stats with business or any domain which could be Arts, Humanities and Social Sciences as well, we kind of get data analysis where most of the traditional research takes place. But when we combine all these three together, we kind of get what we call data science, which is where you have, you know, computer science as a tool and then maths and stats kind of provide you with a theory and then business or the domain provides you the context that you're trying to solve. And in this intersection, there is a great support coming from Python, for instance. When we talk about statistics and analytics, these are the two of the common terms that are being used. When we talk about statistics and you talk to a computer science person, they might be talking about mostly about descriptive statistics or perhaps EDA, exploratory data analysis, if you may call it that way, where the idea is to look for or describe what has happened in the past rather than predicting what's going to happen in the future. This is what traditionally is declared as descriptive statistics, which is kind of also my background in the sense because I'm coming from computer science. The other side, when you talk to a person who's coming from a computer science background, they might be interested in something which they call predictive analytics, which is where they're trying to use the data and trying to predict something in the future. There are other two areas where mostly arts, humanities and social sciences people are interested, which comes from statistics, is around inferential statistic, where they're trying to build a hypothesis and they're trying to test it, which is a little different from what we do in predictive analytics, where we're trying to predict the future. Likewise, there is another discipline where business and hearts and humanities and social sciences people are interested, it is called prescriptive analytics, where is the domain of what should be done, simulation, rule-based approaches to trying to solve a problem, or perhaps commending a different path. The interesting bit over here is when we discuss about Python or the application of computer science or the application of data science, you can see all four of them playing a huge role. And if a person is coming from computer science, mostly they are in these two areas which are pointed out with the arrows. Whereas if somebody is coming from statistical background, they would be more inclined towards these two. Then there are a lot of applications of AOI now when we talk about, like as I mentioned earlier, computer science is not solving its own problem as much as it's trying to solve other problems. So you can see we're trying to make traffic experience better through automobile applications, trying to solve problems of the business, education, finance, manufacturing, gaming, government, healthcare. You just name it, like the applications are limitless, but most of them come from the domain expertise which are outside of computer science. Now, when it comes to data science, there are two distinct kinds of roles. First of the type of the role that I say more suited towards STEM people, but there are other kinds of roles that are suited more towards AHSs. So when we talk about somebody who is a data scientist and of course when we talk about data science, we talk about Python automatically because Python is the dominant language here as well. So when it comes to data science, it means like the person will do almost everything from visualization, from collection of data analysis, presentation and everything. If the person can do that, we call that person a data scientist. However, we know that having a best all-rounder, it's like very difficult to achieve. So you might come across people having stronger abilities in one of the identified areas. For example, they might be very clever when it comes to analysis, but they could do a bit of visualization and so on and so forth. So it's like it's in demand, lots of companies want to have a data scientist. Then there are machine learning experts who are mostly about creating new methods, trying to create new models, doing research breakthrough kind of a thing. They're always trying to change something around an improvement of the accuracy and so on and so forth. They might be coming from an academic background and they might be just trying to publish their new research. So they are in that area. So these two are mostly STEM roles. Then we have data engineer and data architect. These two are about how can we build the pipelines on the data? So how can we manage the data and the SQL and the storage and all these kind of things or the big data? There is a subtle difference between data engineer and data architect. Data engineers are the ones that are responsible for designing and developing and maintaining the entire data pipeline, test the ecosystem, which is required for the businesses and prepare the data for everybody else who isn't the team, but mostly for data scientists. Whereas data architect try to give a well-formatted data, produce a schema, think about how to bring in structure so that everybody can access the information through a structured way. But again, these two roles are STEM as well. Then there are database administrators, which are commonly known to businesses as well. These are the people who can pull out information from the databases. And they are, of course, of STEM as well. Then there are technological specialized roles such as NLP experts, or deep learning experts, and so on and so forth, whatever is required by the industry. Again, mostly they are coming from STEM areas. Now there are other roles when we talk about such as data analysts, such as machine learning engineer. They are more suited towards AHSS. However, they might be coming from STEM too. There is a blurry boundary here. But if you see from the point of view of analyzing something in the context of business and if they don't understand business, if they don't understand the importance of the human who's behind that, then performing such an analysis or performing a visualization and trying to analyze what is important becomes very difficult. So you have to become more expert of the domain. And then you should know the right tools just a bit instead of like showcasing that you know a lot of tools, you just need to know a few tools, but really tell the story. In terms of machine learning engineer who's a different role compared to machine learning scientist, scientists are trying to chase for, you know, one person at QDC boost, come up with a new modeling strategy. Whereas engineers are the ones who are trying to plug in what is already available in machine learning and then try to solve a problem. That's where they understand the problem more than they can develop more of the machine learning and advanced machine learning in terms of the theoretical space. Then there are people from data storytellers, those who can tell the story from the data and they can, you know, inspire people or, you know, businesses and as well as society wherever the application is. And that's where they need to connect with the real society or the real people who are going to benefit from it and that's why it belongs to AHSS. Likewise, there are people from business intelligence development and they are called BI developers. They understand the business and they just try to tell the story or try to solve the problem for the business. They might not know much of data science in the details but they do understand just enough to make use of the tools which are available to answer the queries for businesses. Now, when it comes to, you know, Python, we have a great support for machine learning algorithms just to show you like this is how it looks like, right? This is just a small subset of how the real, you know, support that exists within Python as well. These are all linear algorithms which are all supported in Python. Now, you can see with scikit-learn there is a path from where you can start, you can think about how you can, you know, if you have such a data set and if you want to grow from there what is the path that you should take and try to solve the problem. So there is a strategy, there is a cheat sheet which is available and of course it's growing as well. It's just to, you know, just to highlight what exists over there already. Then we have different types of neural networks. You can assume that as, you know, a rat's brain is different from an elephant's brain, from a monkey's brain, from a human's brain. Whatever is suited, you just need to borrow that and try to solve the problem. If the problem looks like a rat's brain is enough so you need a small neural network to solve that problem, whereas if the problem is different, you have to go for a different architecture. So there is a variety of choices that exist out there. Now the problem what has happened over the last couple of years is we have lots of technology and some of the quotes that I would like to point out over here, we have lots of technology available. Now it's mostly about the right selection of the technology for the businesses to benefit from it. So we don't need to say, yeah, there is another technology out there, so what? So the problem here is what is the right set of technology? Another thing is computer science has become more like a calculator for the other disciplines. But the problem here is this calculator is quite different from the calculator that we knew a few years ago or decades and centuries ago, not centuries ago but decades ago. So this requires a person who understands it and then prescribes what you need from it. Another thing is computer science is kind of analogous to the spice of a cuisine and that cuisine actually belongs to different disciplines. Which could be outside of computer science. And if you put too much of a spice, it will spoil the dish. It's like too much of a technology will not be good. And if you don't give it enough technology, it will sound tasteless. And this is where Python is quite good because it has a lot of support and it just requires people to tailor it and then prescribe it as a solution for other disciplines as well. So Python is not just for computer science but of course beyond for that. And here I'll stop and hand it over to Erzum and for the rest of the part. So my colleague covered what data science is, why social science and the boundaries are blurring now. But when it comes to actually teaching programming and Python to the arts humanities and social science students, there are quite a few challenges and the top most of which is that there is this fear within them like the minion shows. So the real issue that we face coming from a computer science background to an AHSS school or AHSS students is how to make students overcome that fear. So over my one year exploration of this area or maybe two years, what I have learned is that don't make them think so much about the programming task at hand. Give them what they want in terms of their social science problem, curiosity driven exploration because they are solving social science, humanities disciplines are about solving problems of humanity, problem of society. So tell them, okay, this problem that you want to solve of society, you do surveys, you do interviews. Computer science has the right amount of tools, not just the right amount but also a mix of different tools that can help you get the data, do some analysis, edu over it to make your job easy. And that's when they stop thinking about the fear aspect of it and get curious towards how that technology can give them more insights. For example, this is a postdoc position advertised by an interdisciplinary team in Germany. Now you can see that the topic is politics of inequality which is coming from a social science problem. But they want somebody who also has a background in Python, in some of the technology. So this is the kind of work that's happening in research departments. This is the kind of research problems that computer scientists or social scientists with a little bit of technology can solve. So if you motivate them that, okay, let's take a problem of society which I'll also explain as I go on, which I did with a business school and also in another project on immigration. But first, another thing to, you know, make their fear go away is to break that problem down into smaller problems. And for each set of problem, give them a notebook or a small demo. That fascinates them and makes them like less worried about the programming and the syntax stuff and that has worked a lot. Now this was, it's available on my GitHub. So during the last year, I taught in a FinTech module at TU Dublin, Technology University Dublin and being from a computer science background, it was slightly my first experience and I was like, how am I going to teach business school students? And their background was of the right, there was a rugby player in my class. There were two students from Direct Provision. So those of you who don't know Direct Provision is Ireland's asylum system where refugees are housed. So there was like huge, and one or two or three were quite good in programming already. So to take them all along in one class was quite a bit of a challenge. Some were at times nervous that the students who already know programming, they're not letting us learn, they're going too fast, they're answering all your questions, you know, stuff like that. So, and then I had to come up with something that would be challenging for everybody. Not just those students who were already familiar but also those who wanted to study the business side of things, the finance side of things. So anybody who's in finance and business knows that Tesla is the hot thing. So I created this project about Tesla news articles. I scraped some news articles and they had to do sentiment analysis over it. And then use those sentiment analysis features as predictors for the company's financial performance. Another thing that helped was providing them with starter code. So I gave them skeleton code. At first they were struggling, like how do we even start? How do we read those files? How do we read the CSVs and get the sentiments in? So that part, if you go to my GitHub and you want to teach finance students this project, everything, the whole instructions is up on my GitHub. I'll also share on Twitter later. So this worked really well and the students were like towards the end. They were very, very happy that we solved a very nice programming project. And when I saw their reports, it seemed as if there was also a part where they had to do, you know, different feature engineering. And a lot of them put, well, not everybody, but some of the groups put in a lot of effort into that. And that helped bring out their creative side. So it was a very good experiment. This project is funded by the Irish Research Council and they are the other main funders. And it's called InclusiveIRE. IR is short for Ireland. It involves a school of business in TU Dublin and a school of computer science in UCD. So the whole idea is that what we are doing, we are doing surveys and focus groups and gathering quantitative data from that part and applying those insights to social media, gathering data from social media to develop a tool for detecting sentiment around migrants in Ireland. With time, especially during COVID, it was noted that Ireland is getting less friendly towards migrants. This wasn't the case a few years ago, but it has increased. Or maybe it was, we never knew, but some aspects that social scientists always knew, but we as computer scientists don't, came out with this study, especially like some interesting insights in relation to the various nationalities that live in Ireland. I'll present a graph. So this box plot is showing something very, very interesting. And this came from a theory of, by a post-doc of international business. Now, my background is not business. I don't know about this stuff, that there is something called cultural distance. And Ireland's cultural distance with respect to Anglo-Saxons. Now, Anglo-Saxons would be people from the US, UK, Confucianizations, people from China, Hong Kong, Vietnam, those Asians, Eastern Europeans. Eastern Europeans would be Poland, Ukraine, Kloakia countries like that, and Latin America, Brazil, Chile, that part of the world's Middle Eastern region. So this cultural distance value, we made these clusters of countries based on cultural distance value derived from international business. Now, we see an interesting pattern here. If we would have done all the migrants together, maybe this wouldn't have not been so obvious. Some communities find it very hard to make friends that are native Irish. Native here, we refer to those who have been in Ireland, who are born Irish. So overall, what this is representing for would be that the question was, the survey question was, do you find it easy to make Irish friends? And strongly agree was five, means they find it very easy, and strongly disagree was one. So you see, sorry, strongly disagree was a higher number. So the Latin American and the Africans, their mean is the same, four. So they find it harder, but within the Latin American ones, you see that even the lower quartile has a lower number. So they are struggling more. So there's wide variation in the Africans. Some Africans would find it easy, but overall, the Latin Americans. And that makes sense. If you read the news articles in Ireland, you would see that a lot of the delivery drivers who are facing a lot of racism, they are from Latin America. If it would add up with the data from social media, from the web. So that's a very interesting outcome of this project. And this one is an ongoing part. What we did here was that we gathered tweets of migrants and natives. They were manually tagged for who is a native and who is a migrant. The data set of this will be made publicly available, hopefully, because it's Twitter data, and Twitter terms of service allows it, but it's still a work in progress. And when the findings will be published, then hopefully it will be released along with the data. So an interesting aspect is what this graph is showing is that I just used word embeddings on the tweets of migrants and natives, and then I applied K-Means clustering to those words and based on TSNE reduction was applied, you know, to visualize. Now, two of these clusters labeled, but the one with the label zero, the purplish one and the one with the label two, the greenish one. You would see that in the green one, there are a lot of migrants and that one, label two, is about NMH abortion. Now there's this whole controversy around Irons-Stashton maternity hospital. It's being given to a church called St. Vincent's and a lot of natives seem concerned about it, but the migrants, not so much. They didn't tweet a lot about it. So this also is showing the topics that migrants talk about over their social media and natives talk about over the social media can tell the level to which there is an integration into the society. That's what we're trying to do via this project. So the whole idea of this is the code, hopefully I will be able to share it soon because the project will be, the data will be made available, but my whole point is that if you take problems from humanities and social science and try to come up with data science methods and Python has very, very effective tools like Gensim, word embeddings, NLTK, if you try to solve it with that, you'll get very, very interesting insights and ideas and that's what's motivating the arts, humanities and social science disciplines towards it. In fact, there's a whole new field called computational social science. So thank you.