 You very much Nathan for the introduction. My name is Jack. I am the PhD student at Soas at my work, it focuses on historical fiction in India and kind of dataset. Today the modelling-g Arnold's project that I'm a part of, Soas. I want to start off today by briefly outlining some of the details of the project, the main issue I want to discuss today is more of a conceptual one than an explicitly practical one. I think Mae'r cyffin amser yndw i'r cystafell fy gŷtledd yw'r hyffiniau yn ymwyaf i hygyrch yn ymgylchedd i'r cyfyrdd cymorthaeth sy'n mynd i fynd i argyrchu am arwain, ac mae diso'r unrhyw fawr yn ymwysig. Rydym yn fy nhw'n eu gwneud hynny. Mae'n rhaid i goch ei fod yn gallu ei bwysig, a'r cyfeirio'n arwain yn cael ei wneud ar y myfyrdd, ac yn ni i gael, os ychydig o'r gwneud ar gyfer, a'i gael bod ymwysig hwnnaeth, ddim yn cyfreithio o'r bwysig hyn yn ymddangos ychydig gan y ddysgu'r data i'n weithio'r eistedd pan amlwg nad yw'r bwysig i'r ddweud yn ôl i'ch troi swyddfa amser o'r blaid o'r theidau o'r problemau o'r ystod o'r ymddangos ymddangos ymdag. Dedwyd i'r lleiwn am y ddigon nhw, ond â os y project arall, wedi'n blwyddyn nhw'n os y gallwch yma'r dda, ei nghylch i ddau'r projectau â'r project ac mae'n ei rydyn gyfan ydych chi'n cymryd o beraynol bydd y cythiwyd cyfligol yn bwysig, cyn yw rydyn ni'n neudio ni'n gweithio, ac mae hynny'n gweithio arno yn cymryd. Ond ydych chi'n gweithio gyda'i gydag ar gael gwasanaeth a fydd efo'r gweithio ar gael gwyn nharnedau o hypnod ddangos cyfryd ar gynnig iechyd nid yn y gweld, gan hynna powd, gan hynny'n gweithio ar arno. Mae Maith Wrngffi, Camelol, is a European Commission Horizon 2020 scheme which is often abbreviated to monosogy with the acronym there. We haven't standardised the way to say that word so say it however you like and to come to the side yet. The project brings into question the status of world literature as a discipline and the dominance of headymonic languages within it such as French, English and Spanish and German by highlighting the multilingualism of and the many factors that contribute to regional and transnational literary fields. I'll be quoting here from the project description that we have on our website and I've popped the URL just there as well so you can see it for yourself. In practice the project revolves around three sites, North India, North Africa and the Horn of Africa that we're analysing both individually and under comparative focus. This emphasis on comparison is important as to our knowledge there has been no direct link between North India, Morocco and Ethiopia in modern times. This will be a project not about contact zones or connections but rather about patterns and comparison. Unlike certain other methodologies within comparative literature for example which privilege comparison between a centre in the global north and a periphery in the global south, this is a project that really focuses on south-south or south-south comparison as this main methodology between regions that we feel have been previously under or uncompaired. The aim of the project is not to create universalising understanding of literature across the globe as the discipline of world literature often does but to make a significant intervention in or reshape the field of world literature and propose methodologies, training models and case studies that multilingual locals and multiple cinematographies are appropriate for the study of world literature. There are nine of us on the project including a project coordinator and it's led by Francesco Orsini so as who also heads the North India case study on the project. So this centrality of comparison to our work throws up a number of problems when it comes to defining and working with our research data. This is the issue that I want to really hone in on today and talk a bit more about. So there's a number of researchers that have identified this difficulty in delineating what constitutes humanities research data exactly and I've popped a few quotations that I found from presentations made in the event in 2013 at the RDMF that kind of illustrates some of these difficulties and what these researchers saw in this field. The act of comparison that undergirds our project particularly complicates this a bit further I think in the past year or so we have come across a number of problems in asserting what constitutes our research data and what we're meant to do with it either as a team or even individually. As we might get from the description provided the project the data we're using are mainly examples of literary works primarily from our three geographical and linguistic case studies but the process of dividing which or deciding which works of literature should be included for these comparisons is quite a challenge for a number of reasons. And primary amongst these is the inevitable symmetry of knowledge that comes with a group project like this. So we work on three distinct regions with at least six principal research languages and this means we have a wonderful array of resources as this slide I hope illustrates. We have translations of John Bunyan's Pilgrim's present at Oeddu from the 1870s, we have a very modern and hardy literature from Ethiopia, we have Hindi historical fiction from the 1940s and Arabic children stories from the 1920s. This is just some of the things we talked about in the last few months to try and kind of show you a picture of the languages and the genres that we're touching upon and the breadth of stuff that we're doing. But it's these kind of asymmetries in linguistic knowledge that kind of limit how we can engage with the other parts of the project obviously. So my own research focuses on Hindi and Oeddu for example within North India and I studied Arabic and French before so I have some ability to engage with the North India, sorry North Africa case study. But when it comes to the Horn of Africa case study with Oral Moti Grinia and Amharic I am woefully kind of unprepared or unable to engage these things at this point in time. So there is a real issue of kind of, as far as I might try at the moment, linguistically I can't obviously take these things on board. And so I've popped also here just kind of main research language we work in both European languages that obviously have different histories in these regions from the colonial period onwards and also the kind of local and national languages of the regions we're looking at. So as you can see it's quite a range of languages to work with a group of eight researchers basically. And the same limits I suppose also apply to contextual knowledge. So while all members of the case study have a kind of general understanding of the histories and structures of cultures in the region we're talking about there are obviously huge differences in our ability to engage in the contextual knowledge that underpins these things. So a prime example would be the colonial history of India which is a 200 year process that ends in 47 compared to colonial history of Ethiopia which was only occupied by a foreign power for five or six years in the 20th century but had other kind of contacts and interactions with colonial powers whether Italian or German or British throughout the 19th to 20th century. So these things don't correlate and in the project we're trying to bring together these languages and also these time periods in different kind of historical processes that are at play there. So we're looking not only at the pre-modern or pre-colonial work with someone that Francesca has worked on in some length beforehand but the colonial period as well and the post-colonial decolonial period as well. So it's kind of not only linguistic matrix but a kind of temporal and other contextual matrix to work with as well within that. I mean this, I'm not saying this to kind of belittle the comparative focus of the research or the research in itself and I think we have to acknowledge the practical limits of our knowledge and try and work through these together with collaboration. That's one of the main things that I really enjoy about working on this project but on the very basic level of drawing up a suitable list of comparables that can constitute our research data these limits have proved quite a big challenge to our project work and this bind is kind of further tightened by the theoretical basis that forms the kind of contextual undergirding of the project. We're looking at ideas of world literature or comparative literature that have been theorised by different people in different ways using different languages and context as well. So I'm particularly, people like David Danross who've worked on world literature as a concept for a long time but use the Epic of Gilgolmesh for example as their main kind of text to do that. We have no primary ability to engage with that data at all so there is a kind of disparity or a symmetry there that we have to deal with as well. So to head off these challenges we on the project have kind of diverted our own protocols to say to try and create productive and relevant comparisons between the case study regions. So this has primarily been a case of spending time working together to elaborate comprehensive lists of what we're calling comparables. So that is literary works that conform the basis of the comparisons between our case studies. These works are our primary research data towards the purposes. They are the things that we're going to be working from to develop our methodologies and our ideas about what we're talking about. And we've made a kind of effort not to only to lineate them by language but also by other characteristics. So genre, audience, time period, script, materials, provenance, that kind of thing. And so working within this schema allows us to navigate the linguistic barriers that I've talked a bit about just now while at the same time retaining the particularities of each work and not assuming these particularities to a kind of generalised whole as often happens when comparisons are made. And more recently we started an intense period of kind of working within our case study smaller groups. Having spent much of the last year working holistically as a whole project together. And we're using the time to debate areas of interest in our more particular geographic and linguistic regions which we can then hopefully bring to the table when we meet as a project to discuss where we can compare these things. And this flexibility of moving between the big and the small pictures helps us to delineate data that functions on the various micro levels of the case studies of the project but also on the macro level of the themes and ideas of the project more broadly. And so moving forward we hope to use these small lists of comparables to divide our research data thematically and focus our comparative approach within parameters that don't reify language as a defining feature of the literature we're working with. This is something that Karima Larsiwd who's the head of the North Africa case study on the project and Francesca Rossini as well have made a kind of central feature of one of the methodologies they're developing called reading together. So bringing literature together to read them across and between languages much more than within their linguistic silos, if you like. We've also used other more public means to try and make our research data more understandable to both those on the project and those in the kind of wider academic and reading publics. We made it because as an effort in our event planning to try and make our content as accessible to one another as possible and this is a part, an issue of balance so making sure that each of the regions receives equal billing in single events and across our events calendars more generally so as to keep each of the case studies in our minds irrespective of our own focuses and our own abilities in terms of the knowledge that we bring to the project. So we had a project in the summer, a workshop in the summer organised with the CNRS and Taillim in Paris which is the very jazzy flyer for that. We brought together scholars working on each of our three key regions and further afield and we placed them in panels that were not just defined by geography but by other thematic conceptualisations and this might sound quite basic and banal but actually it's something that is not often done with small and large events where you try and think about comparative thematics across different literatures and different concepts for example and what this produced for us was a more well rounded experience in which the patterns and comparisons that we're looking for in the project were more easily to identify and it's kind of also contrasted to an event we held in the first year of the project, a seminar, a workshop at SOAS where we were dealing with pre-modern literatures and trying to get people to talk together or with each other and also with the project and it didn't really work I think because of the way we'd set up the call for papers and that kind of thing wasn't oriented towards that and it's something that we're trying to address in our submissions going forward to make sure these things are done at the events that we can hold. And we've taken a very similar approach in building the website for our project which we're using both as a means to publicise the project and as a constructive forum for debating issues pertinent to the project so the website is structured around three key areas, readings, integrations and itineraries which you can see on the board there which we hope to populate with content that pertains both to the project regions and other geographic and linguistic locales. I suppose the website extends the basis for our comparisons and the breadth of our research data so by commissioning scholars from around the world to contribute content to the site we're able to include sources from a number of languages and cultures that are outside our immediate remit we're also using the website as a repository for educational resources such as collaboratively compiled courses that can be downloaded and used by scholars and teachers around the world so this is accessible on the website which is here. It's open access, you register on the site and then you can download and use them as you wish and our first one went online last week and it brings together Faddan LeBernie who's a post-off on the project Francesco Orsini and Nick Harrison at KCL talking about comparative colonial pedagogies between North Africa and North India so try and bring together different contexts in which colonial teaching was done but looking at how that's ffaddan literature and telling us out in a seven or eight week syllabus that can be used either in its entirety or just for its little readings every now and then and the work that we produce on the website also ticks that ever important outreach and impact box which is aided by our Facebook and our Twitter which I popped on here so if you're on those platforms please do follow or like us and you'll get to see me curating our Facebook content every day which is a fun job for me. So while these efforts have helped us delineating suitable comparables as our research data or are helping us to do that, Molossogy should probably be doing a bit more to both define and develop our research data so one element of our research data management that we are currently lacking is a fully fledged strategy for ensuring that our data are sort of be ordered and crucially reproducible in the future. This is something that has been discussed a lot over the last two days and it's really important for our project partly because we want to use these case studies and this data as a way to influence methodologies and tools that people can use in the future to do similar things even if they're not in the language groupings that we're talking about. I think this goes back to what Getham was saying about linking together data tools and making sure the whole package is there to ensure that they all work together holistically I think. So for this we need something on the lines of a research bibliography which is basically just a better method of collating and recording the data that we're using day to day whether that's reading it once and putting it aside and saying it's not worth it for the project or using it for our main analysis in my PhD for example. I talked to Nathan a few weeks ago about this and I brought attention to the project and I hope that it's going to be something we can build on going forward but as I say this is something that we haven't really thought about until now for whatever reason and I joined the project last year so this is something that we're playing catch up on a little bit we're two and a half years in now and we need to really make sure this data is accessible and reproducible going forward for obvious legal reasons as well. There are challenges to this of course and that's primarily because of the nature and typology of the sources that we're going to be using so one is the variance in the kind of material structure of the sources and the data that we're relying on so there are a number of us on the project that rely on the kind of modern format of the handy and durable book but there are others that work on manuscripts or other kind of less hardy materials that are less accessible from somewhere like the UK or from the US so ensuring that these are all properly indexed to standard criteria is a priority for us particularly because a number of us are going on field work out of this year or next year and so we need to kind of really crack the whip on and make sure we have a way of standardising these things and connected to this is a question of dissemination so as work by Christine L. Borgman has shown there's something of a conundrum in the sharing of research data beyond an individual or project group not least because of the constraints posed by copyright law in the case of more recent publications and this varies on our project some of the work is going to be 19th century or before some of it is going to be as we saw 2016 there are disparities there there are different realms of copyright that we need to work with both in the UK with in Europe and the other geographies that we're looking at so we need to find a way to provide adequate bibliographic and reference information to these resources that are subject to copyright constraints while also making as much of an effort as we can to make these as accessible as possible to people who want to reuse these things in the future and there's another kind of more technical issue there of course in terms of making sure this data is accessible in a physical sense in terms of reading and transitorating these things transitorating systems between even Arabic and Hindi we use similar characters to mean very different things so we have to find a way in which we as a project first of all and then those who might want to use the data after us are able to access this data irrespective of their ability to read exam which is either in their primary form or in a transitorated form and I mean this is also with Amharic we have a very complicated system of transiteration which is not accessible to somebody who doesn't know Amharic or particularly for example so this kind of focus on making sure that the metadata is there and accessible is something that I'm trying to really push with the project more broadly going forward so to conclude, considerations about research data from ROTG are as I hope to have shown today couched in the project search for patterns and comparisons between our three geographical and linguistic case studies it's this comparative focus that makes the process defining and working with research data a significant challenge so a symmetry is a knowledge shape how we as individuals and as a team are able to approach sources and necessitate extra efforts to ensure data are comprehensible to all on the project constructing lists of comparables and making use of public events and digital spaces allow us to mitigate some of these effects and I think kind of extend the epithemic frameworks within which the project can operate ultimately so moving forward I hope we can build something that resembles the research bibliography and make sure that it's useful for people to come who want to emulate the methods that we're going to be using at least in the future Thank you for your attention I look forward to your comments Call for Papers, if you like, for this event people will not talk about the methodology of their research but focus on data management but I found myself feeling as a listener to your talk that knowing more about methodology would help me somehow so I'll be more explicit about that clearly the novel that someone is reading is in some sense research data so there is a question of making that available but I mean it might be available in all the book shops in London but then are there conceptually at least are there discreet moments of where you create data where you create data intermediate between the novel as it exists in a book shop and publish an article or book in literary theory for instance I mean of course there are but what I'm trying to say with this is that it's not the same thing as doing a theological or archaeological work but a bit of it is tangible but you can codify and look at in a certain way what I'm trying to say with this and I appreciate why you don't want to necessarily have methodological concerns here but what I've tried to do with this presentation is say that actually at the moment the conceptual underpinnings of this project mean that determining how we I mean we talked a lot about data is existing and data is generated as I've asked a few days right so that's the question of going out fine of course and there is a difference there at a point in the project where we know what we think we want to achieve with this data but actually it's not a case of saying oh of course all these novels in Amharic, Hindi and Arabic are going to do the same thing hence why we have to take a step back from that and actually assess how we come up with this primary source or primary data in their initial thing because as you say I could go around any book shop in London and find a number of books there's also an issue of of access there and just because a book is here doesn't mean that it actually represents anything about the true culture that we're talking about so we're not at a stage to say that yet and I don't actually know this is why I include those quotes of other researchers saying this because I think there is a difference even within this humanities and social sciences umbrella there are very different methods that we use very different methods and I think a lot of the presentations I've asked a few days are focused on as I say the theological archaeological things that are not more tangible per se but have a particular methodology that I've found to them I think that there is some of that in literary criticism or literary studies more broadly but I think actually there is a difference in how we can engage with that data and then generate our own data from it that's what I've tried to bring to the fore here I don't know what form that generated data would be for us frankly because I don't know when I work on a text there are certain characteristics of it that I'll look for certain facets of the text or functions of the text that I'll use to draw an argument together those aren't always inherent to the text these are normally influenced by secondary material so finding a way to do that is really important for us because we have a hell of a lot of material on scope here so to try and paraphrase it that it's if you're doing a text critical edition or you're doing archaeology there are certain moments where there's quite naturally there's data that your method creates that's easy to file somewhere it's not as obvious that there are specific moments in the methodology that naturally produce exactly data sets but that one thing that you have observed is that one thing that makes doing literary criticism different than reading is the engagement with the secondary literature in literary criticism absolutely and that's what you're thinking about how to model or keep track of more absolutely and that's not to say that doesn't happen in the other fields we've talked about but I think there's a particular type of interaction between secondary and primary in literary studies as opposed to the fields we've talked about these also do interact but as you say I think there is a particular kind of relation that you're taking into account when you're talking about managing or generating your own data just a quick thought the the the fields that we've often been hearing for a final sales position of having to create the digital object that will act as the point that links together the image the meaning of the inscription etc but it occurs to me with modern literature of course you because they're generating the one world they come with that data already attached ISBNs and other things review platforms because review is part of the to what extent do you feed into those to what extent do you tag your data using ISBNs and things like that to link it to the object because the object has a kind of metadata digital identifier already I think in terms of reproducing the data it's exceptionally useful to use those and to make use of them I think absolutely so there's a reason why ISBNs are useful for us finding books because they can help you find something you wouldn't find others but I don't think we can rely solely on those because there's a gear towards a particular way of consuming that material a commercial or whatever a personal form we have to kind of add to that metadata that perhaps deals with thematics or questions of script or language that can also supplement these more practical access data I suppose I know that in some of the app I see on literature is very multimodal because it's also performed and that's an aspect that you haven't touched on at all so presumably you would also have audio visual data that needs to climb your image and scan the data could you talk a little bit about that so I think the emphasis of the project is primarily on written literature but I think to your point these don't exist in a vacuum obviously and so we do have to take into consideration that orature is an exceptionally important thing performance as you said is an incredibly important thing so I think in terms of the data that we're going to be collecting or generating ourselves those things won't necessarily be part of that in fact they will I think book history is one thing the presentation of these books is very important so I think you're very right to bring that up but in terms of trying to navigate this orature literature binarisation that's not always helpful I don't quite know how we would do that particularly with the more modern stuff where there are performances available to them I think this is something that I can take back to the team because actually it's if we do have performances of a Hindi play that we can find being performed then all the better do you know what I mean and there's other layers that we can add to that so thank you for that I think that's where we have to end so thank you