 While your current research and teaching interests include the history of the golden age of Egyptology at the end of the 19th and beginning of the 20th centuries, she will be talking about the Emma Andrews project that she's been engaged with today. So please welcome Dr. Sarah Kennedy. Thank you, Tom. Thank you very much. So thank you for the introduction and for inviting me to speak today. And I'm looking forward to telling you more about the DH research and project work that I've been doing at the University of Washington for the past nine or 10 years now. I trained as an Egyptologist in the UK, and my research interests currently focus on Nile travel and excavation in the late 19th and early 20th centuries, the so-called golden age of Egyptian archaeology. I work mostly with primary source diaries, letters, and other printed handwritten material from this period. And I'll be showing you some of this material today. So I'll start by giving you a bit of historical background about Emma Andrews of our diary project fame, because I'd say there's a fair chance you'd never heard of her before today. Her travel journal record is significant for a number of reasons, not least because it provides detailed cultural and social histories of contemporary Egypt. For Egyptologists, it's an important record of her partner Theodore Davis' excavations in the Valley of the Kings, and a record of art and antiquities purchases, another contemporary archaeology in Egypt. We've added new material related to Emma's diary to our digital archive since our work began, and I'll be talking a bit about the range of material that we work with. And our ongoing transcription and encoding work led us to establish a publishing house called New Book Digital Texts and to offer undergraduate student internships in digital humanities. I'll give you an overview of this program and our digital research output, which includes maps, data visualizations, and digital tools for markup in XML-TEI. And then finally, I'll finish up by showing you how I use the internship program to develop an introductory course in digital humanities at the University of Washington, which I've taught five times now over the past four years or so. So Emma Buttles Andrews was something of a unique woman for the period that she lived in. She was independently wealthy, she was very opinionated, and something of a force of nature which really comes through in reading her diaries. She was the youngest daughter of one of the wealthiest men in Columbus, Ohio, Joel Buttles, who died when she was 13, leaving her to be raised by her mother and other relatives. And she married a lawyer called Abner Andrews in 1859, and he was the son of the president of the state bank of Ohio. At some point in their marriage, Abner became an invalid, and Emma cared for him for a decade or more. And her father-in-law made a very generous provision for her in his will because of this, and so she was certainly very financially secure from an early age. The turning point of her life came when she met Theodore Davis in 1860, when he stopped to visit the Buttles family with his new bride, Annie, and Annie was a cousin of Emma's. Theodore went on to stop in Columbus every year after that until Emma finally moved in with him when she was in her fifties. Davis was a lawyer to the so-called robber barons of New York, and he made a large fortune in a series of very shady banking deals before retiring in his late forties to a mansion that he had built in Newport, Rhode Island called The Reef. Emma and Annie also moved in with him at the same time, and their living arrangements, as you can imagine, were somewhat scandalous to Victorian society, and it seems that Annie's meek personality was very much overshadowed by her husbands and by Emma's. Following a bout of pneumonia, doctors recommended that Davis spend winters in a warmer climate, and he was only too happy to oblige. And so the annual trips along the Nile began, and they lasted for two decades from 1889 onwards, and he and Emma threw their energies into collecting art and antiquities. Emma records some of Davis' first purchases of these antiquities, such as these mummy masks, which he bought in January 1890 from a Luxor antiquities dealer called Mohamed Mohassib. And both of the masks are now in the Metropolitan Museum of Art. After a few years of tourism, the couple's interests turned to archaeology, and they both provided generous financial support to a number of Egyptologists working around Luxor at the time. And Theodore was finally granted the coveted concession to excavate in the Valley of the Kings, and his first solo excavation happened in 1902 with Howard Carter as his frontman, and you've probably heard of Carter as the discoverer of the tomb of Tutankhamun. His first discovery was this tomb of the nobleman Usahat, and this photograph was found between the pages of Emma's niece's diary, Nettie Buttles, and the diary's currently at the Oriental Institute in Chicago. You can see Nettie sitting down with Theodore standing behind her. And shortly after this photo was taken, unfortunately this wooden mask disintegrated fell apart. Prior to his death in 1915, Davis discovered 18 of the 42 royal tombs now known in the Valley of the Kings, including the tomb of Pharaoh Amunhotep III's in-laws, the tomb of Yuya and Tuyu, which until the discovery of Tutankhamun was the best preserved tomb found in Egypt. As well as the somewhat contentious tomb labelled as KV55, which is thought to be the final resting place of the so-called heretic Pharaoh Amunhotep. Emma's diary is important because it provides an eyewitness account of many of these excavations with details that are often lacking in published tomb reports, and her observations about the conditions of excavation of the day are particularly interesting and somewhat graphic. This slide describes Howard Carter's unfortunate state during the excavations of Queen Hatshepsut's Cliff Tomb. Over the course of their long period in Egypt, Emma and Theodore's boat, the Dahabia Bedouin, became the central hub for discussion and planning each season for most of the significant Egyptologists and archaeologists working in Egypt. A few of them are pictured here. Most of these encounters are documented by Emma in her diary. Davis's finds include thousands of artefacts now displayed at the Grand Egyptian Museum in Cairo, at the Boston Museum of Fine Arts, the Harvard Semitic Museum, and especially the Metropolitan Museum of Fine Art in New York. This is a copy of two pages of Davis's will, listing the items he intended to bequeath to the Met, and I've highlighted what Davis considered to be the gem of his collection, here named the Canopic Jar of Queen T, and he kept this piece prominently displayed on his desk at the reef. So the process of digging into the contents of the diary and researching all the individuals, the events and the places that Emma mentions inevitably prompted additional visits to archives in the US and the UK to photograph related historical material. And so our archive, our digitised archive that I essentially keep on a hard drive and on my computer has grown exponentially. The collection has evolved to comprise the unpublished writings of some of the hidden figures of Egyptology who are often but not always women. So they're the wives, the secretaries, the administrators, or in Emma's case the mistress of archaeologists and Egyptologists. And their writings provide unique perspectives on digs in Egypt at the time. An example of this is the unpublished handwritten archive of Helen Winlock who was the wife of the Metropolitan Museum's director of the Egyptian expedition. Her letters home to her family in Massachusetts describe day-to-day life in Egypt in considerable detail as well as her husband's digs on the West Bank in Luxor. She also is an eyewitness of the process of constructing and furnishing the Metropolitan Museum's dig house in Luxor which is still used by archaeological teams today and in fact still has some of the furniture which she chose at the turn of the century. We also have the diaries and letters of the artist Joe Lyndon Smith who worked for Theodore Davis in the 1904 to 1906 period. He was the artist who recorded many of Davis's finds and he was present during the opening of KV55 in particular. And his work is a delight to dig through because A, his handwriting is very legible and also he peppers it with these wonderful doodles which really bring it to life. I recently photographed the diaries of an Oregon draftsman named Linsley Foote Hall who again was working in the Valley of the Kings at the turn of the century and he was a member of the team when Tutankhamun's tomb was opened and yet his contribution has been largely forgotten. And then finally I've been gathering up some other unpublished archival material in the form of archaeologists' notes written by some of the Egyptologists who excavated for Theodore Davis. So here we have Howard Carter again and you can see that this material might present some unique encoding challenges not least the hieroglyphs. So I've given you a sense of the scope and nature of the type of material that we're working with. So I'll talk a bit about the process we follow to make these sources machine-readable in such a way that we can programmatically extract the data that I'm interested in researching further. So Emma's diaries consist of 19 volumes of text. They would originally have been handwritten but the original sources unfortunately are now lost and I have thoughts about why this is the case. What remains are three out of four typewritten copies which were made in 1916 at the behest of the then director of the Metropolitan Museum of Art. There are two bound copies still at the Met and one is at the American Philosophical Society in Philadelphia and they're otherwise inaccessible unless you physically visit either institution. So our work will be the first time that the diary is made more widely available. And if you can see, interestingly this is the APS version and it was actually owned by Herbert Winlock, that's his signature up on it. So there are lots of connections between the various people that we've been researching. So here is one of our transcribed primary source diaries. We begin by transcribing into a plain text document and until now we've been hand-coding the documents in TEI XML. TEI as you probably all know is the text encoding initiative. The focus is on capturing basic structural elements in the diaries as well as particular aspects of the content which we want to investigate further. So in this case we've marked up people's names, place names, archaeological sites, boats, hotels etc. I wrote an XSLT script which generates an alphabetized list of all the entities we've marked up in the diaries and these results form the basis of our ongoing biographical and historical research. The list of entities is a really good starting point since it enables us to use targeted keyword searches in other resources which gives us relevant information and also a range of spelling variants to work from. As we built up our research material we had to decide on a content management system for the project and some of my considerations for choosing an open source option with an active community of users and developers also good support documentation and active discussion forums as well. We also wanted an out of the box solution so that we didn't have to do too much development while still offering the possibility of customizing and tweaking as we went along and as our needs became more apparent. I also wanted a simple yet functional back end interface which would allow me to add multiple users and assign them differing levels of access based on their role in the project. And so we decided to use Omeca which checked all of these boxes. It also integrates Dublin call metadata for our uploaded items and a range of plugins to facilitate data display and analysis. This is a snapshot of some of our named entities in Omeca. To date we've created a database of around 800 individuals mentioned in our primary source material and we generally use the CSV upload plugin for bulk upload of metadata entries and these include brief biographies for as many of the individuals that we can find which are then displayed on the project website. And we've called our database the Emapedia. The goal is to use our machine readable text to generate output which can be read online or on an e-reader or even printed as a physical book. We've experimented with the development of various viewer displays and ended up with five different views which are based on different platforms like the Internet Archive Reader. We've used Twitter's Bootstrap Framework and also TEI Boiler Plate as well. This is the output that we're calling the textual view which we're going to experiment a bit more with this year. As you can see people's names that we tagged in the encoding process are dynamically linked with our biographical and historical research database. The aim is to give readers contextual information as they look through the documents on our website with the option to read a more detailed biography by clicking through to the Emapedia. And I'll show you another version of this dynamic linking shortly in the form of a D3 visualization. I've been using the pronoun we when I talk about some of the work that I've completed. And I'd like to go behind the scenes and talk a bit about our undergraduate digital humanities internship program that has grown out of my initial very slow one-woman attempts to transcribe and encode the diary back in 2011. So as luck would have it I was fortunate enough to meet a faculty member in NELC at the University of Washington Professor Walter Andrews who was also working to transcribe and encode a series of Iraqi travel journals from the 19th century. And he had a couple of dedicated undergrads working with him on his project. He suggested that I advertise on the Undergraduate Research Opportunity Board which I did. And I also ended up with a couple of interns and from there our group began to grow exponentially. This led us to establishing new book digital texts at the end of 2011 which is registered with the Library of Congress as a publishing house. The starting point of our work is the preservation of lesser known or understudied texts from the Near East. And we choose material that might otherwise be lost or remain inaccessible. I've mentioned that you can only see Emma's Diaries if you physically go to one of the two locations that have the typewritten copies. Walter's Diaries were kept in a garage in Baghdad so he ended up having to ship them over here in order to transcribe them and preserve them. And our method for preservation is through digitization using non-proprietary open source technologies. And this process of digital publication has generated a wealth of machine readable texts. And as academics it's provided us with the opportunity to dig into our historical research using computational methodologies. We've been able to involve our undergraduate interns in all aspects of this research work and we feel that this is one of the strengths and unique features of the work that we do. I collaborate with between five and ten student interns each quarter who work on various aspects of the project at any given time and this might include transcription, encoding, historical research, project management, marketing or web development. Students receive independent research credits for participation in our project work. Over eight years we've worked with over 170 University of Washington undergrads and several graduate assistants. Students come from departments across campus. We don't have any prerequisites for joining our group and we find that the interns that we take on have in common a shared love of history and of literature and a curiosity about how computer technology fits into all of this. We currently have four active projects working under the new book umbrella, sharing the common publication and education goals that I've talked about. Besides my research I've mentioned this for Boda Diary project with the Iraqi travel journals and these are significant because of the quality of historical detail they record about topics as diverse as weather patterns, medicine, trade and river travel along the Euphrates. We have the Georgian digital text collective which seeks to present Georgian literature and culture to a worldwide audience in dual and trilingual formats. And then the Baki project is our largest project to date. It started about three years ago and we have over 30 collaborators in the US, the UK and in Turkey and they're all working to identify, catalogue, transcribe and publish the collected works of the Ottoman poet Baki who is arguably the most famous Ottoman poet of all time. My role in that project has been as DH project manager. As our program has grown it's moved beyond the initial transcription and encoding and our research is driven equally by my interests as well as students' curiosity and the skills that they either bring to the internship or that they want to develop while working with us. So here are a few examples of this. One of my students wrote a C sharp script to scrape the Metropolitan Museum's website for items bequeathed by Theodore Davis in The Will which I showed you earlier. We generated a CSV file with around 2,000 individual items and we were able to pull 600 images into my Google Drive account and I should mention that the Met's images are freely available online so there was no pirating going on. I hadn't really had a good sense of the sheer scope and range of the items that Davis and Emma collected until I looked through these images and it begged the question, well, what to do with all this information? So we've decided to use the Met data to see if we can identify the provenance and history of some of this material based on entries in Emma's diary and the other primary source material that we've collected. We've completed work on all the Will entries now and we're beginning to move that onto our project website. We've also been developing digital maps to visualize the base texts. This example is built using Neatline which as you know is an Omeca plug-in and I developed this during a month-long DH fellowship at the American Philosophical Society in Philadelphia a few years ago. It shows excavations in the Valley of the Kings between 1902 and 1913 on a contemporary 1909 base map. Clicking on each map point leads to a repository of information about each tomb including a database of archaeological material, contemporary correspondence, diary entries, and the archaeologist's notes and all of that is in a linked Omeca exhibit. We've also built a couple of story maps focusing on teasing out some of Emma's historical narrative in a visual and interactive way and you saw a screenshot of one of those a couple of slides ago, the Donna Laura Mingetti Leonardo which is another interesting story. We found that work of this nature, so non-traditional publication and visualization helps bring history alive for our audience and our website visitors are able to engage actively with our research material. One thing I haven't mentioned so far about Emma Andrews is that we haven't managed to find a photograph of her yet. I'm confident that we will at some point. Students are invariably intrigued by how she might have looked and so they came up with the idea of using a TEI trait tag to try and capture physical descriptions of her in contemporary textual material and in family photographs that I've collected. The goal was to create a 2D and perhaps a 3D visualization of how she may have looked and this is an artist's rendering in 2D based on the students' work which they presented at the undergraduate research symposium at the UW a couple of years ago. I'm still hoping that some students will be inspired to work on a 3D version of the same project. Over the years I've found that I learn as much from my students as I hope they learn from me. I've also learned about the process of digital project management as we've built our project from the ground up and sometimes usually the best learning is obviously by making mistakes and having to go back and fix things. Some of the main challenges that we face include project sustainability and the solution to this aside from the thorny topic of consistent funding boils down to comprehensive project documentation to include processes, work logs and instruction manuals. The biggest issue we face is that students inevitably and somewhat inconveniently graduate and unless they fully document what they've done what they've built, when things break and we have to go back and try and fix them we have to essentially recreate everything again. In terms of project management we use Basecamp to track our transcription work and I add the original images, instructions and then students upload their plain text documents to the platform when they've finished. We use the free educational version and it's been great for creating quarterly schedules tracking progress and also communicating with the team at large. We use GitHub to track our encoding work so we have the master copy of the transcription which lives in our organization's repository and then interns fork that to begin their encoding work. Once they've finished encoding they submit a pull request and I check all their work before merging back with the master copy and it means that we always have one clean copy of our text and that nobody but me touches the masters and again we learnt this the hard way we started off by using Google Docs and of course we found that people overwrote other people's work, deleted things etc. So GitHub has worked very well for version control. Another constant issue that we've been addressing is quality control or how to ensure consistent transcription and encoding standards. It's very easy to miss off an angle bracket for example as students go through encoding and we have varying groups of students each year with different skill levels etc. One of the biggest transcription challenges with the handwritten material is that most of my students were not taught how to write cursive let alone how to read it so they really do struggle, it's very slow work transcribing these handwritten documents and also vocabulary is very different as well late 19th century so that's been another challenge. Fortunately I was invited to teach a seminar class in computational linguistics about 18 months ago which gave me the opportunity to bring my questions about the process of managing and automating the workflow and markup processes for a text-based DH project to a very talented group of graduate students and you can see some of my questions that I bought to the table listed out here. I essentially took my historical data sets to them along with our XMLTEI schema and I invited the students to put what they'd learned in their classes into practice. This is one example from earlier one of Lindsay Foothall's diary pages which has been very challenging for students to read and very slow to transcribe so students in the seminar experimented with this challenging handwriting they explored binarization of images using computer vision from Microsoft Azure there are a number of challenges in generating decent output that wasn't very error-ridden but along the way one thing we did learn was that the iPhone images that I was taking as I visited in archives were as good if not better than some of the images that historical societies sent to me so that was a good thing to learn. I'll work with the whole diary ultimately prove somewhat frustrating probably because one of his handwritten lines of text tends to run into the line below it and the algorithm struggled to make sense of it all it was one of the frustrations of the quarter system 10 weeks is never enough time to really dig into a problem and you can see one of the students' comments about it on this slide. Another group of students built this OCR recognition and cleanup interface. You can see that potential errors in OCR text are highlighted in red here on the left for the user to go in manually check and edit as necessary. And then another group built an end-to-end system designed to recognize named entities in the text including the person, place name and organization which I mentioned, I'm very interested in researching further. Spelling variants are often a challenge in historical documents and in this case the same place name as one has been spelled in multiple ways in the same document. In other cases the 19th century name is no longer used and bears no relation to its modern variant. This student team used WikiData to pull in place name spelling variants and they also added a couple of fields for users to add variants as necessary. And then finally this group worked on a system for automating spelling correction of OCR output using both statistical and rule-based models for their corrections. The student work and the tools they built are all available in GitHub and to reinforce the importance of complete documentation, the students were graded on developing and including both technical documentation for potential future development but also user-friendly documentation for people like me who weren't quite as tech-savvy as they were. One of the students in the class wanted to continue her work on historical documents and asked me to act as one of her computational linguistics masters thesis supervisors which I did last year and she created over the course of the year this historical markup tool which takes our plain text documents and outputs valid TEI text with a TEI header and the body text with named entities recognized and automatically marked up. The tool is live on our website now and we plan to use it extensively this year for our encoding work and again we're hoping it will ease some of the difficulties of having many hands working on encoding tasks. Pulling all of this together the goal of this work is to develop a who was where when database and visualization for this period in early Egyptology which I hope other Egyptologists will contribute to. We've begun a proof of concept D3 visualization which pulls in Dublin core biographical entries from our Omeca platform along with the GitHub encoded material and this was actually built by a group of four students, capstone students in computer science doing a data visualization workshop and they built it in just four weeks which I find remarkable. And so one of my research goals for this coming year is to continue to work with groups of capstone students from both informatics and human centered design and engineering to further develop this prototypes functionality and the range of data that it draws into the visualization. I'll finish up by backtracking a little bit to speak about teaching digital humanities. A natural progression from the new book internship work was the development of an introductory course for undergraduate and graduate students in DH. We were funded for me to develop and teach three iterations of the class in the 2015 to 2016 school year and I've taught it a total of five times now. The data sets I chose for my initial classes were Egyptological in nature but as my confidence grew I expanded the data sets to offer students a range of material to choose from. They've worked with New York Public Library menu data and also a range of historical newspapers. I'll be talking about the fall 2018 class briefly and then some of the lessons that I have learned from my early teaching efforts. So the goals and outcomes of the class were three fold and have their origins in what I've learned and observed in the new book student internship program. So I wanted to foster core computing competencies for humanities and social science students as well as promoting a team working environment that's based on interdisciplinarity and collaboration. And finally I wanted to highlight the value of using digital tools to investigate humanities data with consideration for best practices for building and curating a DH project. The initial classes were based in my home department of Near Eastern languages and civilizations and it proved challenging to promote and fill the class since students generally don't go looking for a DH course in an ELC department. I had to make a lot of posters go into various lectures pitch the class etc. really sell it. But for the latest fall class last year I was able to have the course listed through the informatics department which is a very popular major more of a natural home for a DH class and it filled within a day of being on my part. We work on the quarter system at the University of Washington as I've mentioned and the DH class is met twice a week for three and three quarter hours of in-person time and so in order to maximize the amount of time available for hands-on work with both myself and my TA present in class I decided to flip the classroom with the expectation that students carried out preparation before the class period by watching introductory videos that I recorded as well as various readings we had an online discussion board as well and then they came to class ready to put some of the theories that they'd learned about into practice. The student groups were very diverse in terms of backgrounds skills, experience so I had them complete a pre-class survey so that I could place them in well-balanced teams which would enable them to interact and learn from each other as well as building invaluable social skills. The traditional university classroom which often presupposes a lecture-based somewhat passive learning experience wouldn't have been suitable for the type of hands-on group work characteristic of digital humanities but fortunately the University of Washington has a number of active learning classrooms with practical features like easily movable furniture to create seating hubs for the teams a larger monitor screen at each table we had writable wall surfaces for planning and brainstorming activities and essentially multiple appropriately situated electrical outputs for student laptops this setup enabled me to get more one-to-one interaction with students and to give frequent and immediate feedback during team activities and also to troubleshoot effectively in class I found that I was able to see students with different learning styles a more personalized experience moving between teams each class enabled me to really get to know the dynamics of each group as well as students interest, concerns etc so it became a very personalized experience as I have mentioned I consider myself something of a subject matter expert first and a digital humanist second it's often the case that humanities faculty are resistant to the notion of introducing a practical or technical element into their classroom not least because teaching tech to a group of students versus more traditional humanities topics can present challenges to the subject matter expert and so recognizing this or the intro class is an opportunity to collaborate with my colleagues both in the library and in other departments to offer students a comprehensive and well-rounded syllabus so for instance we had a lecturer in library and information science come in to talk to the group about copyright and about sourcing open source material we had the metadata librarian come in to talk about Dublin call data and the importance of establishing consistent and complete standards and records as a matter of project sustainability and then finally we had a faculty member from human centered design and engineering come in to talk about effective project planning and design and I also had the opportunity to collaborate with a commercial vendor to leverage their content and digital tools in a classroom setting this is a syllabus which was quite broad in scope and over the course of the quarter students learned best practices for planning their projects for creating and curating a data set they used digital tools to analyze the material that they'd collected before finally building a digital exhibit to display the results of their work we again used OMACA for display purposes and a range of open source tools for the process of collaborative data set creation curation and analysis so this rubric summarizes what I was looking for in each project students needed to demonstrate that they had planned their projects effectively including developing a data management plan for the hypothetical long-term preservation of their work as well as project documentation which included a project charter and a project one pager which set out how the team agreed to work with each other as well as how they intended to present their work to the outside world their main assignment each week was to keep a detailed work log of everything that they'd done documenting both their successes and their failures because of course the failures are the greatest learning opportunity and they always got full credit for things that didn't work provided they had documented the process completely and then finally they were expected to upload digital objects including images visualizations maps etc which were to be described with standardized and consistent Dublin call metadata so I'll finish by showing you this slide which is a demographic breakdown of the students in last fall's intro class it was my largest class today it had 35 students on the roster as you can see the demographics were very varied with representatives from humanities social sciences and informatics backgrounds students had varying levels of technical competency 90% of the group had no experience of DH work they told me they were drawn to the class because they had already developed they wanted to apply the tech skills that they'd already developed to a humanities data set or simply because they were curious about what DH actually was and this curiosity and spirit of engaged learning is something that I have observed over the last decade in my interaction with all students who have worked with me and of course it's one of the goals of a university education so it's been wonderful to play some small part in that thank you very much so we've got plenty of time for questions and discussions this is good that everyone is I have a question in your opening survey to the students did you do any kind of examination of their learning styles yes I did I asked them essentially how they liked to learn whether it was having somebody delivering the information whether they preferred visual things like lectures if they liked to read etc often it was a combination I did try to provide that in the class I found that very few students in the class were actually passive learners in terms of sitting there taking notes and I found also I know yesterday we touched on perhaps some students feeling anxious about making verbal contributions in class and I found that breaking the class into smaller teams of say five students broke down those barriers and enabled a more dynamic interaction between the smaller groups and then we came together at the end for team discussions sort of inter-team discussions so yes I asked about their learning styles their experience, their interests that sort of thing and I tried to map those to the various roles that I assigned within the group projects so we had a project manager a contents specialist a visualisation specialist a mapping specialist and then a metadata specialist and each of those was responsible for that particular aspect of the project yes as you had mentioned that not all of the versions of the diaries were available that you had some thoughts as to why not if you don't mind could you share some of these thoughts yeah I think it's probably the same reason that we've not been able to find any photographs of Emma I think after Theodore and Emma's death in 1915 and 1922 respectively things became very contentious in the family Theodore's will was contested it actually took 10 years and ended up at the supreme court and I suspect that Emma's family destroyed a lot of her material I have found some handwritten letters by her in the Fifth Institute in Oxford and I visited Columbus Ohio six or seven years ago and I got in touch with a member of a very distant relative so he was the great great grandson of one of Emma's cousins and it happened that he had a box of letters in his attic which his family had had the house for many generations and he met me in a parking lot in Starbucks in Columbus and handed the box over to me for the day and I took it to the library and dug through it and actually found a letter written by Emma to her cousin in there which I was thrilled about and I gave the box back to him and then about a month later he emailed me and he said you know I think that letter belongs with you and so he fedexed it to me and I am now the proud owner of one of Emma's letters but I think that's the reason that we've struggled to find anything her diaries and what have you which is a real loss but having said that her handwriting is appalling so the thought of having to work through 19 volumes that would have been very challenging yes I wonder if you could talk a little bit more about who used to be audiences or audience for this project like how would you imagine it being used in the future yeah so I went to a conference a couple of years ago the Astini conference which is the association for travelers in the Middle East essentially and many of the people presenting had collected similar material to the type of stuff I've been showing you and so there were discussions at the conference about generating a big database the who was where when database so that I think will be the primary academic audience people researching similar time material to me but I would say that Egyptology remains endlessly fascinating to the world at large and looking at our web statistics visitors come from all over the world and then we also have distant relatives of emas as well a few people have contacted me via our facebook page and offered one of them had one of emas teapots which is engraved with EBA and I mentioned the Donna Laura Minghetti Leonardo so on the way back from Egypt every year the couple would stop in Italy and purchase Renaissance artwork they were in fact Theodore Davis was in competition with Isabella Stuart Gardner for the best material and they both worked with Bernard Berenson to identify these artworks for purchase and Theodore found what he thought was a lost Leonardo da Vinci which was owned by a woman called Donna Laura Minghetti and he purchased it and took it out of the country under a cloak of secrecy as soon as he got it back to the reef he proclaimed it to the world that there was great crowing and exaltation but then its authenticity began to be questioned and it gradually faded from the limelight and the last mention of it is in a New York apartment in the 1950s and then it vanishes from the record and so the woman who had the teapot was married to the son of one of the nieces and I happen to mention the artwork etc and I mentioned the Leonardo Minghetti and she said oh yeah it's hanging in our hallway so she sent me a photograph of this remarkable piece of artwork and a Swiss scholar has published kind of extensively on on this piece and I was able to say you know would you like to see a photograph of it and so I was able to email him so there's a little kind of unexpected audience never know how successful or does it have to be variable to be honest initially I had hoped that I could use Abbey Fine Reader to process Emma's type written diaries we did that and the output was terrible and if you look at pages of the diaries you can see the page behind kind of bleeding through and I went to the American Philosophical Society to look at the diary I found that the pages were really thin and whoever had scanned it should have put a blank piece of paper between the pages which would have easily solved that problem the work that the computational linguistic students was doing was very promising most of the handwritten engines were trained on material from like the 50s 40s, 50s, 60s and often it doesn't correlate well with material from the late 19th, early 20th century so that was one issue there but there's a fantastic project called transcribers whereby you can upload your corpora of handwritten documents and train the model within transcribers if you have sufficient documents and that is very promising for future work so I'd say you know it's variable but we do have some strategies now as I showed you for cleaning out the output which have been very helpful yes this wasn't in your presentation but I'm curious if you have an opinion about it the MET site that you scraped I use in my classes I teach a world civilization class and they're like freshman and sophomore it's a survey class but I make them to the men a couple other sites that have like very good photos of objects with context and things like this and maybe some essays about those objects and I always feel conflicted about whether I should have them also read things about how back in the 19th century those objects required in the ethical you know in a situation surrounding them and I wondered if you do that with your students yeah it's certainly a conversation they have they read it in the diaries so they know how these objects were acquired and how little involved often Egyptian nationals were in having a voice in where their history ended up but the fact is it happened and it's important I think to have those conversations to make them to make them apparent and visible that's one of the reasons why we've been interested in working on the provenance issues for some of these some of these items that have no provenance attached to them because often you see them and there's no context so you say you started with the first students in 2011 on this have you kept up and seen how they've used these sorts of skills after yes yes I have a few of them have gone on to graduate to graduate study in international relations history etc my current one of my current students something about handwriting rockstar and he intends to go on to a PhD program in history and getting to know the students so well we're able to write very strong recommendation letters for them as they're applying to jobs and programs and some of the things we highlight are the combination of both the so called traditional humanities skills writing speaking clearly that sort of thing but also the tech skills that are layered on to that project management collaborative so we feel that the package that they're able to present to both the academic and working world is a very strong one and often they have demonstrable projects to show like the story maps that they've worked on data visualizations that they've created so they can weave a narrative around the work that they've been doing with us which is always compelling I think so I hope that helps thank you well thank you very much for your attention very appreciative we have some time before the workshop it's going to be in the room can I have the Smith feel free to