 Okay, welcome everyone to this week's edition of Encompass Live. I am not your usual host, Christa Burns is on vacation this week. My name is Evelyn Nipsacont and until recently I was the cataloging librarian at the Nebraska Library Commission. I am now the head of cataloging at the University of Nebraska-Lincoln-Schmidt Law Library, but I have had this Encompass Live scheduled so I decided to go ahead and continue to do it. For those of you not familiar, Encompass Live is the Nebraska Library Commission's weekly online event covering a variety of special library activities and topics. It is free and open to anyone to watch. The sessions occur every Wednesday at 10 am central time and they are recorded so they can be watched for free later if you miss a live session. They include a mixture of presentations, interviews, book reviews, anything library related really and ours online presented by Library Commission staff and my guest speakers. Today I guess I used to be Library Commission staff but I am a guest speaker now but I am your host and your speaker today. I just wanted to say before we jump in that if you have questions as we go along please feel free to ask them, type them into the question box or if you have a microphone you can let me know in the question box that you want to be unmuted and I can unmute your microphone and you can ask that way. All right, so let's go ahead and get started. Our topic today is metadata manipulations using mark edit and open refine to enhance technical services workflow. Once upon a time metadata manipulations meant something similar to what you're looking at in this picture. Working with cards in the catalog and that is no longer the case with electronic records and records coming in from a variety of sources using non mark record metadata and so today I want to talk about a couple of pieces of software mark edit and open refine they're both free and I wanted to talk about ways in which you can use them to make your life easier when it comes to working with mark catalog records and other kinds of metadata you may encounter in your library. The first one I'm going to talk about is mark edit and it is a freely available program that can be downloaded to your computer from the website at markedit.reecet.net. It is created by a wonderful librarian named Terry Reese and he does this all on his own time. This is not his day job. He works at I believe Ohio State University but he maintains it. He does updates. He is working on a version right now to work on a Mac so it's just the labor of love for him and he's really awesome. I'm going to tell you as much as I can about mark edit in half of this presentation. There's a lot more that can be covered. I've given a three hour class on it. If you're at Nebraska and the technical services round table of our Nebraska library association is getting Terry himself out to do a pre-conference at our state library association and that is a whole day thing so there's obviously a lot more than I can cover in one hour presentation when I'm also trying to talk about another piece of software to give you the highlights but there are a lot of tutorials available. If you go to this mark edit website this is where you download the software but there's also a tutorials link on the side and that leads you to a lot of YouTube videos so it's a really great resource when it comes to the software itself and when it comes to the help available you can email the creator Terry Reese and he responds quickly. There's a mark edit listserv that you can get on so there are lots of resources for help with this software. These are the features that I'm going to briefly demonstrate using the mark edit software. You can batch edit records so if you have a bunch of marked records that you need to add the same field to all of them you don't have to go and touch them all individually you can just use mark edit to add the same thing to or delete the same thing from all of them. You can extract fields from records to create a spreadsheet or do something else with your data. They have a function called mark compare where you can compare two different mark records mark join where you can join a bunch of smaller batches or even individual records into a larger batch or the opposite of that mark split where you can split a large batch into smaller groups or even down into individual records. You can merge records with similar data to create a new record. You can actually use mark edit as just a cataloging software to create records from scratch if you don't have access to any other software that will do that for you. A couple of relatively recently developed features are RDA Helper which is used to convert records created under the old cataloging rules AACR2 and converts them to something, I guess a hybrid record would probably be the technical term for it but it has RDA resource description and access, the new cataloging rules, compatible fields and it does it all to a big batch of records automatically. And then there's also mark next which is the newest addition to mark edit and it is a suite of tools that sort of help you play around with link data concepts and using your mark data in whatever the brave new world of postmark is going to be with big frame coming from the Library of Congress and things like that. So those are the features I'm going to talk about. I'm going to actually jump out to where I have the software installed. You can download it on your computer or they have a variety of options. Actually let me go to the market page and show you the download options. Let's see here. Mark edit start screens. If you go to downloads over here on the left hand side there's the various options that you can download. Once you've got it on your computer it will let you allow it to install a desktop icon or whatever you want to get to it. Let's see and I have it here. And what you'll see is any of the functions that you use the most often you can customize the start screen so that you can use whatever works best for you. The first thing I was going to talk about is batch editing of records. So if you have a bunch of records that come in and you need to do the same thing to every single record you can do that. The thing about mark edit is that it takes a mark file and then converts it into its own format so you can work with it and manipulate it and then you have to convert it back to round trip it out back in the mark again. So the first thing to do is to go to mark tools and we want to make sure that the mark breaker function is selected from that functions menu and then you have to find whatever file it is you're going to work with. I'm using an example a batch of records for documents from the federal government that we would get monthly files like this coming in when I worked at the library commission and then we would decide whether we wanted to import them into our records. So you just browse through your computer to wherever the file is that you want and then you have to specify where you want it to go. This is the dot MRK is the extension for the mark edit file and then you can execute and it will convert the records. It tells you down at the bottom how many were processed and then you click edit records to get to the screen where you can do just that. And so there were nine records in this batch and these, they're for print, let's see, therefore print items but they also are available online and so let's say that I wanted to add a field to all of these records saying that also available in a PDF online and so I don't want to go through and add them all individually so at the tools menu you can choose add and delete field. This information goes in a 530 field so you simply type in the field number and then whatever text you want it to say. So I'll just say also available as a PDF online and then you click add field you'll get a confirmation from a pop-up telling you how many fields were added and then if you go and look it's probably not the easiest to see but they are indeed added to the field. There's a 530 field here and it's the same for all the records. You can also delete fields. These have a 994 field that was automatically added but we don't necessarily need to use that for anything so I'm going to go back to that tools menu go to add delete field we want to delete 994 and as we don't care what the data is you don't have to put anything in the second box if there was, if there were the same field that had two different types of data you couldn't specify the data then it would only delete the ones with that data but I'm just going to delete the field and again it tells me nine fields were deleted so that means it was done from every record. You can see nine records isn't that much to work with you probably could go in and edit all these individually but when you think about it sometimes you get packages from vendors and there are thousands of records it's really a time saver to be able to do things like this. The next thing I'm going to talk about was extracting fields from records and this is a function that I used to, when we got these same government documents records I as the cataloger would download the records from the government's website and then I wanted to pass it along to all the GovDoc staff to be able to evaluate whether we wanted to put these into our records or not let's see if there were some questions come in and actually two people asked the same question so this is good the five other fields have indicators or sub fields yes that was an oversight of my part if you do want to put indicators and sub fields in there you do need to type those into the field data box you put the 530 into your field and then the field data box you do need to put in your indicators and sub fields and that was an oversight of my part excellent question I was talking about extracting fields as part of our workflow I would extract information from the mark records into an Excel spreadsheet and pass it along to the GovDoc staff and then have to stare at the mark records and make sense of them so I would for this one you need to go to the tools menu and you can customize this as to which ones you want to actually be on your home screen but you can also find all these functions in the tools menu and so you want to export and then export in this case I wanted to export a tab delimited records that could be used as a spreadsheet and so you choose export tab delimited records from the sidebar if you haven't gotten there already and then you need to set your file paths again similar to what you did when we were editing the records and so I'm going to choose that same mark records file and then you need to specify a place to save it and you notice it's going to be a text file a tab delimited file rather than a mark file and then you select the field delimiter I want it to be tab that's what's going to tell it to go on to the next field basically and then you get to choose the fields that you want to export I recommend selecting normalize field data that takes out all the mark subfield codes things like that so I would do let's see you can either select them from this list or if you wanted the title you put in the 245 field and if you only want a particular subfield you do the subfield in there and then add the field and it goes up there so whatever else you want if you want the call number you can do the 086 and add the field or let's see here if you wanted a 110 for the corporate name so it's probably a corporate name rather than a personal name but it might not be in a 110 it might be in a 710 so I'm going to do that you might not actually want to add a subfield you might want a whole field in the case of a 710 because it might be human services and then a smaller division division of public health or something like that so you might want to see the whole thing if this is a process that you do often you can save these under settings click save settings and then you'll be able to save it as a text file so you can load that again the next time you want to do the same thing over again if it's just a one-time thing you can just click export if you are coming back to something that you've saved the settings before you don't have to go through adding these again you can just do settings and then load settings and it'll give you a chance to reuse the settings that you would save before you're demonstrating this as a one-shot field so I will just click export and it tells you where the file has been saved and so at this point I go and open up excel and I would open the file that I just saved and you'll probably need to switch it so that it shows text files and it'll give you some options for how you want it to display you do want it to be delimited so that the computer knows the tab character is separated into a new column you want the tab to be the delimiter because that's what we saved it as and when you click finish you've got your various field it looks like some of these had a 110 but we've got the column numbers and we've got the 245 so you can just customize it and then you can do whatever you want with this spreadsheet if you need to set it on to somebody else it's just a way of getting a subset instead of a whole mark record and I'm not going to bother saving this because we don't need it until later let's see now we'll look at mark compare and this is a case in which you take two different mark records that could be in your records for the same item and they you want to see where the differences are maybe you've got a record from two different vendors and you want to compare the pros and cons of each of them and again this is something you could put on the home screen you can also find it under the tools menu and the examples I'm using are a record for a print version of a book and the record for a ebook version of a book and then you have to specify where you're going to save the file it's going to be an HTML file mark compare execute then you need to go and find where your file is and it just kind of highlights the differences between the two records so the greens things highlighted in green are the things that are unique to the print record things highlighted in pink are the ones that are unique to the ebook record so again I don't necessarily know if you use it so often in your day to day work but I think it could be useful when you're evaluating records save two different sets of vendors and you want to see which one might be superior for your uses things like that so that's the mark compare function the next thing is mark join and this is just what it sounds like you take either small batches of records or individual records and you show mark where to find them you have to start by selecting the destination that you're going to put them to so I'm just going to call it mark join and then files to join I'm just going to pick some random files it's pretty simple to go through that you can open up the file and instead of having two different files with ten and nine you'll have a file with nineteen records altogether mark split is the exact opposite of that you choose your source file you choose where you want it to go it's going to kind of put it all together in a folder and then you can either tell it the number of files to split it into so if you have a file with five thousand records and you want it to be into five files you can do that if you don't care about that and you just want to tell it you want to separate things into individual records you can tell it to do one record per file so you can either tell it how many files you want at the end or how many records you want in each file each of those can be useful in a separate in particular situations and so you process it I don't know why you would want to use a control field in the file name I have never done that this says nine files have been generated you saved it to my documents and so all these M split with numbers those are the various files so now you have nine individual files rather than the one with nine files all together you can also merge records so let's say if you had I don't know records from a vendor that had pretty decent cataloging but they didn't have OCLC numbers and then you had a batch where they did have OCLC numbers but they had been put in cataloging and publication and they weren't very fleshed out and you wanted to merge them together so they had OCLC numbers but they also had good quality records you can merge them again it's just as simple as figuring out which ones you want to put together and then you can either merge into source so if you don't want to merge them all at the end you can just merge the second file into the first one or you can specify an entirely new file if you don't want to overwrite what it is that you had and so it will give you a source and I'll just call it merge results so if you only want to merge particular fields you can choose that if you want to merge them all together you can do that too and so then you do that and then your file is located wherever you specify in your source file so a lot of it is just navigating to the right spot on the computer telling it what to do and then letting it work its magic and again I could go into much more detail on all these things but we have a lot to fit in in hours of under time but please do stop me if you have any questions I said you can also create records from scratch there is a mark editor function which is either if you customize it on your home screen you can do that or it is also under the file menu and you just get a blank slate you can do file new and then it gives you some options for starting with so a book, a serial, visual material, etc etc so if you choose book then it pops into the standard fields that you might use for a book and so then you can just type them in and create your records so if you don't have access to a bibliographic utility or for some reason your local system doesn't allow you or you're not in a situation where you can use your local system to create records you're away from your library and you really need the record Marked it is a good very basic software for creating records from scratch as well as doing all these wonderful things to edit them then there is also the RDA helper function as you may or may not know cataloging is going through kind of a transitional period between the Anglo-American cataloging rules AACR2 and resource description access the new rules if you use the RDA helper function you can take a batch of AACR2 records and add things to make them RDA compatible I would say they're not necessarily truly RDA records because you haven't gone back and looked at the item and made sure that you're transcribing what you see but they will be hybrid records that will work in a catalog with RDA records I have known some people who have used this function to fully convert their catalog to RDA compatible hybrid records and then you can choose which ones you want to add these extra 340 340 X fields may not necessarily be important depending on what kind of material you're working with but you can add them in most people will probably want to add the 336, 337, and 338 delete your GMD your general material designation evaluate the 264 for publication information so once you've chosen it it can also expand the abbreviation so it will if your records as p-period for pages and you can have it spell out pages I have a question coming in, I miss how you can convert the files back to mark after you use the mark breaker oh yes skipping back to mark breaker for a second if you are in mark editor we have already converted a file, I'll just open that one we were working with before when we were adding and deleting fields once you've done all the changes you need to do that can be either batch changes to add a field to everything or if you want to actually go in and touch each record individually compile file into mark is one of the last options and then we'll just let you name it as whatever and then that's how you save it and then you can import that file back into your local system that is a good question can we use the aggregate utility on authority records too you know that is a really good question I'm not, I have never asked that one that is definitely something that I would say contact Terry Reese about he's really really good his contact information is on the website and he's very very good about responding let me see if I can figure out to me it looks like it's mostly referring to bibliographic records so if I was pressed I would say no but definitely talk to Terry and maybe something he's working on who knows okay so I think that's all the questions if not please go ahead and if I can order a question by accident please go ahead and resend it I'm going to go ahead and talk about the RDA helper function let's see I have a file of a ZR2 items and it says RDA process has completed let me see if I can find that file I called it RDA items and so I guess I didn't show it to you beforehand but it has the 336, 337, 338 it's added a sub-bill E with RDA I know at least for OCLC's practice it's they don't even prefer to put the sub-bill E up before the sub-bill earlier in the string but Marketed just put it at the end oh I know someone excellent answering the question that came in earlier Marketed is market agnostic so it will work on all market formats authority, community information, etc so that is a good answer to that question thank you so that is the RDA helper function RDA Marked next was the next thing I was going to talk about and it is the Swedish rules I'm not going to go into great detail about it because I haven't exported a whole lot myself and we only have so much time to get everything in the thing that I have found most interesting is the link identifiers and you can use it to add let's see here I'm going to use one of my government documents files and then you specify a saved file all linked and what it can do is if you're familiar with the ideas behind link data one of the movements is to instead of just putting strings of information in our authorized headings is to also have a URI a Uniform Resource Identifier for a person's name or subject heading, things like that and so what this is doing is it's linking fields to the Library of Congress's link data where they have concepts and people represented as Uniform Resource Identifiers and so if you would check that box and then hit process and it takes a while it's going down under 2, 3, 4 and I'll go find that file actually I will it's a mark file so I'm going to break it into the market files so you can see it more easily and what it has done is it's added a Subfield 0 in fields like a 100 field for a personal name and it has a link to the URI for this person's name Jennifer D. Schmidt so this is kind of a baby step towards link data I don't know if a lot of people are doing this widespread but you'll notice it also has it for the 650 fields, the subject headings down here it has a link to the link data version of the Library of Congress's subject heading so these subfield 0s are at least a concept that's out there and so they are being added in by some people I think it's very widespread but a lot of people think this is kind of a baby step towards link data you know in the environments we're in we still have to use the strings where we actually type in somebody's name in an authorized form but we can also use these identifiers that will someday be more computer readable in order to make link data work with our library data so I think that's kind of cool like I said I haven't explored a lot of the other part next tools in detail but there's information out there on the tutorials that Terry Reese has on his website and so definitely if you're interested in that that's something that I would recommend exploring I will at this point also say that any of the links that I'm sharing they will be collected into the Library of Commissions delicious website and when you get the email but the recording of this is available there will also be links to all these so you don't have to worry about writing down URLs as I said so does anybody have any questions about mark edit before I go on to open refine and if you are playing around with these things after the fact and you have questions definitely feel free to email me I will be happy to do my best to answer them I don't see any coming in at the moment although you are certainly free to still go ahead and type some in if you happen to think of them I'm going to go on now and talk about open refine you may have also heard this called google refine it is a tool that was originally used or made by Google and now they call it open refine available software that is available at openrefine.org you download it onto your computer and it's running on your computer but it also uses your web browser in order to allow you to actually interact with the interface if you are interested in this I will say right now as you can see there is a book called using open refine that is a really good introduction to this software there is also documentation available on the website I've also mentioned a website called freeyourmetadata.org which uses open refine a lot to sort of show you how to mess around with metadata and do things with it that we are not currently doing in our library necessarily so keep that in mind as well that is also going to be in the delicious link after the fact freeyourmetadata.org and so here are the features of open refine that I'm going to show you separating multiple values in the same field analyzing the distribution of values in a particular field and cleaning up inconsistencies and I'll talk a little bit about this also has link data applications so you do go to open refine.org and download it and then when you work with it close this tab it is in a spreadsheet like this I have already gone and uploaded a file to work with a tab separated file it did take a while so I'm not going to show you that stuff but I had a test set of metadata actually I downloaded from the freeyourmetadata.org website so if you necessarily have a project that you are working on you can get some test metadata from there this is from I believe the powerhouse museum so it is some of their collection data and so the first thing I said that we can do with this is separating multiple values within one field depending on where your metadata came from you might have like a string of say subject headings separated by a semicolon or some other kind of character and they are all switched together in one field I know that my experience with nonmarked metadata was working with delvin core fields in content dm for the Nebraska memories project that we did at the Nebraska library commission and so we would string all the subject headings together in one field separated by semicolons but there might be situations in which you want each value to be in its own field so we will do that but first I have to create the project this is the screen you get after you import your data I'm just going to call it encompass live test because we are just working with this form this encompass live presentation and then hit create project I might have to start over I try to be proactive and do this ahead of time but it really is undermining my best efforts so now you can see the whole step I had a file of the powerhouse museum data and it is uploading it into Google refine you'll notice that some of the screens say Google refine even though they pretty much brand the software as open refine these days and it says almost done it's working on it you can see the the little bar is almost full it seems to always get stuck at this last point but it's working on it and while we're waiting for this I'll just talk a little bit more about open refine I feel like it's particularly powerful for dealing with large amounts of data where you wouldn't necessarily be able to go through and touch them all and find inconsistencies you can analyze it to see did everybody use a geographic subject heading the same way or did some people just put Chicago Illinois or something like that and you can analyze thousands and thousands of records very very quickly let me see how we're doing here it's still going okay you know I tried to bypass this and I'm you know best play plans but it's working on it because while we're waiting for it I will just talk through some of the things that we would be doing what exactly I mean by each of these things separating multiple values in the same field it's just what I was explaining let's see oh what kind of I have a question coming in good what kind of files can be uploaded with open refine let me answer that for you I know they have a list I think it's pretty flexible there's a lot of things let me go to their website and at openrefine.org I'm going to go check out their documentation just to make sure I'm not missing anything you know I think they can do XML they can do I think CSV files come separated values tab separated values FAQ that seems like a good place to start yeah I'm sorry I'm drawing one here off a definitive list of everything you can do but I do know that it's a it's a pretty extensive listing an Excel spreadsheet a CSV file like I said a CSV file then this is of course not working let me try this again I'll start over I was going to start with scratch opening up Google refine over refine make sure that we're not encountering any glitches here and it starts up with this little start screen and then it opens up in your browser let's see I have a question coming in about what software do you recommend to those who are new to mark files in general I guess that would depend on what I mean I'm going to need some more clarification from you do you have a local ILS system that lets you work with marked files to respond to that and so I'm going to try again uploading this file currently working in internship with a corporation environment where you may or may not have a mark based catalog you know if you're just working with you know actually creating your records I mean outside of just your local system probably mark edit would be the most basic software I can think of to work with mark files I guess you can create them from scratch in there you can edit them and I believe you can also export them in files other than mark if your corporation's software does not take mark up there are things that you can do to you can probably export it as something else and then see if it's something that can be used by your system I'm sorry I feel like I'm not giving you a great answer but it's hard to know without knowing your local system and yes there are a lot of tutorials on YouTube for mark edit so I think that will hopefully help you do what you need to do with with your system ok so I'm going to let that run I'm kind of wishing I had screen shots to show you everything now but I'm going to go ahead and talk about the things that I was going to tell you about open refine when it says analyzing the distribution of values in a particular field like I said you can tell it to look at a particular field and they call it faceting and clustering and so you can do the facets and it will go through if you have say a place of publication field and it will go here and tell you anything that looks similar you can group it together you can do what they call it clustering so it will bring together things like Chicago, Illinois where Illinois is capital I where Illinois is I, capital I, lowercase l, lowercase l period where Illinois is in parentheses where it's not it will sort of let you evaluate anything that might be similar and then you can say you can choose ones you can say we want to do Chicago, comma, capital I, capital L and it will fix all of those like that and make them all look the same so you can do that for simply analyzing and seeing where the differences are you can clean up inconsistencies whether it's things that were typos or whether it's things that you know policies have just changed over time and so not everybody is doing things the same way I really hope it's going to work so that's almost done I have someone with their hand raised in the audience I don't know is that a request to talk with your microphone if you would like me to unmute you please go ahead and type that in the question box and I can certainly let you speak that way so anyway I'm beginning to feel less and less hopeful that Joe from Refine will actually let us see what it is that I'm talking about I have a question coming in asking if the webinar will be archived and yes it definitely will you will get an email with the link to the recording Christa our host is out this week on vacation so usually it's up within 24 hours but it might be next week but you will definitely get an email with the recording we don't get to see this up about heart edit so anyway I'm not entirely certain I'm going to get to show you okay so here is the important screen and you can call it whatever you want and then click create project and this one takes much shorter obviously we have 11 seconds remaining so this creating the project link actually works much faster I'm so glad this actually worked out we have a question coming in asking if open refine can find inconsistencies with upper and lower case yes I believe it can we can look for some of those right there but yes it's a very very powerful tool and it can definitely help you find almost any kind of inconsistency in your metadata so I was talking about how you might have you want to separate values in fields they have a category field here in this powerhouse museum metadata which is a lot of subject headings also some more descriptive type of stuff and so they are separated by vertical bars which is an interesting character that I've never seen used for this before but if you want to edit this you can go to category edit cell each column has a little arrow next to the column name and that is where you get all your options to edit things you choose edit cell and you want to split multi-values that will tell you or let you put in what separator it is and so we have the vertical pipe and you hit ok and it works on it for a little bit all much faster than the initial upload I promise and now it has split things into multiple columns so this space script and models and space technology used to all be in the same column and now it has split them out separately so that's something kind of neat for something you can do now I'm getting a comment coming in saying that the vertical pipe is an option inmarked for extracting records so that's cool and that's possibly where these came from so once you've got things separated out and you want to kind of see what you're dealing with you can facet it and a text facet when we're dealing with something that's mostly text and then you watch over here on the left hand side you'll get a list of anything that is used in this field and then there's little gray numbers after it tell you how many times each of those things are used so we have a lot of numerical things here this is the type of day that I'm not totally familiar with so I'm going to skip down to the text things and those will probably be more meaningful so we have four items where AC generators are used it's two items where acid jars are used so you can it's kind of cool tool just to see we have 69 things that were considered to be advertisements it's a snapshot of your your collection data basically and so if you want to do what I was talking about before and kind of see where some of the inconsistencies might show up you can hit the cluster button and this is where it shows you and so yes to answer the question that came in before it does show you differences in uppercase and lowercase letters so things that are close but otherwise you know they're a little bit different so let's say our policy is to capitalize the first word and then not capitalize the other ones unless they're proper nouns so it shows you these things you say yes we want to merge it and then this new cell value is where you get to choose what they are all going to look like so if you put it in there with a lowercase second word then it will take all these ones where both of them are capitalized and it will make it look like that and so you can do the same thing for this one tell it if you want to merge and we want to have the lowercase second second word same thing for this one we want to merge and you measure the instruments et cetera et cetera and then once you've selected everything that you want to change you hit merge selected and recluster and then so you'll see those are kind of taken out of the equation you've still got all these that are similar but not identical and so you can go through and make more changes and so it appears that most of the inconsistencies in this file are exactly what the question was talking about they are differences in capitalization but it could also be you know it can find things like what I was talking about before where it has Chicago versus Chicago Illinois things like that so it's really really good at finding things that are close but not quite and allowing you to standardize things across a lot of data so you can so you want to merge that one and then if you hit merge selected and close then it takes you out of that option so those are the main things that I've been aware of using this for and I've not had chances to play around with this as much as I would like I would love to get into more projects where I'm able to use open refine but those are covering most of the things I talked about separating multiple values analyzing the distribution and cleaning up inconsistencies I will just briefly talk about the linked data applications because this is definitely something that I have not used it for but similar to how mark edit had the mark next functions open refine also allows you to encode your data in a way that can be used for linked data and link it to other sources let's see and just for the record you can't get your data out of here by exporting it in a variety of sources and these extensions built in it has freebase which is a store of linked data data we could be a type data that's out there so you can choose to link it to other systems that are out there and then make yours available and it will use identifiers and things like that that are already available in a linked data world let's see here we have a couple questions coming in can you remove specific data for instance 505 fields that are just a list of the alphabet yeah I can see why you wouldn't want that in your data specifically with open refine I assume that you could what I would say you could do let's see I think that if you you go through and sort your data and you you'll get a list here and so if you see the one that you want like the ABCDFG etc you can choose edit and I think you can just delete it all and say apply and it will just get rid of anything that has that data in it you could also do that in mark edit where we were talking about earlier when I did the delete field option I didn't specify a particular type of data in a field I just told it to delete also in 994 fields but you could tell it to delete a 505 field and then specify in your field text exactly what it is that you want to remove so I believe either of these pieces of software could accomplish that for you another question coming in does it handle diacritics correctly that one I do not know off the top of my head I don't work with a lot of items that have diacritics in their text so I'm going to have to unfortunately defer that for you I can certainly research that and email you the answer if you are interested but you will probably also search the documentation on the open refine site and I'm sure they can answer your question on that we have another question coming in does mark edit or open refine have a skull checker tool or a way to find the spellings for mark edit yes I'm pretty sure it does I believe I have used that open refine I'm not so sure I will say I have never personally used it with the powers that it has the fact that it can identify things that are not capitalized consistently I am confident that you would be able to find a way to do to look for the spellings when it does the cluster thing we also saw capitalization differences just based on the data that we have here but certainly I would say if you were looking at a particular field and you asked it to cluster similar I believe would bring together both correctly spelled and incorrectly spelled versions and then you could change the correctly spelled ones to correctly spelled ones I hope that answers your question does anybody else have any questions coming in and like I said I feel like I've just totally scratched the surface on both of these especially open refine I know it can do way more things than I just told you either of these could easily be a full day presentation on their own but hopefully this has given you some information so that you can jump in and explore these things on your own I would definitely be glad to answer any other questions you have either now if you think of them or you can definitely email me later I have links available now under my own personal delicious account they will also be on the library commission's website when the recording is ready so wait a second to see if there's any other questions coming in if not then I thank you for attending and please join the library for more encompass lives in the future thanks bye