 Thank you all so much for coming this morning to the open source digital preservation and access stream which was put together by myself but also a lot of help from other folks primarily the open source committee and a lot of other people that gave comments and suggestions and stuff so I'd like to thank the open source committee and everybody that helped put today together not the least of which of course are the people that are doing all of the great programming we've got a really good lineup of speakers and topics and just really stellar so I'd like to thank everybody for contributing. Kicking us off and with our inaugural session is Trevor Thornton and Lauren Sorensen. Trevor is an application developer in the digital library initiative department at North Carolina State University before moving to NC State in February he was at the New York public library where he was a senior applications developer with NYPL labs. Lauren is an audio visual archivist interested in digital preservation magnetic media and open source currently at the Library of Congress as part of the American Archive of Public Broadcasting Project Lauren previously worked at BayVac or Bay Area Video Coalition where she helped manage digitization workflows for preservation and access of video and audio for smaller institutions. She participates in the FADG born digital video and NDSA standards and practices working groups and is co-chair of the independent media committee at AMIA which is when when is your meeting? Friday at 5.45 every day and a quick reminder if you're tweeting the hashtag for today's stream is hashtag OSDPA it's on both walls and up here on the podium in addition to the hashtag AMIA 14 so that would be great if you could tag those with that with that I will turn it over to Trevor and Lauren thank you very much. Hi everybody so as Chris mentioned I'm an applications developer at NC State and for developers working in libraries archives and museums there's an increasing focus on building tools to solve common problems and tools that can be useful to the community at large and not just within the institutions that we work for and the mechanism that enables us to share these tools is by releasing them as open source projects increasingly the tools that we build are taking the form of web applications so what I want to do today is give you some background on open source software in general and the technologies behind web applications in particular which I hope will provide some foundation for the rest of the presentations you're going to hear today. So to start with let's define what we mean when we talk about open source first open source software is made available with a license that permits users to freely run study modify and redistribute the software there's two notable parts to this first the importance of the license because in its most basic sense open source is a legal designation and the license as a legal document is fundamental second is the notion of free when we say that users can freely use this software we're not talking about cost we're talking about freedom from legal restrictions while a lot of open source software is available at no cost the issue of cost isn't implicit in the designation of open source open source software is by definition distributed with its source code in a human readable format traditionally software can be thought of as taking two forms the source code which is what the developer writes which is then compiled into machine code which is what is executed on the computer in order for developers to study and modify the code they have to have access to the source code and the third characteristic of open source is that it's typically not always but very often developed collaboratively by developers who are otherwise independent of each other but come together to solve a common problem and this is one of the real strengths of the open source model so to better understand open source it helps to look at where it came from and how it's built in the early days of computing the focus of the business was on hardware and the sharing of software was common particularly in academic settings and software was customarily distributed with the source code but with the development of mass produced computers in the 1960s and personal computers in the 70s a significant industry developed around producing and selling software and in order to remain competitive software companies began to implement measures to protect their source code from being copied or modified which the time mostly meant just distributing the software as machine readable binary code and not including the source code at this time there was no existing legislation prohibiting the sharing of software but this changed in 1980 when protection under the u.s. copyright law was extended to cover computer software which gave software the same protections as literary works the same period of history saw the rise of what's become known as agriculture which is concerned with circumventing limitations of computer systems in order to extend their functionality this community increasingly began to see efforts but these efforts by the software industry to restrict access to source code as an impediment to innovation and to their intellectual freedom this led to the foundation in 1983 of the canoe project which was started by Richard stallman from MIT with the goal of collaboratively developing software that was free of any restrictions on its user modification it's important to note that the copyrighted software doesn't just apply to application software it also applies to system software which computers required to do anything at all and stallman believed this fundamentally restricted the user's freedom to control their own computers so the first thing the canoe project developed was the canoe operating system canoe is a recursive acronym that stands for canoes not unix which refers to the fact that canoe is based on the proprietary unix operating system but the thing that maybe has had the most lasting impact was the license under which the canoe software was released which is called the canoe general public license or gpl which provided for the first time a legal basis for giving users the right to copy and distribute the software in 1985 stallman founded the free software foundation a non-profit corporation dedicated to support the growing free software movement and in 1986 they published the first formal definition of free software which is based on four freedoms afforded to users the freedom to run the program for any purpose the freedom to study how the program works and change it to suit specific requirements which is predicated on access to the source code freedom to redistribute copies and freedom to distribute copies of modified versions in 1991 linus terwald's developed the first iteration of the linux kernel which is another operating system based on unix and the next year he released it under the canoe gpl linux soon became the first massively collaborative software project with thousands of developers contributing code over the next few years in 1997 a programmer named eric rayman published an essay called the cathedral in the bazaar in which he analyzed the development of linux and similar collaborative projects the title refers to what he saw as two distinct models for developing free software the cathedral model refers to carefully planned and organized process that's closely supervised from start to finish at the end of which the product is released with its source code the bazaar model which is the situation like linux where development happens collaboratively over the internet by individuals with distinct agendas and strengths his argument is that this model has inherent efficiencies that come from having lots of people looking at the code and thereby finding and fixing problems more effectively which he summarizes in the aphorism given enough eyeballs all bugs are shallow it's around this time that open source comes into existence as an alternative to free software there's still controversy about the distinction between the two but a good way to think about it is that free software is more concerned with what users can do with the software whereas open source adds on to that this collaborative development model and is really kind of focused on the development model more often than not nowadays you might hear people talk about free and open source software which covers both but usually when we talk about when we say open source we kind of mean all of this stuff together so in talking about open source as a development model there are some key principles that characterize it first is the idea that users are potential co-developers it's useful to have users that tell you that your software is broken it's more useful to have users that tell you why your software is broken and because they have access to the source code they can look at it and tell you what you did wrong and if given the opportunity they can fix the problems for you and contribute the code back into the project following from this idea is the motto of release early release often when you have this potential for a community of developers to look at your work it's better to get a minimum viable product out and then when users find bugs or discover potential useful features that aren't there you can address these in a timely manner and re-release it's often desirable to release multiple versions so you'll have one version that's basically stable with the bug fixes and new features tested and vetted and then you'll have one or more development versions which includes new bug fixes new features that is made available for scrutiny by the community open source projects tend to be modular in their design this allows developers to work on different parts of the code according to their individual interests or strengths it also promotes the reuse of code because it makes it easier to borrow discrete pieces of functionality to use in other places and in order to keep these projects organized there needs to be some kind of structure in place to manage what fixes go in and when the releases happen this usually falls or initially will fall to the original developer but as more people get involved usually a core group of maintainers will form around the project to kind of steward it into the future the thread that runs through all of these things is this idea of community which is really central to the open source model community of users and developers working together for mutual benefit so as I mentioned in the beginning what makes software open source is that it's released with a license that makes makes it explicit what the users of the software allowed to do just releasing the software on the internet doesn't make it open source because copyright law in the United States and elsewhere applies automatically even if you don't do anything it's necessary to have the license there to make it explicit that users are free to do whatever they want to with this software there's a group called the open source initiative which was founded in 1998 to advocate for open source software development one of the big things they do is to evaluate licenses to determine if they can truly be considered open licenses they publish a document called the open source definition which is sort of a checklist for things that must be present in the license for it to be considered an open license which includes the four freedoms that I mentioned before and also things about how the software can be distributed or how it can not not be distributed sort of and makes explicit that software needs to be available to anyone for any purpose regardless of who they are or what their field of endeavor is there are a lot of open source licenses available and while they all meet these criteria that I was just talking about there are some subtle differences between them for example the gpl has requires the derivative works are released under the same license so under the gpl which some believe to be too restrictive the mit license is probably the most permissive and straightforward the apache license incorporates elements of patent law into it so it's good to kind of be aware what what the differences are between the licenses the ones you probably see most often are the gpl the mit and the apache license and if you're releasing open source software you can release it under one of these existing licenses it's actually better to do that it's better to use a license that people are already familiar with so they know automatically without having to read it what they're allowed to do so open source software comes in a lot of forms there's command line utilities desk type applications operating systems but as I mentioned one of the the models that is becoming increasingly popular not just in the open source community but in general is the web application model you've you've used web applications before if you use the web to write email to edit documents to watch tv this is a web application and the same technology that makes it possible to do these things also makes it possible to do things like manager collections which is something that we have traditionally depended on a piece of software to be installed on a specific computer or on a bunch of different computers the main advantage of the web application is that you install it in one place on a server and any user with a web browser and access to that server automatically has access to the application web applications minimize system requirements for users so they don't require a lot of hard drive space or a particularly large amount of ram if you can browse the web you can use the application they provide compatibility across platforms and devices nowadays most developers will build their applications so they work well on mobile devices just as on laptops and desktops and web applications provide an increasingly rich user experience as web technologies continue to improve excuse me I'm saying a lot of things uh so web applications can be thought of as having a three tiered structure there's the the presentation tier which is the level at which the user interacts with the application this in turn interacts with the application logic tier which is the level at which the actual code is running this is the sort of the heart of the application and then the application logic interacts with the database or storage layer which is where the data that's accessed or manipulated by the application resides looking at this different a different way uh web applications are usually talked about as having a client side in a server side uh the presentation layer lives on the client side in the web browser uh which communicates to the server side over HTTP the standard protocol for data transmission on the web on the server side there's an HTTP server which receives the request from the browser uh passes it to the application logic uh which then interacts with databases does whatever it needs to do to prepare a response which it sends back to the HTTP server which in turn sends it back to the client uh so I want to talk about some of the basic technologies behind all this starting on the server side uh so in talking about server side technology this thing called the lamp stack is uh as good a place to start as any uh the lamp stack is a very common set of open source software components used to build and deliver web applications lamp is an acronym for the four primary components Linux Apache MySQL and PHP uh it's likely that if you have access to a web server it's running some version of the lamp stack already that's fairly ubiquitous um there are alternatives to each component both open source and proprietary uh but the lamp stack serves as an archetypal architecture so in looking at it we can get a handle on how things work on the server side so we talked a little bit about Linux already uh Linux is an open source operating system built around the Linux kernel uh it's very common operating system used on web servers and there are a variety of Linux distributions available that bundle the OS with other utility software it's not in common for this piece of the stack to be uh substituted with the Microsoft Windows server which as you might have guessed is not open source um but it can still be used with open source components and that's kind of the point of open source you can use it with other software regardless of what it is uh when you substitute uh Windows you usually call it wamp instead of lamp that's fun to say uh the next piece is the Apache HTTP server uh I mentioned the HTTP server accepts HTTP request processes them and sends back an HTTP response there are other HTTP servers available but Apache is really the most common by a pretty wide margin um the Apache Software Foundation which maintains the Apache server is a community of developers that build and maintain open source software which is released under the Apache license which I mentioned um they currently maintain about 150 projects the HTTP server is really the the flagship project and it's the one that is commonly just called Apache my SQL is an open source relational database management then relational database management system which I never have to say out loud um and has traditionally served as the data storage component in the stack uh relational database is one that stores data in a set of interrelated tables uh my SQL is very widely used uh and is the default database uh for a lot of developers though a number of open source alternatives have become popular Postgres or Postgres SQL uh we usually just say Postgres because we're lazy uh is an object relational database which is a hybrid between a relational database and an object-oriented database which is more than we really need to get into uh but suffice to say that it offers a lot of functionality that isn't available in my SQL so it's preferred by a lot of developers there's also a class of databases that are commonly called no SQL databases which store their data in some other format other than in tables um this can include uh storing data as flat documents uh which is how CouchDB and MongoDB work uh Redis stores data as discrete key value pairs lots and lots of little pieces of information um and there are triple stores which store RDF triples which are used mostly for linked data applications it's worth noting that there are some widely used proprietary databases like Oracle and Microsoft SQL server which again can be used in conjunction with open source components PHP is a programming language commonly used in writing web applications and we talk about PHP being part of the lamp stack what we're really talking about is the PHP runtime a runtime system is software that interprets and executes code written in a particular language um and there there are other languages that can be used instead of PHP PHP's been around for a long time and is widely used but in the last decade or so you have languages like Python and Ruby and Perl which has also been around for a while um which are viable alternatives and can basically serve the same function uh they have different strengths and weaknesses the decision on which language to use is highly subjective um and the source of tedious debate among developers that you should avoid um but yeah they they basically do the same thing this is what you write the code in this is kind of this is where this is what the application kind of is in a sense uh one last piece of server side technology worth mentioning is Apache Solar which uh as you might have guessed it's another Apache software foundation project Solar is an open source enterprise search engine that provides full text searching fascinating and full range of features that uh you need to search over your data it's highly scalable which means that you can put a lot of data in it and still search over it really quickly um it's actually the most widely used search technology on the web um open source or otherwise uh and it's a very common component in open source web applications that requires some kind of search functionality so moving from the server side now to the client side uh the most basic piece of technology there is HTML if you know anything about web development at all you've probably heard about HTML um it's the language used to mark up web content and provides the structural foundation for the content when the browser reads and interprets an HTML document it generates what's called the document object model or the DOM uh this is a hierarchical representation of everything on the page and is the basis pretty much for everything that happens in the browser so the DOM is kind of fundamental to everything that is happening on the client side uh in the application HTML as you may know uh is hypertext markup language and in the beginning that's all it did it was for marking up text and creating hyperlinks um but over the years developers have always found ways to make HTML do things that it wasn't really designed to do and the standard has developed in response to these uh new uses the current version HTML5 includes native support for audio and video local data storage in the browser 2D drawing uh improved interactivity and a bunch of other things um mostly in support of HTML's role in web applications CSS is cascading style sheets which is the language used to define the visual presentation of web content originally this was a function of HTML but uh having the presentation and the structure tied together in one document became uh increasingly problematic as as websites became more complicated so CSS allows us to separate the presentation of the content and the structure of the content into two separate uh documents CSS works for the most part by assigning display attributes to elements within the DOM CSS lets you specify all aspects of layout and display and the latest version CSS 3 even lets you do basic animation and transformation of elements on the page JavaScript uh is a is a programming language that is most often used on the client side where it's executed in the browser uh over the last decade or so JavaScript has become a fundamental component uh in most web applications and probably more than any other technology is responsible uh for web applications becoming a viable alternative to desktop applications because it enables developers to provide a comparable user experience to what users are used to uh in using a traditional desktop app and there are three things that JavaScript can do to make this possible first it can respond to all kinds of user input uh movement of the mouse mouse clicks clicking and dragging scrolling typing selecting things in a form anything that pretty much that you can do with any input device connected to your computer JavaScript can respond to that and do something so this is a very powerful feature uh second it enables manipulation of the DOM so uh the user can interact with elements on the page by moving them around adding new elements deleting elements uh changing the content of the page and manipulating objects on the page in a variety of ways third it provides a mechanism for the browser to interact with the server asynchronously so data can be passed back and forth in the background without interrupting what the user is doing if you've started typing in a search box and it starts suggesting things that you might be searching that's the browser sending a message to the server every time you type a letter and the server sending back possible searches and all this is happening in the background without you even knowing in addition to these things JavaScript is also a general purpose programming language so it can pretty much do anything that can be done on the server side with php ruby or python and because of this you have a growing number of applications that are written entirely in JavaScript running in the browser that only interact with the server occasionally to get new data um this kind of changes the model that the the diagram of the server side and the client side I was talking about before so some of the application logic layer is moving now to the client side uh this trend is has uh led to the development of node.js which is a runtime system for executing JavaScript on the server side so the code that still needs to run on the server side can also be written in JavaScript so it's JavaScript all the way down so when developer sets out to build an application we don't just sit in front of a blank document and start typing um we start with uh usually start with what's called a framework which is a set of code components that provide a lot of the functionality that's common to most web applications so rather than spending the time doing the basic stuff that all web apps need to do um we can spend time solving the problems that our applications are intended to address and by building our apps using a framework that a lot of other developers use it makes it easier for others to look at our code and understand what's going on uh because it's placed within a familiar context so in building open source web applications it's uh that hopefully other developers are going to uh use and modify and build on top of uh it makes a lot of sense to start with a commonly used framework so some of the things that frameworks provide are routing and url mapping letting the application know which piece of code needs to be used to handle an incoming request page templates for generating documents sent back to the user interaction with the database and a model for mapping objects in our code to records in the database which is called orm or object relational mapping dealing with security issues to keep the evil doers and robots out of our systems and conventions for organizing code which is more of a design principle than a functionality per se but which goes back to what I was saying about providing a familiar context so that other developers can understand what's going on so some popular frameworks that you'll see uh there's ruby on rails which is the one I use pretty much every day um which as you may have guessed from the name is written in ruby um there's jango which is written in python there are numerous frameworks for working with php and there's an increasing number of frameworks for writing applications in javascript these javascript frameworks are uh getting a lot of attention uh recently and are largely responsible for the growing number of pure javascript applications available on the web um so that's you know some of the the technologies at work in developing web applications um and it's good to know if you're not a developer you probably if you're a developer you probably know that already if you're not it's good to just know what these things are so you're familiar with these uh with what's going on with these applications but uh what I want to stress is that you don't have to be a developer to implement open source applications there are varying levels of involvement beginning with the most basic which is just finding a project that solves a problem that you need to solve and trying it out you may need help getting it set up usually developers will provide some kind of a tutorial on installing and uh getting things going or if you have an it person that you can work with work with them but uh you know if you're not technically adept as others don't be afraid just jump in and try it and uh you'll be better for it um once you get working you might find things that you need to change to suit your needs um and while this requires a bit more technical expertise the point is that in working with open source applications this is something that you can do you can't do this with Microsoft Excel if there's a if you don't like something that Microsoft Excel does that's too bad uh but with an open source application you can change it to do whatever you want with it um uh and if you think that the modifications that you make might be useful to others you can contribute them back to the original project and if you really want to just build something from scratch uh maybe that's a problem that hasn't been solved already or you think it could be solved a different way do that and release it uh so that other people can use it release it under an open source license and finally I want to come back to this idea of community um so if you have implemented a piece of software a piece of open source software you are a part of the user community for that software most projects have some mechanism whereby users can communicate with each other and with the developers either through mailing lists google groups or on github which you're going to hear about in a minute uh and and as a part of the user community you should feel if not obliged at least encouraged to participate and to seek help when you need it and to provide help to other users when you can I've already talked about the community of developers that forms around the project the people that write code contribute code contribute fixes um uh but the last community that is really important is the community supporters um a lot of uh large open source projects and organizations that support open source development operate on a membership model and smaller projects depend on donations from users to keep going so sustainability is really an issue in open source projects so if you or the institution you work for uh are able to contribute financially to help support these projects you should really consider doing that and that's all I've got so if you have any questions or you want to get in touch with me that's my stuff I'll eventually put my slides there um quick plug for where I work uh North Carolina State University Libraries Digital Library Initiative uh some of the projects that uh we've built up all the projects that we've built a lot of which uh are have been made available open source and are on github so uh check them out try them out and uh thanks I think I have a minute if you have questions what's your questions after excellent thanks all right hi everyone hi I'm Lauren um thanks Chris for inviting me to do this presentation um so at its most basic github is a collaborative social media actually tool that uh works with a version control system called git um according to a recent article it has 6.8 million users and 15.2 million code repositories um more than double the respective numbers recorded two years ago um it's mostly used by coders but there are several cases where github has been used in alternate ways um like for example this this uh travel log uh which I cannot link to from the presenter mode but um it's uh you know someone someone actually made a travel log out of a github repository where they um had people contribute by pushing and pulling which is basically the the idea of giving and checking and contributing code um but in this case it wasn't code it was travel tips and things like that so um and then there's also another example I have here of a um the github repository that was uh that was used for music so it was a a notation um that a notation uh guide that was that was created and it was contributed from by people all over the world so um so that's just all to say that um github and git are you know have many different use cases and you know for the archival community we have we have different needs um the developer community has different needs we can all come together um also for digital preservation and access so um that's kind of mostly what I'm going to be talking about today so what is git um git is a version control system uh developed in 2005 as a part of the linux system the original creator with other contribute others contributions as linus torvalds um he who also invented the linux kernel um which uh Trevor mentioned and is the basis for ubuntu and other open source operating systems um for servers and for desktop um applications um git is unique in a version control systems for software in that it works best for distributed or open source models where community and non-linear decentralized contributions are welcome um there are multiple checks called pushing and pulling um and that those who are committers to a particular repository or application um can gather from other users in the community that will wish to contribute um so and then separate from git is git hub which works with git and it's a social way to share versions of things um things being historically mostly code and associated documentation um so that making community contribution is easier um we're going to run through a quick demo of how to use git with git hub and um I just want to note that I haven't been using it I I got a new job in January I haven't been using it for a while so I had to relearn this over the course of the last two weeks so please be patient with my with my demo um but uh you know I think it's a it's a good lesson to also just work in the command line um and that's what I'm going to be working with today um it's a lot in my experience it's a lot more straightforward git hub all git git hub also has a um a GUI a graphical user interface but I find that to be more complex and um it it's just a little bit easier to work in command line I don't don't be afraid of command line so um I'm going to go into um my uh my demo here let's see here so signing up for git hub I'm going to log out here because I have already have an account um but you just go to github.com you can sign up pick a username your email password like any many social media sites nowadays um I'm going to go ahead and sign in um because I already have an account and show you how to create a repository on git hub so this is this is separate from what you'll you'll be you'll be doing next which is creating a local repository which will live on our computer like our desktop computer application so um um so right now what I'm going to be doing in the these two these two networked and local will have a relationship but we'll we'll see that in a moment um so I'm going to go to my user account this is the page that um that every user gets and hopefully the internet will obey well um let's tour the home page a little bit so you can you can see that there is a social stream of people that you can follow see I follow D Rice Dave Hannah Frost I follow as well um oh there we go so it decided to go to my repository or my left my account um so here are some contributions that I have made um we had the hack day yesterday so I contributed to that um I have forked uh dpdp scripts which is a project that I used to work on our old job had um so I forked that over to my account so that I could make changes to it and work on it uh repositories contributed to these are different repositories that I have um made some changes to or a member of um and you can see here contributions um that I have made in the past based on these colored dots so what I'm going to do first is create a new repository and you can name it anything um I'm going to name it Amia 14 and you can put a description I'm not going to do that um you can make it public and this is one of the key points of GitHub that's really nice is that you can keep it public so that others can use it fork it make changes send their changes to you to approve things like that um I'm not going to initialize what they read me because um we'll see soon create the repository and and what's really nice about creating it remotely first is that you get these nice um these nice instructions for for creating a new repository in the command line so what I'm going to do first is I'm going to go back to my terminal and go cd which is change directory into Amia 14 oh I did not make that so I'm going to make a directory make dir is the command to make a directory which just makes a folder on whatever um whatever directory you're currently in so right now I am let's see I have the cd into desktop and I'm going to say make dir Amia 14 and then um in order to make my get repository I'm going to say get in it and that creates an empty get repository so what that does is it tracks um any files that you put into it as you command it to do so um so what I'm going to do first is um create a read a read me file so I'm going to do touch which is a command that makes a file um touch read me dot md and then you can also do this in text wrangler or to any text editor create a new document save as into that where wherever you're located wherever the dot get folder is located and you won't see any kind of um result from that but it will be there so you can say okay I am in I am in uh Amia 14 so I'm going to say tree that did not work um all right so we're going to go back into uh this and say get add read me dot md I think I know what happened I did not add the file to the version control repository the dot get file so it's not tracking it yet so we can say get status to look and see what is happening with the file so we have um a bunch of untracked files oh I'm still in desktop that's what happened all right again thanks for your patience with me um so I'm going to say help I think I'm going to go back to the powerpoint basically after you've typed get in it and made a directory called Amia 14 um there's a nested folder beneath it called dot get and that file that dot get tracks um all of the all of the hot files that are in that repository so um I'm going to go ahead and go back to my demo and um um create a and config configure the the file so I'm going to do I'm going to configure it to be um into my have the file communicate with my user in github so it'll be get config global username is Lawrence Lawrence and that's my username on github and I'm going to configure my email as well all right so we're going to take some deliberate actions to track the file that we've put into this this folder that we're um that we're now tracking um and I'm going to say get add dot which adds everything that's in that folder which I actually already did but yeah so this adds it and we're going to say get status to look and see what we're what we're doing here and then um get then we're going to commit it to to the um to the the local the local the local directory so I'm just going to type in get commit m my first commit all right so it added a file it created a unique identifier for that file um and it is now tracking that file because we've committed it so now what I'm going to do is I'm going to um remotely add that file over to where I um created that get repository which is this instruction right here and you can also um see this when you go to um a mea 14 in the new newly created repository on github.com um it'll be on the sidebar here so I'm just going to copy and paste get remote at origin and so um and then go to get status again just to see what's happening with this nothing to commit but untracked files are present so I'm going to go ahead and do get add and it added the file you can look at get status um so what we want to do is push the file so pushing means um you are pushing the file up to github so push you origin master and it'll take a minute because it's communicating with the network now it's not just local so um so you can see now that it has um done what it's uh wanted to what it wanted to do um it pushed it up up to a mea 14 dot get and you can see here when we go to the uh the repository that it has a they have it has a file in there so we went ahead and pushed the file from the local instance which can be altered in any way um committed to and then pushed up to a public repository which anyone can go to and fork and uh alter and do whatever they want with so it's a nice collaborative way of working and um a lot of um open source projects uh work with git because it's very amenable to this so how do archivists and cultural heritage workers use git in github um so there's different uses um for building software and finding software that's useful for cultural heritage organizations um like i said i haven't worked with git in a while that's why i stumbled a little bit but um it's a really useful tool for if you um if you have a certain application that you're looking for say you know you want to implement checksums in your workflow you know i would really encourage you to go to github.com and just search for checksums um there's a lot of institutions out there who are working in open source and can actually um you know provide code for you to work with and um have documentation about what what to use um in the past i've worked i worked with uh uh application called premises which we we kind of made steps to um to create an xsl document that will um create premise premise records um based on this the digital repository system that we were working with um qc tools is a project that that bavac is a part of um that's also open source and on the web um the ams which is the project that i'm working on right now the american archive has a um has a github repository as well um and some of these are not out of the box you can't just go and um download them but if you work with it it's a really good opener for working with your it department and um creating those connections with developers and with um people working in it in your institution um so there's different there's also different use cases for different institutions so for distribution um my understanding is the library uh does not use um github for production purposes um or project management purposes but they use it for distribution only so they'll make the code um in another system and then push it to github when they when that's ready for distribution um for project management pb core is a is a now an active development and i'm i'm working on that and we're we're actually tracking um tracking changes requested changes from the community in github via the issues tab so that can also be used um for for projects that don't necessarily have to do with um you know web applications or specific um specific programs like that but but for metadata work too um so for collaborating on coding projects um like i said forking pushing and pulling this kind of non linearity that is um really essential to get in github is um is something that that you can really work with and um can be helpful in in in you know if you want to learn to code you know you can go on to github and make what you want and you know use uh code that's already been created and improve improve upon it um to get practical hands-on experience if you're interested in that um there's many youtube tutorials there's um linda.com there's uh schools that you can go to now because developers are so you know prominent in our in our economy right now uh flat iron hack bright general assembly there are many others that are online um and i just wanted to uh quickly shout out some uh some tutorials that are out there um the some cultural heritage institutions and individuals that are on github so that you can create an organizational account you can create an individual account um so we have a mea open source which is the open source committee here uh in a mea that has a pretty active github um wgbh um if you preserve dav rice um ed su which is ed summers uh he now works at mith uh bavac and the library of congress um has a you know the bagget tool and among others um so yeah i think it's important to note that um kind of like most things in our field um you know steenbecks film rewinds professional analog video decks um digital preservation and access is often a hybrid of production process use cases and making that work for archives so you know tools that might not pertain to exactly what we're doing with archiving um we can kind of alter for our own uses so i think this is another kind of way of you know hacking a film rewind for archival purposes that used to be used for editing kind of that kind of thing um and yeah i think that's it and if anyone has any questions um that's my contact info thank you um there's nothing inherent about open source that makes it any more or less secure it's uh it's really the software that's open source um if you remember i talked about uh development frameworks uh and the those security features are built into a lot of these frameworks so if you want to uh do if you want to somehow analyze how secure or insecure a project is um a good place to start is if it's built in a framework you can get some uh uh advice on how secure the framework is um rails which is the one i use a lot has had sort of a bad reputation for security in the past but it's improved a lot um so that's the place to start but the the point is that it isn't uh there isn't you know open source versus proprietary it's it's kind of a it's kind of a non-issue it's uh it's really what the software actually is so it's it's unfortunately it's not that simple or maybe fortunately it's not nice that's a better answer is there excuse my ignorance here the issue of open source in the sense that software would be free and open to anybody but is there a type b where there would be limits that that it's so far as adapting it and i'm let's use a mythical example let's say someone were to do a front end like ltfs archiving with lto tapes and you wanted to make something that would be an archival safe front end to that and you wanted to develop it and then give it up free but if people can then be modifying it for their own uses it would potentially make it less archivable it could someone could could do it and that that adaptation of it that would somehow hurt its universality so something that can be free and open but not modifiable by users is there a type b of open source? well that's just free in the sort of uh generally understood sense of free as in you don't have to pay for it wouldn't it would not that would the example you're describing would not be open source it would not be free as in freedom of speech so people talk about freedom of speech versus free as in free beer um uh and what you're talking about is free beer um and that which is that is a model i mean that is a model you're perfectly uh you you can do that um but it wouldn't be a it wouldn't be considered open source so there's nothing wrong with giving stuff away for you i'm all for it the more the better really i said there's a term that lauren used several times forking which is essentially but the way most uh gift repositories are set up would be that you would people could not alter your work if they wanted to alter it they would have to duplicate it or fork it to their own repository and they could alter it so some that you have a main branch and it forks off of that and that's actually uh when i talked about the open source definition that is uh that's allowable in in a license you can specify that modifications have to be done via patches um and that the original code has to be uh maintained but that doesn't keep people from changing it to do other things so it depends on kind of how what what you actually want to keep people from doing you then you can always maintain the integrity of the original code but you can't if it's open source you can't keep people from doing other stuff with it yes definitely um there's a lot of tutorials out there for um like command line applications so command line is basically you're manipulating um you know uh files and folders in just a different way than you would in like say your finder in mac so you know if you look up um you know unix command line um there a lot of mac of the mac command line is based in unix so you can do a lot of futzing around with that and it's it's actually really fun um so so you can do that and then also git um you can you can go to git dash h i think it's dash dash h and that gives you a whole list of the different commands that you can use within git so you can just play around with it