 Thank you. Thank you everyone for coming to the first-ever Libra graphics track at scale It's nice to see so many people here Hopefully this gives us gives us ammunition for coming back next year It would be awesome if you want to see these kind of talks again and are tying to tired of the cloud stuff Which is great totally cloud is great If you want to see more of these talks, please make some noise on social media use the hashtag scale 16x and Shower us with love So get for get for photographers first a little bit about myself. My name is Mike Samrick. I've been a photographer for about 20 years Started on film and moved to digital My first github repo was December 1st 2009. I guess the longest one that's still up there at least and I would be remiss if I Did not plug the pixels dot us community. We're trying to build a community of graphic artists who use Libre applications and we do not want to Bind ourselves to any particular application Because I think everyone uses multiples So Pat told me I could not have a talk about photography without having pictures So I have interspersed some of my own work throughout the presentation and I Mean it's managed and get so I guess it has something to do with it, but Maybe maybe not I like shooting pictures of junkie stuff So that is number one I guess I guess if you wanted to only use git you could add like all the Raw file types to your git ignore and then only check in your XMP and PP3 and other sidecar files and that would be better than not But I wanted a full solution. I eventually found my way to a program called git annex it's an add-on to to git itself and You can find it at git-annex.branchable.com it allows you to Manage files with git without checking them into git which sounds a little paradox of paradoxical, but we'll get into that So the git annex core is gplv3 if you like GUIs there is a Web GUI that is a gpl3 that helps you set up stuff like syncing automatically so when One repo sees another repo. They automatically exchange content Developed by Joey Hess I think he's pretty well known you've probably used his software divin and Debian installer icky wicky xckeeper and a whole bunch of other stuff I believe he is at joeyh.name if you want to know more Also, it's written in Haskell so if you're into that kind of thing I'm sure he would appreciate the help I believe it's mostly him Developing it on his own. I know there's a couple of people making small changes He is funded by Maybe Lieber pay now he was on patreon So if you find it a useful tool as you should with any open source you find useful, please money money money helps Features of my git workflow with git annex gives you all the flexibility of git In my own workflow after you do a git annex add all the files are read only I like this a lot because I believe you should preserve the original source I mean raw files don't change anyway unless you rewrite the metadata and Really if you want to do that you should just write the sidecar You can also fight bit rot git annex hashes your files and gives you an easy way to do file redundancy you can easily keep track of Your collection over multiple disks. So if you do have Terabytes and terabytes and terabytes of data you can put one repo on each disk and you can split all of your files between it Version sidecar files is awesome if you've ever had an application decide that it was going to spew a bunch of junk into your sidecar files Previously I had no way to Go back, but now I do and that's awesome and has saved me Many times a lot of time In git annex you can have views based on metadata When you git annex add a script will run that uses xif tool and then you can tell it What metadata fields you want to put into git annex and then you can say show me this field show me all of my ISO 100 pictures show me everything shot at f2 and Pretty much any other IPTC metadata Field you can sort and show It gives you simple PGP encryption. So if you're like me, maybe the cloud backup is not for you Don't trust it. Even if it is encrypted. So some of my disks rotate off-site and they're all PGP encrypted That's a full repo encryption. So the tree and the files are all encrypted with PGP Easy off-site backup Also, if you do love the cloud and you want to put your files there git annex gives you a simple way to Push to the cloud give you the redundancy and let you manage. What is there? That's the middle of the desert and that is a Chinese food dog Used to demarcate the property lines There's a lot of nothing out there So The other kind of competitor in this space, I suppose is git LFS developed by github. Everyone loves github Yay So why didn't I choose this I started using git annex and git annex was available before LFS was ever conceived So I was already using it when they announced the first version of git LFS in my poking around to see if it was a Better fit for this kind of workflow. It still requires hosting Git LFS hosting on github is pretty pretty pricey and They have some weird weird things about your LFS files don't follow forks. So Yikes still centralized in that you need hosting One free platform. I know that supports git LFS is gogs and when you set it up you need to Define a separate data store for git annex. So your files are not actually in your repo There's just a pointer in your repo to where they are somewhere else There's probably a bunch of other things I missed but those things disqualified it for my use So the prerequisites for running git annex git git annex itself and xife tool a Pretty recent version of git annex is available in Debian stable Which leads me to believe that it is probably available in a newer version almost everywhere else This is more junk and stuff in the middle of nowhere. Do those look okay on the projector? So setting up a repository is pretty standard git You need to make a new folder so at the command line make their new folder and then change to it Initialize the repo as standard git Get in it and then you want to tell git annex What the name of your repository is this makes it a lot easier when syncing between multiple remotes so git annex in it and The name is actually optional, but I highly recommend it Once you get to more than three Clones of your repo is very useful to tell what is where Otherwise it will show you just the git remote name and sometimes that's helpful and sometimes it is not So To add your raw files to git annex after you've initialized your repo You want to copy your raw files? Into your git folder. I personally use an xif tool script that copies and renames all my stuff But you can use your file manager. You can use rapid photo downloader You can use cp you can use rsync so many ways all valid Add the files to the annex git annex add Your file name dot neff I shoot Nikon so Dot neff it is you can substitute and or add dot cr2 or f or What whatever you want? And commit the changes to git again, that's standard git git commit dash m and a commit message is nice So you want to know what happened when I use the command git annex add That is where a lot of the magic is so first thing when you use git annex Add is it hashes your file? The default I believe is sha 256 that is configurable to a whole bunch of different back ends there I know there's sha 512 they moved off of sha one and sha 5 there's a There's a back end that's useful if you burn stuff and archive it to cd media Called worm But it's very configurable That same file is moved into git Slash annex and then a bunch of UU IDs based on the git pack format and then the file is renamed using the hash of that file and Finally a sim link is created where the file originally was that points back to the dot git slash annex slash Sha 256 file name So After that's done when you go to commit what you're really committing is the sim link back to the file in the annex space so You've added your immutable raw file you open it in dark table or raw therapy or some other editor You will usually get an xmp from dark table or a pp3 From raw therapy and you can use standard git command to add that and track it since it is just a text file so git add file name xmp and then git commit I Added a sidecar file if you Get annex add your raw file and then Don't commit it and you go work on it and you get an xmp you can add them both and Commit them in the same commit So that's awesome, but adding xmp files and raw files separately is not awesome You probably don't want to differentiate between the two when you use git add or git annex add Because that's really inconvenient So you can tell git annex to use the regular git add When you're using git annex add And you do that using the annex dot large files command, so You can use multiple of this git config annex dot large file and then include a Wildcard match to whatever type of binary files you want to add into the annex You can also tell it to exclude any files, which will add them in the standard git way that way when you go to add you just git annex add period and xmp's will be checked into git and your raw files will be checked into git annex and Makes things a lot easier. This is actually at the Salton Sea, which is an environmental travesty But a very interesting place the largest lake in California Has a higher salt content than the Pacific Ocean So adding git remotes is what will allow you to have redundancy in my own personal setup. I have Six external discs and a copy of my repo that sits on my main editing workstation They're all able to talk to each other Independently so if I plug in one and I sync my desktop to disk one I can disconnect that plug-in disk to sync them I can take disk one and disk two plug them into another computer and sync them and they're all magically tracked That's super nice. It really uses the standard git remote stuff. So Clone down your original repo Change to that directory tell git annex that you want it to be an annex Git repo Give it a name so that you know what it is and then change back to your original repo and add to the other remote This is just like for adding your external disc on the file system The other nice thing is if you don't mount all the discs at the same spot on different computers You can add multiple remotes that point to the same repo and it's smart enough to figure out What's what and sync your content appropriately? So some of the special cloud sauce is in git annex special remotes You can sync to Glacier or s3 bup or D. Dar which are D-duplicating archive formats based on the git pack format our sync or directory which lets you push only the files without the actual git repo to a location so you can do things like Push your files to a file server and then push only the git without the annex part to a centralized server and you can Bring them back together webdav bit torrent and then all of your online types of drives box drop box Google Drive The list is really super long For the add-ons that you can get for that. They are all at The git annex home page and there's instructions for all of them It's basically just an extra flag on the remote and you give it user credentials and It adds and then you can stick all your files there So once you have multiple remotes, you'll want to sync the content If you want to sync only the git tree and not the files it is git annex sync that will find all of your available repos Whether they're local or remote and it will sync Just the the git part of the tree if you add the dash dash content flag You will also sync all the content down In my setup. I have GPG encrypted remotes, so it'll prompt me for my GPG password I enter it and it syncs all the content there. I unmount it and Rotate it off-site Part of what I like about git annex is that I don't have to worry about my files going bad because I have seven copies of all of them So When I get to the point where I run out of Drive space, I will manage them on multiple I'll have a slice of my repo on multiple drives, so You can add and remove content from your single repositories using git annex drop and git annex copy But what I really want to do is when I have a single repo I want to make sure I don't drop a file that I don't have another copy of So you do that using the git annex num copies. I think and Minus two of the number of drives you have is probably your your best bet I think I have my personal one set to three So before I can drop content out of a repo it has to know that there's at least Three other copies before I can actually get rid of it So I can't get rid of something that I don't have another copy of so it makes it really easy to ensure that all of my data is there So you can drop content when you start to run out of disk space on an internal or external disk Git annex drop and then the file name Since it's really just git you can use a wild card So if you wanted to drop all of your neff files you can do that if you wanted to drop All of your cr2s because you don't shoot cannon anymore like me you can do asterisk dot cr2 and they will all go away Part of the awesomeness of having git annex hash your files is that you are free to check them later So very standard FS check will iterate over your entire repository and Check the hash against the original hash that it was checked in Against if you have any bit rot It will tell you and you can try to pull a fresh copy of that file from another remote Unfortunately that works like per repo So I have a shell script that iterates over all the locations where my repos are and FS checks them so I can Like do it overnight if you have a lot of files. I highly recommend the dash queue flag That's quiet if you don't use dash queue it will tell you that all of your files are okay Which is great, but I'd like to assume that they're okay, and just tell me if they're bad I think that's pretty much it If anyone has any questions, we do have a microphone maybe Do you recommend one gonna get annex for your entire library or one get an annex for each photo set? When I have my entire library in one get annex and I have a separate one for like XCF Files if I take them into GIMP and I have a third one that is just like this is a finished File that's ready to either go to print or go to social media or what have you Not a question just a comment. There's a talk tomorrow about Attica Which is interesting because it solves similar problem. It does and that one's very new right? It's like rust or something. Yeah, the rust is the new hotness, but If you don't like Haskell and you love rust then I guess Attica might be for you. I should go to that talk What about other binaries like could this be useful for designers for PDF PSD files, etc? Yeah, absolutely. I Deal a lot with only raw because I mostly only do photography, but any binary file you can use get annex The default mode of get annex is to make every file when you check it in read only but Like all great tools you can make it not read only and you can work with it in a mode where it will sync to remotes But it will be unlocked. Also, you can just unlock if you don't like that You can unlock the file modify the binary and then use get annex add again And it will hash the file again add it to the annex and then you check the sim link in and if you need to roll back the old file is still in the annex and You would go check out the older sim link and you would be pointed back to your old file and there's a whole subsystem of commands for Dealing with files that don't have a sim link pointing to them so you can have like a Really offline copy that holds only stuff that you're not working on or stuff that doesn't have a sim link pointing to it a whole bunch of stuff Yes So you have a directory on a disk is Get it does get annex actually like take like Replicate that content like those images across like three different drives when you put that number to a three or Yes, okay. Yeah So you set the number of copies that you want to keep and It will propagate all of them Across all of the repositories that are available to it when you issue the command So those are those repositories are like terabytes in size, right? If you need that much space, yes Okay Thank you. Do you work with or recommend any? image Differencing Applications, I don't really I believe image magic has a nice differencing algorithm that will You can diff two images and it will paint them magenta where they're different I believe like backstop J s if you do any web development uses that feature of image magic I find the version metadata more useful when I Edits I usually edit something and then I leave it for a day and I come back to it and I ask myself if I still like it And if I don't like it then I just roll back a commit or two And start over or I check the whole thing out Like delete the history stack and dark table and start over but I've done both ways where I've been like oh, I don't like that edit delete the history stack and dark table Try again, and like I don't like that either so I want to go back so I just get reset the metadata file and I'm back to where I was oh That's nice