 Welcome another edition of RCE. This is Brock Palin. You can find us online at RCE-cast.com You can find a link to all of our Twitter's and our blogs and the entire back catalog which this is special This is a very special episode. This is episode 100. This is not just another episode Which episode is it Brock? It is 100. It is episode 100. The very first episode Jeff was gonna kick out of this because Jeff was the first guest and then I wrapped him into being on every show after that George was on there too. That is true. When was it? January 2009. We've been doing this over six years 2009? Yeah, we've been doing this six years. How did we only hit a hundred episodes since 2009? Doesn't it seem like there should be a lot more episodes? Yeah Someone should go mind the scrape the site and mind the deal. Oh, no, no, no, no Don't ask people to do that because they'll do it. Our audience will do it Someone in the audience will do it. They'll find out like oh, yes There were there were times when it was wonderful when it was every two weeks and then there were droughts. Yeah That's a little embarrassing, but it happens. Sometimes people life just gets in the way Yeah, yeah, it's all over the place. Oh, it's still pretty awesome Yeah, every now and then you can see on my blog failure as a service I'll put up like our stats since like the culture that is RCE and the fact that like ten percent of our clients come from Other Unix system because everyone's rewriting their user agent because we're all tinfoil hat people and everything else in this industry We're terrible So who do we have for 100 today? Okay, so our guest today is Eli dart now this is a follow-up to our previous podcast episode 99 which was perf sonar Which was how to measure your network Eli today is going to talk about the faster data project So Eli, why don't you take a moment to introduce yourself? Hey, how's it going? So my name is Eli dart I'm a network engineer in the science engagement group for the energy sciences network or ES net So my job is to help science collaborations get the most out of the mission science infrastructure that we build and deploy Okay, so faster data. Um, it's it's it's not software It's it involves software, but you're not actually you're writing software and stuff. So tell us what exactly is the faster data effort? so faster data is Network performance knowledge base essentially it's a repository of a whole bunch of information that we have found to be helpful in In really doing performance tuning or performance engineering or helping science collaborations really effectively use The high-performance infrastructure that that exists for their use in science Okay, so don't wait a minute Can you explain that a little bit because I mean don't I just need like my 10 or 100 gigabit uplink? And I'm good to go. I mean how much harder is it than that? well So think about this right you go and you buy this beautiful shiny 4,000 core Computing cluster and you go and you plug it all together and you slap on default install and your users are good to go right right? not so if you go and Put together a high-performance network and you alright it pings. It's good walk away is Unlikely that you're going to Get the performance out of it That you would really like to or that you paid for and so just like computing systems networks have some configuration and some tuning and Ideally some test and measurement involved in making them into high performance scientifically relevant to it Okay, so perp sonar, you know talked a bunch about you know We do a lot of continuous testing to see where the bottlenecks are and whatnot But but you're talking more about well at least the the first phase is when you install your shiny new cluster Your shiny new network switch or your shiny new uplink or things like that What are typical types of issues that people run into that perhaps they didn't expect? So there are a variety of things right so we on the one hand you could have bought the wrong gear And just you know not not have something that's gonna perform when you throw the workload out if you're gonna throw at it So Getting stuff, you know getting me the equipment itself that's capable in terms of the workload You're gonna you're gonna run on it is important and that's that the analogy at HPC holds there as well, right? You can also Just not architect it right I mean there's an element of network architecture as well Involved in making sure that the design thing is is well suited to the tasks you're gonna you're gonna put to it So there's there's a variety of things that you have to consider Just as you would with any kind of major infrastructure investment So it seems like there's a lot of different pieces you would need to kind of completely validate You know we had perp sonar, which was a collection of tools at ES nuts involved with But you had more you had tuning and so you had best practices you had software to use that implemented things differently than the common solutions What are all the different pieces that faster data actually kind of advocates for? so so The bastard is a knowledge base and so the you know we put things in there that that Are sort of relevant to that set of tasks? if you want to look at So what's the what's the right framework in which to consider high performance networking for data intensive science? You're getting into something that we call the science DMZ model And there's a big section of faster data devoted to the science DMZ model, but that's a set of design patterns for Building and operating Network infrastructure for data intensive science essentially So there's a lot of what would be Things that you would consider in that In in building and deploying this are covered by the science DMZ model And then there are a whole bunch of aspects of faster data that include More detailed information about how do I in particularly do I set this thing up or or you know? How do I drive this particular tool? There's a whole breadth of So let's go more into the science DMZ. Let's get to a specific detail Yeah, I Work at an HPC center at a large public university, and I have users with data and They're logging in from all different parts of campus. I have a big distributed network. How would a Science DMZ look like at an institution like mine? So so there is no one true science DMZ right science DMZ is a it's a design pattern not a legal specification so There are a few key components of the science DMZ And and their specific instantiation depends very much on the environment that you're in the budget you have And the workload that you think you need to serve. So the science DMZ is a is a Enclave at typically at or near the site network perimeter This design specifically for data intensive science That's where you put all of the pieces that you're that are have the all the pieces that have the Responsibility for getting data in and out of the site. So specific Systems which you call data transfer nodes would go there. You definitely want perf sonar in there I mean you talk with Jason about perf sonar perf sonar is what it is is a key component of the science DMZ And that's a place where perf sonar really shines and so you you That's that's the spot where you would integrate all those things So what would that look like for a major public university? There are a lot of different ways to build it depending on the network culture the funding environment who needs what when? Are you just starting is this a mature deployment? There are a lot of different ways There's a lot of different flavors a lot of different colors a lot of different ways it can look Now the phrase DMZ is is typically Associated with just hanging something out there with no protection on the internet like if you ask a typical internet user They look at their home router. They're like, oh, I can have one of my you know home PCs hanging out on the DMZ Which basically means it's sitting in front of the firewall. Is that one of the precepts here? Yeah, so let's take one step back and say so what what is a DMZ? And then actually the notion for the for the science DMZ the DMZ part of that comes from traditional Network security design. So in a previous life, I've network security engineer So if you look at what a security DMZ is a security DMZ is a portion of the network Adder near the site perimeter that is designed and built specifically to host External facing services authoritative DNS incoming and outgoing mail World-facing web servers things of that nature and you put it out there in front right at the site perimeter because the traffic and the applications running in the DMZ have a different security profile And and often a different application profile than Whatever you've got running on your internal land And so you build a specific enclave and you put those external facing services there That does not mean that it's undefended right what that means is It is defended differently than the land Is defended in a way that it's appropriate for a DMZ not necessarily in a way that's appropriate for a land because it's different And so so in the DMZ you apply Security policy and policy enforcement mechanisms that are specifically tailored to the services running in that DMZ So in a science DMZ that basic design methodology is the same Right. It's an area out of near the site perimeter designed specifically for high performance data intensive science services that involve collaborating with other institutions and so it's designed and built specifically to serve those Those needs and those applications in those work. Okay, so when you set one of these Things up does this would this involve like all the software-defined network to make like frictionless paths and stuff? And does that actually accomplish what you want here or is that kind of hype for a different part of the industry? SDN definitely has a place in the science DMZ We should give a shout out here to some of the Science DMZ's they've been funded by recent programs from the NSF. So the CC NIE CC IIE and CC DNI programs have funded a bunch of science DMZ's at that major universities around the country I Should say that a whole bunch of those science DMZ deployments specifically incorporated SDN And so there's a whole bunch of SDN ready or running SDN now science DMZ So there's a lot of hype around SDN But there are people who are doing real work with it at the same time And I'll say the people who have their hands on it now people who have a way in which they can get their Operational feet wet with it and actually roll around with it and understand it see what it does they're going to be ahead as as SDN goes more mainstream and the science DMZ is a really useful place to deploy SDN in your network whether the entire science DMZ is SDN enabled or whether You have an SDN on clay that's attached to the science DMZ just as you might have Biology on clay or a hundred physics on clay or a climate on clay attached to your science DMZ It's a really good way to get your hands on it and get some work done with it And use that as a platform for rolling it out to the rest of the infrastructure All right, let's let's shift track a little bit here Let's talk a little bit about some of the things that that you advocate in terms of tuning in things like that Why let me just take a step back in the beginning here and ask and and I'm somewhat Tongue-in-cheek here because I work for a networking company obviously, but why do we still need to tune? TCP and network settings on servers after you know so many years of Scientific data and all the data rates keep going up and all the Ethernet vendors are pushing 10 and 40 gig nicks these days and so on so Why don't operating systems just do the right thing now? The primary reason in my view is that high performance To air quotes big data science is not actually the common use case and the defaults are designed for the common use case as they should be and if you if you live your life in the place that's outside the common use case Then you know, you're gonna have to customize some aspects of your life. There's sort of no way around that right and so There have been a lot of advances auto tuning over the past 10 or 15 years in particular RFC 1323, which gives us window scaling And kernel auto tuning those are both huge huge huge wins But you still need to make sure that the configuration on your host gives those mechanisms enough rope to actually do their work for you and so Some of the defaults are getting better But we still need to make sure that that when stuff gets deployed You go through and you configure it properly and you test it to make sure that it's working correctly Just as you would for any other, you know item of major cyber infrastructure Okay, so what are the most common things you found when you were testing this out that kind of affects performance when you're talking About TCP settings and stuff. What are those settings? It's pretty straightforward. So Making sure that Your auto tuning buffers are high enough Making sure that you have both Backlog Counterturn or backlog can fake turned up and also your Transmit queue depth maybe essentially just things to allow To allow everything to work cleanly, right? I mean if you Anytime you you get into a place where you have to sit and wait for somebody else to do something Right, that's idle time that you're not using your infrastructure to the full and so In the in the TCP window in case that's if you're if you're talking to something a long distance away That allows you to have enough data in flight that you can make full use of your pipe for transmit queue that means you can actually put enough stuff on the wire Or receive queue that means that your host doesn't get overrun and cost packet loss so that there are a few basic Settings that can really get you a long way. Now. That's just the network side though That doesn't deal with storage And that's just the host portion that doesn't deal with the core network at all So two things you said faster data is a knowledge base So you kind of have notes and best practices for Linux kernel settings and stuff all on the faster data website for things We're just talking about Yes, in fact, they're designed for easy-cut and paste including the comments, okay, so But isn't the rest of like commodity internet kind of moving this direction we're doing more video delivery I mean, it seems like a Netflix stream to my house is about 30 megabits per second That's faster than I see most SCP transfers to and from our system actually travel at how are they reliably delivering that data? When I can't do it using SCP today Well, so there's reliable and there's fast and those are two separate things Reliable means I Did it correctly right? I got all the bytes from one side of the other I did it in order without changing any of them And I do that every time you ask me to do it. That's very different from I did it really really fast And so you're happy with how well I performed when I did it I should say that SCP is a pretty poor data transfer tool And it has some built-in limitations that that caused some fairly significant performance problems So and I realize people use SCP a lot For data transfer and the main reason for that is that SSH is is the is what's used for system access, right? And SCP is a data transfer mechanism that uses the same credentials that you get for free And so that's why a lot of people use it that doesn't make it an excellent data transfer tool Okay. Yeah, I guess by reliable. I was thinking they reliably deliver 35 megabits a second to My Blu-ray player that plays in Netflix So, you know, it can keep that stream going which is interesting So while we're on the SCP thing, you know, you have down here You have an entire page on data transfer tools where you you know You have SCP and you have some data with these different tools You have Globus which I've had under show before you have BBC P a lot of the common Alternative data transfer tools on there. You have this patched version of SCP What what's so important about that patched version that you get such different performance numbers? And what are those difference in performance numbers? well, so I Can't take credit for the patch that was done by Pittsburgh Super Computer Center The stock SSH SCP and friends application suite has some statically defined internal flow control markers and they don't resize depending on What the what the latency is between the sender and the receiver and so that's the same as having static TCP windows which, you know, we fixed that a long time ago See RFC 1323 in this game, right? And turn a lot of tuning so what that means then is that if you're using SCP to transfer your traffic or if you're using SCP to transfer your data You the application has some static buffering That is essentially defeating all the auto tuning that your kernel has turned on for you And so there you are you kind of stuck What the patch from Pittsburgh does is allow that window To scale at runtime and so Big that allows SCP or our sync over SSH or what have you to scale its It's buffering along with all the other stuff that needs to scale With a long-distance data transfer Now is this something that you need to be running your modified SCP on both sides Or do you even get some benefit if you're only running the modified SCP on one side? How does that work? My understanding is you get some benefit If there's if it's only present on one side, but you really get more benefit if it's present on both Okay, now is this something that that could be submitted back upstream or or have the Open SSH people indicated they don't want it because we're outside the mainstream Um, so my understanding and and the person who can speak to this authoritatively is Chris rapier at Pittsburgh Supercomputer Center My understanding is that those patches have been Submitted back to the open SSH folks and the open SSH folks are not interested in incorporating Some other distributions Have incorporated versions of those fixes into their into their Standard version of SSH that they distribute with their system. I know previous D has the It's semantically equivalent set of fixes built into the SSH that's installed by default on a previous D host I'm not as sure if any of the Linux distributions do that But that's that's another way to do it, right? Which is to say if you're maintaining your own patch set or your own distribution you can just incorporate That set of fixes into the stuff that you run and that you deploy But I think that that it had it It's been offered to the open SSH people and I do not believe the open SSH people are going to incorporate the patches at This time, but again, you should really ask Chris rapier about it for another day Okay, so you made some interesting comments there about you know, why you need to tune for Internet level distances particularly when you're going through many hops To get to a pier but does the same kind of philosophy apply when your pier is right next to you in the data center Maybe even one or just two hops away So let's see Hop count actually I mean as a quick aside hop count doesn't really matter with modern asic-based routers and switches anymore so but but but distance does and If the further you have to go the more The more TCP really has to work in order to maintain performance So if you're just going to the next rack over me provided routing not pathological and sending you out to you know The East Coast and back From my California perspective in order to get to the guy in the next rack and I've seen that behavior by the way And it's really irritating but if all you need to do is go a few tenths of a millisecond over T-speed enough to work very hard and Your you can be a lot more lenient with your tuning requirements and so By the way, I should say that that packet loss is going to affect you a lot less too because the control loop is so short because the latency So love so Inside the data center things are much much much easier And you have a lot more leeway in terms of what you can do what kind of equipment you can use and and how you can run your machines Okay, so besides the patch version of SCP you had a whole list of other tools and so besides vanilla SCP what should What would be your recommended tools for people to kind of look at for actually transferring? I guess there's two different sets of data There can be big data can mean millions and millions of little files Which you know we are or bad for file systems and everything else as well as you know large large individual files So what would be the tools you would think people should look at? So As as you as you hinted right Different applications require different sorts of optimizations I've been doing some testing recently that shows that you know as is not surprising at all Large numbers of metadata operations right file and director creation that sort of thing is a real torture test especially for large parallel file systems And some data transfer tools don't necessarily Like those as much either and so if you have bazillions and bazillions and bazillions of tiny files and we see this sometimes in gene sequencing applications, right where they'll be just Ridiculous quantities of bitty bitty little files. You might even want to Roll those up into larger tar balls or something like that before moving them Depending on how your data transfer application is going to deal with it If your data transfer application is appropriately pipeline and streamlined, maybe it's fine But for specific applications We see a lot of people deploy globus and I think you said earlier that you've That you have the globus folks on here before and so We certainly see a lot of globus in in science environments There are other environments that use We use a sparrow a fair amount and so, you know, those those are two tools that I think are are a little more Little more polished and sort of designed to really be used by non-experts. There's a whole other suite of tools that If you have an expert that's deploying it for some specific Application between a small number of hosts They can use whatever tool that they that they want But if you're going to put something in the hands of somebody who's not an expert You really want something that's going to actually just work when they when they use it And so, you know globus and the sparrow are a couple of tools that do that do better for that kind of use case I think you'll find globus deployed at most of the major super computing facilities in the United States All right, so you talked about some some common user level tools that are good for this kind of stuff What about standards? Are there any Standards being worked on to make this better even if it is still kind of a niche thing I mean the niche is still a pretty big niche that covers many people even if it's not seven billion people on the planet It's it's got to be at least many thousands or tens of thousands of people that are affected and so, you know Is there any technology in a cross-cutting kind of sense that is being Developed to make this better You know, let's see so For standards, I seem to remember that there is a Standard for the protocols used in grid FTP, which is what globus uses under the covers right for for actual data transfer engine I think there are multiple Implementations of that that can interoperate and so there is some standardization in that space Aspera's Secret sauce is just that it's secret. So it's proprietary. So that's not standardized There's a I mean the science DMZ is Is considered in in many circles to be best practice now That's not a standard either. That's just a Sort of agreed upon best practice. So I mean that there are some standards efforts out there I think NIST may be working on a you know, they have some big data initiative or something like that But it's not clear to me that they're doing tools work specifically. So unfortunately I'm afraid I have to say that, you know, there there are some widely deployed things and there are some some best practices out there But I wouldn't say that that we've gotten to the point where there are proper standards For dealing with this at least in the way that that some of us might like to see them And I guess I should say right in the absence of standards things like knowledge bases are really important And that's one of the reasons why we put together and maintain faster data So one of these tricks you see on a lot of these systems or one of the tuning parameters at least on systems like globus is You know parallel TCP streams or sending multiple files simultaneously each in their own TCP stream Why does that work so effectively and should just more people implement that? So there are a couple of reasons why that that works well. So if you if you have a An environment right so you have two hosts and you're transferring data from you know host A to host Z and There's some network path between them And so your your your data transfer application is is going between those two hosts over that network path There's going to be some My my colleague Eric has this has this rule of performance as he puts it There's always something keeping the system from going faster, right? And so if the thing that's keeping that data transfer from going faster is Single-stream TCP performance then if you add another stream You can get more performance by by doing that because you'll have a second thing That's running over the same path neither one of them are taking the full path And so you can add performance in that way now that only works up to a certain point because at some point You're either going to fill the network pipe or you're going to end up with too much host contention or or you're going to match out your storage or something but Paralyzing by by TCP connection does get you performance increase if per Low TCP performance is your bottom up. Okay, so all this being said and done And again, I work for a networking company when I when I joined Cisco I thought I understood TCP and over the years I have come to understand that holy cow TCP is Incredibly complicated subtle and he's really pretty good at what it does and the people have written the network OS stacks for TCP and whatnot are fantastically smart people much much Smarter than me, but there are still these these limitations that we run into You know, is it better to not use TCP use something like UDP? And and do your own retransmission? I mean, do you have experiences with applications that try to do that instead of TCP? Mm-hmm. Yeah, so so that that's another way of going about this Right is it if you if for some reason you can't make TCP work when you use something that's that's not TCP There's a open source version of this called UDT that some folks use Yeah, I think you can even plug that into into grid FTP if you want to And that matter of fact, I know that you can plug that directly. There's a grid FTP module You can just plug that in I think a sparra is is non TCP So that's that's another reason why people might use a sparra is if they can't make a TCP based data transfer tool work The thing is though TCP is is the standard Not data transfer tool or not data transfer tool my apologies TCP is the standard reliable byte stream service in the TCP IP protocol stack and so The vast vast majority of deployed applications that need reliable In-order byte stream delivery use TCP and so there's this gigantic gigantic deployed base if you if you only have a small number of Machines or hosts or sites or whatever that you need to that you need to deal with It may be the deploying something non-standard is tractable If you have to have your data service be used by a wide variety of Organizations each with their own missions funding models staffing Support models everything else That starts to argue very strongly in favor of a another standard solution, so I mean, yes, you can you can do fine not using TCP There aren't that many applications that do it and and they tend to be niche application So we're starting to see some things like I think Google put forward a new proposal for a new reliable Service to kind of be the next generation of TCP. Do we see any of that coming out of our community? You mentioned UDT, but I believe that's kind of still a play on UDP. Do you see anything? starting to maybe gain traction and become like a try this first then fall back to TCP kind of emerging anywhere or not I Don't see anything in these are new transport protocol space Coming out of the out of the research community Most of the it seems to me at least the most of what the research community is focused on is Higher you know, it's a one-step up the stack Right, it's how do I get the data there? You know using the substrate that I've got how do we get the data there? And that's that's where faster data and the science is D&Z and all these tools come in and then how do I? work with that data using computing in order to Extract the science and extract the knowledge from from that data. That's where most of the Focuses at the moment Google is going to come out with a brand new transport protocol We'd love to look at it. I think that'd be cool Adoption is going to be tricky But that's the case with anything new and if it's better people will adopt it, you know, we'll have to see so a little bit of a Get a Mormon opinion piece from you You're working on a thing like faster data in which you're you know You're using tools like persona to verify that your network performs the way that you know it was advertised to and You know, you're doing all these tunings and stuff when in most cases What is your gut feeling for scientific data? Is it these settings and tools we're using that are holding us back more or is it a lack of investment in the physical infrastructure? So I would I would say it's it's a variety of things But you hit most of them and that's that's why That's why we have the science DMT model right is is That's an integrated way of looking at The current best practices of how to do these things properly and so it addresses the the common sources of performance problems right which is Networks that are not Designed properly and so they cause packet loss or it's hard to fix the packet loss when you find it or There's packet loss present and you can't find it and so you need a tool like sonar Or you have Data transfer servers that are also, you know doing other things. They're not dedicated. They're not specifically configured for data transfer Maybe you're not using the right tools. You're doing everything with stock S&P Or you know W get or something like that so that's the that's this common set of problems and The science DMT model provides an integrated set of best practices for For dealing with that set of problems And that's why that's why that's a big part of Faster data, but then you know it could be you you go and you deploy your science DMZ This doesn't work right Okay, now I need to dig into particular specific details of particular aspects of this now That's why I need the other half of the knowledge base Which gets into the details of how individual things are set up and configured and troubleshooting techniques and so So what is the the typical path for someone? Let's say I'm a network administrator and it's been brought to my attention that my researchers are getting crappy network performance and so on What do they what do they do they come to your website and start digging for you know Just question-and-answer kinds of things or what's the typical route? What's available? Well, I mean people do different things depending on their on their expertise and what they think the problem is I Mean as a network engineer myself I can I can tell you what I do when when someone comes to me with a problem I mean the first thing I do is say okay. Well What are you trying to do and how are you trying to do it? Because because without having that baseline as context It's really hard to understand What what's possible what good means right because good or fast or enough is very different Or different groups of people and under different circumstances So try to characterize what they're trying to do how they're trying to do it See if it's even possible to do what they're trying to do in the method that they're that they're using to try to do it And then see where to go from there if what they're doing makes sense based on what they're trying to do and It's not working for them then you can get in and troubleshoot that specific set of things but you know I run into a lot of cases where where somebody you know they want to do X and Instead of using stuff that's well suited to doing X. They're using Q and it's like well, okay I Realized that you know you thought you needed Q But maybe let's try X over here and see if that would work better for you And in some cases it's just swap tool and and they're happy they go away. They're they're crazy in other cases It's a big huge Problem that they're having is actually to providers down and they have no visibility into what those guys are doing So for you specifically Sitting there working at ES net and someone comes in with a problem. What's kind of been your proudest moment in terms of You know this faster data effort helping out a researcher. I don't have one single proudest moment I don't think Different ones are rewarding in different ways There was one case where And this goes straight back to SCP right somebody was you they were it was an international large-scale international data transfer And just by saying hey, why don't you guys try and use this the the patched version of SCP? We shaved literally months off of their data transfer time which had a huge impact on the productivity of the project So, you know that felt pretty good In other circumstances, it's just being being able to have the information available So that they can help themselves That's a gigantic a gigantic gigantic win So that so that people don't have to stumble around in the dark, right? There's a place where they can go and Get the information that they need apply it to their setup to their circumstance to their workflow and get on with their lives Without having to go and beg an expert to you know, Dane to look at their at their particular issue So we want to do two things, right? We want to make sure that people can get help themselves as much as they can And then give the experts a tool to use to help folks with that they run out of steam And then at the same time preserve the experts time for the really hard problems So what's coming in the future right you have a lot of information there on your website You even mentioned that it's easily copy and pasteable and things like that. How do you Keep this relevant is it just injection of new best practices over time as technology changes? Or do you have major new features coming that people can look forward to or you know? What's the future of faster data basically? now so the future of faster data is Better organization so that as the site grows we have to refactor it every every so often just because we stop being able To find stuff if we can't find stuff you can't either So it's gotten it's gotten big And and that's good because there's a lot of stuff there So we're gonna need to you know every once in a while we shuffle things around in terms of this space The thing that I would really like to see is Further adoption of the science DMZ model And therefore less need for some of the things that we have on faster data if we we can get to the place where To first order Infrastructure built correctly and people are using the right tools There'll be a lot less need for Very detailed Network performance tuning. So I mean it's it's kind of ironic, right? We but ideally we would like to see people need this less right because everything was working correctly And and scientists could get their work done without having to stop and and ask questions of other system administrators So how did the faster data effort get started? Faster data came from Brian Tierney's TCP tuning sites of Brian Tierney had been maintaining a set of TCP tuning Parameters and how tos and best practices for well over a decade At Berkeley National Laboratory And when Brian came to work for ESnet he brought all that information with him and we Stood that up as a knowledge base Within ESnet for use by by our constituents and I started working on it at that point as well And when the science DMZ model Came about we incorporated the science DMZ content into the faster data site as well It's really grown over time Just to include everything that we find useful in Working with scientists to make their infrastructure go faster, but it is Historically this came this came from Brian Tierney's efforts in in maintaining TCP tuning information for the community Okay Eli, thank you very much for your time. Where can people find the faster data Knowledge base and if they want to get a hold of you, how do they do that? So it's a couple different things there. So faster data is it's just fast data dot s dot It's a website that yes There is also a This is the science Site there there's also an announcement list that we use for for Training events or or things like that. I think it's just faster data events There should be a link on the site for that as well We're getting a hold of me one of the best ways to do that is actually on the science DMZ list but But you can look me up on on ESnet as well. I'm just darted es.net. So you can get a hold of me that way if you need to So yeah, I guess I guess that's it really it's it's just there on on the net so fast for data dot es.net Eli, thank you very much for your time Thank you. See like and I'm glad to be on episode 100. So hundred for hundred gig. We're good Whoo-hoo