 Welcome to another edition of RCE. Again, this is Brock Palin. You can find us online at RCE-cast.com. We're also on the iTunes podcast store, but they only show the last five episodes or so. If you go to our website, RCE-cast.com, you can find all the old shows there. There's an RSS feed added directly to your Google Reader or your feeder of choice. Also, you can find links to all of our Twitter accounts and Jeff's wonderful MPI blog and everything else. And that also means that once again, I have Jeff Squires, one of the authors of OpenMPI and from Cisco Systems. Jeff, thanks again for your time. Sure, Brock, this is good stuff. And it really does just roll off the tongue, doesn't it? RCE-cast.com just kind of rolls right off there and you get all the old episodes and things like that. Hey, so there's a couple of dates coming up here that are probably worth mentioning. I will mention the first one here. So the EuroMPI conference papers this year were in Spain, which is gonna be pretty fun. I don't think we've had EuroMPI there before, but the due dates for papers are at the end of this month, March 29th, so if you're not already working on your paper, you need to be working on it. Also, I will actually be one of the speakers at Globus World this year, which is at Argonne National Lab down there by Chicago, and that is April 16th through 18th. So remember, we've had Globus on the show before and they're coming out with some new features for Globus Online product, which I hope to get, Steve and some of those guys on the show to talk about some of the changes they're making for their web tool, which we didn't cover much in the original Globus podcast, after they announced their new features and stuff they're coming out with there, so I will be there, so if you're there, get a hold of me. Cool. All right, well, what are we talking about today, Brock? Today, we have something that I am only anecdotally familiar with, and hopefully, Jeff, you can help out a little bit more with this one. I just came in a request from a couple of different users, we're talking about Chapel, which I believe is a new language for scientific computing, which a lot of the heavy lifting there is being done by Cray. We have two guests with us, Brad Chamberlain and Sung Choi. Guys, how about you take a moment to introduce yourself? Yeah, so my name is Brad Chamberlain. I'm an employee here at Cray, a principal engineer, and I'm the technical lead for the Chapel Project, which means I oversee the design and development language. Hi, I'm Sung Choi. I'm a senior developer with the group, and I also work at Cray and working closely with Brad. Okay, so I threw out what I thought Chapel was there. Please correct me. What exactly is Chapel? It was pretty good. So Chapel is a new parallel language that we're developing at Cray. We're doing it in partnership with other members of the community, primarily from academia and government. I think the one thing you said, so Chapel is very much designed to be usable for scientific computing at large scale, but we've also tried to make it a very general parallel language such that it's also attractive potentially to desktop parallel programmers who want to make use of the multicore processors that they have, for example. So how did Chapel get started? And along with that, you said, you know, there's various academia and government involved. Who exactly are the other organizations involved? Oh, there's a long list of other organizations. So Chapel isn't, well, let's see, let me start with how it got started, maybe, and we'll come back to the course that's involved. So I think the short version of the story is that Chapel was something that came out of the DARPA HPCS program. So HPCS is a program started in 2001, 2002, with the goal of improving productivity on high-end systems. So the observation was that we're very good as a community, making faster and faster machines, but maybe not necessarily good at making them easier and easier to use. And so, Cray was one of the participants in this program, and we looked at our systems, sort of everything from processor design, network, memory, all the way up to software, compilers, tools. And part of that was we thought that languages were an area where the community could really benefit from some new technologies and have improved productivity. So that's, Chapel came out of that program, essentially. It was part of Cray's entering the program. So actually, so can you talk, John? See, it was DARPA, but I assume it's, I mean, did Cray just get the contract for this? Or who are some of the other contributors to this project? Yeah, so let's see. So initially, the HPCS program had five teams, then it went down to three, and then down to two. So at the point that the languages were coming about, there were three teams, IBM, Sun, and Cray. Each team came up with a language that they thought would improve productivity. So we worked on Chapel, IBM developed a language called XEN, and Sun developed a language called Fortress. And then there's a down select, and Chapel and I mean, sorry, Cray and IBM were chosen to go into phase three of the program and continue doing their developments. So Chapel's been, even though Cray was spearheading the effort, it's been designed as an open source effort. So all along, we've been encouraging of people throughout the community who are interested in participating or protruding to do so. And so you asked before about what other organizations are collaborating. So there've been a number of people at National Labs, Oak Ridge, Livermore, Argonne that we've pursued collaborations with and others. And then we have academics at the University of Colorado Boulder, some overseas at the University of Malaga, the Barcelona Supercomputing Center. And so actually, there's probably a few dozen organizations all together who participated at one time or another. I have trouble remembering them all off the top of my head. Fun, can you think of others then? Oh, UIUC. So there's been a lot of interest in working on it. And I think the open source nature of it has been very encouraging of people to join in and contribute where they can, which has been encouraging. So that's quite an impressive list of organizations. What is it about Chapel that brought them together? I mean, put a different way. Why do we need yet another parallel programming paradigm slash language slash model? What is different about Chapel that makes it better? Yeah, so let me start with the part of the question. Why do we need another model? So while there's been a number of attempts at designing parallel language over the years, and people love to haul out the list of, you know, the dozens or hundreds of languages that have failed, you know, since the 90s or 80s or whatever, the fact of the matter is the number that they've actually columned and succeeded is quite small. So, you know, MPI has obviously been successful if you're willing to count that as language. And I usually do because I think it really impacts the way you program. An open MP successful and UPC has had some success and Core Fortran has been adopted into the Fortran standard. But by and large, I think we feel that there haven't been any really great parallel languages. Most of the parallel systems that are out there are good at certain things, like certain styles of parallelism, like maybe SPMD parallelism or task parallelism, but typically not sort of at a broad swath of different kinds of parallelism, say data parallelism and task parallelism and concurrency and nested parallelism and pipeline parallelism. So they sort of have their one thing they do well. The other thing is most parallel languages we have only target a single granularity of parallelism. So MPI, UPC and Core Fortran are sort of process-level SPMD parallelism, things like open MP and P-threads or thread or task-level parallelism. But again, there's not sort of a single language that allows you to talk about different granularities or styles of parallelism all in one unified way. And I think this has also been the failure of some of the academic languages is that they sort of, by nature in an academic language, you pick the one thing you want to work on this novel, you work on that and you maybe have some successes there, but the language is not broad enough to be sort of hit that tipping point where people get excited about it. And at Chapel, when we started out, we decided to go really big, like to be very general. Any parallel algorithm you can think of, any kind of parallel hardware you can imagine wanting to target, you ought to be able to do that in Chapel. And I think that while that makes it very audacious, can we actually pull this off? I think it also sort of is the thing that makes Chapel more unique, more different. It's just incredibly general in the scope of what it's trying to do. And I think that's been a lot of the appeal as far as why are people interested in using it or in participating in it as well. So, I mean, did you have anything you wanted to add to that? So, mostly I just want to add to extend on what Brad had just said, which is that since Chapel is a general language, it has a broad appeal and it brings together people from all swaths of the community, people just who are interested in languages, as well as parallel systems and runtime systems, and that gives it a lot of attention in the academic community. Let me pop back in. I think one of the things that makes it really appealing to people, in addition to all the features that we talked about in terms of general support for parallelism and locality, which we haven't talked about as much, it also, we very much tried to make it a very much more modern feeling language. So, most languages we're using in HPC are still very Fortran C, C++ based, but if you talk to young programmers, they're using Java, maybe C sharp, Python, things like that. And we really designed Chapel to have that more modern sort of both in a type safe manner and in sort of the scripting kind of way, like you can leave off types of variables and things like that. And so a lot of times we get people who are Python users who are just really excited about Chapel because it feels the most like programming in Python that they've seen in a parallel machine and that's really enticing them as opposed to using that's something that's C-based or Fortran-based or something like that. So to put it a little concretely, so that was the first thing you said there is that is it kind of like Python? Is that what you're saying? Or is it just kind of analogous to the people who are used to it? I mean, can you put it in a few grounded terms that what does modern mean to you, for example? So I think modern means, so it's not actually based on Python. And in fact, interestingly, when we started this project, I was really ignorant of Python. Like I heard the name, but I never really looked at it. And David Callaghan, who was one of the other early architects on the language, similarly was not really a Python user, didn't really know much about it. So the thing I think that people see there is more, we knew we wanted to give sort of that more productive like I don't have to type the type of every single variable. I don't have to type the type of every single argument or return type of my functions. And we wanted that more sort of fast and loose scripty kind of feel, but yet still have it be a completely statically typed language where the compiler figured out all the types and you would end up with some kind of performances you get in a statically typed language like C or Fortran. So that sort of feel of like being able to sketch out code quickly and not sort of specify every single thing was definitely a theme in the language. And the similarities to Python are more circumstantial in anything. I think we just ended up with some concepts that made sense to us that turned out to be similar to things in Python and there's been a certain amount of crossover between those communities as a result. The other thing I associate with modern languages is a lot more object orientation, a lot more type safety, the sort of bulletproofness that you expect out of like a Java or C sharp. And so we went more that route than say C which is obviously very permissive language in terms of allowing you to do things. But it sort of probably straddles the line between the two because at the same time we needed to retain the performance in order to compete with things like C. So let's talk a little bit about the actual parallel features. I mean, I assume this was kind of a core thing from beginning. You said you wanted to touch on everything, but I mean, I think of more like one-sided operations, synchronous operations, partition global address space operations. I mean, do you just re-implement those same kind of functionality? Did you come up with anything unique? And what one of those did you think Chapel came out as implementing the best in the Chapel way? Well, so in that sort of taxonomy you went through different styles of communication or P gas like languages. We normally say that Chapel is a P gas language or rather than P gas, I kind of prefer the term partition global namespace because normally in P gas languages you're not actually talking about addresses you're just naming variables. And it's the fact that you can name a variable and access it whether it's local or remote that sort of gives it that PS like feel. So in terms of as a programmer when you think about sort of communication you're not doing sends and receives or explicit puts and gets you're just referring to variables and the variables may be local in which case you'll just access them locally or they may be remote in which case the compiler in runtime will do the communication for you. And semantically in the model you can reason about and control where your data is allocated on the machine where tasks are running on the machine. So you have full ability to reason about locality but you don't necessarily have to it doesn't trip you up at every step because you're not doing explicit communication. So it is very P gas like in that sense. At the same time it's very different than the traditional P gas languages like UPC and Core Fortran. And I would say the big difference is those languages use a static SPMD model where the beginning of time you fire off K copies of your program and that's your parallelism from the beginning of time till the end. And in Chapel we've made parallelism and locality very orthogonal concepts in the language. So we have one concept that talks about locality. What's on this node? What's on that node? What's here? What's there? And then we have other concepts for talking about parallelism. I'd like to run this loop in parallel or spawn up some parallel tasks to do things. By separating those two concerns which I think are very different concerns I think that's one of the things that really sets Chapel apart from this language that either use the same thing for both which is what the SPMD model gives us or they just don't have a notion of locality at all which is where we are with open MP and P threads. So to be able to get good performance out of one of these things because something you may try to access may be very, very far away or I could see it being very easy to write something and prototype it real quick and yes it'll run in parallel but the performance it's not gonna scale very well and then you would need to give it extra information about where you kind of want stuff laid out. Where does Chapel fall? Do you need to be explicit or is it completely free form or is it like it's completely free form but you can give it hints? So your characterization is correct. There is this tension between being able to sketch things out quickly, get them up and running and then accidentally shooting yourself in the foot in terms of performance and I think that's a tension where sort of Chapel is at one end of spectrum and MPI is on the other and we sort of intentionally as a result of the productivity angle of this program said that we were gonna go after that more productive solution where you could sort of write things very quickly but you might do really bad things. So as you're tuning for performance you are gonna go into your code. You're gonna be looking for places where you're doing more communication than you meant to and you're gonna do, you're gonna maybe rearrange your data in order to have it local in memory or cache things so that you're not communicating quite so much. So in designing language, one of the things we really thought was important was sort of respecting the 90-10 rule where sort of if 90% of your time has spent at 10% of your code, it's a shame that in a lot of our programming models you're sort of paying the cost of dealing with locality very, very explicitly sort of in 100% of your code. And so the ability to sort of write a bunch of code and then focus in on the part that you really care about and tune for locality there and tune the communication very carefully there. We think really age and productivity. Yeah, anecdotally, I've been teaching a class over at University of Washington this summer and the potential is very apparent. They wanted to print out a distributed array and an MPI would be painful if you want to print it out as sort of one coherent whole, you have to take turns printing and depending on your file system you may even have to be very, very careful that the file system doesn't reorder things on you. In Chapel, you just say print out the array. That's an example where the productivity helps you and if you're doing that just at the beginning or end of your program, you don't care about the performance. But then on the other hand, in Chapel, there were times where they were accidentally accessing something that was non-local without realizing it because it's so easy to do. And until you go in and start counting your communications and seeing what's happening, you can easily miss the fact that you should have localized something or distributed it and didn't. So there's definitely a tension there. So you mentioned that this is a very high level language and that it has all these modern fields to it, untyped variables and things like that. How do you interact with the underlying system itself, like the network calls and things like that? Is that all basically completely hidden away or is that implemented in C under the covers or how does that work? So basically in terms of interacting with the lower level system, we have a runtime system that is implemented in C for now. And the calls from the compiler and within Chapel modules, Chapel modules are sort of, you can think of them as libraries that are written in Chapel that are supporting every program. There are calls into the runtime there. And we have separate, what we sometimes call layers that support different types of communication, threading, tasking, memory allocators and things like that. So this is something that a user will not see. We haven't really talked about, so we talked about the high level constructs, but there are ways to drop down and use sort of lower level constructs and you could essentially drop all the way down and make C calls also. Okay, so that's actually an important point. So Chapel can interact with other middleware out there like there's, you know, zillions of C and Fortran based numerical libraries. Is there a way to make those interoperate? So one of the things we think is really important in Chapel is that you not have to throw away all your previous code. And so interoperability is a big part of that story. So one of the capabilities we have today is capability within language where you can declare external types or variables or functions, for example in C and then you can call from your Chapel code out to those functions. We also have a desire and an initial effort toward supporting calls in the reverse direction. So if you wanna take your C program and rewrite the middle of that in Chapel, you could call into a Chapel program and that's ongoing work. And then for a broader interoperability story, we've been working with the Babel and Braid team at Lawrence Livermore National Lab and they have a system in which they support interoperability between sort of all the major languages that HPC cares about. It's using a hub and spoke model and so they have a spoke for Chapel which allows them to interoperate with Fortran, Python, Java, C, C++. And that's a much broader and stronger story than when we have built into the files. I guess something that hasn't come up is Chapel actually uses a source-to-source compilation technology. So our compiler sort of compiles all the parallelism and the locality kinds of aspects down to scalar C code and these calls to the runtime that Sung mentioned. So that makes interoperability with C very easy. And then we rely on the backend C compiler to do lower level optimizations that are specific to the processor and the machine that we're targeting. Now, one thing I wanna jump back a little bit in the conversation that strikes me, I didn't get a chance to ask about it earlier. You were talking about the concept of locality. Well, what is it? What do you expose? Do you expose a processor core or a cache or a memory locality? What is your concept of locality? Yeah, so that's a really good question. So Chapel has a concept in the language. It's like a built-in type you can think that is being, that is called a locale. And a locale is sort of intentionally somewhat abstract. It basically is a part of your target architecture that supports the ability to run tasks, which is to say it has a processor and store variables, which is to say it has memory. So typically these locales on a cluster say would be mapped down to a single compute node. So anything within that locale would be running with the cores of that compute node and stored in the memory of that compute node. And anything in another locale would be in another compute node somewhere across the network. So in the traditional PGAF model, you can access variables, whether they're on your locale or a different locale, it's just a matter of cost, basically. So obviously accessing things that are local to you is going to be cheaper and preferable, particularly from a scalability standpoint, but you still can name things that are stored in other locales. Now you asked about mapping these locales down to something finer grain, like a single core or a single memory or something like that. That's something we haven't done much with yet, but it's an area of ongoing research within the group currently. So as our node architectures get more heterogeneous and have deeper hierarchies or more sensitivity locality within the node, one of the things we're looking at is supporting locales within locales. So you could talk about this locale represents my compute node, but then within there I have a sublocale that talks about the CPU resources and a sublocale that talks about the GPU resources. I'm sorry, I have ongoing work. Today our locales are just sort of a single top level non-hierarchical concept, allow you to talk about horizontal locality, but less about vertical locality. So what about the actual communication parts? How does things move between nodes and does it only work on a crate? Okay, so those are two pretty different questions. So taking the first one, how do things move between nodes? So one of the things you have at your disposal in chapel is something we call an on clause. And so with the on clause, you can say something like on locale number three, or you can use in data-driven manner and say on locale that's storing variable X. And then any text within that compound statement will be executed on that locale. And so this is the primary way in which you move computation around the machine is use these on clauses to say, go do this over there. You can also use these on clauses to allocate data on different locales. And then there's also a high-level concept called a distribution, similar to distributions in the HPF and ZPL languages of the 90s, allows you to talk about taking an entire array and distributing it. So that's how you distribute your data across the machine. And then to the communication itself happens just by naming variables that happen to be on a different locale. So for example, if I declare a variable X and then I use an on clause to move my task to a different locale, it can still refer to X because it's still seated in a flexible scope, but that reference to X is gonna require puts and gets over the network to store and load that variable X. I guess one that's really interesting how you're doing that words, it's more like a data-centric communication rather than just like a byte address-centric communication. But what I was actually getting at is, you're using the language-to-language to compiler so that you're utilizing already well-optimized C compiler to produce your machine language. Are you using, what are you doing a similar thing for communication? Like are you just relying on existing optimized MPI methods and or is this something that relies specifically on like a Cray-Tourist network and thus I could never run this on something else? I see, I see. Okay. So in terms of the communication, as I mentioned, we have this runtime communication layer and it supports things like puts and gets and active messages. And you can map that down to any underlying technology that supports it. So traditionally and by default, we tend to use the gas net communication layer that comes out of Berkeley because it has very good support for puts and gets and active messages, which are sort of the three main constructs we need. And so they've taken the burden of doing a lot of work of tuning this down to specific network hardware capabilities as well as providing very general implementations that run over UDP or MPI or things like that. So typically that's the approach we take for communication. We do have an implementation that's targeted directly to Cray networks so that bypasses gas net and goes directly down to the network since that's something we know about in-house. And we've talked a lot with the MPI three team, particularly at Argonne about doing an implementation over MPI three to make use of some of the new single-sided communication constructs that have been introduced in that draft of MPI. Maybe taking your question a little bit further about portability. So a lot of people when they hear about travel, they sort of assume that it's only being developed for a Cray system because Cray's leading the development. But we recognize within the team that that's sort of a clear dead end for language if it's only supported on a single machine that people expect their language to be portable across multiple generations of machines and multiple vendors and to work on their laptop and their cluster. So both in the design of the chapel language itself and its implementation, we've really emphasized portability greatly. And you can run it pretty much in any machine that has a C compiler. Can you make positive threads and gasnet? So speaking of that, what transports and what environments does chapel support? So you mentioned, you know, obviously it works well on Cray machines and you made passing references to MPI and gasnet and other things. What operating systems, what transports do you support? So in terms of operating systems, we're pretty much support any kind of Unix-like operating system. And then also the SIGWIN environment, which I'm not sure how you would categorize that, but in terms of transport layers, Brad said we primarily target our communication to gasnet. So that supports just generic UDP and MPI and all sorts of, all the other sort of native transports on the big machines. And other than that, we have a custom in-house version of the communication layer for Cray machines. We have in the past supported directly MPI and a couple of other interfaces, but we don't right now support them in our releases. Yes, we used to have a direct port of our communication API down to MPI. The argument being kind of for portability sake, but because gasnet has a conduit that builds on top of MPI, it felt sort of redundant in the sense that if you're gonna rely on MPI as the underlying communication substrate, you might as well use gasnet support instead of us maintaining two ports. So that's why we sort of let that retire. As I may have mentioned in a previous question, one of the things we've been exploring with the MPH team in Argon is whether the new MPI extensions would be sufficient to do kind of a native port MPI that makes use of the push and guess that have been proposed for MCAT 3, or I guess accepted in MCAT 3, and whether with that we'd get some more performance to gasnet. So with your chapel to C translator, C is a language, but not all C compilers are the same. Are you targeting a specific one? Like which C compiler should I have around if I wanted to get chapel going? We expend a lot of effort making sure that the C we generate is as portable as possible. So for example, when we're doing our own development, we use GCC a lot just because it was convenient, but we throw sort of those wall flags so we get all the warnings on and we basically try to tow the line to the C99 standards as closely as possible. And the other piece of that is that in our nightly regression tests, we test against every C compiler that we have available to us. So in-house, that's like GNU, Intel, PGI, crazed compilers, we have a couple of them. And then we have users who are always trying it out on things like IBM. So pretty much we do the best job we can to generate the most portable C code possible and do a pretty good job of that for the most part of things. All right, now you mentioned that this is actually a fairly old effort. I think you said you started back in 2002, 2003 kind of timeframe. What kind of shape is the system now? Can someone download it and use it and does it give good performance on, say, clusters of Linux workstations versus a Cray and things like that? You talked about the portability. You didn't really talk too much about the performance portability. I'm sorry, I threw in a whole bunch of questions there. I hope you could make some sense of that. Yeah, that's fine. So yeah, it has been a long-standing effort, although designing any new language takes a long time. So the first several years of the project were basically spent trying to figure out exactly what did we wanna build. We knew we wanted to do new language. We knew we wanted to be more productive, but it took a long time to really figure out exactly what should be in language and should not be in language. So that said, as far as what's pasted in today, you can download it. It's posted at the Sourceforge page. I'm sure we'll put the link to that in the webpage for this podcast. And you basically can download it, build it, as Sung said, pretty much for any Unix system that has a pretty standard C compiler, but you may have sort of the minimal requirements on POSIX threads, I guess. As far as performance, so performance can be really hit or miss with chapel today. There are a number of optimizations that we have planned, that we know we need to do. We just haven't gotten to yet. So generally speaking, our implementation approach has been to implement the features that we think are key and get them working right first so that we can get them out to potential users and get feedback from them as to whether these are the features that they want and need. Is this a language that would be useful to them if it became production grade? And that's been really useful because we've gotten a lot of great feedback and improved language based on those initial users who've been taking the tires. And then once the feature is sort of figured out, then we go back and work on improving the performance. And we've done a number of things from a research perspective in the language that have maybe given us challenges from a performance perspective. So one of them is that all arrays in the language, whether local to a single locale or distributed across multiple locales, are written in chapel, using chapel itself. And so that's the kind of thing that we think is necessary to be a very general, flexible, general purpose language. We think it's one of the things that killed HPF and ZPL in the 90s with not being able to specify your own array distributions. But at the same time, that takes a lot of knowledge away from the compiler that means you have to make up some ground in order to get your array performance back to what you'd get if you were doing it by hand in MPI, say. So maybe just to wrap up this answer. We usually encourage people if they're interested in chapel to download it, kick the tires, try out some key items that matter to them and see whether they like the features to not judge it based on the performance today but to sort of see, you know, extrapolate, what would this be like if we continued working on it? So the argument we make is that, you know, if any language is going to come out that's sort of truly revolutionary and truly generalizes parallel programming and makes it more productive, it's not going to show up fully formed on day one with perfect performance. So, you know, you can either work on it in secret for years or you can work on it out in the open and show people your dirty laundry. So a lot of people today try it, they see bad performance and they give up but most people we ask like, okay, so are there sort of key things that you think will prevent it from getting to performance as we put more time into it? Most people can't really point to much that sort of inherently flawed and it's just a matter of putting in the time to optimize it and that's something worth continuing to do. So what's the long-term support for Chapel look like? I mean, is DARPA still putting money into this and there's like another 10 years guaranteed plan? Like if I develop something and keep iterating it will I still have a working Chapel installed 10 years from now? That's a really good question. So we're in an interesting junction right now where the HPCS program just wrapped up this past fall. I think we finished our final deliverables within the Chapel team in October and there were some things that super competing last year to kind of tie a nice bow around the HPCS program. And we know that both Cray and our users are really interested in seeing Chapel go forward and we're currently trying to establish what's that gonna look like? What's the funding model gonna be? What's the support model gonna be? And we're sort of in the middle of figuring that all out. So I hope we'll have a more concrete answer to what does the precise future look like for Chapel? Sort of by super competing at the very latest although hopefully much sooner. But at this point it's a little bit cloudy other than to say that everyone seems cautiously optimistic that something will keep it going forward. The other thing I guess as far as sort of your ability to rely on it going forward is everything we've done so far is open source. So if for example, Cray pulled the plug on it there's nothing to prevent someone else from running with it outside of Cray. And that's sort of I think an important part of designing any new languages to keep it open source so that the end of life of one aspect of the project doesn't mean end of life for the project itself. That's another reason we've developed these collaborations is sort of grow a community for Chapel that's not just sort of Cray centric sort of to have more people outside of it. Now you mentioned that Chapel is open source. What is its specific license? It's currently licensed with a BSD license. So it's quite permissive as far as how it can be used. We've talked at times about switching to an Apache license. The main reason being that Apache comes not only with a nice permissive license but also with a sort of formal contributors agreement. BSD doesn't really have that. So at times there've been questions about what's the best way to manage contributors agreements for developers on Chapel. We haven't made that switch yet but we always are sort of thinking about it. Particularly if it would allow someone to contribute who couldn't contribute under the BSD license. Okay, hey, look at the bars bounce, okay. So you mentioned- I'll try this at home. Let's see. I'll shut up, go ahead. Yeah, yeah, fail, fail, fail, fail. Okay, maybe we'll get this going. Okay, you mentioned earlier, you mentioned locality for GPUs and stuff. Does the Chapel language itself make use of, can you do Chapel down to CUDA or Chapel down to magic Intel pre-processor statements for fees? Like, can it make 20,000 threads on a GPU? Like, does it have a way of handling these? Yeah, so that's an excellent question. One we've been getting a lot recently, as you can imagine. The answer today is sort of yes and no. The main thing we've done with GPUs so far has been a collaborative effort with a student at University of Illinois named Albert Sidelman. And he interned with us one summer and then went back to school and continued working on a port of Chapel to GPU. But essentially what he did was, I mentioned you can build your own arrays in Chapel. He basically built an array that was sort of an array that targeted GPUs. So if he declared an array of that type and did operations over it, that would result in, you know, the computation and the array being stored on the GPU. So that was a really good proof of concept and that exists in a branch of our source tree. But it's never really been hardened to the point that we've incorporated it fully into trunks. And within the core team, we haven't done much with GPUs primarily because under the DARPA HPCS program, the cascade architecture that we were developing, at least as defined out of the program didn't have GPUs in it. Now, since then, of course, craze has been putting GPUs in the products we've been looking at, so might as well going into the future. So this is clearly an important part of craze roadmap as well as other vendors and something that we need to play catch up on. So one of our goals for the coming year is to basically take some of the ideas that Albert worked on at University of Illinois and combine that with hierarchical locale concept I mentioned before. The idea is putting locales within locales to talk about different types of resources within a compute node and to basically start talking about targeting a GPU locale. So our goal there is to sort of take those two technologies, bring them together, bring it up to the point that we can put it into trunk and make it part of the sort of main release. So does that mean with this like locale and locale, if it's gonna handle GPUs well, does that also mean that chapels in general is probably, with the locales being different islands of characteristics or performance would work very well in a hetero, a truly hetero-genius environment of any type? So it's certainly designed to be. Again, with this theme of talking in terms of locality and parallelism and not talking about sort of specific capabilities like one-sided or two-sided message passing or specific granules of parallelism, the idea is that it would be a very general language and be able to support heterogeneity. And we've had some initial experiences along those lines. We had an implementation for a while that was running on a Linux machine and part of it on a machine. That's an effort that sort of has fallen by the wayside but certainly the language itself is capable of doing that. There's obviously a certain amount of engineering of wire to make that really robust. So with all this work that's going on and understanding that chapel is still relatively early, is there any real-world use of this out there? Are there people using chapel for actual scientific computations or is it more still in the trial, kick the tires kind of stages? I think it's probably accurate to say it's primarily in the trial stage. But I think the important thing to say is that we have a number of very interested users both within the government sector, DOE and DOD and in industry, although the industry people normally don't want us to say who they are. So there are a number of users who are sort of really anxious to use this in a production setting but of course they need the performance and the robustness of the implementation to catch up a bit before they're able to put their full weight on it. So I'd say the interest is there and certainly there are lots of tire-taking going on with sort of increasingly large and interesting codes as time goes on. But to my knowledge anyway, I'm not aware of say production applications in government or industry that are relying on chapel today. And frankly, I think that's appropriate. It would worry me a little bit if that were the case given where we are today. Okay, now something I tend to ask a lot of software developers and it's pure curiosity on my part simply because I love to hear the answers of this. What source control system do you use and why? We use Subversion, it's hosted at Sourceforge and as for the why, that's a really good question. I think Subversion is a really simple model to understand and it does the job for us. We have a number of contributors, particularly outside of Cray who really advocated for moving to something like Git and we have not done that yet in part just because of the effort required to switch between different systems. But in part also, I just haven't really fully gotten my head around Git yet and maybe I'm too old for it or something. But the sort of, I have this fear of not having a really well-defined trunk that I'm gonna use for my nightly testing and I understand you can set that up by convention in Git but sort of the fluidness of Git makes me nervous about it. We do have some developers who use Git to create branches off of our Subversion tree and do some work there and then contribute patches back to the Subversion tree and that seems to work really well for them maybe to the point that we'll stick with Subversion for sort of the trunk for the foreseeable future. Okay, so for the last thing here, what's coming into future seems like it's incubating still to some degree, what's some of the short-term things for Chapel and what's the long-term things for Chapel? Well, so as I sort of alluded to, one of the big things for us is figuring out sort of what does the Chapel program look like going on from here? And part of that is sort of how do we want to continue to move forward with our team here at Cray? But there's also a question about how do we transition Chapel from being sort of very much a Cray-driven technology to much more of a community-owned technology? So ultimately, it's our intention that there be sort of an external Chapel foundation that's sort of independent of Cray and that invested users and developers are sort of bought into that and have more say over the direction of language. But figuring out how we get from here to there is definitely an organizational challenge going forward. And then on the technical side, we talked about the desire and need to target GPUs and mics and sort of other emerging processor technologies. So that's gonna be a big emphasis going forward. Improving the performance is always an emphasis and something we need to spend more time on. So that's gonna be an emphasis going forward. And then going back and sort of backfilling some of the features that were maybe in the dark corners for HPCS, not on the critical path, but are crucial to end users, I think it's sort of a third big theme for the coming years. Okay, well thank you very much for your time. Where can people find more information and get involved with the open source project for Chapel? It's the sort of single point to know about getting involved with Chapel is chapel.cray.com. That's our meeting website that has a number of, you know, a lot of information, presentation, tutorials, things like that. It's also got the link to where you can download it. It's got our contact information. So that's sort of your one stop shop for learning more about Chapel. Okay, well thank you very much for your time. Thank you. Yeah, we appreciate it. Thank you.