 Welcome to another edition of RCE again. This is Brock Palin You can find us online subscribe and find on the entire back catalog of over a hundred episodes on research computing at RCE dash cast calm I have again with me Jeff Squires of Cisco systems and one of the authors of open MPI Jeff. Thanks again for your time Hey Brock, it's getting to be I guess we're hitting mid to late spring here And that seems like a perfect time to me to talk about electrodynamics. Don't you think of course? It's always a good time to talk about that So we have with us some of the team that's behind meat Guys why don't you take a moment to introduce yourself? I'm Steven Johnson. I'm a professor of applied math and physics at MIT. I am out of honor school I was formerly a student here at MIT with Professor Johnson and currently leading a startup out of San Francisco commercializing me Called sympathis Okay, so can you give us a background on what meet slash sympathis? I think is sympathis Okay, yeah, sympathis. Uh, you know, what is it? some meep is a program to simulate Maxwell's equations the equations of Electromagnetism and there are lots and lots of methods computational methods for electromagneticism so meep Uses one of the most general and basic methods called a finite difference time to main methods It just discretizes space and marches Maxwell's equations forward in time and so because of because of that it's handling the full time-dependent Maxwell's equations it can Include lots and different kinds of physics and it's a very general code With the trade-off that it's not as efficient as maybe as some more specialized codes that only solve one small class of problems So dr. Johnson you were on this show before about FFTW, which is probably what you're best known for There's a relationship between FFTW and meep isn't there? very tenuous so FFTW is actually used For another electromagnetic simulation code that I developed called MPB Which is less general than meep, you know, but it only solves one kind of problem It finds the like harmonic modes or resonant modes of Maxwell's equations and that uses for you transforms and For a transform basis and so uses FFTW and in fact FFTW was developed in part for meep. So meep Solves a more general class of problems and so it doesn't use for you transforms directly But it actually can call MPB For part of its features and then and then MPB calls FFTW. So there is a connection, but it's sort of a little bit indirect Ah Okay, so but meep is actually an acronym, right? What does meep stand for? So I think it originally stood for a MIT electromagnetic equation propagation We've invented lots of other meanings for the acronym like Maxwell's equations for every person and other things like that But like most acronyms mostly people forget what it stands for and they just call it meep Okay, so you talked about Maxwell's equations and whatnot, but break it down for those of us who are not deep into the science here What kind of physical phenomenon does this simulate? I mean, what how would I explain this to say my mother? so you know Maxwell's equations is all electromagnetic electricity and magnetism like circuits and also light and optics are electromagnetic waves and All sorts of things like plasmas and other kinds of things can fall into this category as well So Mainly meep is oriented towards modeling wave propagation problems. So problems Like I said in optics modeling You know lasers modeling sensors You know modeling all sorts of like optical fibers and Filters and other kinds of you know phenomenon there Also microwave effects, so I mean basically you can you can model microwaves and microwave devices and other kinds of things But mostly I think most of the people using meep are using it for optical and infrared simulations So how would that those types of simulations map into like a real-world product and what people actually use I Can give a few examples here one Kind of emerging area of display applications you guys might have heard of our organic LEDs There recently been introduced by Apple in the latest generation iPhone X but also increasingly prevalent in solid-state lighting applications and One of the big challenges in organic LEDs is much of the light that's generated by these organic molecules Actually remains confined within the device and so there's a lot of research within companies such as Samsung LG Nichia Japanese company to design structures that can extract the light and to increase the energy efficiency of these organic LED devices and so that's one application that involves quite a bit of optical simulations based on Finite difference time domain tools such as me and other solvers other applications involving photonics involve Integrated circuit applications for data centers and so many of the listeners of this podcast may be familiar with bottlenecks related to data transfer within large data centers and There's been a lot of talk recently into transitioning from kind of an Copper interconnect technology to a light-based optical fiber technology and at the core of that are essentially optical circuits that can convert electrons into photons and those involve quite a bit of optical design and Simulation tools such as meek and others So getting light onto a chip from an optical fiber and into a circuit and off a back off Also solar cells like you know the if you're trying to make a solar cell more efficient Very often what you do nowadays is you take a thin Solar cell layer and you texture it or you put micro structures on it to to scatter the light and make it absorb more efficiently And to design that you need lots and lots of simulations So who are you know who uses this software so you mentioned a bunch of brand names in there And I work for a very large Networking company to Cisco, you know, do we use these things to develop our hardware could forgive me I actually I don't know because I'm a software guy. So I don't work down much Laura at the hardware 30s level Well, I can say pretty much with high confidence that most large corporations developing display applications or Sensors involving light and as Steven mentioned solar cells all of them using electromagnetic simulation tools really at the core of their design and More recently their applications involving augmented reality you may be following kind of developments in Google Glass and its successor daydream Oculus for virtual reality all of these have involved really intensive research into kind of next generation display technologies and Those involve lots and lots of optical simulations as well And now there are lots of simulation methods and lots of simulation software out there So sometimes it can be a little hard to track down who is using what so for example meep is a tiny main simulation And and there are actually several it meep is completely open source free software, but they're also closed source proprietary Simulation factors that use similar algorithms from a whole bunch of companies So, you know and different people are using different software So initially I think meep meep's initial foothold was more in the research world where you know when you're doing research and you're pushing the edge of What's possible with optics you often need to be able to get inside the code and really change what's happening And that's only possible with an open-source tool But you know now our art events company is working much more with a lot of different commercial Commercial vendors that are starting to use this kind of thing So you've talked a lot about the organic LEDs, which you know for things like displays you want to be in the visible range you also mentioned I Think it was infrared What about extreme parts of the spectrum is this code completely generalizable like could I use it for extremely high? Frequency like almost like radio trend like like radio type frequencies. Well radio is not high. It's actually low Yeah, I got that backwards, but Like yeah, okay. Is it purely generalizable or do you kind of assume you want to stay and we're focused on this? Optical range, you know from inner infrared, you know to each side of optic, you know visible But you kind of want to stay in that range No, it can work for everything from radio frequencies to x-rays You know as a theorist I usually work in dimensionless units. I set my wavelength. There's always one So, you know in some sense that the theory is the same whether radio frequencies are at x-rays the difference is the materials so at Really at long wavelengths You know microwaves and radio frequencies, you know where you have really really good conductors the pedals is really high Then you know you can still use this code But sometimes there are other methods that take advantage of That that really high connectivity to be able to be more efficient from metallic structures at really short wavelengths In some sense once you get to start getting to x-rays it almost becomes meep is is overkill because at x-rays Almost everything is transparent as you know, that's why they use x-rays to look through you So there isn't as much to do in terms of doing scattering calculation meat meat can certainly handle it But it's it's like if you're mostly modeling light propagation through empty space. It's overkill to use a big simulation for that So, you know many of the most interesting the reason the reason I mentioned You know most people are using it for infrared and optical frequencies is that that's where most of the interesting Photonic design is going on at this space it if you're a microwave Frequencies if you're ready if you're really long wavelength hundred kilometer wavelengths with for radio waves Then you're doing circuits, right? So I mean and it's overkill to do the full maxis equations for a circuit Well, and if you're microwave if you want to trap if you want to make a waveguide at microwave But we want to send light along a channel you just make a metal tube You don't need a complicated geometry and again at the other end of the spectrum if you're at x-ray There's really not much to do with optical design because it goes through just about everything So the the interesting part is Particle and infrared and that's that's why people need lots of simulation tools there So you mentioned MPB you know MIT photonic bands, which was another code you had done and you mentioned that Meep is able to call MPB, but you said that meep is more generalizable. So what what does MPB? Do that you'd still want to call it versus just using meep. Why would you want to choose one package or the other? so You know usually in this is an all scientific computation as a trade-off between generality and performance And ease of use so if you have something that just does one kind of calculation It can be very fast and very easy to use for that thing So MPB is one of those things that just does one thing it just computes What are called the modes of the harmonic modes of a structure? So if you have a waveguide an optical channel, it will very efficiently compute You know what those modes are what the field patterns are that just propagate down that way that we've got it to fix frequency so Meep is much more general it you can have Things that don't really have modes they have non-linearities or things are moving around in time So the frequency is changing it's extremely powerful in general, but The trade-off is that if you just want to compute the modes of an optical fiber You can do it with meep, but it'll be slower and a little bit more annoying to use than something that's specialized for just that problem And so the reason that meep calls MPB is very often if you're doing a calculation, for example You're coupling you're trying to couple an optical fiber into a chip And so you want to start off the simulation by sending in one of the modes of those fibers So your initial condition in some sense your source is one of the modes of those fibers And so what meep will do is it'll call MPB to compute that mode and use it as the source and so MPB will tell it. Oh, this is the fundamental mode of that optical fiber and And then meep will use that information and launch that mode in into the chip And then you'll see what it does the mode comes in and it bounces around or gets converted into something else and so forth And meep will will do all those dynamics Now one of the obvious Challenges that we have in lots of scientific computing is that you know on a single server You can do so much and so you know typical solutions of this are you know the the typical MPI types of solutions Does does meep support MPI? Do you guys scale up that way? Yes. Yes So in fact one of the things that Art events company does is help with people do that By running meep on the cloud like an Amazon cloud server Okay So what's your what's the biggest scale that you guys have done or or I guess maybe a better question would be What is a typical scale that your customers run on? Well, let me give some context to this Which is the the range and the size of the problems that meep calculations Involved can be huge. We're talking from you can run calculations on your desktop kind of with a few cores all the way to hundreds of cores or perhaps even thousands of cores and so that range and then the computational needs really demand and Kind of inspired us to kind of start this company a Scalable resource like a public cloud infrastructure where you can on demand Access a scalable resource that size just for the applications of interest and so Particularly for a small startup or a small team that can't afford to build their own local cluster and maintain it having access to An Amazon cloud or a Google cloud infrastructure really enables them to leverage meep and other open source tools in ways That would be very difficult if these tools were not available So how well so that sounds exactly in line with my expectations of you know Using Amazon and Google for their infrastructure one that's perfect for exactly this But there have been some traditional challenges Doing HPC in the cloud such as the data transfer on and off and the lack of high-speed networking in the cloud How do you guys address that? Well, the applications that tickle we've been looking at Recently haven't been pushing the the limits of the Interconnect capabilities of these resources yet Bottle necks or communications bottlenecks yet, of course they are there and it's just that the applications that we've been using Have been fairly well constrained. I would say But certainly I would say that of course having a dedicated HPC infrastructure for these kinds of calcium to it would obviously be Important for a performance reason. Yeah, of course, so some of the big calculations we do you get time on a traditional supercomputer as well You know a lot of the time you're running simulations on a few cores That can even fit on a single machine And then you're if you need lots and lots of cores It's often because you're doing parameter sweeps of some kind of optimization where you have Lots of optimum lots of things that are embarrassingly parallel that don't even need to talk to one another So then you just need to run a thousand instances of them So but you know so you and usually in sort of day-to-day design work People are typically doing relatively small simulations and then every once in a while when they've got sort of everything working They put everything together and then throw it at a huge machine with you know hundreds or thousands of cores Okay, you actually answered my question I was gonna ask if you could actually optimize around the fact for Total time-to-solution because you do have the ability to sweep over frequency and maybe actually subdivide ways where you're not as worried about getting a Single large one done quickly when you're more interested maybe across a range of frequencies or a range of time or something like that Where you could actually subdivide so it sounds like you're already doing that But so so first of all meep is is a time domain code so you're not putting in a single frequency So it actually automatically gives you more so if you want multiple frequencies You put in a pulse in time and then you're for you transform the result and it gives you all the frequencies in one simulation That's actually one of the big advantages of doing a tiny main simulation in cases like solar cells for example Where you really interested in a broad bandwidth, you know the whole the whole visible spectrum and more You can get that entire bandwidth in one simulation You also can't parallelize over time because this is that the time dimension is serial You know you have you have to do the earlier times before you can do the later times So you can't you can't do those in parallel so you can only really parallelize over Space like you chop up space into into different into pieces and they still have to talk to one another or Over other parameters for example, you know in engineering design Usually not just doing one structure You're looking at a whole family of structures And you want to you know see the effect of this parameter or that parameter and then the radius of this of this Waveguide or the the height of that of that other structure and so you're doing a whole bunch of simulations in parallel Those things parallelize perfectly of course So you've mentioned a couple different cloud providers there Are you finding that the companies you're working with or the people you're working with that? They kind of drive that decision or are you providing a Almost like a hate for the type of simulation you want to do we find this to work better over here What's what's kind of driving that decision about who you know like which different provider to choose? Well to be honest, we're just using Amazon at this point We've been asked whether we Provide our offerings through Google Cloud or Microsoft But we haven't yet and gone in those directions yet only because AWS is really cheap You guys and your listeners might be familiar with spot instances And so this is a situation where you can rent like a multi-core virtual machine for literally pennies per hour And so and the scale of Amazon's infrastructure is just so much larger than its competitors at this point where the cost Kind of considerations are really foremost among our customers at this point And so they want really it's more important for them to access the cheapest low-cost machines versus the most high-performance and most spec'd out computers Okay, I was I was thinking more along the lines of if that there was a an interest in like they had existing data existing infrastructure or Existing agreements because I assume some of this work is proprietary when you're talking about working with companies But that really hasn't come up yet with the clients you're working with they're they're happy to trust you with Amazon People people also run it on their own machines, of course And so we have true can download a compiler People have their own clusters or or run it on you know, if they have if they have their own Supercomputer time. There's lots of people that they do that actually the that that was Everyone until very recently until until Lord event Started up this thing with simpitas. So actually what when when did you start that up? I can't remember simpitas actually started in July of 2015 and It really started because we felt that we had been working on meat for almost a decade And we were really more than a decade more than a decade and in fact even before meat Steven As he mentioned as was working on MPB And so we had these set of really powerful open source solvers And we really felt that we wanted to kind of take them to the next level and to give Enterprises and companies who are doing real product design access to the most cutting-edge Solvers that didn't have these licensing restrictions as you know Licensing commercial simulation packages is very expensive and for some companies the the the budget for the accessing simulation software is a non-trivial faction of their overall cost and so The idea was what if they could leverage the very best in open source software and where we could support it with Consulting and technical support that they could have assurance and being able to deploy these tools And to improve their productivity So when a customer connects to you know, the instances or use whatever tools that you have there and on EC2 What exactly did they get? Do they get you know pre-canned simulations that they just supply the input or can they write their own? Applications and call your routines. However, they wish. How exactly does this work right now? What we provide is a The latest versions of me an MPB pre-installed on a boom to virtual machines and in fact, that's a free offering And so our customers can deploy this Amazon machine image It's a virtual machine with these tools pre-installed size for their particular applications and for them They customize it so they add on their own tools or they add on third-party Selvers to create some custom toolchain and sometimes we provide them with technical support for that but at this point in time the tools are free and They deploy it for their specific needs and we just provide them with some technical support Yes, I just in terms of inputs Basically, once you have the programs you can run it on any geometry in any kind of device you want It's Scriptable actually with there's a funny story I mean, it's scriptable initially using scheme and recently we added on a Python interface And you can also call it from C++ as well. So you can just write you know, right basically write programs in any of those languages that that Control the simulation and input any geometry want and allow it to change as a function of time and extract any information You want from this. It's quite quite flexible. So I have to ask the obvious question. Is it scheme because of MIT? No, so story the story here is You know before me came this other software which still exists. It's just more specialized called MPB and So I was developing that in the back in the late 90s actually the same time as FGW and At that time In most of the simulation codes that we had that were like these large four-trend codes that you have an input usually And people who are four-trend users will recognize this Every time you ran a new simulation you had to recompile the code because the parameters of the simulation where we're put is like code parameters And or it had and or it had some inscrutable text file full of numbers and use an usual for transform as they usually space sensitive That they had to usually had to write a script to write the input file for this this code so I wanted to have my program be scriptable and At the time you know Python in like 1997 was still kind of a You know not completely on people's radar screen for company first computational science and the the the GNU standard for scripting language add-on was guile was this is a scheme Implementation and and at the time that was one of the ones that was one of the only languages that was really designed and documented To be something that you would add on to a C program and and use it to control that that C program You know Python as I said, you know if I started a few years later, I probably would have used Python Or Lua or something like that But those languages really out really on the radar screen at that time. I also of course new scheme because Because I'm at MIT and you know, I took scheme as an undergraduate and actually new scheme when I was in high school as well So I was familiar and coupled with a language. So I ended up using guile and scheme to script MPB and then a few years later in 2003 when we started working on meep It was natural to use the same The same scripting tools I developed for scheme for for controlling MPB is for controlling meep as well So that so that was the interface for many many years was the scheme interface Okay, so let me ask you about Python because Python is all the hotness these days How well does meep play with other Python numerical packages like numpy and like so so Python so so the Python interface is very fairly recent actually there was a another Python interface that was done by another group At Ghent University a few years ago That but we want to send it was a little bit more closely integrated with with the core of meep And so this this was recently started I guess In the last year it was done that and then Chris Hogan who's here as well was the lead on developing that so yeah, it uses numpy it you know, so for example, you can be running a meep simulation and At any point in time you can say give me the fields As a numpy array and then I want to pull or a slice of the fields as a numpy array and then I can plot it with Matt blotlib And so you can use all of the Python tools that uses it uses numpy uses H5 pi the the the hd of 5 Python interface. It uses it uses the hoax in with the MPI Library so so that they can use Python's MPI tools to talk to meep's MPI stuff So it's it's hooked in pretty well So give us a little background here. How did meep get started because if I read your web pages correctly This has been this is a fairly mature project, right? Yeah, it's been around since 2003 was when it first got started And it was actually started by David roundy As people may have heard of him because he wrote something called the dark's version control system Which was a you know It it came before get it was one of the early distributed version control systems It still exists and he's currently a professor at at Oregon State So he started this along with a Couple of colleagues of mine Mihai Benescu and Peter Bermel who's now at Purdue And I got involved very shortly as well So at the time, you know, basically in order to do research in electric magnetism you have to have you know Access to the codes and every group had its own You know musty old Fortran code for doing this kind of simulation and we were no exception We expected two different Fortran codes So David started developing this actually was initially called dactyl For two-footed because it was only 2d and only cylindrical coordinates And he started developing something that would handle cylindrical coordinates and be really Scriptable as a C++ library and then it ended up being so so useful that he and we started adding on full full You know support for other kinds of geometries and it became our main Our main time-to-main simulation code after that So you mentioned that the the code is open source and that you were trying to bring you these Scalable, you know to cutting-edge academic software to the commercial space and make it accessible and that's part of the business around Consulting and making accessible especially for small companies utilizing the power of cloud while also supporting people's local systems But the the licensing I noticed you know FFTW It it says that you know You need a contact MIT if you want to get a commercial license for this So can you clarify like what license is meep itself under and is it under a similar type of arrangement? So meep is also GPL You know the difference with FDW is that FDW is really only usable as a library So if you want to use it and you're in MATLAB or some other commercial code You have to And you don't want to open source your entire commercial commercial code Then you have to buy a non GPL license from MIT. So meep is also, you know plus plus and Python and and so forth but So far people really haven't been integrating into like large commercial packages You control it with little scripts that you never distribute So the license doesn't really prevent you from using that in a commercial setting as is under the GPL Because you know the GPL basically has no effect on you if you don't redistribute the code So if you just write a one-off script to use meep for your device It has no effect on you. So that's that's not so so So the commercial entities don't need a special license to use MPB. It's just to use meep or MPB So art events business model is not Selling licenses. It's you know more selling consulting and other kinds of things You know she can speak to more than I one of the things that really focusing on here as I mentioned is to make the simulations Accessible to a much broader audience and part of that involves Offering the access to the tools through Amazon particularly for companies who don't have their own local clusters But another aspect of what we're working on right now is to make the tool easier to use One of the big challenges with engineering simulation tools is that they require typically a PhD level training to have confidence that the simulations are being set up correctly and When the bar is set so high it makes it very difficult for small companies particularly startups to really gain access to these kind of cutting-edge technologies And so the focus that's simple is is really to try to make the tools much more accessible by automated key functionality that requires accurate simulation so for example Choosing the right resolution in a meep simulation is non-trivial It depends very much on the materials that are involved and the structures that are being used and so Choosing the right resolution Has an enormous impact on the accuracy of the result and so we're working on ways to essentially be able to automate Choosing the right resolution for a given application in order to ensure confidence and typically These involve a lot of trial and error and manual hand kind of tuning which we're developing tools to automate The other aspect is actually running these simulations in the cloud So for example choosing a cluster that's sized for the application in order to ensure a very high throughput is also non-trivial And so to really leverage the the cloud you want to choose a cluster Configuration that's sized just for that application and that's also a non-trivial Problem to deal with particularly for new users who don't have experience And so again, we're developing tools that can be used to leverage the cloud To deploy meep simulations in a way that's not possible today and quite frankly. It's not possible with the other commercial solvers either So what is it about meep? I mean earlier in our conversation you mentioned that there's a lot of other software packages out there and Clearly you're working to make it easy to use make it available in a cloud-based environment have consulting services and things like that But what about the meep software itself? What makes it unique? Why you know other than licensing issues and whatnot Why would I use meep instead of some of the other packages? One of the key challenges with? finite difference time domain solvers in general is the representation of materials on the actual grid the volume the volumetric grid and In very early research that we did when we were developing meep we realized that the choice of the representation of these material geometries Is very much dependent on Things like sub pixel smoothing Yeah, so so so basically These these algorithms work by dividing space onto a grid as the question is what do you do with the material that's at a boundary? You know you have a pixel basically you're just crossing the geometry into pixels And what do you do with a pixel that crosses a boundary between the two different materials? And how do you deal with that accurately? And so one of the things that it had early on was we developed a unique algorithm for for doing a special kind of averaging That makes it much more accurate in handling boundaries. So so it has some unique accuracy features enabled in its ability to handle You know discontinuous boundaries with high accuracy for example, why this is important is that this allows you to reduce the Conservably so typically in order to ensure very high accuracy you really need to crank up the resolution But with this sub pixel averaging method that we developed This allows you to reduce the size of the simulations while still ensuring accuracy And this is particularly important for things like shape optimization where you're varying the size or shape of your Material geometry and you're doing tens maybe hundreds of calculations And you want each simulation to be as small and as compact as possible Because you're exploring a large parameter space for example And so this sub pixel averaging really enables the application of me to these problems that previously required lots of computational resources Yeah, and another unique thing is just the script ability the ability to write You know call it as a C++ library or call from Python hook into NumPy You know the level of integration that that's possible there is I think relatively unusual where you know a lot of the focus Especially in commercial codes is typically I'm making nice GUI interfaces, which are also nice for their own for certain audiences but Especially in a research setting or in a design setting where you need to run lots and lots of different variations on the given design It's really powerful to be able to program it. I think we keep hearing one of your developers there in the background as well Yeah, my My one of my well my dog is in the background So she comes with me and that to the office a few days a week and she's a little restless that you know not being Not not being able to play with me All right Well, let me ask you a question that I ask all development projects on the podcast here is what version control system do you use and why? So initially we use darks and it was because David roundy started Started neat and he wrote the darks version control system and when I use that for many But and I actually like darks a lot, I think it has a much better interface in many ways than get but at some point the advantage of being able to use GitHub especially So overwhelming that we switched over so I transferred all the history from darks over to get there's nice nice tools I'll let you convert repositories one way or the other Actually, that was a little tricky because we were using dark so early on from such an early version That the darks to get migration tools didn't actually handle the early commit So I had to patch it a little bit in order for that to work So nowadays it's all get it's on GitHub We use the GitHub, you know version tracking when you have to use the the Travis CI all the usual All the usual tools that people use these days Okay, so since me biz open source I assume you have a bit of a community behind it How do you guys accept github pull requests? You know, how does someone get involved in the meat project? Yes. Yeah, so we do accept pull requests. I mean the way you get involved is the way you normally get involved with something on github You know you submit a pull request. You usually I would suggest First following if you if you plan or planning a feature first filing an issue Get some feedback on whether you know on the design of that feature How it fit in with other plans and once there's the green light then to go ahead and and submit a pull request with the code implementation and Yeah, most of the large Patches at this point is to have been done by people who work directly with me or and or art of am but You know that there's been smaller contributions and and there's now that we're on github Which is actually a relatively recent thing in the last couple of years There's trying to be a wider community of people that are that are actually contributing directly patches and things whereas before You know people could submit patches of course before with even without github But it was you know much rarer I think for people to get involved that way It's it's it's much easier now that there are all these nice online tools for tracking contributions And where can people find contact information for your new company? You can check out syphidus.com. Syphidus is actually a reference to simulations being an impetus for new discoveries and technologies S-i-m-p-e-t-u-s.com Okay, well, thank you very much guys. Thanks for your time. Thank you. Thank you guys