 Okay, so my name is Monica Wahee and today we're going to talk about the topic of migrating someone from SAS to R. And why am I saying somewhat? Well, it's because if you are a SAS user like I am, you're probably a SAS user because you're in a field or you're at a workplace where they have either a SAS server or they have a bunch of SAS data. So the reason why data in SAS format is important is that when SAS reads in SAS data that's in SAS format, whether it's PCSAS reading in a SAS 7bDAT file or you're on the server, SAS is happier because SAS's data formats include a lot of special information in them, like indexing information. You know how like in Microsoft Word, there are code, secret codes in there that you can look in if you want. Well, SAS does that with their data sets that include some secret code that SAS can decode. So then it goes really fast. You want to stay on the SAS server, anything you do, proc freak, whatever, it goes fast. The problem is SAS is not really good at everything. Like it's really good at analytics, it's not good at graphing. Now, every time I talk to SAS, they say we're good at graphing, you know, but let's just face it, they're not as good as like Python or R or anything. So the bottom line is if you're a SAS user, you're probably always be a SAS user. Like I don't think SAS is going to go away, even though there's a lot of better things for better things for certain functions. SAS really is that big analytics engine. Like I challenge you to do like some big survival analysis model or an AI model and big data using like Python or R. Like it probably won't even work. You know, like in fact, the customer I was just working with, she has some data, some microbiology data, and she was having trouble on Python. Like she's like, Monica, it takes hours to run. And I'm like, yeah, I mean, it looks cool when it's done, but that's the whole point of SAS, right? That's the whole point of SAS servers you don't need. If you've got big data, genomics, microbiology, whatever, you can do it in SAS. But what if you want to make it look beautiful? And that's actually what the customer's doing. She's like, I want to make it look gorgeous. So that's why she's using Python. So we can't, if you're a SAS user, you know, SAS doesn't have to start crying, you're not going away. But we can't really just keep using SAS only for all of these projects. So we can't break up with SAS, but we can't really live with it either. You know, like, we just can't use it for dashboards. I'm going to make that argument. I just don't think it works. The IO is not that good. So how do you envision the future? Like the future is going to have SAS in it. You're going to be using SAS, but it's going to have R and Python and probably other things to have low who knows. What does that even look like? And so that's what this discussion is about is how are we going to, and I'm going to talk specifically about R because a lot of SAS users really like gravitate towards R, myself included. Python's good. I don't know Python. It's good stuff. But just like starting with R, if you're like, I'm a total SAS user, I'm going to try R. That's probably a good place to start. And if you learn R and then you want to try Python or whatever, you know, that's fine. So that's what we're going to really talk about today. But before I continue, I wanted to let you know about a free online workshop I'm holding. If you're one of those people who likes free stuff, this is for you, right? So the workshop is a, it's in three sessions, Monday, Wednesday, and Friday of next week. Each session will be about two to three hours. They're at noon Eastern time. That's kind of the best time for internationally, my customers. So, and me too. What will happen is you'll get the course on the course management system, like there's, there's a online course, and I'll give you that for free. And we're going to be, I'll be basically teaching you that course, but here's the difference. The course is about applications basics. So the course really is aimed at people who are data analysts like me, but they didn't come through a business or a computer science college. They went through like public health or, or biostatistics. And so they never really learned about the construction of computer applications. And this is something you really didn't need to know 20 years ago if you were doing data analytics, but now you need to know, right? Because we have had these health apps, you know, epic medical records as an app, these are all apps. And so it's really important for you to understand the basics of applications in general, you know, we know SaaS, we know what we use, but what are the basics of applications? So what you're going to learn in, if you sign up for this free workshop is that the basics of application basics for SaaS integration, because I'm going to, even though the whole thing is about applications, I'm going to focus it on what if you're trying to integrate these things with SaaS, because that's actually kind of what we're doing a lot of the time. So if you sign up and show up for this workshop, you're going to have a lot of fun, because we're going to have a lot of discussion. It's, if you actually ever take management classes, a lot of them are case studies, because you have to have a lot of discussion, it's not like you can just follow some step one, step two. So I really encourage you to sign up for it. I, if you go to the original event on LinkedIn, which I kind of screwed up, I said it was LinkedIn live, and it's not at Zoom. So I screwed that up. But if you go to that original event, there's a link to sign up for this workshop. And there's also a link for these slides. So please, you know, if you want to learn the stuff, sign up, we'll have a ball. All right. Now back to our regularly scheduled program. So the question is, can scientists break up with SaaS? And I already just basically told you though. So we may have our disagreements to SaaS, but we probably can't break up with SaaS. But we can find better ways to get along with SaaS. Like we can find ways of offloading some of our work from SaaS to R, where R is better at it. Okay. So one way is by using open source R for some things that SaaS is not good at, but still keeping your SaaS shop. And so the two case studies I'm going to cover today in this little discussion are the ones on the slide. So the first one is called a SaaS to R success story. And I encourage you to actually go there. And this is there's a recorded lecture. It's a little older. I think it's maybe from 2017, but it's so good. So Dr. Atkinson is actually a professor. Let me see if I put it in here. If I went over there. Yeah. So she's a professor at a research hospital. And this is what you'll see if you go to that and watch a video. So she was ahead. So I'm going to just sort of tell you what the video says. So she was ahead. Maybe she still is of a biostatistics core. So what does that look like? Well, when you have a research hospital and you have a biostatistics core, you probably have a SaaS server if you've had it for a while. So what's happening? Well, you have people at the research hospital that are going to do some studies like maybe they'll do a lab study, maybe they'll do some study of their patients. But each time they do a study, they read a research protocol, they put it through the IRB. Maybe there's some people from your core involved doing the statistics part. And so what will happen is whatever data they gather in the study, it's going to end up being a SaaS data set, just a data set sitting on your SaaS server. So that's what she was dealing with. She had all these SaaS data sets sitting on a server from a long time ago. And recently, and of course, probably the recent ones are a little more complicated. So she starts her lecture by explaining that it was the next year and SaaS, the company had come to them with the bill. So what does it cost? Right? Because if we're used to using R and Python, like what does SaaS cost? Well, it costs different amounts depending on what you're doing. EC SaaS is a standalone application, just like Microsoft Word. Like you can load it, you can install it and just run it. And I don't know how much PC SaaS costs because it depends. Like you always make a deal with SaaS for licenses. But I will tell you once in 2004, I was at a research institute at the University of South Florida. And we paid, my research institute was a nonprofit, paid $10,000 for one seat of PC SaaS, just the SaaS component and the stack component. Okay. And this was in 2004, one seat at a research nonprofit, those components $10,000. Now, when I worked at the Army, which is 2008 to 2011, every year, SaaS would send us 10 free PC SaaS licenses. Okay. The problem is I was running a data lake. So I needed what Dr. Atkinson had. I needed a server, basically. I didn't need PC SaaS. So, but I'm sure we could not have afforded the server. I mean, think about it. One seat of PC SaaS base and stat was $10,000. So I really don't know how much SaaS costs, but I know it's huge if you have a server. So that's what she starts her lecture with is that SaaS came to negotiate what components you want, what you want, what you want. And she was like, we cannot keep paying this price. I don't know how much it was, but it was just too much. So she's like, but we have to stay with SaaS. So is there some stuff? Maybe there's some components, or maybe we don't need as many seats. And so she started deciding, what do we do? And her solution, and I again, encourage you to watch the video, her solution was to set up R for people who wanted to use R. So it's not like push any SaaS users into using R, they don't like it, but the ones who want to try it, set them up with R. And what she found is that in doing that, it actually was really challenging because as much as I'm complaining about how expensive SaaS is, you're paying for something with SaaS. You're paying that you can call customer service and they will come and solve their problem. And if you Google all over the internet, you will find that is true, that SaaS has a very good customer service and they will, they'll spend hours with you. I mean, I know this from the army because I would spend hours with them sometimes at that when I was working at the army. So what do you have with R, right? So she was saying that they needed to set up an R help desk, like they needed to figure out who's like the smartest about R there and create a R help. They had to create an R department basically in their vial assistance card. You know, it never occurred to me, but when I listened to that, I was like, oh, crap, that's probably what I would have to do, you know. And so the bottom line is if you do create that R core within your SaaS biostatistic score, you can start like finding out what people want to do in R and what they don't want to do in SaaS and start removing some of your SaaS functions and moving them to R. Now, my next case study, actually, I'm going to quickly, I think I'm going to go to this. Well, this was her findings. I wanted to make sure this was her lessons. You need to set up your own internal R help desk. You need to set up your own internal R trainings. And you need to organize your R work in such a way that everyone is following a standard. So, you know, SaaS standardizes everything. So what package are we using? So you might be familiar with like a bio conductor is a package in R, but it's like got a bunch of commands in it. That's different. So if you're using bio conductor, what are you using for what? Like you're basically kind of making standards, right? And you know, SaaS, like I said, as a company, you are paying for something, you're paying a lot of money, but you're paying for them to really come out and help you. So I used to work with some engineers. When I would talk to them about SaaS, they were like, do you have a way of converting that to SQL? Can you run it through a conversion program? And I just laugh out loud. I'm like, hello, I'm your SaaS SQL conversion program. So you need actual data scientists and staff who can act as SaaS to R conversion programs. And because they're totally different languages, and if you're trying to do the same thing in both, you're going to have to start over and figure it out. And that's actually kind of the next case study here. So I went to an R conference. It's actually called the Earl Conference, E-A-R-L, I forgot what it is. I forgot what stands for. But it's one of my favorite conferences. And they usually hold it in the UK, and they usually hold it in Boston, and it's, or Cambridge, and it's been in November. But I sort of lost track of it. I don't know when it's happening this year or if it's still happening after COVID. But before COVID, I went to it. And one of the people presenting was Nick Crane of Mango Consulting. And she was presenting this R package. Those of you not familiar with R. So R is a little like SaaS in the sense that there's a base. And you're always downloading R base and installing R base. It's nothing like installing PC SaaS. It goes really quick because R base is really, really, really lean. Okay. And there's two ways you can run it. You know, like I'm talking about server SaaS and PC SaaS. Well, in R, you basically have two interfaces. You have R GUI, which is really simplistic. It basically kind of looks like PC SaaS. And then you have RStudio, which you can do the same things code-wise. But RStudio is more for envisioning like it's an integrated development environment or IDE. So you're making dashboards and stuff. You want to use RStudio. But if you're just doing like a sick regression or cleaning data, R GUI is good enough. So whether you're using R GUI or RStudio, base R is what you download and install. And now you got base R, right? Each thing you need to do that base R doesn't do, you can download and install a package. So that's why you think of the packages as SaaS components, but you know, analogous to, but what's cool about it is the packages are free. Okay. Whereas the components cost something. So when you choose a SaaS component, you're like, oh, we want this geographic mapping, whether you're making a big decision. When you choose an R package, you're not making a big decision. You're just trying a package to see if it works. Now R packages, you know how like sometimes they'll have a SaaS macro that's an official macro and sometimes they'll have an unofficial one? Well, R packages are like official macros to R because R packages that are on the CRAN server see this repository here that says CRAN. This is a screenshot from the documentation from the R package that Nick Crane worked on. And she's the lead author basically. And so that CRAN is the repository, the official R repository for these libraries. And so I'll get to the point here. So Nick Crane was presenting at this conference, she's from main code consulting, about how she and her co-authors developed this package called SaaS Mac. And I was so intrigued, like R is doing a SaaS thing. So I was like, what does it do? So she was explaining, and this is really painful, okay? So I apologize in advance. They live in the UK and the NHS, you know, National Health Service is, you know, part of the government. It's not like the US where I am, where it's not part of the government. So the government, most governments usually adopt SaaS. Well, apparently, I don't know what part of the government they were working on or what part of SaaS, but there was a SaaS operation at the government and their consultants, consultants, they're actually our consultants, usually that's what they sell themselves as. But the problem was they had to offload some of their SaaS functions to R, okay? And so I actually, Nick presented about this pack package, but I actually talked to her afterwards because I was so intrigued by this. I was like, how did you do this? You know? And she's like, she's not a SaaS, well, maybe now she's a SaaS expert, but at the time she was like, Monica, I'm just doing, I'm figuring out what SaaS does in the context of what R does, you know? And she's like, I'm just telling you, data steps, there's nothing like it, right? And if there are any functions in macros, they're usually not documented in SaaS, and she would have to sit down and document that whole macro, figure out what it was like and rebuild it from the ground up in R. And I was like, you're kidding. And she's like, so the reason I made the package SaaS map is what SaaS map does, it's, I'll read the description, it's a static code analysis tool for SaaS scripts is designed to load, count, extract and remove and summarize components of SaaS code. So she's like, she's like, Monica, I don't know SaaS very well, but it'll do like count the prox. And so I'm thinking, wow, yeah, you could kind of like, if you knew how many times proc freak was called in something or how many times a macro was called, like if you just took some long SaaS code and had that metadata about it, you could begin to think about where, how you're going to rebuild it or what you're going to rebuild, right? Because I make very modular code in SaaS, like if you take any of my like LinkedIn learning course or whatever, you'll find my code is super modular. And if you've come to any of these, any of these lectures that I've been doing in the series, you'll also find that SaaS has this interface, depending on what you're using, like, if you're using SaaS enterprise guide, or I think enterprise miners like this and data integration studio, it has this interface where you drag an object over like, like a summary or like a proc freak or whatever. And then you, you can configure that object and put code in it. Okay. And so it's kind of like, before you even get to an interface like that, you need to plan what you're doing. And I feel like this is kind of like that, like she was kind of planning what she was doing, because SaaS may do a lot of functions, but there may be one in the middle that you just don't need SaaS to be doing, like you could offload it to R. And literally it's this painful, like if you're rebuilding some of your SaaS into R, you're starting at the beginning, and you're going through all your code. And I really hope if you write SaaS code that you're super readable, that you're super neat with it. Those of you who know the poem, the Zen of Python, there's a line in it, readability counts. Oh my gosh, it totally counts in SaaS, because you're going to have to go back and sort of figure out what each piece of your code does. Okay. So this is sort of the main take home that I want you to know is that as the data flow out of SaaS is not stopping. So old SaaS shops can't afford to keep SaaS. I'm just telling you that and I'm sad. In fact, I talked to SaaS the other day. And so it's almost easier to tell you what SaaS is good at now and what you should adopt. So what I was talking to SaaS about, I was talking to somebody at SaaS, I was realizing that Athena Health is an example. I don't know Athena Health or I kind of read about them online. I was really interested in them. The reason I was interested as a CEO posted something that I thought was so interesting. And he was talking about what his post was about, was about how healthcare is all screwed up in the US. But one of the things that he believes is a solution and he instantiated in Athena Health is he thought there's a way to make like an outpatient medical records, web based kind of thing for outpatient, for like outpatient clinics that help them do a best practices, sort of build the best practices into their medical records. And so if you look up Athena Health, like that's what they do. So if you're part of the Athena Health practice, you're an outpatient provider and you're doing that. So let's just think about the architecture of that application of Athena Health, I just described, which is, if you don't understand it, you got to come to my free workshop. So Athena Health is all architected on the web, right? So let's say you want to do higher level analytics. Like for instance, let's say when people scheduled an Athena Health for like an MRI or something or maybe not an MRI, maybe like labs or, I don't know, that we wanted to run an AI algorithm on that, get the result and do something with that result. How are you going to do that in medical records? Well, if you buy SAS Viya, which is the cloud version of the analytics, and you hook it up to Athena Health, you could probably do a handoff. You could probably just hand it off to Viya, hand it back. Those of you who are like, oh, I'm probably going, oh my God, that's so beautiful, right? Like you don't have to install a server. You don't have to install BZ, you don't have to do anything, right? The thing is that probably if you're old like me, you didn't think about that because we're so used to not having cloud-based software, right? Like we're used to trying to dig data out of, I don't know, laboratory machines or something, right? So the bottom line is that old SAS shop, I shouldn't say old legacy SAS shop. So if you set up your server or you set up Enterprise Guide, which is a separate application, or you're running your PC SAS like I was doing the Army, and I've got all my macro set up or whatever, you probably aren't going to be able to keep doing that that way. And by afford, I don't necessarily mean that SAS will charge you a lot of money. Sometimes you can afford it. You may not be able to afford the time it takes. Like that was the problem I had at the Army is our PC SAS, it went slow when you're putting zillions of records through it, right? Like it was fine. Like our PC said you put thousand, we put four million records in it. You know, it was no problem. But you put eight million, right? And so, you know, I couldn't afford to keep it. And I stopped working there in 2011, but I was trying to tell them we have to store our data in SQL. Like we have to just be pulling small amounts of data like making a view in SQL and just pulling down some in our PC SAS to analyze it. Like that was my like mandate solution. But I couldn't, in 2011, I could never say let's start a SAS server, because I would not be able to pay for it in 2011, right? So that's, I hope I'm characterizing you with the problem. The problem is not so much SAS. SAS is good. Like, analysts are awesome. Vias, unbelievable. In fact, that's why I was talking to SAS. He was going to try and give me like a training version of Vias so I could try it. So cool, right? But if you're set up in the old fashioned way, like that Dr. Atkinson, you're going to have to do something. You're going to have to come up with a creative solution. And SAS won't tell you how to do that. You're going to have to figure it out. So if you've got macros in SAS and you don't want them in SAS anymore, you're going to have to rebuild them from the ground up. So you might want to start documenting them. If you think offloading some of your data into SQL is a good idea, one of my lectures, which unfortunately I forgot to record, was on SAS access, which is an API that you can use, even from PC SAS, and you can tunnel into a SQL or Excel. You can tunnel into anything, act in snowflake, and get just the data out you need, right? Of course, that doesn't work if you're running a server. Like Dr. Atkinson, she's like had all these disparate data sets, I guess we'd call that a data lake. And I don't know if she could have done anything with that. You know what I mean? And so you're going to have to sort of think about, well, what are your data needs? Are you really serving data? Like with her, I would have probably archived a lot of that data. I mean, it's easy to tell people what to do with their shop. But the bottom line is no one wants a new SAS shop, meaning a new physical server setup. People want new Viya. And I'm thinking of the old Epic people. If you had Epic from early 2010, and you set up a SAS server then, you probably wish that you didn't, because now you could just use Viya on Epic, right? So people want to adopt SAS Viya, and they want to adopt the analytics tools from SAS. SAS says AI. But then how are you going to work it into your pipeline? So I guess this is not a very good lecture because I'm just blowing open this whole question about what are you going to do, you know? And what I would say just as the generic approach is the first step is look at what you have in SAS, like datasets and functions, just make a list of them, and look at what SAS stuff you have, like what components, whatever, or server. Make a list of all these things. Then the first thing you ask is, do I need all of this? Are there data I can archive forever and just put it on the safe? Are these functions, do I need all these SAS components? Do I really need to do all this? And if there's anything you're just not really doing anymore, just get rid of it. Just say I don't need it next year when I do the license for SAS, okay? Or, you know, clear off to your server, maybe I don't need anymore SAS. So that's the first thing. The next thing is, of the stuff that's still less on list of your SAS stuff and SAS functions, ask is, is there anything on here that would be better to offload to R? And once you identify those things, come up with a migration plan to just migrate those things to R and then do that. Once that's done, it's time to redo this analysis. It's time to look at the list of what you've got in SAS. And if there's anything you want to get rid of, get rid of it, not using. And if there's anything you want to keep, ask yourself, can you rebuild it in R or migrate it to R, move it off to R? And I say R, it could be Python. Remember R and Python are open source. That means they cause $0. It doesn't mean they really cause $0 because like I described to you about Dr. Atkinson, you know, the amount they're paying for SAS licenses, they're getting support. You pay $0 for R, you don't have to pay a lot of dollars to get people to learn R and to support R and to make a help desk and whatever. Still at the end, if you do what I say, you do this inventory thing and you carefully do that stuff, you're going to get the best of both worlds. You're always going to get the best bank for your buck from SAS and R if you do what I say. But what I say is not really that easy, right? So that's why you want to show up to my workshop because then I can show you what I'm talking about when I say you take this inventory and you decide whatever. So thank you for coming to this. Here is my contact info. And if you weren't here at the beginning, let me just remind you that I'm holding this free workshop. It's online. It's all on Zoom. And it really is free. Like you just go to the LinkedIn event, which I kind of screwed up setting up, but there's a link on there where you can sign up for this workshop. So what do you get? Well, the workshop is three different sessions. Each will be two to three hours, depending on how many people sign up. And you can see it's August 7th, 9th and 11th. I picked a time that tends to be good internationally, so I'm sorry if it's not good for you. What it is is I have an online course called Application Basics. And the purpose of the course is to teach people like me or other people who learned biostatistics or learned data analytics, not from the business college, not from the computer science college, but from another college like Health or something, where they don't really go over application architecture or how do applications work generically. So that's actually what the online course is about if you were to just pay for it and take it on your own. What we're going to do is I'm going to teach it as a workshop, and I'm going to focus on application basics for SAS integration, which is kind of what we were just talking about right now, is I was saying, okay, well, R is a different application than SAS, and so how do you integrate them? Well, we're going to talk about everything, like any type of application integration, like if you've got a health app and you want to take data from the health app and put it in SAS and analyze it or put it in R, that's what this is, it's sort of generically about that. So please, if you're interested, sign up for the free workshop and we'll have some time to actually have discussions about these things I'm talking about, better ways to do your pipeline, better ways to offload functions from one program to another. So that's what that's about. And so this is basically the end of my prepared material.