 The thing that will be different in this month is that we are having to bring it up to the last week and this is going to be the second week of this month, followed by the next week that is going to be in July, we will talk about that later. Please welcome Delaney from Quantovian, he is a season speaker, not air season but we will try to get him most season with our new tips at pie data. So he is going to talk about pie plan in finance. So over to Delaney. Thank you. So long story short, I work for Quantovian and Quantovian is a crowd source quantitative investment firm run out of Boston. For those of you who don't know, the whole deal with Quantovian is that we are trying to democratize quantitative finance and make it so that anybody can get into it without having this huge barrier to entry. The first thing that we did at Quantovian is we said, well, what tools do professional quant, professional quantitative finance managers use? What data sets do they use? And then we said, can we build those same tools and data sets and then open source them? So again, just in case anybody doesn't know, the thing that makes us different from a lot of other finance firms is we have a github and feel free to check that out at any point going forward. So a lot of what I show you today is actually open sourced on this github and you can kind of actually go through and look at the code that's producing the results and, you know, modify the code and make it better or worse, it's up to you. So as a very quick overview to provide to some context, the way Quantovian works is we provide you with kind of all the tools you need to research and evaluate your own strategies. And we let you do this in a very end-to-end way. So I'm not going to be talking about algorithms tonight. I'm going to be talking about the research component behind algorithms tonight. But as an example, here is what an algorithm might look like. So this is Python code that trades on the stock market and, you know, produces buys and sell orders depending on, you know, what the data is coming in. And then you can see how this performs over time historically. I'm not going to run a back test right now because it takes a few minutes. And we also provide, like I said, data. And we partner with the best users of the platform and we allocate to their strategies. We say, hey, you have a really good idea. Let us allocate to your strategy and we'll pay you a percentage of the profits. And the idea being that, like, we can open up that field to people who would not have their own capital to commit to their own strategies, may not come from a finance background. And so a significant percentage of our users are actually just, like, programmers, engineers, people from a science background who just kind of would never even veered path finance on the highway in a normal life. So that's the platform. And that is the end of what I'm going to say about the platform. Feel free to check it out on your own. There's lots of different parts now. What I'm going to talk about tonight is factor modeling, which is, I would argue probably kind of the current state of the art in finance right now. At least quantitative finance. And what I'll be talking about is content from the lecture series. And the lecture series is a large part of what I work on. So if you go to learn and support, learn, there's a tab here for lectures. And again, all of this is free and available. You can afterwards go home and look at it and realize how terrible it actually is and then write me angry emails. But you can check it out on your own. And it's all implemented in Jupyter Notebooks. So I'll just go ahead and drop right in to the first notebook that we're going to talk about. And so if you did this on your own machine, you would also get a copy of this Jupyter Notebook. And as a show of hands, who is familiar with Jupyter Notebooks? Has worked in Jupyter Notebooks before? Okay, so for those of you who haven't, Jupyter Notebooks have kind of taken industry and academia by storm over the last couple years. And what they are is for those of you who have used Mathematica before, they're a cell-based environment where you can put in code, you can put in LaTeX code to render math formulas, you know, text and have the code and all the results displayed on the same page. So it's a way of saying kind of combine the documentation with the analysis and just make it one package. And then if you're a researcher researching some interesting topic, you can put it on one notebook and then just send that notebook over to another researcher. As an indication of how much popularity these things have gained, people here are familiar with LIGO, L-I-G-O. They discovered gravitational waves recently, yeah? Anybody not familiar with the recent discovery of gravitational waves? Because if you're not, you should really, really check it out. It's kind of one of the most significant physics discoveries of, I would say, probably the century, just confirming that this thing that we think might be true actually is observable in the real world. So LIGO, if you look up LIGO-I Python notebooks, they use it for all their research. So if some of the smartest physicists in the world are using this stuff, it's probably okay for us as well. And not that I'm implying that we're not the smartest physicists in the world, but I'm a statistician and we're not the smartest physicists in the world. So you can see here in the documentation for the code, they tell you all about how to get at the data. So for those of you who were here last time, I think it was back in March that I came, I think I talked about Paris Trading back then. Paris Trading is what a professional quant might refer to as, like, cute, and this is what a professional quant might actually do on a day-to-day basis. And to give you an idea of, like, how much interest and active research there is going on in this stuff. So there's, I believe his name is Andrew Eng, A-N-G, and he was a professor at Columbia and he had written what was kind of known as the current go-to book, the Bible of factor modeling. It was kind of informally referred to and it described kind of the full process of how to do this stuff and a lot of different things you could run into and how to solve them. And I was actually on a phone call with him like many months ago and just talking about this stuff and then I was at a conference and I was talking to someone from BlackRock at a conference. BlackRock, for those of you who don't know, is the world's largest asset manager. It runs, like, between five and six trillion dollars, depending on the market's doing. And I was talking to someone from BlackRock and I said, you know, somehow Andrew Eng's name came up and I said, how do you know, like, what's your connection? And they were like, oh, he's working for us now. And I was like, okay, so, you know, that's normal though. Academics kind of move in and out in finance. And I looked it up to see, like, if there was any news of this, BlackRock had given him a portfolio to run based on this technology. I think it was about 300 billion dollars, sorry, of money to put behind this type of portfolio. I think that was the number. So there's a lot of capital being allocated to this type of approach right now. Here's the idea. Let's say that everything is a return stream. So everything in finance can really be thought of as a return stream, because everything in finance is an asset. You can buy it, you can sell it, sometimes you can short sell it, which is the opposite of buying it. Everything is an asset, and assets have a value at every time point. That's the price of the asset. That's the price time series. And so if you say, well, what's the change in value? That's the return time series. And that's what people in finance are really interested in a lot of the times, is what is the return over time for some asset? And so what you can do is you can say, let's say R sub i is the return at time i. And what we're going to do is we're going to use a linear regression model to try to say, how can we break down this return? What is this return made of? Because nothing in finance exists in isolation, right? Some asset is going to have some amount of correlation or dependency on other things. And what we're interested in is saying, rather than looking at an asset as a black box, let's actually try to figure out what's going on inside there and what other things are driving the movement of this asset. So it's a very simple example. You could say, well, the movement of an oil, a mutual fund or oil sector kind of broad index fund, is going to be comprised of the movements of all the oil funds that are in that sector. And that would be kind of the model there. In practice, you don't necessarily know what makes up some asset. You might have some new strategy that someone's proposing to you and it has a bunch of returns it spits out. You're looking at a house, the value of a house over time, and you want to say on average what is contributing to the movement in the house pricing market. And you can kind of, this is a very general technique. So the idea is that you have some factors here, the f sub 1, f sub 2, et cetera. And you make this linear regression model where you say, at every time there is a row in my data set and the outcome variable is the return of the thing that I want to model and the input variables, the independent variables, are all of the returns of my potential influencing factors. And what this does is allows you to try to measure kind of the, it's not going to be exact, but get a general sense of how much things are influencing this asset you're interested in. And it turns out that once you have set up this type of model, you can go in a whole bunch of different directions. So I'm not going to go into the details of say like arbitrage too much, because I think that there's something you just kind of have to sit down and read and then read three times and then make sure you understand it. And I want to get to some cool examples. But the general idea is that let's say that you have this model and you are assuming that this model works, right? Let's say that like, you know, how many of us are computer scientists? Okay, not that many, never mind. I was going to say, for any of you who've done computer science or math, you know, say abstract out this part, let's assume the first part works and now let's move our thought process to the second part. So let's assume that this model works. And we say, okay, well if this model works, then I should be able to get an expected return based on the performance of all the factors. Because if historically the price of my asset follows these three factors, these three factors pretty completely explain the movements of my asset, then I should be able to compute the expected price by looking at what these three factors are. Okay, I'll go the other direction, you know. And so what you can do then is you can go that direction, you can say I'm going to generate kind of what I think the price should be using one of these models and the movements of other factors. The other thing that you can do is get at risk premium. And the idea here is that everything in finance has a risk. Associated with it. What is the risk of holding that investment? And in equities and a lot of other time series driven fields, what people generally do is they'll say that the risk is equal to the standard deviation of the return time series because the more crazy it is, the higher the risk it is to hold that asset. And so what you can say is, well, let's look at the returns you can get from this thing compared to the amount of risk that you take on holding that thing. Those are known as the risk adjusted returns. You're adjusting the returns you can get for risk. And oftentimes those are actually kind of the important thing to look at. The other thing you can look at is you can say, well, this is a big if, but assuming that markets are efficient, okay, assuming that markets are efficient, then everything should be correctly priced for risk. So you shouldn't be able to kind of get free returns without taking on extra risk to get those returns. And so you can go the other direction. You can say, well, I know the risk of each of my factors and the expected return of my asset should be equal to the risk of each of these factors like combined, depending on how exposed I am to them. Because let's say that the market has a risk of one, whatever that unit is, and I'm two times the market. I'm like 2x the market in this model. If the market goes up 1%, I go up 2%. If the market goes down 1%, I go down 2%. So I'm twice as risky as the market because I have twice as much of the up and down motion. And so therefore if the market is efficient, you should factor that in, and this coefficient will reflect that. It will be like 2x market motion, therefore also 2x the market risk. And the return should also be equivalent. This should be 2x the market. If you're picking on twice the market risk, but you're not getting on expectation twice the market returns, then just invest in the market, right? So that's the idea. And hopefully you guys can kind of see a lot of the different kind of interesting corollaries and directions that come out of thinking of returns like this. The last thing I just want to say is two things, which is one, you can think of this, for those of you who are familiar, you can think of this as a form of signal decomposition. So for those of you who have done fast Fourier transform analysis, it's similar. You're taking one signal and you're saying, let's not treat it as one signal, but let's kind of try to break it down and look at all the different component signals that we'll be making it up and then analyze those independently. And then the other thing is that people, what they do is they say, okay, let's assume that the markets are efficient and we know or we can estimate the risk attached to each factor. And so we should be able to price each asset based on its historical exposure to these different risk factors, big if. But if you assume that, then what you can do is you can look for mispricings, where the returns are not lining up with the amount of risk that is being taken on, and those mispricings are known as arbitrage opportunities. They're like things where the market isn't really being efficient if your model holds correctly, and those are the cases in which you can kind of take advantage of inefficiency. Okay, any questions about that before I go through a super simple example? There will be questions later. I guarantee it. So what we're going to do here is we're in the research environment. We are going to get pricing data for this asset HSC. We're going to get pricing data for Microsoft. And then we're going to get pricing data for the S&P 500, the US broad market. And we're also going to get pricing for an ETF that tracks the Treasury return, because we don't currently have Treasury return data in Quantopian. And what we're interested in is computing the factor exposure as the market being one of the factors and the Treasury return being another one of the factors. And we're going to look at how each asset is kind of dependent on the motion of the market and dependent on the motion of Treasury returns. So we just put this into one data frame and then we run a linear regression, which is pretty easy in Python. It's a few different lines of code. And we can see here that for our first asset, HSC, over the time period we looked at, which was 2014, June to 2015, June, it has a dependency of about 1.7 on the market. So when the market goes up 1%, it goes up about 1.7% on average, assuming the linear model works at all. And then when it goes down, it goes also down about 1.7%. And then the dependency on the risk-free rate is higher, but that's because the risk-free rate is much smaller and so small motions in the risk-free rate have to be magnified a lot to reflect themselves in asset returns. And then slightly different numbers for Microsoft. But for anybody who's done any of my other lectures or attended any of my workshops, one thing that I am constantly desperately trying to convince people of is that this is often a very bad way to look at things just because kind of fundamental statistics would tell us that this is not really useful at all. You can't just look at one slice of time and get one number out of it and say, oh, I think this is how the world works. So we're going to go one step deeper and we're going to say, okay, well, let's compute this on a rolling basis. So that's what we're looking at here. We're looking at each time step. Let's look at the last 100 days and we're going to do a linear regression on the last 100 days and we're going to compute the same numbers and see how they move around. So for those of you looking at this, the green line is the dependency on the treasury returns. Would you have any confidence predicting future dependence on treasury returns? I wouldn't. If anybody in this room would, you're either some kind of genius or haven't read Nate Silver's book yet, which I recommend. What about the blue line? Would you have any confidence predicting this asset's dependency on market movements? Okay, what if we zoom in on the blue line? Anybody going to change their minds? I'll take bets. So this is just the same plot for the second asset, but let's zoom in on the blue line. So one thing that's important to note is that everything is inconsistent depending on the scale. You can zoom in enough on anything to make it inconsistent and you can zoom out enough on anything to make it consistent. And this is actually one of the common ways in which people kind of intentionally or unintentionally mislead you with data. It's all about scale, right? So let's say that you were completely okay with the dependency being anywhere between zero and two. Then you might say, you know, this is still okay to me. Historically it just has never left that window. I'm okay with that. But maybe your strategy is more precise and you need it to be between, you know, maybe more of like a 0.1 error, in which case this is deviating a lot more than 0.1 historically. So that's an indication maybe we should not be so confident in our results. Just something to think about. The final example in this notebook is actually just what we do is we do the same model, but then we offset the inputs by a month backwards. So now we're looking at using this month's market and treasury returns to forecast one month forward price and doing that historically and then saying, okay, well now we have, you know, the ability to take today's market and treasury returns and try to guess what this asset's returns will be one month into the future. I'm not going to spend too much time on that. I want to move on to the next notebook, which I think is more interesting. But again, that's an example. You can check it out. If you're interested in doing any of this analysis yourself, again, the code's all there. You can rerun it. The goal of these notebooks is to kind of remove the jump between learning and doing. So as you're learning, you're also like learning the code. You can kind of graduate from one of these notebooks actually able to run this analysis. That's the goal. Okay, any questions about this before I move on? Yes. So we do a test of say 100 days and find out how volatile the deviation is. How different is that compared to looking at standardization of the factor? Standard deviation, and this is something that I talked about in my other lectures, looking at standard deviation assumes that you're looking at a normal distribution. And sometimes, well actually pretty much all the time in finance, these things are not normally distributed. So it's actually a little dangerous just to look at that one single point and say I think this reflects the spread because in finance, finance data sets are famous for not being normally distributed, having big spikes in other directions. So you have to be careful. And I think you still want to look at what's actually going on to look for warning signs. Okay, good question. All right, so the next thing we're going to talk about is fundamental factor portfolios or fundamental factor mimicking portfolios. And hopefully this will make some semblance of sense. It may not. Okay, so we're generally okay with factor models. We kind of get the idea we're reducing a return stream to a set of dependencies on other return streams. So one of the things that we may be interested in is saying, okay, well, let's say that I want to look at my dependence on market cap. That's a famous one. Do you know what market cap is? For those of you who don't know, market cap is the invisible hand of the market's valuation of a company. And what it is, is just the number of shares issued times the price per share. Because if you've issued a thousand shares and each cost $100, theoretically people would not be willing to buy them at $100 if they didn't think that the company as a whole was worth $100,000. Because you wouldn't buy a thousand something for $100 that you didn't think was worth $100,000. So what you do is you multiply price of a share times the number of shares. You get this thing called market cap, which is kind of the market's valuation of a company. And something that is kind of a documented effect in finance is that things that have large market caps behave differently from things that have small market caps for a variety of kind of intuitive reasons. Small market cap companies have a lot of room to grow. They're often kind of doing crazier stuff to get off the ground. Large market cap companies oftentimes are more stable. They may have different return distributions. So something we might be interested in is how could you invest in market cap? How could you make a bet on market cap? And the way that it is often done, and we'll just show an example of this using the Quantilbian Pipeline API, which allows you to screen stocks and rank stocks based on various different criteria. So here what we're doing is we are saying we're going to make a custom factor, and this is just going to be market cap. And the inputs we need are the number of shares that have been issued and the price per share. And we just multiply them. That's the output of this factor. And then we make this other factor called book to price, which we're not going to worry about right now. It's magic. And so what we do is when we add these, we tell our algorithm, get me all this data every day. And again, feel free to work through this code on your own if you want to. But the general idea is that we're doing is every day, we are sorting all companies based on market cap. We are taking the top 1,000 companies, the biggest 1,000 companies, and we're taking the smallest 1,000 companies, and we're just spinning them out. We're saying, here's the biggest, here's the smallest. Okay? All right. So this is just a visualization of the entire pipeline and what data is being fed into what that might be useful for you if you're doing your own research. And then we actually run this, and here's a sampling of the data that you might get out. And so you can see here, this is for this particular day. This is like all the companies that were traded on that day. And whether it was a member of the biggest companies, its actual market cap is over there in one of the columns. The rank, if you sort on market cap, where it falls in that ranking. So this is the data. And from this point, we can kind of do a lot of interesting things with this data. Right? We can kind of parse it and squish it and process it in many different directions. The one that we're going to do now is we are going to take the biggest companies and what we also spin out into this is the daily returns of that company. And so, yes. Can you please do a control check for the last person? Oh, yes. Thank you for asking. I always forget to do that. Is this okay? Okay. I'll do one more. Bonus. So you can see here, what we're doing is we're indexing into this data frame. For the biggest companies, let's see what their returns were each day. So now we have like a portfolio. We can say, if we match them all together, what would a portfolio look like that just held the thousand biggest companies each day? Make sense? Okay. So when you say returns, you mean just the holding return for the day? Yes. We're looking at what the... If you held those companies on that day, those thousand companies, ignoring transaction costs or slippage or anything like that, what would your returns be on those thousand investments? On the next day. On the... Either that day or the next day. I forget the exact one or off by one indexing in this data frame, but it would be pretty trivial to just shift it down if you wanted to get the next day or the next five days or whatnot. The other thing you remember is that market cap fluctuates, but for a lot of these things, the metrics are only going to be updated on a quarterly basis for say like book to price ratio or something like that when the company releases its book. So oftentimes it's not even as big of a deal for certain factors. That said, if you have a bias and you're updating your portfolio the day before the things that are actually released in your back tests, then you have problems. So we take the mean and then for each day we say what is the return on those thousand companies averaged together? We're kind of equally invested in each of these thousand. And so that's what that is. And then we also do it for the smallest. And then we can say, okay, well how a quant would generally do this if they would make what's known as a long short portfolio in which they have a lot of long investments and a lot of short investments based on a sorting. And that's what we do here. So we've sorted based on market cap and we say small minus big SNB is the returns of the small companies and then minus the returns of the big companies because we're shorting the big companies. So it's a negative one in front of the returns. It's inverse. Makes sense? So now we have this portfolio which kind of reflects effects maybe of we're trying to get as close to market cap effects as we can because what's going to happen here is we're not going to be dependent on market motion because when the market goes up we're going to make an equal amount of money on our long basket as we lose on our short basket. It cancels to zero. And so the only effects left over hopefully will be the market cap effect. So this is known as a market neutral portfolio and it's there used to study these factors. And then the other one we created is this high minus low portfolio which sorts companies based on that book to price ratio. I said not to worry about it earlier. But it's the same principle. So that's what the returns look like each day. You could kind of cumulatively sum them to get what this would look like over time. So that's what this looks like. So it looks like over the year of 2011 making a bet on market cap would not be a great investment. It looks like that the large cap companies actually outperformed the small cap companies in 2011. And so this actually loses value in this portfolio. Now, can anybody guess what we're going to do next? Yes. But for the individual returns of the digital investments, we are on a low return basis because maybe their prices are high. So the question is this based on absolute return or multiplicative return? And these are based on multiplicative returns. It's ratio. So you're still going to see different effects, different distributions between the high and the low baskets but it shouldn't have that bias factored into it. Yes. Someone mentioned this market neutral. Is it beta adjusted? Could you give more info on that? Do you think that the higher market cap at low beta? Sure. Yeah, absolutely. So the question is, is this going to truly be market neutral in all cases? And the answer is not necessarily and it depends on the factor. So yes, what you're saying is you could have an effect in which the large companies had more dependence on market movement because they're like a greater percentage of the market than the small companies which are more independent of market movement. And that's definitely true. There are some biases in these portfolios. And so sometimes what people will do is they'll slightly adjust the weightings of the long and short to try to minimize that exposure. But again, that's a little more nitty gritty and we can kind of for now pretend that we've gotten rid of all market dependency by doing this equal weighting long and short. OK. So this is the market returns where we're just looking at the broad market. You can see it looked like there's kind of a large volatility event somewhere in the middle of 2011 that for both of these portfolios is kind of reflected in these, well actually all three of these portfolios is reflected in just a higher volatility of returns over the second half of our data here or the last third. And then what we can do is we can start running regressions of other now return streams versus these factor return streams. So we might want to say, well how dependent is your return stream on market cap? And that enables us to ask a really important question which is are you just making a bet on something else that somebody already knows about? Because this is a very, very common thing is to develop something you think is really cool like a super innovative strategy and then you run this linear regression and you find out it's actually just like a sum of these three well-known things. Right? At which point you kind of waste, well you haven't wasted time necessarily but nobody should be interested in that new thing because it's not really new. And why would they want to pay your hedge fund fees or whatever for this strategy when they could just invest in these well-known things? It's very easy for an investor to create this long short portfolio on the market and just run that versus investing in your strategy which is just kind of 2x this or whatever it turns out to be in the dependencies. Does that make sense? You're breaking it down and you don't want a significant portion of your returns to be highly dependent on other things that are well known. So we can do that here. We have this tech portfolio of Microsoft, Apple, blah, blah, blah. And we just take the mean of that portfolio so if we were holding these stocks every day what would that be? And we put them together into a data frame, we run our linear regression, and this is encouraging. This implies that this is working to an extent because what we're seeing here is this portfolio of kind of large well-known companies follows the market about one. So it just kind of moves with the market. But it has a negative dependence on our market cap factor which is good because all of these companies we selected are very high market cap companies, very large expensive companies. And so you would expect that you would want to see a positive dependence on a factor where you were a long big company, short small companies, and a negative dependence on a factor where you were short big companies, long small companies. So that's what we see here. We see a negative dependence on our factor. And then I don't even, like, that's harder to interpret. But this is the interesting one is that you actually are seeing this dependence pop out. Again, don't just look at a single number. Let's look at it over time. You can see here that the dependence on the market cap factor is kind of consistently negative. Again, that's a good sign. And the dependence on the other factor is consistently positive, albeit it's a very small effect size. And you, again, have to figure out, like, is this effect size actually meaningful? Will this hold going forward? What's going on? Do all of your normal statistics? And then this is the market beta up top consistently higher than the other two. That make sense? Questions? Concerns? There's one final secret bonus level of this notebook I'm not going to talk about today. And the reason I'm not going to talk about it is because it's another one of those things which is a little conceptually tricky to get. And if you're interested, I recommend going through, running the code, wrapping your head around it. But it basically involves a way of taking this data set, rotating it 90 degrees and doing a similar analysis. That's the simplest way I can explain it right now. It's basically known as kind of daily cross-sectional factor analysis. Instead of what we're doing here, we're just looking across time at these portfolios. Okay. So, how much time am I running on? Ten minutes. Ten minutes. Perfect. I'm going to check. One last thing to talk about, which is the risk component of this. So people are somewhat comfortable with the idea that you can break down return streams. You can measure exposures to well-known factors. You can say, is this thing that someone is trying to tell me is super new and innovative and based on artificial intelligence, really just a 3x leverage bet on this other factor that I already know about, which is often the case. Let me tell you. The other direction you can go is you can actually try to measure the risk of a strategy. Again, remember earlier I was saying that each factor has a risk that you can estimate. And then if this other return stream is just kind of well-represented by a sum of these other three or four factors, then supposedly the risk can also be estimated. Or at least you can look for warning signs of high risk if these other factors have high risk. So something that you might not notice is let's say you have a strategy which seems to be very low risk, or an asset that seems to be very low risk. But then you look historically, it has a very high dependency on a factor that you are not comfortable investing in. Maybe it has very high dependency on some of the less stable states of the EU's stock market indexes. This is something you might run. You say let's run this analysis against some risk factors I'm worried about. Let's see if it has high dependency on those things because that could be an indication that if those things blow up in the future, this thing could also blow up. So again, it's a way of seeing where your returns are coming from. Do you actually want those returns? So what we're going to do here is I'll walk you through a part of this analysis. And again, it's all there. You can go forward and try this on your own. But here we're just going to be looking at from the other direction. Same code as before. We're looking at market cap and book to price. And the analysis is going to be very similar. Going to get the returns over time. Again, this is just now. We're looking at the cumulative returns of a portfolio holding market cap, short and long if we're looking at book to price, high and low. And we're going to look at the same portfolio. We're going to run the same analysis. But now we're going to add a little bit of a correction, which is that we are correcting for covariance between independent variables in our model. So let's say that we have two factors, but these two factors are actually somewhat correlated. We would want to kind of correct for that in some way when we are saying, well, what's the actual risk exposure to each factor? Because otherwise it might not fake out in the wash for statistical reasons known as multi-colonarity, which are discussed in other lectures. So we're at least trying to correct for that. You can never perfectly correct for it, but we're trying. And what we can measure here again is we're saying, okay, well, what are the relative risk contributions of these two factors to the risk of my overall portfolio? Again, remember, risk is measured as standard deviation. So what we're looking at is we take the returns of our portfolio minus the returns of the market. That's kind of our active returns. That's what we're doing over the market. We take the standard deviation of that. That's kind of what risk we're adding to this portfolio over the market. And of the risk we're adding, how much of it is coming from market cap risk, how much of it is coming from book to price risk? And then we can also look at it rolling over time. So you can see, interestingly, risk changes. Risk exposure changes. So there's periods of time in which you're actually more exposed to risk from certain factors, maybe because of how market conditions change. And there's periods of time in which you're less exposed to risk from certain factors. And part of this is kind of idiosyncratic. It's because of noise in the data. And again, what you're looking for is kind of consistency in these things. You can forecast what your risk might be going forward. But that's the general idea, is we're trying to measure how much of the risk we're taking on is coming from these well-known risk sources. And again, you can substitute anything you want into here. That's the beauty of these models. That makes sense. And then the last thing we do is we compute. So this was just looking at the raw dependency without the correction factor we were talking about. And we compute it with the correction factor here. And this is what it looks like. So you can see it changes a little bit, just maybe gives you a little bit of a clearer view of what's going on here. Interesting behavior, right? There's a lot of risk in the first half of the year from one of your factors. And then it looks like not a lot of risk in the second half of the year. And then your question you have to ask yourself is, is this meaningful at all? Is this just kind of an overfit pattern? What's going on here? Should I worry about this? These are all questions that a risk professional would ask when they're looking at potential investments. And then after this, I just talk about reasons that, you know, you might be worried about using this data. The one that I'll just mention before I finish is, so I mentioned earlier that nothing in finance is normally distributed. So something that people might naively often try to do is they might try to say, okay, well, my risk over time has had a standard deviation. And so what I'm going to say is my future risk is like what it currently is plus or minus three standard deviations. That's kind of my range of what the future risk exposure could be. Does that make sense? So you're looking at this plot. You're saying where could it go in the future? And sometimes what people do is they'll say, you'll take the value it is now, and then they'll draw like a range based on the standard deviation. And they'll say it could be anywhere in that window. That only makes any sense if the underlying series is kind of well-behaved or normally distributed. And in this case, you know, just running a quick test says it's not for both of these risk exposure time series. So naively doing this approach is actually not going to give you a good estimate of where the risk could go, and the risk could actually go way different places because these series are going to be auto-correlated and I'm not going to go into this too much, but the long story short is that auto-correlated series tend to have lots of spiky behavior in both directions. So you could be in for a nasty shock if you said this could be between three and negative three standard deviations of the current value, and then you get in the future and you see a six standard deviation event of where your risk exposure is, and you're like, I had no idea this would happen, but don't do that because I just told you it could happen, and you should actually go in and try to figure out is there a way I can correct for this? Is this normally distributed? Do I need to use another model? Sure, so these are all questions you should be asking yourself. Any questions on that before I finish up? Okay, so that's factor risk exposure. I'm getting a lot of the, you just gave me too much information and now I have to sleep look, which is good. That's where I want to be because again, this is all available. You can go to it and look it up later if parts of it didn't sink in. I just wanted to quickly say that, so I'm going to be at PyCon for off and on on Thursday and Friday and then I'm going to be running kind of a four hour workshop version of some of this stuff on the Saturday. The idea is that it will actually give you a chance to try this on your own machines and do some of the coding and get into it. Not specifically this content, it's going to be more introductory content, but also sourced from the lecture series. And then we are actually, so one of the things we've been running for a while, and I had some questions about this earlier, was we started running workshops for various different schools and universities and we've actually since expanded that out into a full workshop system. I guess you'd call it thing. I don't know what business people call these things. And we have workshops now being run all over the place. I was just in Sydney. We're doing a bunch there now. We're doing them in Singapore. Anthony, who's here today, is going to be running one on July 2nd. So if you guys are interested in actually going to a full day course and this stuff and actually getting yourself off the ground practicing some of the stuff, actually doing it on your own machines. I cannot say enough good things about Anthony. He's a really great instructor and if you're at all interested, please check that out in July 2nd. You can ask him or me if there's links. I'm not going to actually advertise on this screen. And just as an indication of his quality, I think his students recently won the International Futures Trading Competition. What was it, CME's Futures Trading Competition? Yeah, so his students recently won that competition. So that's kind of a, you know, some statement of the quality of his instruction. That's why we work with him. If anybody has any questions, I'll be around later. And I don't know if we want to, I'll take a few now, but I'll be around later. And I just wanted to quickly throw up what might be some links in case anybody wants to get in touch with me going forward as I just answer one or two more questions. I made this and I felt horrible about myself. I felt like I was becoming a business person and I hated it. But let me see. I can't even remember where I put it, which just tells you an idea of how much I care about this. That's right here. Yeah, so I just put this up just so I could throw this up at the end of talks because people are always like, how do I get in touch with you? And I'm like, I don't know. So this is how you get in touch with me. Those are where I live. And that's me in case you didn't realize. Okay, get it. The question here first. I'd like to ask a silly question. I suppose all the lectures are the money-making secret. Why do you like to share the money-making secret to the general audience? Sure, Quantopian actually has a set of incentives which in many ways are reversed to a lot of traditional finance firms. So traditional finance firms, when they build stuff, when they research stuff, they actually have a very strong incentive not to share, to hide because the moment they share, other people take advantage of it. Quantopian's model is not that we're trying to necessarily, like, protect any of our own IP, but what we want to do is we want to work with others and enable them to generate their IP and then do profit sharing with those people. So you're not afraid, like, people learning all the things and start their own Quantopian? Go ahead. I have been at Quantopian long enough that I can tell you how difficult that is. So, yes, the short answer is, like, we want our users to be at designated as possible. Is there any top secret in hiding behind those sacred lectures? Pretty much no. The only thing that we keep private, and not even that private, I'm going to be speaking in Hong Kong about this. We've been doing a couple of talks recently as the strategies for selecting which strategies we want to work with from the community of users because for privacy reasons, we can't actually look at the strategies. So we have to look at returns exhaust. And it actually involves doing a lot of this factor analysis stuff to look for where people's risk is coming from, how consistent are their returns, looking at end-of-sample performance. So that stuff is still not 100% public, but we've basically done as much as we can to open source as much of our code, provide all these educational materials because we have a strong incentive for other people to learn this stuff. Thank you. I was here first, Jeff. In continuation to what you just said, can we access the strategies that you have invested in? No. What is the ticket size of investments that you make in these strategies? Okay, so first part of the question is user strategies are always private to that user unless they choose to share them. If they choose to share them, the strategy will show up on the forums here where they will share it and often times with some interesting commentary so you can see here. Someone who's actually an attendee of the Sydney workshop shared their strategy. I'm not going to false advertise. This guy has already been working on stuff for a while so don't think that you can attend one workshop and then make this kind of stuff. But he shared this strategy as an example to get people talking about it and you'll oftentimes find threads where people share interesting research reproductions examples. They're not going to share their cash printing ATMs. The second question, the short answer is I don't know if you've heard of the SEC in the US, but they like sending people to jail for non-compliant stuff and I don't want to go to jail. I can't answer any questions about the fund or our allocation process or anything like that. So if you have questions about that please check out the investor relations tab on the website. And one last time, May. What happens to the idea of a strategy that you've invested in? Does it stay with the owner or does it stay with the fund? The owner keeps all of their intellectual property when we allocate to a strategy. They just enter into a licensing agreement where they still own it but they have licensed quantopian to use it. Thanks very much. Any other questions? We're working on it. We like not to over-promise so we don't have specific dates or deadlines for when we're going to be able to finish it but we're working on it as fast as we can for to enable futures as a tradeable market. So you still have to implement like that's getting the data component but then there's also a huge amount of problems with actually making it tradeable and implementing accurate back tests because they just don't behave the same way as equities in many ways and there's different restraints on trading and lots of different things that can mess with you. So enable to have an accurate representation of how a futures trading strategy would work. There needs to be a lot of other things that are done. Similar to the IP question, everything is hosted on quantopian. So what are the protective measures to protect my code? Can quantopian have a... Sure. So the question is, what are protective measures to protect code given that the IP is hosted in our platform? The short answer is that quantopian would effectively die as a business if at any point we started leaking people's IP. So we put a lot of effort into protecting that. It's stored under a layer of encryption on our databases and then there's multiple layers of security around that. So to get at someone's IP, you would first have to break into our production infrastructure and then you would have to break the encryption somehow that the algorithms were stored on. So we put in place a lot of mathematical encryption techniques but also just a lot of other more securities and operations-based techniques to make it very difficult for people to get anything. Even let's say you were able to break out of the sandbox we give you in the research environment, you would still have no way to get at all of this data. That's on the back end. Any other questions? Are there any plans for Forex? Not currently. We are working on expanding the number of tradable markets that we have and so once we finish futures we will step back and say what next? We're still quite a small company. We're venture backed. We're 45 people. We're like 30 engineers. So we don't quite have the capacity to do parallel markets right now. We kind of have to implement them in sequence. Yes? So are there any prerequisites to taking the lectures? For example, I'm an underwriter who is studying computer science and I know a bit of machine learning. But I have no idea about one of the requirements. So is this a good thing for me to pursue? Because it seems very interesting. This happened last time where I had to swear that I didn't pay someone in the audience to ask that question. The short answer is yes. And this is like honestly pretty much, I would say like the poster boy use case that we designed the lectures for, which is someone who has a little bit of programming experience. It sounds like you have more than a little bit, but has at least a little bit of programming experience. Has an interest in statistics or mathematics. And the short answer is if you start from the beginning, I think you'll pick it up very quickly. We're also working on expanding the number of lectures that actually teach the Python that's required to use the platform. So hopefully we'll also be able to open up to like a broader segment as well. The Python lectures will be available online. It's going to happen in the next couple of days. The Python... So I'm running a tutorial there on Saturday. I don't know what their policies are for releasing videos or whatnot. You'd have to ask the Python organizers. Questions? Okay. I'm going to seed my time in the interest of being fair here. So thank you everybody. Good questions. And I'll be around, I think we're going to have another presentation, but I'll be around for a little while afterwards if you want to ask follow-up questions or whatnot. Sorry? Oh, yes. I keep forgetting. I do too much here. We have some quantopian swag. I think we sent t-shirts or something. It really could be anything in that box. I have no idea what we put in there. But whatever got put in the box that got shipped across the sea to end up in Singapore is outside and you can take a bet on what that is if you want to. So hopefully it's good. Thank you.