 Hello everyone, I'm Mira Abou, the CTO at Neotic.ai. Neotic is a small company located at Lebanon. Lebanon is a small country in Asia. So first of all, thank you for inviting me for this talk. It's my pleasure to be here, listen to all your thoughts and in my turn it will be also my pleasure to share with you a small concept of our technology which was recommended by Mr. Naseem Talb. So as we all know, investment or doing a good investment and effective investment is not an easy job and that's why Neotic, our company, has created an alternative idea based on AI. So I will start by a small introduction about trading in order to have a small idea about it in order to be able to follow up with me in the next slide. So as we all know, each stock could be represented or the prices of the stock could be represented using candle charts. So if we take an interval of time, for example three months for the stock, for example Google, we can see how the prices of Google change, what is the movement of prices during, for example, here three months. As we can see, so each day we have a candle, this candle represents the open price, the closed price and also the low and the highest price. So why we trade? Simply, it's just a game to make some profit. We can see it just like a game. So this game has a few rules. So let's start with rule one. Open long position on a stock when you expect its price gonna go up. When you expect its price will increase. In fact, having a long position on a security means that you own the security, you buy the security. So when you buy it and when you see its price will go up, will increase, so sell it and you will make some profit. Rule number two, open short positions on securities that you expect its price will decrease. So when you open short position, means you sell a position and then if its price will decrease, you buy it and then you make some profit. So we can do profit in case the stock will go up or go down, goes down in the price. Most of the time the traders, when they open positions on a stock, they put three parameters with their positions. These three parameters are holding period, target gain and stop loss. So I'll start with the first one. I'll explain it very quickly. So holding period in general when the traders open a position on a security, they say I will hold this security for example for one year, for three months and then they will close their positions. Add to this most of the time they put a target gain when they open the position. So for example, if we can look here, for example around 14 July or more than 14 July, let's say a trader has opened a position, a long position and he said I will hold this position for around 1st of September and he put a target gain 8%. So when he put his order and when the stock goes up so the position will be closed here and the holding period, the maximum holding period will not be achieved. So most of the time it's good to have a target gain without just holding the stock during the whole holding period. Also you can add a stop loss. So stop loss in order to limit your loss. So when you say for example I will put a stop loss of 4%, this means the stock will go down by 4% of its price, of its value, the position will be closed. So what is our target? Our target is providing trading strategies. A trading strategy, you can see it as a fixed plan that we put it in order to achieve high returns by opening long and short positions or we say going long and short. What is our target is providing to our customers many trading strategies, each one with its parameters, parameters as stop loss, target gain, holding period, etc. So why we do that in order to have a high diversity in forecasting? In fact if you do for example an open position, if you open a position on a stock with very high quantity, if we say to all of our hedge funds or to all the investors or traders for example, open long positions on this stock, in fact the price or we fall in the problem of market impact because you are doing trading with very high quantity on the same stock. So our target is providing strategies in a manner that each strategy selects different stocks. So our technology in fact when we give a strategy to a customer, we plug this strategy into our technology and the strategy and the technology automatically will select profitable stocks for long and short positions each day. So each strategy will be plugged in and will provide returns or to select stocks for long and short positions. Sometimes there is no return, no selected stocks. So sometimes it will say there is no profitable stocks today, there is no stocks that we are sure 100% is good for long position or it is good for short position. I'll start now by the core of our presentation. So in the first part I'll explain some challenges that we face while we are working with financial data. It's a little bit different because financial data is a time series and it's a little bit different from other types of data. After that I'll explain a part of our technology without giving all the secret source of it. So I'll try to stay high level and I'll say how we have achieved high returns and how we have faced all these challenges. So the first problem I think this problem occurred in all the machine learning problems. So we have, when we are working with data, financial data, we have a low SNR, which is signal to noise ratio. So which is a ratio of the probability of significant information, the probability of two signals to the power of noise, which means the information that are not useful. It's very low, so we have many information that are not useful in the financial world. In fact, when you have such problems, of course it would be very difficult to build a machine learning that do not fall into the problem of overfitting and underfitting. I would suggest that all of you know these words. So of course overfitting is that your data, your model has just memorized the data instead of learning from it. You can summarize it in this sentence. So the second problem is that the financial data are not stationary. So what means not stationary? This means that most of the stocks follow random walks or trends, or most of the time both. So you have trends and they have also random walks. So when you have such type of data, for example, doing a regression will lead to a meaningless result. Or also when you try, for example, to use the concept of mean reversion, you will not end up with a good result. So in the stationary, you can say you have a kind of a constant mean. Razor, when you have not stationary data, you do not have a constant mean. So when data is added, the mean of your data will be changed. Of course, here, for example, you can use the concept of mean reversion because you have a constant mean. So most of the traders use this concept. But it's also applied, it's only applied on the stationary data. Also, of course, we have the 3D problems which is a problem of big data. So every day, since we work with the US market, so every day we have data about 8,000 stocks. So it's very difficult to visualize this data. We have data about their news, about their prices, and also we calculate all the technical indicators. We have data about the financial reports. So it's a little bit, we have a lot, lot of data. The second problem is the velocity. Velocity, this problem means that the speed of the data generation is very high. So sometimes every minute we have news about data. So we have to treat this. And finally, we have the problem of variety which means we have some structured data and we have some unstructured data. For example, one of our data providers is the Kondal. We have many problems with them. Sometimes we find the same stock with two different prices, for example, and etc. So also one problem faced in machine learning while applying machine learning in finance is the change in the distributions. In fact, these problems, it's a little bit related to the problem of non-stationary. In fact, when new distribution of the data change, the previous observations does not meaningful anymore. So when the distribution change, you have to restart your model, you have to start from zero over again. And finally, the final problem with this, which is the problem introduced by Mr. Talib, which is the fat-tailed distribution. Do you know what is fat-tailed? Okay, I'll give an idea about it. So I have written here a small definition. So a fat-tail is a situation in which a small number of observations creates the largest effect. So when you have a lot of data and an event is explained by small observations, I'll give some examples. For example, in sales, if you put the sales of the companies, you will find that most of the sales are for few companies, for few number of companies. In Pharmacritical, for example, you will see that few number of drugs gives the most sales. For example, in wealth, if you take 1%, just 1% of wealthy people, you will get half of the wealth. So you have small observations, small data, but this data has a large effect. This data is more significant than any other data. And finance, almost everything is fat-tailed. So we have few observations, but these observations create the largest effect. So if we look here at the index of SP500, we will see its distribution. For example, on this side, this is the distribution of the returns of SP500. So if we look here, for example, at the left side, we can see, we can find here that was the economic crisis during 2008. So it is not acceptable in finance to do a model without taking into consideration the data to take into consideration these small observations that have really large effect. Okay, so I'll start by giving an idea about our technology. Around one, two years ago, we were using LSTM algorithm. I will not go into the details of LSTM, you can find it on the internet. I will just say what was, it's a problem. The effect, as we know, time series, what differ time series is that the time, the temporal aspect is very important. So on a positive side, I think this information is very useful. So if we look at it, we can see clearly what the model was trying to do. So in order to predict a time t plus one, for example, the model was using the price at time t. So most of the time, it was using the previous price. So this is also known as the problem of persistence. You have a persistence model. So this implies that the future value, the future predictions are calculated on the assumptions that the conditions remain unchanged between the current time and the future time. So more specifically, if you plot the cross-correlation between the real value and the predicted value, you will find that there is a time lag of one day, which means most of the time, the prediction is equal, or the next day is equal to the price itself. So that's why we moved to another concept. We are now working on these two parts, but I will explain just a bit about the first part, which is predicting according to the price patterns. Okay, I'll say what is the process, and then I will give some details in each step. So first step, of course, it's data preparation. So here we have, for example, a bundle, our provider, we get the data from them. So the data that we get, it's about stock prices, it's about financial reports. Who is familiar with trading knows that, for example, the financial reports are very important in order to see the success or failure of companies. So it's a good data in order to analyze and to predict according to it. Okay, we also have a news, et cetera. We will not go into the details. Okay, the second step is features extraction. In fact, having a lot of data doesn't mean we can start extracting or training our model. Of course, we need to extract some information. What we do is calculating some technical indicators. These technical indicators are of many categories. One of them, for example, is indicators about the volume, about the moving average of the stocks. So these are features that we can calculate for any stock at any day. Okay, so for example, when you say moving average, so for a stock at day one, at day X, so it will see the moving average around 40 days. How was the stock moving around the 14 days? Also, we have other values. Then, so these two steps are for we perform it every day. So every day, we have data. We do features extraction, et cetera. Now, when we want to run a strategy, which means we want to forecast for a strategy. A strategy, remember, it has its parameters like three of them are stop loss, target gain, and holding period. What we do is labeling according to the strategy itself. So we don't have data like most of the companies, they have data and they do, for example, percentage of returns or et cetera. We have labeling according to a given strategy, according to a strategy. So labeling is done since most of our strategies are for long, go long and short on market. And of course, there is a financial concept behind it while we go long and short, not only long or short. So we do the labeling for each stock. How much you can consider the labeling like how much this stock is good according to this strategy for long position. For example, giving a label for another stock for a short position, we calculate how much this stock is good for short positions according to that strategy. Of course, here there is some formula that we use. After that, we start learning from data. What we do is, you will ask from what data you learn. So what is the data that you will perform the labeling on it? In fact, what we do, if we want to forecast today, for example, according to strategy one, what we do is we go back, we take several days, an interval of days, and we perform the labeling on it. After that, so we start learning from this. What is learning here, what we do? As we can see, it's just very simple. It's a matter of clustering. In fact, our concept, which was recommended by Talib again, is if you want to forecast today for stocks, it's enough to find similar stocks and historical data and that are winners. So if I can find features of features today that are very, very similar to features of historical data and these features lead to a positive return that there is a very high probability that this stock will give high return. It's as simple, as much, as simple as that. And in fact, much of the hedge funds and the managers do this. They find similarities, but according to variable to one variable or two variables, etc. So what we have done is applying their approach but using machine learning and some advanced idea. So after this, so the historical data will be classified and then we will project or we will put all the data of today. So we classify the historical data. We see what we have stocks in the market today and just we put this on the clustering and we apply here some formulas in order to find the top clusters, the top stocks that are very, very similar. And finally, what we do is since we built in the end we follow a strategy and we build portfolios and finally what we do is adding the stock and following up with the stock. So for example, the data preparation we have data preparation and also adjustment. So it's familiar with, for example, the concept of dividends of split. We need to do some adjustment on our data. For example, here we have a stock which is in blue and after that the price has decreased and it continues with the red color. What happened here is something called split. So the price has not in reality decreased but the stocks has been divided in due to and its price has been divided also in due to. So what we do here in order to our model do not treat this as a decrease in price. We do some kind of adjustment in order to keep so all the historical data will be adjusted in order to keep a harmony of the stock. Secondly also we do, in fact, what we do is we have calculated many features which is related to technical indicators to the change in the financial ratios. So what we do here is removing some features who has a very strong collinearity and also we choose the features who has strong collinearity with the label. I think all of us do that so I will not enter very much the details. So here is similarity concepts that we follow. So let's say we have a variable here or a feature and here also the same feature for another stock. So for example to forecast today in stock too we just try to find similarities but also not just on one variable. We have very dimensions, very variables that we work on and as we can see, they have similar and after that the stock price has increased and this is what really had happened also for stock too. Why we use simply unsupervised learning? In fact, if we go back to the approach or to the concept of fact-tales, we said that we should take into consideration the data with small observations. If you are using other models, the data with small observations will not be taken into consideration. So that's why using clustering, you have all the data there, you put the outer stocks on it and there is a possibility of 100% that the data which we consider sometimes outlier because they have large effects, they have a probability of 100% of being chosen. So that's why we follow such simple clustering. Of course we have our algorithm of clustering but you can test it with K-means, K-median, mean shift, etc. But of course you should take into consideration some points. For example, when you are using K-means in order to cluster your data, you have to make sure that this data has a Gaussian distribution. It does not have, in fact, when we say normal, this means that there is in the system some kind of a mean. So your mean could be a representative of your system. So if your system does not have a normal distribution, so of course you cannot use, for example, K-means clustering is based on the concept of means, which only has meaning in the normal distributions. Otherwise you can try, we have tested all of these. For example, you can test mean shifts which tries to do the clustering according to the density of the data. So the high mode will be like point of attractions in order to do the clustering. Okay. So when you think such techniques, like for example K-means, most of us here is we have, for example, this variable, this is the distribution of the variable. It's not very clear, but here what is happening is most of the data is around zero and e power seven. And we have rare data distributed, but that is very, very far from the zero. So if we keep your data like that and you want to do scaling, because of course in K-means, you have to scale off your data. So if you do this and you do the scaling without any cleaning, you will see that all your data will be almost the same value. So when you are using, for example, K-means, you should remove all the outliers. You are removing these points that are very far from the mean or from the data. And when you do this, of course, we are removing some points that might have many significant information. Okay. So that's why you should take care, when you are using K-means, that you should clean your data, but you are removing maybe some, many some significant information. Unsupervised, okay. In order to gain time. So here is what I was saying again. So we should not remove all these points because they are far from the range for a given variable. We should not remove them because they might have, they might hold very significant information and they should be kept. And when we are using clustering, they have a probability of 100% of being chosen. Okay. So as I said, I cannot give many details, but I will affect what defines a good algorithm. Is it good for clustering or not? It's the algorithm you follow and of course the similarity measure you are using, which is the best thing while you are doing the best, the most important thing when you are doing a clustering. In fact, we are using something called entropy, which is a concept from information game. Entropy first could be used in the system, whatever it is, it's distribution. So here we are not linked to the distribution of the data. What does the entropy, it measures, it's a property we can define the entropy of a system as a property of a system that measures its randomness. Okay. So for example here, if we look, let's consider this as a cluster. So we can see here there is a high randomness, high impurity in it. Rather when we go to the right, we will find that there is low randomness and very high purity. So what we do is to cluster the data, we start by initially clustering and then according to some formulas, we see how much the clusters are pure using this formula, which is related to the probability. Okay. And then we try with many iterations to improve this. Using this method, in fact, the outliers will not affect the clustering because in K-means, I think you know what is the problem with K-means having outliers because the means or the centroid will not be very meaningful. So with that method, the outliers will be taken into consideration and will not affect in the same time the other clusters. Okay. So finally, this is one of our back tests that takes into account the outliers and it has, in the same time, low volatility and here are some values also related to fineness. As we can see using this algorithm, we have here the financial creates. So we have done better than the SP500 and also here in December 2018, also the portfolio is better than the portfolio returns are better than the SP500 index. Okay. And this is all despite the bad because in the Christ, you have a bad environment. So despite the bad environment, we have achieved good returns. So why we are simply using unsupervised learning and not more advanced techniques? First of all, we have decision transparency. One of the most important things that one should understand is what the model is trying to do. When you are using another techniques, I think it's very difficult to see what your model is doing and add to this since we are almost start up so when we explain, when we need to explain our algorithm to investors, to etc., to many people, it's very simple to explain our algorithm. Okay. You cannot say to hedge funds, give me one million dollars and I cannot tell you what the algorithm gonna do. Okay. So it's simple and they have a trust in such type of technologies. The second thing is lower risk. In fact, each day when we run a strategy, the strategy will see several days and will start learning from zero. So each day it will choose a huge amount of data and start learning. So here the model has bad experience. Next day it will start from zero. It will not be affected by the previous decisions. Third thing is the greater diversity. So sometimes we run two strategies. Each one will choose stocks for long, stocks for short and which is very good for the market. And finally, because we have simply overcome the crisis and this is what most of hedge funds want to see when you back test a strategy. So thank you for your attention. I will be happy to answer all of your questions if you have.