 Ladies and gentlemen, Mark Andre Lemburg, give him big hand. So thanks again for the very nice intro. It's going to be hard to live up to that. Right, so I'm going to talk about a project that we did end of last year. I'm Mark Lemburg. I've been around in the Python community for lots and lots of years. I'm also one of the EuroPython organizers. The rest you can read here, it's not that important. So this is the agenda for the talk, quite a few slides. I hope I can show all of them. So what was the outset of this project? We were asked last year, we were asked by, let's say, a big company, because I can't give any details. I assigned an NDA for this. So a big company wanted to design a small company doing Python development internally. And they wanted to know how much their Python code was actually worth. So what the value was of the IT that they had. And because they didn't have any IT skills specifically for Python internally, because most of you are a Java company, they asked us to help them. And so we had very little time to do this. We just had two or three weeks to actually run the whole project. So we basically knew nothing about valuation of companies. And we had to come up with a few things. And we thought that it might be a good idea to try a few different models, and then maybe do some averages, and then maybe do some calculation, and come up with some value that we give them. And they liked that, and so we did that. So first disclaimer, the stuff that I'm talking about here, I'm not an expert in. We basically just kind of did some research, chose some different methods, some different tools to use. And then went ahead with it and mixed all that with our experience in running these projects. So what do you have to do if you want to assign a value to an IT startup? Well, first of all, you have the business value, that is something that I'm not gonna talk about, because the company did that themselves, they had experts for that. Plus, of course, if it's IT focused, then you have the IT value of the company, and that's where we came in. So both sides, of course, have risks. And so you need to address those risks when you value the company. Again, the business side, we did not really participate much in, but what we did do is when we looked at the code base that they had created, and the way that they had set up their systems, we found a few issues with that, that were also going, for example, into the area of data security, or maybe patent patents and then trademark infringements. And those were things that we basically told the business side of the company to take into account and take that risk into account. And then we came in to then judge the IT risks that they had. So this is just a list of IT risks that you usually find in larger projects. And so we had to look at the IT side of this valuation process. So this is what we set out to do, and we told the company they were fine with doing that. So first we sat down with the team, we tried to analyze the team, we tried to figure out whether the developers were any good, whether the system was any good, whether the data that they had was any good, and it turned out to be very good. And then of course we wanted to add some scientific approach to this. So we, of course we used Wikipedia and then searched a bit around for possible ways of doing valuation. We found this Kokomo model, which seems to be a standard in the industry for doing these kinds of things. Of course it's based on CC++ and Java, it's not based on Python, and we found out about that later on in the process. And as second model we used an effort model, which is basically, well I'm gonna talk about later on. Anyway, and then you get some basic value for the whole thing, and then you apply some soft values, or you remove some soft values if you find risk for example, from the value that you get, and we call this added value. And then you get one final value, which is kind of the estimate based on models, and then you go ahead and then you try to figure out what would it cost to rebuild this whole thing from scratch, and you give a value for that. And then in the end the company has to decide whether to either buy the company and then maybe run with that company, or maybe instead use the approach of building everything themselves, which might actually be cheaper. So let's look at the analysis part. As I said we have quite a few soft factors, we have some factors that we can actually measure. So you have code metrics, and you always have to take into account that you cannot possibly look at everything in that short time frame. So you have to build in some risk buffer for inaccuracies that you know you're going to have in your estimate. And so for the first thing, we just sat down with the team, we discussed everything, we tried to figure out as much as possible from them. We had a list of questions, something like 200 questions for them to answer. We went through that in a meeting for a complete day and got all the information from them. And that's also how we found out about things that were like risks, for example, that they had not really addressed yet, that the big company, buying a small company would have to address in that small company in order to fit their own corporate standards. And then we had to check the code, that was fairly easy because there are tools for this. And then again you have to throw in some experience to measure the risk buffer that you have to add to this. So let's have a look at how you can measure the code metrics. There's this nice Python tool called Radon, which you just throw at your Python code and it just runs through the whole repository that you have and it then takes all the different details from that code and gives you nice summary reports and outputs all the stuff that you need to know about. I think I just skipped a slide. Yeah. So the standard terms that you have in code metrics are of course lines of code, then source lines of code, which is basically just lines of code without the comments and the doc strings. Logical lines of code is something special, it's actually just code that gets executed. So for example, if you have a for statement then the for line itself is not necessarily executed, it's just that the inner loop is executed and so you just count those lines. Then you have blank lines of course and especially important in Python because blank lines are white space and we love white space, right? So the more white space the better, the better you can read the code. You also have to look at lines of code per module and then can use that as basis for how maintainable that code is from just looking at these numbers. So the more lines of code you have per module, the more classes and methods and everything you have per module, then the harder it usually gets to maintain that code. It's usually better to break modules in smaller pieces and then do it that way. So Radon helps with this. It also helps with figuring out whether you have enough inline documentation which I find very important to have in a code base. I very often get to see codes written by companies that don't have a lot of inline documentation and so basically all the documentation about the code itself is either somewhere else or it's just nowhere so it doesn't exist. So having doc strings inline comments or it's a good thing so they get a plus added value for that if they're gonna show this. So in this particular case it was kind of average. Not that, not really that good. Then you have two measurements that take all this data and then also add some extra information from the code base so they actually parse the code and then try to figure out how many decision nodes you have in your execution tree. So for example an if statement would be a decision node and the more decision nodes you have per function and the higher the complexity and so higher values are worse so you get more complexity and lower values are better. It's similar with the maintainability index except that is the other way around. So the maintainability index takes the complexity, the density, the lines of code and everything puts everything together into a nice huge formula and gives you this index and you get higher values for better maintainability. Again you can use Radon for this. So this is what we used as part of the input for the evaluation. Then we had a look at the test coverage so we had them give us all the output of the coverage.py. We also check for end-to-end tests which are very important so those are things that you usually don't cover with unit tests so you actually have someone sitting there or maybe you have Selenium sitting there and answering the stuff into your data into your replication and you check whether the end results and let's say the report that you get out of it later on actually matches what you expect. Those are very important to have. In this case they did have a few, not that much so that was a bummer. Then we also check for randomized tests because we found in other projects that if you don't do use randomized tests you often end up with test cases that are biased towards one particular area in your code and so even though you have 100% test coverage you're not actually testing a 100% of what the possible entry data could look like. So you think everything is correct but it's not necessarily so, right? So that's what you can do in terms of code metrics but just looking at numbers. Next was this Kokomo model that we basically read up on Wikipedia. So this is a very old model. I think it's from the 70s or 80s. It's used to assign a value to a, to give an estimate for how much time you'd have to spend to write code. The only parameter you enter is basically lines of code and then you have to choose one of these models. Of course nowadays most projects are organic projects in the sense of Kokomo so you have small teams, agile process and so there's nothing much to decide there. Then you get these very simple formulas here with a few parameters, the small A, B, C, D or parameters, those are predefined by the model so you just look them up on Wikipedia, use them for that particular organic project category that you want to use and then you have to use this adjustment factor to accommodate for the efficiency of different languages and what we found is that Wikipedia recommends 0.9 to 1.4 for Java and C. Well the numbers that came out of this did not really match reality so we had to use a different factor so we used 0.5 for this which kind of indicates that sort of like a side result from this whole thing that Python is in fact more efficient than Java and C as if we didn't know. Right, so what you then get, you get development time and then you have to look at your development team, the number of senior developers and regular developers that you want to put on that team, how much money you have to spend for them and then you get the value and that's your estimate and you do the same thing for the effort model except that you don't use some formula for this, you sit down with the project manager of that particular product and ask the company how much time they took to write this thing, how many developers they used and what problems they had and then you use that as input for this formula and you get the similar value, right. So next comes the magic part which is add a value. So you have these numbers and then you go through this list of soft factors that you have and you add some percentage or you remove some percentage from the value depending on what you think is good quality work or good quality design and you factor in risks, extensibility, maintainability, various costs that you see in that questionnaire that you did. You add the risk buffer and in the end you come up with something that you can actually use in your calculation. So you take those two models, we just took a pragmatic approach because we did no better. So we used Kogoma model, the value that came out of that, we used the effort model and just used the average. Then we added the added value factor and in that particular case came out to something like 75% more than what the value from the models was, which is a good sign by the way. I mean, they really did a good job there and then you come up with a final estimate, right. And then in that particular case they also had valuable data in that company which is something that you not necessarily have in startups, but I mean, if a startup has worked for a number of years then they have usually gathered some data and you need to assign a value to that as well. Now for that, we had not found any good model like the Kogoma one to use so that we could just use the effort model. So we basically sat down with them again trying to figure out how much time they took them, how much they had to pay for those experts to gather that data. And then we had an estimate as well and we added everything together and then we had a final value to give to the big company to then use as estimate. Now the next question was make or buy. For doing that, you have to basically try to create a new company that does exactly the same thing. I mean, I'm just leaving aside all the patent and infringement and stuff. Just, I mean, big company, you know what, small company, so big company can do this. Small company, you really can. So, right. So what you have to add is you have to recreate all the products, all the data. You have to get the experts in which is usually one of the most difficult parts. And then of course you have to work and get the same market share in order to be able to compare those two companies. So all those costs, and especially the marketing stuff that costs a lot. So that's a business side, so we did not do that. So we just focused on the IT side. And so you need to see how much money you would have to put in as let's say you're a software shop, like we are, how much effort you would have to put in to basically rebuild everything that they did. And you'd also have to then look at how to recreate the data, which is not necessarily something that you do as a software shop, but at least you have to provide some advice on how to do this. And then you have an offer for rebuilding everything, and then you put everything together, give it to them, and they decide. And in this case they decided not to buy. So for them the analysis was maybe not exactly what they wanted, but I think in the end the outcome was good for everyone. So how can you add value to your startup? Well basically it's just all I've just said, and you work on all these different factors and improve them. And it's not really that hard. I mean you just write, you need to write good code. It needs to have a good structure. Complexity should be low, the structure should be right, so you better use more modules than larger ones. You need to have everything as flexible as possible when you design the whole product. It needs to be extensible, because usually the big company wants to enter new markets, which the small company has not thought about. So you need to be flexible at that end. And then of course you need to invest into things like data structures, like algorithms. For example, for the algorithms you can have lots of books from CS you can use. You don't always have to use the naive algorithms for everything, which many companies do. Plus there's one important thing on the IT side, it's reducing risks by not depending too much on third-party packages, because you don't have that much control over them, even though they might be very high quality. If your company is not capable of maintaining such a third-party package, in case, for example, the author just goes away, or does something else, or I don't know, the project stops, then you have a huge risk there. Right, and that's all I wanted to say. So far, let's have a round of applause for that. Thank you. Okay, so do we have any questions? I've got loads of questions, so I hope you have the same ones that I have in my head, because that'll help. There you go. Yeah, I wanted to ask about this Radon tool, or this automatic check on the code quality. Isn't there usually the danger that you can play a game with it, that you optimize for the tool and not for actual, better code? How reliable is this Radon? If somebody tells me, oh, tomorrow there will be due diligence, just do some makeup to the code so that the tool gives a better value, is that possible? I'm quite sure it is possible, yes, but the company in that case did not know about this tool, and we gave it to them and said, tomorrow we need the output, so they did not really have a chance of manipulating anything. So what we did in that case is we took the output of Radon, and of course it gives you the outputs per module, and then you look at the modules that look a bit like, say, have a high complexity or have a low maintainability, and then you just check the code of those and analyze why the Radon tool gives that readings afterwards. So, of course you have to do some code review as well as part of that, but we simply did not have time to read all the code base. So if you're given code base that has a few hundred thousand lines of code, then there's no way to do it that fast. More questions? There you are. Gentleman at the back on the right there. Hello. So did you look at any other alternatives to the Kokomo model, which seemed to use lines of code as perhaps the starting point for measuring things? What other ways could you measure? What other signals from the code could you use to measure the value? The signals from code were basically what we did. You were measuring lines of code, weren't you? That's how Kokomo works. Yes, that's how the Kokomo model works. And the value that we got out of Kokomo was so much higher than the value that we estimated and the value that we got out of the effort model that we simply had to use a different factor for that. Right. Okay, it just reminded me of a musical example where the BBC used to pay arrangers by bar and that was just an arbitrary measure and so the arrangers figured this out very quickly that if they wrote everything in two four they got twice as much money because there was twice as many bars and so there's a way you could gain this because if you understand what the thing is that you're measuring, again, it's very similar to the question we just had before. You could gain these things, but you answered that, I guess, when you said that you didn't tell them that you were coming. Yeah, I mean, of course you look at the code and then you see these things, right? So I mean, if you see that someone's, I don't know, been adding lots and lots of dummy lines to the code and just to get more lines of code to make it look like more valuable then, of course, you will detect that. Yeah, so I mean, of course, sorry, hey. Okay, sorry. Hello, you mentioned the risk buffer in your talk and what value is appropriate in your opinion? This was hard to say. I mean, in our case, for example, we were not able to review all the code that they had. So we just had, we did not even had a chance to look at all the components that they had in their code. So we just look at the main, the central kind of component that had all the interesting bits in it, but we did not look at all the other components that were stuck on the side that did some UI stuff. So we knew that we were only covering, say, maybe 20% of what they actually wrote in their code and based on those 20%, we kind of had to interpolate then the rest, the quality of the rest. And so that's what we used for the risk buffer. So the risk buffer was fairly sizable in that case. So, and at what number did you? I can't tell you anything. Okay. Yeah. We need more questions. Yep, over there. Did you continue tracking the company to see whether your evaluation was correct or did you have any other way of knowing whether your evaluation was correct or close to correct? Well, we know that the big company did not buy the small company and that they're thinking about actually doing it themselves. So they, on the make or buy, they're actually more on the make side, but they're still discussing that. Big companies take longer in these things. What was the effort of the reviewer? Did you spend weeks of tens of people or the days? No, no, no, we had, we didn't have much time. We had something like two or three weeks, like I just said. And so we had one full day meeting with them, asking all these questions, also doing the part of the review, because of course they did not give us the source code, so we just had, we had to sit there and just look at it. And so we didn't have that much information from them. So we had to base everything on that kind of input. More questions. Which is not ideal just to say it, but we simply, I mean, we were told that we have to give them an answer, the big company, an answer within those three weeks and that was all we had. So we needed to come up with something that made sense and so basically that's what we did. So as I said in the beginning, this is not necessarily, is this not how you should do it, right? This is just how we did it. And the numbers that came out, they did make sense. Hi, so first, thanks for your talk. From what I understand, you basically evaluated a company based on their repository, which is I would say interesting, but I would also argue that perhaps the biggest weight in the company evaluation is the developer team or the company structure, their processes that were created out of various needs and so on. So I would be interested in how did you approach to measure these kinds of, let's say, more soft things? Well, we did not have a look at the business side of things because we had a different part of the project doing that. So they had, the big company had experts for doing these things, so analyzing their numbers and analyzing the team. All we could do is we could tell them whether we have the impression they have good developers or not, or good software designers because that's our expertise. And so we were not experts on business processes, so we cannot really put a value to that. So what we did do is we told the business side what we think about their team skills and we told them about the risks that we found in the code base and in the structure of their systems, but that was basically it about the business side of things. So it seems like the very common thread in both the talk and your answers is you were under a lot of pressure and you made certain decisions just basically because of that. Now, let's say you remove all the pressure. You have as long as you want, like the big company comes in and tells you take as long as you want, have as many people as you want, do whatever you want with this. What would you have done differently? What would you have maybe not done or maybe done more thoroughly from the process? Well, I think we would have done more research on this whole valuation approach. I mean, we basically just had this Kokomo model which just came up on Wikipedia and we used that. There were also some people in the company who wanted to use that model, so they obviously found it useful, so we kind of made sense to use that. We would have put more energy into that. We would have had more interviews with the team members. We would have done more code reviews. We would have had access to all the systems, all the components, so not just looking at a part of it. Then, I mean, basically talking to people is very valuable. We did not have time for that. We just talked to the chief developers, not the regular developers that they had. We also did not really have much look into the development structure and how they work. For example, that is something that we completely left out. For example, we would look at this, how they do this agile process, how that works out for them, whether it's efficient for them or not, this kind of thing. Because when buying the company, of course, you're not just buying the product there, you're also buying the people. And of course, people are usually very valuable, but sometimes you need to restructure things to make them more efficient because you're not necessarily buying the most efficient process there, development process, I mean, not business process. Okay, I can't resist asking a question myself. You said you split it very clearly between the business decisions were for the business and yours were just the technical ones, but it occurs to me that some of the value you get from software isn't down to how complex it is or how much effort you put into making it, but it's like the intellectual property or finding the correct solution. It took 20 years for someone to eventually come up with Timsort as being the heuristic algorithm that's gonna be really good for sorting lists. But actually, it's only eight lines of code and if you wanted to re-implement it and you knew which one you were doing wouldn't take long. So did you have any way of kind of evaluating or measuring these guys have come up with a really good solution to this and trying to come up with it again from scratch going through all the wrong solutions would take ages or costs lots of money? Yes, yes, in that interview session that we did, we had them basically explain to us how they do this, what the solution looks like. And then we found that they were doing a, had a kind of clever way of doing things, not necessarily the most clever way of doing things. So there was some, there was some basis for improvement there and you can see that we're to improve things. So that was something that we found was positive so that you can actually make it work better and scale better. But we've, and we've taken that into those added values as percentage, so they had, we found that they had reasonably good algorithms for everything, but actually putting a value to what you're saying is more or less putting a value to say something like a patent or something like a mechanism that you come up with an idea and that we did not do. Any more questions? I know there's one at the front from someone who's already asked a question so we have time for one more if someone hasn't spoken up yet. They can have a burning question answered. Otherwise, it's a familiar sounding voice but we're very glad to have it. I'm not sure if I maybe missed it but is Egenics even allowed to make an offer to the big company because you gained a lot of inside knowledge. Isn't that cheating if you redo the project now after extracting all the information from them? Well, if big company asked us, of course we can. I mean, like I just said, I mean this big company and small company, so yes, of course you do have these issues that you cannot take away the intellectual property of the small company and just redo everything but that's not really our decision, right? If they want it like that and they ask us to do it maybe a little differently so that you don't have these issues, of course big company has lots of lawyers and legal departments and everything so they can make it work. Which is not necessarily nice but I mean big companies are not necessarily nice, right? Good, well that takes us to quarter two and time for our last breaks. The next thing on the schedule is lightning talks at five o'clock. Thank you very much, Mark.