 So I'm gonna start Today, I'm gonna talk about life after cloud core change And my game is called Heo Shida. I'm with a club of productivity and I've been working with this code base now legal office prior to that was open after four since 2004 I sent my first part in 2004 It's still yet to be integrated And I'm originally from Japan. I spent half my life there and then moved to the United States in 1996 August 20th Because it was never one more day. So I've been I've been this ever since Today I'm gonna talk about What actually what change was actually made in CalCore? I talked about this in the previous conference and Milan So I'm gonna just briefly briefly go over what was actually done and then move on to talk about what areas of CalCore was affected by the change and Then I expected it was a reality. What I was expecting to happen And what actually happened in reality and And going forward basically what we can learn from that and then what strategy we need to Forbid going forward First what change was actually made in CalCore? I'm actually using this slide from the last year Conference so some of you have seen this already So basically this was the old CalCore model which consists of basically I see base cell being the cell object the space cell object had some Attribute so attribute that was shared between five different cell object cell types broadcaster text-width and cell type and script type and There are five Five different cell types that were derived from this is the base cell and value cell stores numeric values and string cell stores string values no cell despite well It is called no cell but it was actually used for something else I was actually used to store broadcasters for empty cells and any cell stored rich text String string values a formula cell stores Formals now after the change we made a huge change to this model to basically move from Cell object base storage to array base storage what the new storage stores values is basically it actually Stores cell values of the same type in the in a single array Which is basically which? Lifts in the memory space as continuous memory space We have four array types one type stores String ball string values, which is actually a CSF now. I'm gonna talk about that later One type stores double values and one type stores any text Which is basically what was stored in previous C SC edit cell. It's a basically a rich text String types and SC formula cell book stores an array of formula cells. This class Was the only class that was retained before and after change and the logical managing Different array types is very very complex. So we have decided to offset or Offload the management of array to any DS at the external library By adding a new container type multi-type vector to manage the logic of multi array type array storage so Because the the content of the storage Was drastically changed Sorry the code that have access to the Cells range and this is how things was done. I think so I'm done in the old way all the way. Let's say if you want to do Something only with formula cells what we use something to do is to iterate through The cell array and pick up Create its cell type if it's if it equals formula cell and do something with it the The code is pretty simple right but the downside is that if the call mostly consists of numeric cells And then maybe two or three formula cells you have to still iterate Every single cell to pick up those a few formula cells So there was a you know performance cost with it the new way of handling Situations like this is basically You Still have to iterate to blocks to pick up formula blocks, but because You can only deal with it in case the column consists of numeric values and Formal cells you basically deal with two types of blocks So it's actually much quicker to iterate to maybe two or three blocks as opposed to you know one thousand ten thousand cells and Everything is pretty much templatized. So what you need to do is to create a handler functional object and then use a process formula which is a Templatized function to just only pick up a form of cells to do something with it. It's pretty simple Okay, another change that we made is we decided to share the Formula talking in rain instances, which is represented by a city talking array the previous in the previous model each SC formula cell object would store Unique instance its own instance of a city talking array and even if all these formula cells contain identical formula rate tokens each of these would have its own instance Now the new model after she had formula was implemented We are called to detect C formula cells that have identical formula expressions internally and then grouped up into a single cell group and the talking array is actually stored inside A group object. So if you have let's say ten thousand formula cells having all having the identical format special It is It is now represented by single Let's see talking array options and our change we made we've decided to pull all string instances previously if you have a Cup document having a whole bunch of string contents and even even if the most of the string content are identical each string string instance was Unique and we decided to pull all these string instances into a pool into pool sing I share pool and Have each set string instance only store Point of studio to share our string instances in the pool and when the string is pulled we also Generate upcase version of the string into the pool and store the pointer to it in the string instance This is how we we used to handle string comparison That change was mostly done. No. No because we wanted to save the memory footprint because that all your string was already Pulled internally but because we wanted to accelerate The performance of string comparisons the all the way of comparing two string was done this way If you have to do a case sensitive string comparison, we first fix the case sensitive to plastic literator from a sequel and then I use that to basically do Role string comparison for case insensitive comparison we would get case insensitive transfer there and then do the Role string comparison With the share string all you have to do is basically Call get data of share string which gives you a pointer to the the share string instance in the pool and Simply compare the pointer values for case insensitive comparisons You just need to call get data. You know case instead. It was a thing which is much quicker. So The summary of the change is that basically this is the largest refactoring effort done in code code and Therefore they the most critical puzzle code have changed the straight was changed. So All the neighboring code that had to access Self-stretch code also had to change You want to make you want to make things work Which means that We share formula we can no longer assume that if you have Share for myself instance that contains talking array We can assume that the formula cell itself manages the life cycle of talking array And then we share string We also have to make sure that all string objects are pulled in the shared machine pool and also With the previous Cell array random access to the cell array was actually a constant process a constant complexity, but we can no longer assume that so we have to change the algorithm to Access self-stretch in order to maintain reasonable performance. So what areas affected? Assume this this big Box is the entire talk to make things simple This much else affected There's a little area where it was not affected. So that's pretty much something I understand of the whole Core was affected by this change. And of course, this is not scientific. I just came up with a number I'm probably lying here, but it feels like 90% of the calc core was affected and one feature was affected the most affected areas Will form a dependency tracking reference updates copy and paste And so it was also affected Life-spec check was yeah, one of the first ones that was affected. I think This was one of the first public polls that came in after the Quarry factory of course the audience for example was affected most mostly in terms of performance after the Refactoring the the import became extremely slow. So we have to Change things slightly to obtain reasonable performance Finally place was also affected in terms of performance functionality wise it still should work, but In some cases the performance Actually suck and this is one of the things that we have yet to fix Also other areas there are they were affected Includes excellent for export For the same reason that would you see for it for us affected also name the range Storage was also affected It caused some crushers that we had to fix Database means was very similar to name range. So that was also affected as well some crushers Self-editing was also affected By that one's content rendering self comments those who are affected, but Let's see here actually these were less severe self comment was affected not necessarily because of the quality factor, but because the core change I have to coincide with The change that somebody else made to reflect Self comment storage. So that basically contributed to the some some issues and Whole bunch of other areas affected These ones in the white affected areas, but we don't see how a bunch of bug reports For these so I assume that These are really okay For now and of course under a leader sits on top of everything and so and Although we actually fixed Much of under review code. I think some quality still assume the old South race. So that's something that we still need to look into now expectation of us in reality so As I said, this was probably the largest reporting we ever had to do in Cal core and we didn't know what to expect so and Was going into this whole you know situation And that's all for some of the things that I realize. Okay. Check number one. I I assume that okay, you know, I'm very careful when I make cold changes I make sure that okay, you know, do some due diligence and you know to make sure things don't race things that you know Calculate the crash. So, you know, I was expecting many 25% 50% of the cold change made will cause regression In reality Almost all cool changes basically everything like that cause some formal regression changes that you know use as Considered to be regressions. So, yeah, that was a Big life lesson for me. Yes, I remember this one Take number two, there was parties behind us and we just need to fix your box so the most toughest part was to the reflecting itself and that was Myself and that was over I was happy and I think I was pretty cheerful and the previous presentation that you know, you know But in reality, the worst part was yet to come That was just the beginning of the night that I would ensue lots of media for reflecting ensue and as I said when the core story changes it's behavior changes which means that The code I use is that the container also has to change in order to you know, keep things fast Unfortunately these areas require medium to large reflecting as well, and I think one of them being the sorting that Prior to the sorting rework It didn't work. It didn't crash, but it was extremely slow But unfortunately in order to fix that I also had to definitely put in the whole sorting Not not the whole thing, but you know part of the sorting code to To bring the performance back That was one of the few ones and I think there are some of the other ones were named range handling That was crash shows and that one also unfortunately required medium reflecting and same as database range Yeah So that was also Something that I wasn't expecting or I wasn't hoping would happen Check number three Just focus on regression from the ranking four point two releases and go back to normal for the photo three So reflecting was done in time for photo two release So my hope was to just to focus on regression Fixing for photo two releases and just go back to the way it was, you know after that But in reality Although we do fix the first ones in the photo two cycle many is to remain the worst ones are behind us, but That doesn't mean there aren't any regressions left for us to work on so Yeah What can I get? Check number four Prior to the reflection We already had quite a number of unit tests written for Calc. So I was pretty confident that okay Even though we will probably get Quite a number of regressions, you know, most of them will be caught by the unit test to keep us safe In reality I may be lying, but I think I feel like being up a doubly them on unit test for Calc during photo two period alone And we still don't have enough, you know, given that, you know, there are still whole bunch of regressions reported in Bagusela So that was also lesson And check number five as a responsible corner, I will fix all the bugs might change course, you know I didn't want to be the bad guy, you know, I didn't want to be the one to break Things for other things other people to pick up and fix, you know, I feel bad about that, you know I'm a human being You know, people you say, okay, you break, you know, you break, you fix it So I try to do that, but you know It's just too large, you know, I'm a single person I I need to you sleep I have only 24 hours maximum so, you know In reality, we need multiple people who can handle Bugs and Calc or you know, comfortably. So that's the reality. So So what can we learn from this going forward? So the challenges we still face Basically squash remain the regressions cause why the whole changes The numbers to high It's actually it's still not I try to make it sound like it's a disaster It's not actually that bad. It's just, you know, we should have don't want to do Basically a gallon of people and you know, have them become a comfortable handling but fix it in Calc When the quarter factor in here, I think not many people are comfortably comfortable even trying to fix box So, you know, I was left with huge number of So when is the whole thing started No, many people actually Nobody else was comfortable enough. It's a change made to go fix box So I think initially I think I got I got fixed it. I got trying to help me But major ones actually Were given to me so So that was yeah, that was that was a big issue and I think these days I guess Hopefully, you know feeling more comfortable tackling and the markets is also becoming more comfortable So it's not as bad as I try to make it sound like here. But yeah, but still, you know, we need more people who can Try to fix box and it's an important thing, you know, just And it's not that difficult, you know, people think maybe okay, you know fixing carpets is it should be left for the dead experts But that's not the case. So let me try to encourage you And the third challenge is basically build the culture of writing unit transport each and every bug fix Not just some bug fix it. If you fix a bug, write a test for it If you implement the feature, write a whole bunch of tests for it so that, you know, they don't break We need to build this culture to to basically make it a I won't say requirement, but you know, make it just make it common or, you know If you fix bugs, bug fix won't finish until you write a test. So let's promote that kind of thinking for the inside of the project is what I'm trying to tell here And Of course, to make all that happen unit test is key One unit test is what 20 future bug fixes basically If you write a test It prevents 20 future bug fixes of the same bug So even if I understand that, you know, writing test is not an easy thing You know, it may take actually much more time than fixing the bug itself Especially if the bug itself is fixed by only one or two lines of change Despite that, you know, you need to understand that, you know This is the minimum that you have to do when you fix bug and I always think that bug fix does not finish until you write a test Let's try to promote this kind of thinking You can tell yourself every day It sticks in your brain And writing unit test is a quality for your fellow developers Writing test is not for yourself, but for other developers so that they don't feel bad about fixing your, you know Pressure bug fixes or features It's a quality at least that's the way I see it and we really don't have a choice if you don't If you're in a project where Writing test is not mandatory You have to really screen every single commit, every single change that somebody makes To make sure that things don't break But we really don't have a choice here because, you know, we try to encourage new coders to come in and encourage them to commit fixes And based on 20 years of history code base It's not possible to run every single corner case just by standing Even two or three years, I've been with this code base for 10 years and I don't even know everything and Yeah, I get Blank or breaking things left on like that. I wasn't really aware So yeah, that goes to show how difficult this code base is again Not emphasized enough so that's why I'm using two slides just for this topic but fix broken again, and Scared you it just actually happens especially now if you don't have a test I will fully break that in the next release. So make sure you write a fix for somebody else Well, they break it and but fix We need there's more we may fix forever and I keep saying this by time of time again But we need to emphasize this the promo this kind of culture So this information for you, which one will be your choice, you know Unit test or back fix this unit test. So hopefully you will have the right answer That's true Yeah, yeah, that's a corner case of an interest, but you know, thanks That's all I have any questions memory consumption in theory memory consumption will be reduced because from their tokens shared And also the overhead of cell value storage also much lower than the previous Storage model having said that there are always one of cases There are always some code that still assume the old storage model in which case the memory Requirement may rise temporarily after we fix it. So in theory memory footprint friends will be much much lower But there's some corner cases which that may not be the case You did have some stats and you saved like a hundred megabytes of 300 or something Yeah, if some cases the memory footprint of For that talking to long wasn't we use from somewhere around 300 megabytes down 200 megabytes. Any other questions find your customer want something cool Thank you very much