 I had said in a previous video that within that weekend one of the first real tutorial videos around stringier would be completed and it's over two weeks now and it's still not so I kind of want to give an explanation, I kind of owe an explanation on that. At the time I was just finishing some stuff up for a new feature, specifically we're talking about the target and jump nodes as part of the pattern. I'll get into explaining exactly what that is and the hiccups as this goes on. I wound up hitting quite a few barriers. There's a number of issues up on GitHub, really they are enhancements that I'm working on and I noticed a lot of these actually are interrelated, that is they can be implemented in one single sprint and because of that it just makes sense to implement them all at once. I figure since a lot of this has developed very organically very cowboy-coded rather than real serious design, which makes sense it's a novel approach you really can't plan out something that hasn't been done before because you don't know what's going to work or what's going to work well. Decided that I would make this pass into a large audit of the codebase going through and adding you know null guards where they should be adding documentation to everything. Previously the public-facing documentation was like 95% done but a lot of the internal stuff wasn't and that is a barrier for people contributing changes or even to an extent myself maintaining it, so going through and adding in all that documentation. There was also one of these enhancements exploring the possibility of using SIMD instructions for doing the string comparisons and at least from my experience because a bit more so in the past than now. I really focused on Aida and the thing there was that definitely would have worked and part of the delay I'm having had to do with how it even in the C sharp world it worked sometimes but not in others. So what I mean by that it is possible to compare multiple code points in a string at the same time using SIMD instructions because while a string is encoded the characters are representations of the code points not necessarily the byte encoding of the string and you can compare those code points and vectors using SIMD instructions. So you like on my machine you can compare 16 characters at a single time and obviously that's really fast. As I got into implementing this why I did the initial prototype and it definitely checked out in the prototype. I ran into some issues going with the full implementation and that's because no matter how good you are at creating reasonable prototypes there's always real-world differences and it'll be very clear what that real-world difference is in just a moment. But I benchmarked extensively and some of these benchmarks were done on .NET framework along with the engine so that is .NET framework but a native image generation. Kind of the predecessor to .NET Core AOT it's not great but it worked. Benchmarked on .NET Core .NET Core AOT and mono and the results I was seeing were hard to make sense of at first. On .NET Core and .NET framework I was getting far worse performance than the native string equals and on the engine and AOT images as well as on mono my approach was considerably faster and that's bizarre because you don't expect to see drastically different things like that. You might expect to see little gangs or maybe even a little decrease but then some nominal gangs on another. You don't expect to see immensely better and immensely worse with no middle ground that that's not normal results and what I eventually wound up figuring out what I traced it down to some of the .NET runtimes take advantage of string interning which is a difficult to implement well optimization so you don't see it very often there are a number of things that have to be in place for it to work and in order to work well it really has to take advantage of garbage collection not the deallocation side of garbage collection I find especially when people work primarily with native languages and they have little experience with garbage collected languages they don't understand exactly what's going on the garbage collector doesn't just do the allocations for you it also does memory defragmentation and memory reallocation to help make sure that things fit well into cache sizes and don't spread across page boundaries and other things that really help a lot. That being said I do have a bit of a soft spot for native code I prefer dealing with the deallocations but it's not garbage collection actually has a lot of benefits it's and the ability to reasonably implement string interning is actually a big one of those and that's why when it comes to text processing even when I was working on the edit tool stuff that's actually a big part of why I didn't know it at the time but I'd always recognize that for some reason C sharp was really fast at string comparisons and really good at working with text that that turns out to be the reason why that it uses string interning and it's really fast this is also why on the engine and AOT images it wasn't working those are native code they don't actually how they deal with the memory management is interesting but the garbage collector kind of goes away sort of it's it's interesting but string interning doesn't work on those as for mono it doesn't seem like they ever implemented string interning so that explains those results and between that all the results just make sense so with that in mind reverted back to doing just the normal character and string comparisons like you would expect to do and then it just goes back to finishing up the audit like I'd said making sure everything's well documented going through making sure there's null guards everywhere there should be going through and analyzing the code base overall to reduce things that don't need to be there to make sure that everything is well tested something that I had found out that I didn't realize was as bad as it really was the core library both the extension and the patterns libraries are completely CLS compliant so they will work on any .net compatible language end of story however a big thing that I wanted to do is make sure that there is good native feeling integration so even though through CLS compliance the f-sharp stuff just works I want it to feel like it's native f-sharp code as much as possible some some concessions have to be made there's some things that just don't work very well can't be mapped into that kind of functional feeling environment or could be but at great performance loss and I don't want to impose that but as much as possible really get that functional feeling and it as I was going through this I realized it's really not up to my standards the level that I hold myself accountable to it's still a major step up over the various parsec options just because they're so tied to a very specific behavior that it doesn't feel great on certain types of languages it's still better than that but it that's not up to the standards that I hold myself to and so one of the things that I had been doing was rewriting all the tests to be strictly in f-sharp the reason for that is ultimately a code coverage thing because the f-sharp stuff is just calling CLS compliance stuff it's guaranteed to go through that code path so it's guaranteed to be testing it but it makes sure that the that is much functional style stuff is implemented as absolutely possible while also kind of minimizing the amount of work I have to do to maximize code coverage which just just makes maintenance and overall testing easier as well as benchmarking a lot of stuff that hadn't been benchmarked before anything from the extensions library I never bothered to benchmark because at the time it was just like all right let's just implement something real quick as long as it works it's fine as it turns out not all of those were working as well as I would have liked that was another thing that I had done as I was testing these I was adding far more tests and there were some bugs that got fixed then we get into since I'm going through when I'm doing this full audit and I'm changing things implementing the target and jump nodes targeted jumper nodes I'd explain these guys the pattern works through a while it superficially feels like parser combinators because it's still built upon combinator theory but it's combining patterns not parsers a very subtle difference as far as the superficialities go the underlying implementation is drastically different than anything that regex or parsecs do the approach at least the way I took it is based on was based on a decision tree this is not a necessarily a binary tree and in fact it's largely violated with modifiers and some of the nodes I implemented later actually have more than two branches so definitely not a binary tree but it is a decision tree each of the nodes represents a decision realistically it might be better to view it as each of the nodes is the individual part of the pattern and when you call a parser on the head pattern it goes through and calls that same parser throughout whatever direction it takes through the decision tree what I had realized is that if you separate okay so the the original design there was a the pattern in the node conceptually were merged into a single type so they're the same what I'd realize is if you separate those the so that the pattern is just a the pattern is the only exposed type the only public type and the nodes are all just internal to the system that you can share a graph with each pattern being an entry you can share the tree with each pattern being an entry point into that tree it's a little optimization just reusing things when possible kind of the same approach to interning actually where you're sharing things when you can and once that separation happens what you can then do is take a leaf somewhere in the tree and loop up to a higher point in the tree you're now creating a cyclic graph which are risky and for numerous reasons there they're a not necessarily the easiest of data structures to work with but with a large number of restrictions you can easily work with them and that's the only way the tree concept is violated you always have one entry point into this graph and cycles are very well defined so it's still conceptually basically just a tree where one specific node can cause a cycle for the overwhelming majority of people you or for the overall majority of patterns that people create you just get a tree but these cycles the jump and target nodes they allow for recursion to exist because you figured you have a cycle in here something references either itself directly which is your standard recursion or one pattern recurses to another or jumps to another that ultimately goes back to itself the original one and then you have mutual recursion well these seem a bit esoteric the the advantage is that you can implement left recursive grammar rules and I had proven that works I'm a little tied up in exactly how to implement that best but I had proven it works using this approach and that's kind of a big deal because left recursion is actually rather difficult to implement well in parsers and I did it so I understand not everybody really knows a lot about linguistic stuff and what what what the actual fuck is left recursion a great example of it is like lists can be very easily defined using left recursive rules where the list is defined as the list and then whatever the item is and because it recurses you just have the repeating item that's not super useful and can be easily dealt with but the mutual recursion side of it is actually incredibly important the best real-world example I can find of that is expression statement third expression statements Jesus expressions in grammars most common you'll see is arithmetic expressions but there are others but obviously an expression can contain an expression and there you have that mutual cycle so the ability to deal with that is actually really important for handling complicated real language grammars that implementing that possibly introduces some minor breaking changes not anything major but it's enough to warrant instead of just a minor version bump an actual major version bump so a stringier version 2.0 and because of that minor breaking change because of that breaking change it makes sense to slightly delay a tutorial video just so that I'm not releasing a tutorial and then having minor stuff about it not work just a week or two later that's kind of a shot to call it to product confidence so yep that's the delay I am in the meantime going to be doing some videos on advanced C sharp and F sharp stuff especially I will today be recording a video on how to nicely integrate F sharp and C sharp together since that's something I see a lot of projects not really do they kind of just hey we implemented what we think is CLS compliant stuff and hopefully you can call it all from F sharp but it's nice to offer some actual proper F sharp supports so look forward to that I'm definitely recording it today I'm literally recording it as soon as I upload this video I don't know if all can be able to finish editing it today but until then have a good one guys