 I'm going to go ahead and get started. Just let everybody know. First time speaking at RubyConf, first time speaking in general. So thank you. Thank you. All right. So this is RubyXCube. My name is Stafford Brunk. You can find me around the internet under the handle Wingrunner21. I work as a full-stack software engineer at Guild Education in Denver, Colorado. We're a startup that's striving to help working adults go back to school in an affordable manner. What we don't do, however, is pretty much anything to do with RubyXCubes. So why? Why did I do this project? So probably about a year ago, my wife and I were having dinner with our neighbors and their middle school age daughter came over, and she was super amped about a RubyXCube-solving robot that she'd seen at a museum exhibit. She'd been working on learning how to solve cubes herself for the past few months, and she was just starting to take programming classes in school. So she was very interested in how this thing actually worked. So I started thinking about the problem, and I was like, well, maybe I could figure out how to build a RubyXCube robot and sit down with her and explain the concepts and go over the differences between how a human approaches the problem of solving RubyXCubes and how a computer would do that. So there was one problem, though. I was not an expert in these areas. I wasn't even an amateur. I had never even attempted to solve a RubyXCube before. I was kind of, I had some robotics background and some design background, but I had not done anything along these lines before. So I was a complete junior Kuber. Kuber is what people in the RubyXCube community call themselves solving cubes. So at this point, you all must be thinking, can this guy even solve a RubyXCube? No, no, I can't. But Ruby can. I'm still learning the ins and outs of solving a RubyXCube by hand, but this project has given me a lot better idea around the fundamentals around the RubyXCube problem set, but it's not given me some sort of magical boost and the ability to do it myself. If you'd like to follow along at all, all the code and associated links and stuff is posted under the RubyXCube repository under my GitHub account. That Google URL just takes you to the same place. Yeah, so let's get going. So this is the general roadmap. The first thing we're gonna do is go over some of the basics like terminology and movement notation. If you're a seasoned Kuber, my apologies, this is gonna be pretty basic for you. The next step is we're gonna go look at some of the strategies for solving a cube and then finally we'll talk about kind of what my plan is for the next steps on this project. So first up is terminology. There's pretty much three basic terms that you need to be familiar with here. Face, a facelift and a cubelit. So first up is a face. Facets are pretty much the individual sides of the cube. Faces are denoted in reference to whichever face is closest to you, independent of color. In this instance, the yellow side of the cube is the one that's closest to you, so it is the front face. From there you can see that the remaining faces on the cube, red is the right face, orange is the left face, blue is the top or upper face, green is the bottom or down face, and white is the back face. Faces are broken up into nine individual facelets. These are basically the individual colored stickers in the cube. Facelets are numbered from one to nine, starting in the upper left hand corner and wrapping around as you go down the cube. A cubelit represents the individual colored pieces of the Rubik's Cube, so these are the cubes that make up the cube. There are 26 total cubelets in a three by three Rubik's Cube. They are broken up into three categories, corners, edges, and centers. You can see the corner cubelets highlighted here in pink. Corner cubelets are exactly that, the cubes in the eight corners of the Rubik's Cube. As each corner cubelet is the intersection of three faces, it has three possible orientations. Let's take a look at the URF corner cubelet as an example. Take a look at where the yellow, red, and blue sides meet. If you remember from our face terminology, this is the intersection of the UR and F faces. Thus, this is the URF corner. There are three different orientations displayed here. You can see an individual orientation is essentially a simple rotation of the cubelet. Each corner cubelet can be oriented along the same pattern to produce all possible orientations of the cube corners. So if you twist one corner that we have twisted here and then another corner is twisted, that's another permutation of the cube. Note that while the corner cubelets themselves can be placed into these orientations, you can't necessarily reach all these orientations through legal cube moves, meaning without actually taking the cube and rotating it yourself through normal turning of the cube. The next type of cubelet is an edge. Edge cubelets represent where two faces meet. Again, they're highlighted here in pink. There are 12 total edges on a three by three Rubik's Cube. Each edge has two possible orientations. Here we can see the two different orientations of the FR edge cubelet. Like corners, a different orientation is simply a rotation of the cube's possible values. Also, like corners, permutations of the 12 edges in the cube equal all possible edge orientations. Finally, we have center cubelets. In a three by three cube, there are six center cubelets. As the center cubelets only ever touch one face, there's no orientation to worry about. They only have one. One of the most important things to remember about the center cubelet is that its position is essentially fixed. All corner and edge cubelets move around the centers. So the centers are the axis of rotation for each individual face of the cube. Now that we're familiar with the parts of the cube, we need to go over how you denote movements on a cube. For that, cubers have adopted a notation of relative movement developed by math professor David Singmaster. We'll only be talking about the basic movements here. If you're interested in the more advanced cube movements, I recommend checking out ruwix.com's article on this topic. Movements are defined in terms of quarter rotations in relation to one of the cube's faces. Each movement has an optional modifier that denotes the type of movement that will be performed. In this example, we'll use the front or F face as that which is being moved. A single letter denotes a clockwise quarter turn. So here you can see that the F face has gone from its original solved state to having all colors rotated by 90 degrees clockwise. There's no modifier required for this type of move. Here we can see an example of a modifier at work. F2 designates two 90-degree turns in the clockwise direction. It is possible within this notation to specify additional higher numbers, but this is rarely used since it overlaps other notations. For example, an F4 notation would be equivalent to no movement, as in you would do four quarter rotations and you're back to where you started. An F5 notation is equivalent to an F rotation. F prime denotes a quarter turn in the counterclockwise direction. You'll notice that this is the same as if you'd specified an F3 rotation. Individual face movements are chained together to create movement strings. Take, for example, the movement string here. F, R prime, U2. We start with a solved cube on the left, execute a clockwise turn of the F face, a counterclockwise turn of the R face and two clockwise turns of the U face. The end result is shown on the far right. And I'm pretty sure that that's the correct cube orientation, but I drew those by hand in Illustrator, so forgive me if I've got some squares out of place. All right, so let's move on to actually solving the cube. So we're gonna go over three strategies here. Brute force, what's commonly known as CFOP, or layer by layer. And finally, Kossienda's two-phase algorithm. So why don't we just brute force the solution to this? It's a three-by-three cube, right? It really shouldn't take us that long. Well, there are actually 43 quintillion possible permutations of a Rubik's cube. More precisely, 43 quintillion, 252 quadrillion, 3 trillion, 700, or 274 billion, 489 million, 856,000 flat permutations. That is a huge number. Computing how to solve that many permutations would take you years, which is exactly what researchers did. The upper and lower bounds on the number of moves that are required to solve a Rubik's cube is known as God's number. The term comes from the idea that were God to be given a scrambled Rubik's cube, he would always solve it in the most optimal way. The bounds were first thought to be roughly 17 on the lower end and 80 on the high end, meaning researchers thought it would take a minimum of 17 moves and a maximum of 80 moves when they first began to research this in 1980. It wasn't until 2010, thanks to the efforts of Thomas Rokicki, Herbert Kosiemba, Morley Davidson, and John Deathridge that the upper and lower bounds were converged into a single number. This number was reached via 35 CPU years of computing time donated by Google. So that's just how long it took them to compute that problem space. And that number is 20. All scrambled Rubik's cubes can be solved in a maximum of 20 moves for the most optimal solution. Researchers discovered this by computing all possible solutions to the cube and making sure that none of them required more than 20. Note that they didn't actually have to compute all 43 quadrillion different solutions in order to reach this conclusion. Due to the nature of how Rubik's cubes move, some permutations are considered symmetrical to each other. Think of it like this, if I took a Rubik's cube and held it up to you and then I turned it over, the solution to solve that cube is exactly the same. It's just a different face is on the F face. Those are considered symmetrical permutations. All right, so it took 35 CPU years to compute all the moves. That means we could just have a lookup table, right? It'd just be like a giant hash and we can just hit it and go. It'd be like super fast. Well, it would take a lot of space. Even if you assumed one byte per solution string, which is completely unreasonable, it would take 43,000 petabytes just to store everything. I'm sure you've got much better things to store than Rubik's cube movement strings. Like your shiny new collection of air dropped cat pictures from RubyConf. So the brute force method is a no-go. Random movements would take an infeasible amount of computation time. Pre-computing all possible movements would require a ton of storage, much less how long it would take you to actually perform a lookup on a data set that size. We'll have to use other strategies to solve the cube. But first, let's ask, how do humans solve a Rubik's cube? The answer is they break it down into smaller parts. So here's an example of a beginner strategy for solving the Rubik's cube. It's known as the layer by layer strategy. This strategy is part of a set of strategies that fall under the C-FOP genre. C-FOP stands for cross first two layers orient position. So basically you're starting from the top of the Rubik's cube, solving that layer, solving the middle layer, solving the bottom layer. This genre is popular for beginners and for advanced cube solving techniques. There's some speed cubing solutions that use this and some basic computer algorithms that use this. The idea, sorry. Note that I'm going to go through the strategy as you would solve it as a human, but I'm not gonna go into the strategy in depth. If you'd like more information on this, again go to Rubik's.com and they have a great article on beginner strategy for solving the cube. So first step, make a white cross at the top. You can see that the edges and the centers are also color aligned here. Put the corners into position. So this marks the complete solving of the first layer. You can see that the orange and the green are solved, centers are still in position. Next is to solve the second layer. Here I've actually flipped the cube over. This is in preparation for the next step, but the first layer would be on the bottom, the second layer is in the middle. Notice how the yellow cube is in the center on the bottom. Next we're gonna solve for a yellow cross. We'll put the edges into place. You'll notice that this is closely mirroring what we did in the original, or in the first couple of steps. Now we're gonna put the corners into position. So this is called orienting the corners. So they're roughly where they need to be, but they're not actually solved. And finally, we solve for the yellow corners. So we've got a solved cube. This seems pretty good, right? We've broken it down into smaller pieces. It's a pretty straightforward way of doing things. But remember that each step is a different permutation of the cube. Were I a computer solving this, I would have to compute what moves are required to go from point A to point B in the most optimal solution, and I'd have to do that for all seven steps. To illustrate this, I have a small demo. So this is the Rubik's Cube Web 5000. So we're gonna generate a random cube, and it's gonna come back really, really fast. This is a very naive implementation of this algorithm. You can see that it came back with 79 moves. It's not too bad. Some of these are whole cube rotations. It's a little bit inflated. So this is just to solve the first layer, though. So we'll go ahead and let that run. So you'll notice with the whites, it's actually doing it in a slightly different order than I talked through. It's gonna be solving the white corners first, and then it's gonna be solving the white edges. It's just a slightly different order performing the same algorithm. Oh, that did not work. I have the wrong thing going here. Oh, that's what happened. Oops. Try that again. Hopefully that's better. All right, well, sorry. So we'll go back to having it compute the entire cube solution. Part of this is this library I'm using to visualize the cube is a project called Roofpig. It's a pretty cool visualization, but it's a little finicky on how the solution string comes back. It'll actually change the sticker colors around. So this should actually be the full cube solution. It doesn't spin. All right, I'm gonna punt on this one. Come back to this. Let me switch over to CoSiemba's algorithm. Okay, all right. Sorry about that. So CoSiemba's algorithm was created by Herbert CoSiemba, a German mathematician that you may remember was a member of the team that helped discover that Gauss number was 20. His algorithm is made up of two phases. The first phase solves the cube into a known state. This allows the second phase to have a considerably smaller subset of moves required to do the final solving. Just as an example, we'll use the first two layers solution to solve the cube into a given state. So in this instance, the first phase finds the solution to the first two layers. The second phase solves the rest of the cube. So in our problem set, we're looking for the most optimal solution of solving the first two layers. And from there, we're looking for the most optimal solution to solve the remaining layer. Contrast that to the layer by layer solution that we were just looking at, which theoretically worked. And you can see that if Gauss number is 20, then phase one plus phase two should equal a maximum of 20. But even with the minor moves we were getting back, we were well above 20. All right. So in order to solve this, we should probably use a tree. More specifically a game tree. What's a game tree? You might ask. If there's anybody who's new to programming, you may not be familiar with trees at all. So let's go over a small example. Okay, good. This is a game tree for a tic-tac-toe. So as you can see, the initial state of the tree is the initial state of the game, an empty board. That's the root of the tree. Each additional depth of the tree or each level down has leaves. Those are connected to a node above them. Those represent a new state in the game. Each edge or line that connects each leaf corresponds to a move in the game. And then each level of the tree is considered an additional depth. The idea is to search the tree until you find a winning state. There are two general ways of performing this search. Breath first search and depth first search. Breath first search searches across the tree. So it'll search at depth zero, then it'll search at depth one, going from left to right, depth two, going from left to right, et cetera. As you can see that as the tree is iterated, each leaf is checked for the finished solution. If you find that solution, you're done. You don't have to go any further. The other thing that we have here is depth first search. So depth first search means you go all the way down the tree on the left-hand side as far as you can go. Then you start coming back up, go back down, come back up. So you can see that you're going up and down and then you're moving left to right. So the problem with the game tree is that usually all possible permutations of a game are impossible or highly difficult to build into a tree. Tic-tac-toe, you can probably pretty easily do. You can't put 43 quintillion different combinations into a tree in memory. So we need to make some sort of heuristic to make our search more efficient. One way to do that is to limit the depth of the search. We know that God's number specifies that all cubes can be solved in a minimum of 20 moves. We can then set our search to be 20 and limit how deep we have to go into the tree. Another problem with the gaming tree as we have it set up is that you have to make a trade-off between how close to solve the Rubik's Cube is. For example, if all you did was perform one turn on the cube, you'd expect that solution to be at depth one on the tree. However, if you do a depth first search, it's possible you'll be waiting quite a long time in comparison to be doing a depth or breadth first search. It'd be great if we can minimize that trade-off. So if we did a breadth first search, we would hit the solution move on the fifth leaf that's iterated here. But if we're doing a depth first search, we have to go all the way down and up and down as we go across and you just have to wait longer and extrapolate that out into the Rubik's Cube problem set and it's a big difference in time. We can use a technique called iterative deepening in order to try and hybridize the difference between depth and breadth first search. We'll continue to use the first two layers as an example. So let's say that it requires 15 moves to solve the phase one portion of the algorithm and then it solves 11 moves to solve the last layer. That is 26 moves in total. Irritative deepening essentially allows us to use less optimal solutions for phase one and check to see if we can generate a shorter phase two. So let's take a look at this. So you can see the first row of the table is our optimal 15 moves for phase one and then the 11 moves for the remainder of phase two. It's for 26 total moves. Then we check, well, maybe 16 moves in phase one generates a shorter phase two, maybe 17, 18, 19, et cetera until we hit 26 total moves in phase one and zero in phase two. However, hopefully somewhere along the way we find a shorter total. So in this instance, for 17 phase one moves and we get six phase two moves for a total of 23. That is an overall more optimal solution. When that happens, we restart our search with the new total of 23. Eventually we're gonna get to the point where we can't find a more optimal solution. Maybe our algorithm times out, maybe we run out of depth to search in the tree, whatever. At that point, phase one will be the full total move. So the bottom row in the table and phase two will be zero. In practice, normally when you start getting close to this bound, the last few moves that phase one encompasses are actually the first couple of moves that you would actually find in phase two. All right, so how does this all relate to Cosiambas two phase algorithm? So instead of using F2L or the first two layers, that's its initial state, as it's phase one initial state, Cosiambas solves the cube into a known state G1. The properties of G1 mean that all corners and edges are oriented and all middle layer edges are already in the middle layer. All right, so kind of what does that mean in English? So like I said, corners and edges are oriented. So they are roughly in position to where they need to be at a solved state. Middle layer edges are already in the middle layer, meaning that if they're going to end up in the middle layer or the second layer down from the top, they need to already be there, but maybe not necessarily on the side that they're going to end up on. The idea around solving to this state is that from here, only a small subset of moves are required to finish solving the cube. So from state G1, you can see that only the moves U, D, R2, F2, L2, and B2 will be required to actually move the cube into the solved state. This cube is not a cube in state G1. This is just a representation of different edges, corners that are oriented. All right, hopefully this time we're a little more lucky. So we'll be using CoCM as algorithm this time. I'm setting it to search to a depth of 21, one more than God's number. The reason for this is that my implementation of CoCM as algorithm right now needs some optimization and one additional piece of depth dramatically helps the amount of time it takes to search. So, okay, got back and we got a 20 move solution, done. So you can see that took 14 seconds to compute. Sometimes it's a lot faster, like right there, 21 move solution. Longest I've seen thus far for my implementation is 45 seconds. This one might take a while, let's see. All right, so this is kind of the next steps from the project. So I base the initial implementation off of reference Python implementations by a couple of different individuals. One by the initial author of the algorithm who ported his Python implementation was from his Java implementation. And then the second one was from someone named Maxim Soy who's, he's kind of cool, he'd build a bunch of different cube solving robots online. He's got some novel ideas on different robotic mechanics and stuff on most efficient ways like grabbing and letting go of the cube. I ported the code fairly directly. I converted idiomatic Ruby in areas where I thought it would be fairly low risk, but in general, it was kind of Python-esque. Like it kind of looked a bit like that. So, my next task is to try and optimize the code for the Ruby interpreter. So one thing that this algorithm makes prolific use of is while loops. I've got a lot of while truths in there. RuboCop loves to tell me that I need to be using loop instead. However, for my low level testing, loop is actually slower than using while true. If you remember, loop requires a block which creates a new variable context every single time while it's just looping right there. So it's very possible that as I optimize this algorithm it will not end up being idiomatic Ruby. I intended writing some kind of benchmark suite to try and see what kind of a penalty does idiomatic Ruby actually have in this algorithm. Another thing I'd like to put in place is symmetry detection. If you remember, I talked about that before. Basically, detecting if I can take a given cube state and then see if it's been rotated or something like that and try and short circuit some of my tree searching. Some of the algorithm has the baseline features in place for that, but it just needs to be further fleshed out. There's also a newer implementation of CoC in this algorithm called min2phase. You can see the GitHub link there. It actually will solve the cube into the initial state of more than just G1. It has a couple of initial states and a couple of different ways it prunes the decision tree. I haven't investigated that very deep. I actually found out about it only a couple of days ago, but if you kind of need to see what kind of optimizations I can pull out of that. So one last thing. I definitely mentioned a robot at the beginning of this talk. And I've done actually a lot of initial work on the robot. The bot's functionality can be boiled into three steps. Read the cube state, solve the solution, and then solve the cube. So in this instance, use openCV to read the cube state. Use CoC MMOs algorithm or some modified version of it to compute the solution and then translate that moveset into motor movements. So you can see a screenshot here of some of the work I've done to read the facelifts into openCV. The facelifts themselves are being highlighted in their appropriately detected color. There's still some work to do around dynamically compensating for different lighting conditions and doing some refactoring, but it's working pretty well. The only thing that I have not gotten working particularly well is that the openCV bindings to Ruby do not handle live video streams particularly well. I think I need to drop to something that's C-backed to handle the data structures and translating it out of Ruby objects is just too slow to do the video processing in real time for this, at least the way I haven't implemented. So the next step is use CoC MMOs algorithm, so we've already covered that. And then the last piece is the robot and then translating the movement's strings into some kind of motor movements. So I've been designing the mechanics of the robot already. On the GitHub repository, there are some STL files. Those are the files that you 3D print around like the cube gripper and how you mount the gears and stuff like that. Eventually I'm hoping to have a full bill of materials up there and instructions and for people to build their own bot, how to run it, et cetera. I've also started implementing motor functionality. I checked existing projects like the R2 gem and they don't really implement stepper motors. I need fine-grained motor control to make sure I'm turning exactly 90 degrees and stuff. So I'm going to be ending up implementing my own lighter weight solution. There's already a library up on my GitHub username called LiveMP-SSE. So this is a USB interface to what's called the I2C bus. That's a way for embedded stuff to talk to each other. I think of it like a mini network, basically. It's a C library that's wrapped by using FFI and I've built out the constructs to be compatible with the existing I2C gem. So eventually I hope to integrate that into a motor controller or you can just drop it in place for existing projects. Instead of using something like a Raspberry Pi, you can just hook up something that Adafruit sells to a USB port in your computer and get the exact same functionality. All right, thanks. That's all I got.