 Okay, so now we're recording, welcome everyone. So today's agenda, we've got some hardware updates we want to talk about, we've got our usual bugs reports to go through, and anybody else having any other agenda items? Nope? Okay. So, well, let me just give you an update on the hardware for now. So we made a couple extra revisions because of the sketchy availability of one of the power supply parts, but Kevin got that knocked out this weekend. And so we're ready to release the first iteration to the community to take a look at. So I'm not sure exactly how to announce that, but I guess you're probably the right one to take charge of that. There's a repo in, it's currently marked as private for the Mark II hardware. And so we just need to flip that over to being a public repo so people can see it. And I think that's how we want to share things. We can get specific feedback on the Kaikad files and the design documents and stuff like that. So I did have a question actually about the design documentation. Right now it's all in a Google Doc and there's just really one document so far because we're only documenting the daughter board that we're building, not the whole system. Would it be useful to turn that into the markdown format in GitHub or is it fine to just leave a link to the Google Doc? What do you think will be more, most useful to people and the community? So I think that not everybody has a GitHub account necessarily and so I think it's much easier to share a Google Doc with people than it is to ask them to get a GitHub account if they want to see the repo. Well, you don't need a GitHub account to see the repo, right? Well, because it's public, you don't. Yeah. Yeah, it'll just show up like a weekend. I see the benefit of doing the markup or whatever it is, just because you can see it all right there in the one place. Yeah, and then it's also revision controlled in sync with the files as well, which is nice, I think. Yeah. But there's some redundancy, so do all the work over here and you've got a copy there. Well, I mean, if I move it over to the GitHub repo, I'll just delete the Google Doc. I'm not going to keep two copies. That's ridiculous. Yeah, yeah. It makes sense to me. So, okay. The other thing on the hardware is talking about part numbers. As I was going through and writing some of this documentation, it became cumbersome to specify whether I was talking about the Mark II or the daughter board, various configurations or whatnot. So I decided to come up with a part numbering scheme. Now I went ahead and did this without asking any of you if we already had a scheme for remembering things, but I didn't think we did. I have a little bit of a scheme just in that I try and separate part numbers by purchased parts, fabricated parts, PCB assembly parts, the bare PCBs themselves. I have those four main categories. So a fabricated part would be something we build ourselves, like a plastic or whatever, cast part or something. PCBA part is the daughter board. It'll actually have two, well, three, we'll have three PCBA parts because we have got this little USB jumper thingy or whatever. And then the purchased parts is like the speakers and screws and stuff. Now outside of trying to keep those categories separate, I don't really care. That's my only thing that I would like to do is keep those all kind of separate. Okay, well, why don't we have a discussion about that? Because those are different concerns than the ones I had. So we don't need to have that discussion now, but let's take that offline. My concerns were primarily about version tracking, tracking revisions that we send outside the company when they're not going to be using our version control software, right? And also making sure that we can correctly identify when things were made. Even if we make the exact same part that we made two months ago, I'd like to know that it was made in a different print run, if you will, so that we can track that, in addition to having serial numbers and all that kind of thing, which is a whole other discussion. So anyway, that's enough on the hardware side. As far as GitHub repos go, do we have, I personally am a little sensitive to how many repos we have right now. And is there a repo that exists like a hardware repo or even the Mark II enclosure repo where this stuff can go rather than creating a whole new one? Or do we need to create a new repo for this? We already have a new repo, although it could be moved, I suppose. I can link it here in the chat. I was going to have a look at whether it would make sense to merge it with the previous Mark II hardware repos, because obviously they can both exist, so they can say the progression at a time. But whether we just push the old prototype to a different branch and then have this master and just try and keep things cleaner and stuff. But I haven't had a date look at the repo yet, so. Right. Okay, yeah, I'll take a look at that with Derek and Kevin and see if we can keep it a little cleaner. Yeah, I agree that I think there's a ton of repos out there. It's not just with this, but there's a, yeah, there seems to be some cleanup work that needs to be done there at some point. So, okay, so I'll take an action item there. Right, I guess we can jump right into the bug tracking. Progress reports. All right, everyone see my screen? Okay, we'll start with the project rule over prototypes. Charlie, it looks like you're still at home. How's it going? It's going all right. Actually on Saturday, Derek was able to drop by some of the prototypes for, I guess, the audio chambers. So actually this morning I was working on putting some of the threaded inserts in there. I got, for the most part, done with the ones that he gave me, but he's still going to bring some more over tomorrow morning for me to do. And hopefully I can get those done tomorrow as well. So just trying to get those threaded inserts done. And then hopefully I can get my test results back soon and then I can start doing more stuff in the shop, such as like the wiring and the soldering. So. All right. Are any of the tickets that are in progress right now can maybe moved or are we still in the middle of three years? Yeah, I don't think anything can move. So we're on a 3D print and prepared. That's what he's working on. Oh, okay. Now that's what I'm doing. And then 60% done with that, I'd say. Yeah, for the seven sets of audio chambers, Derek, that would that go under the same thing as 3D print and prepare seven sets of housings? Is it the same? Oh, it's I kind of separated the auto chamber because it was a little bit more complicated. But yeah, I guess the seven sets of housing was everything but the audio chamber. Yeah, I don't think we can count the auto chambers until we actually put all the speakers in. I can bring speakers over tomorrow to start putting those in as well. Okay, yeah, I can do that. Nothing, nothing's technically moved. The other thing on the prototyping front, which doesn't really concern any of you guys personally, but just as a note, one of the wicked guys also had a bit of a scare with COVID and is getting tested and self quarantined. So that's a two out of what, like five or six people in that office that have had scares. So to make sure we don't get delayed again, I'm going to try and increase our safety protocols over there. So I'll work with the wicked crew on that, but we're going to have to be more vigilant than in the past in terms of mask wearing, social distancing and all that. So I want to make sure that we can, if someone were to get it, if that unfortunate event happens that we, you know, I think in some cases, the way things have been over there, if someone have got it, it may have you could easily spread to the rest of us. So I don't want that to happen. So I am not on the project roll over prototypes. Yeah, so I'm continuing to go up it and continue the 3D printing, but it's going to be next week until we're really able to finish all this stuff because of the kind of slowdown. Okay, Derek, we already talked about the Mark two prototype a little bit. Any other updates there? Just a couple of things, scroll down to other issues. Let's see, I just tagged, guys, I checked you into reviewing the GitHub repo just to have another set of eyes on it outside of Michael and I and Kevin that, you know, aren't as involved with it. So just take a look if you could. And then I am releasing the blocking file, the blocking assembly, which is just kind of all the components and arranged and how they will be for production, but not finished plastic. We're releasing that in the GitHub repo as well. So people can get a look at how it's all going to come together, how the, it's mostly about the daughter board, but they'll make and see how it mates up with the Raspberry Pi and where the speakers are going to go and how the display is going to interface with it and all that stuff. Yeah, so that's that'll be in the repo as well. And that's I think everything else is in the right place. So is that OK? Yeah, guys, can you take a look? If you want to help with that, it'd be good to get the Google Doc link and happen to, you know, pull that down and then format it all up and mark down and pull that across to the other resource. It's better to be helpful. Otherwise, it might be better to wait until that's in a more final position. It sounds like there's some other stuff that Michael wants to do on it. Yeah, so I've only really edited the read me, which is, you know, basic stuff, having threw something up there. But it was really simple. It's like a title. So I've added a little bit of info in the reading. Yeah, we might have a look at it. It'd be good to, like, sort of point to, you know, what the different what the different what what the files are that are there. Like, we've just kind of got some basic details after the device and then a whole lot of files. Um, yeah, I think it'll be helpful if we if we kind of give people a little bit of a guide as to what's there, like, we don't have to do too much hand-holding kind of thing that that could be in the Google Doc. Yeah, looks like Michael just showed on a chat. So we could put that in there. Yeah, have to do that. Well, sounds good. All right. Yeah, I think that's it for this. I don't think we talked how long do we want, Michael, how long we want to leave this out there for the community to review? We said a week. Week, OK. Yeah, I think once we push out a notification, we should give them a week and we'll see how much traffic we get. You know, if we're getting a lot of activity and it's being really helpful, then then we could potentially extend it. But I think we should we'll start. We'll start there. Should we do a quick blog post about it? Oh, yeah. Yeah. OK, so I mean, it doesn't have to be much. But, you know, yeah, targeted at the people who might care to look at that and review it. Yeah, I just kind of figured it would be just a way to point to it. Just hey, this is it. Can we go ahead and make sure that blog posts gets put up on the Kickstarter and Indiegogo pages so that those communities could have had a little early. Yeah, sounds good. I only left when you finished a integration open so we could give a status on it. It's done. The code is in test right now. I want to leave it there for a couple of days and, you know, go out to the UI and check and make sure everything is doing all right before I promote it. But. But, yeah, everything is has been reviewed and packaged up and is sitting on test now. So, yes, I don't know if you have a device pointing the test or anything, but if there's any. If you do and have a minute to poke around a little bit on test, that'd be awesome. So I'm going to complete this sprint, so we don't have to talk about it anymore. It will be released this week after a couple of days and tests, but. I'm going to check. All right, cool. Congratulations. OK, bug fixed sprint. So now that I'm done with the Salini stuff, I'm moving on to some of these bugs. The first thing I was looking at was something that kind of resulted from Josh's email from earlier. That is more to stopping working. So I think we need to probably get all the key V code merged in with what's in dev right now. So, as I know, you did a little work on that, but I want to go ahead and get all these branches cleaned up and merged into the right repose so that we're not off on this one off thing. And right now, basically the devices that we're putting out with the key V image of a pretty old version of core. And you can see things that I think a couple of things are already come up that are like, oh, yeah, I know we fixed that since since then. But it just isn't showing up as fixed because it's such an old version of core. So, guys, where you left that? Yeah, I think it's in the top of the to-do list. M-I-C 369. OK. So, yeah, there's some some comments in there around. I think the the issue from memory was around when we updated tornado and changed the the gully bus because it used to create new new clients for a reconnection or something. And then we shifted to just reconnecting to the same to the same tornado server. OK. Yeah. I was going to ask if it's worth cutting a release before we do this so we can do it against 20.2.5. But if we're doing it against a dead branch, then that's something anyway. Yeah, I mean, I can wait to do the merge until the next release if we want to do that. So do you have anything? Do we have anything in dev that's worth releasing recently? Yeah. OK. Yeah. I'll take a look at this. And it's been six weeks since we did a release. And there isn't as much as as they usually is. But there's still a few interesting. We've had 34 commits since the last month, which includes a few fun pictures and stuff, say. OK. You want me to take this ticket over then and get past these issues and get this branch ready for the next one? OK. Yeah. And I think I don't have tickets for this, but does that does that include getting the skills that have KB code in them? Um, up to into the 20 to branch or not? Are there different tickets for that? I've just been creating. I've just been moving those into the KB display branches on each of those. On each of the skill repos. OK. Because they're still for a KB skill, doesn't it still require some external code? Which is in a different repo somewhere. All right, it shouldn't. It should just require the stuff that's in core. I don't think there's anything specific you have to put in there. It's all it's all just communicating with that display bus, I believe. And after my myself, it's too early in the morning. I thought there was I thought there was some some thing where the screen, the screen to the field would define somewhere else. Well, yeah, the screens are defined in the KB repo right now, instead of in the skill. That's mostly because I know we've talked about this before. And we probably need to come up with it eventually is a more generic interface to display. So that skills can basically, you know, that kind of code can go in the skill. Yeah, so at the moment, if I wanted to add a. A new screen, like if I wanted to add a key display to some random field like envy or something, I would need to update the end of the field and also update the micro display repo. Is that right? Yes, that is right. Yeah. And I was mostly due to, you know, the wanting to just to get this up and not worry about creating that interface, because I would have taken some time. Yeah, especially if we wanted it to play nicely with the other stuff. So yeah, yeah. Well, yeah, so I don't think it's worth spending too much time doing that at the moment before it's still in control of everything. OK. Yeah, we can leave all this. I mean, we can leave the, you know. Do we want to leave the key to code in core and a branch to or just for our personal use? Or is that worth? I think it comes back to the, you know, we need to we need to have that conversation and decide which way we're moving. Because I think once we start doing that, then, you know, we're we're moving in a particular direction. And so that's, you know, I feel like we need to make a call one way or the other. And then, yeah, decide where we're going. All right. So this MYC 369 was really you just trying to get that TV branch updated with what's in core right now, not necessarily merging that in the core. Yeah, my plan was just to put that just to make that back into that. Back into the TV display, what's it called? They do not to display branch. That was my plan there. All right. So just to point out that the longer we do this, the more, you know, as the skills get updated and core gets updated, you know, just for putting ourselves in a place where we have multiple things to update. So, yeah, so just to keep in mind for everybody that is surprisingly what I do, very long term. All right. So my current plan for this week is to start tackling these QV4, Pi4 image labeled tasks and probably result in another image once those are done. So maybe multiple, depending on how things like Wi-Fi is not going to be a small task, but some of those other things might be might be better, might be easier. I have a I have a question about Wi-Fi that goes back to some really basic assumption type stuff. I are you sure that there's not a Wi-Fi set up package out there somewhere that somebody is maintaining for Raspberry Pi that we don't have to build from scratch the entire Wi-Fi SSID and encryption key management system? I mean, I just it feels like we're reinventing the wheel. Somebody surely somewhere maintains a package that allow you to set up the Wi-Fi and the Pi4 intelligently. Yes. So part of this issue is that the Wi-Fi set up repo as Rome and is no longer just a Wi-Fi set up repo. It basically handles lots of system level things like when you do a Hey, my crop shutdown or restart or, you know, any of those things that would you like me to turn myself off? No. Okay. So, yes, Joshua, I agree with you that there there it'd be sort of surprised if there wasn't something out there that would do this. But this goes back to another architecture thing where I know that's like the GUI. The other GUI library has the touchscreen thing where you can enter Wi-Fi set up stuff. I think that's a good point. I think that's the same thing where you can enter Wi-Fi set up stuff. And it's also using a different from what I understand from OK. It's using some network level package that required that so we don't doesn't require us to do some of the things we were doing with the Mark 1. Does that sound right, Chris? So, yeah, I mean, also because it's got the on screen keyboard and everything you can just enter the into the credentials straight on the device rather than having to use the hotspot which you can connect to and going back and forth. Yeah. So, I mean, I guess there's one question is, you know, what do we want this thing to do? I haven't even played with the touchscreen yet, but I can't imagine putting a keyboard up and doing the same thing that the GUI is doing is that difficult. At the same time, we have this repo out there that does this for us. I could just try to get that repo working on this pie for there's another possible solution. So there's different ways we can approach that short term and then maybe a lot maybe even a different short term long term solution depending on priorities and all that good stuff too. So, Michael, do you have any opinions on any of that? Not at present. If we need to, if you want to chat with it afterwards, I'd be happy to chat with me afterwards about this. All right. I'm still tempted to push that ticket down a bit until it comes a bit more urgent. I feel like, you know, there's other stuff like, well, you know, Project Rollover, they know they can use Wi-Fi and we can, you know, do a lot of types of provisioning. They're not going to run around and manually set up Wi-Fi on all the devices or anything like that. Yeah. I mean, they may use Wi-Fi, but they'll be able to to show on to the device and Yeah, we're going to have, yeah. I'm more enterprise-style provisioning process. My concerns are primarily surrounded by, you know, doing it over and over and over again, right? So we created a Wi-Fi setup that used the serial bus for the screen for the, for the Mark 1. We created Wi-Fi setup for the Raspberry Pi version. Then we created it for the Mark 2, and now we're creating it again for the Mark 2. And it seems like, you know, the amount of time and effort we're spending on it, we might as well be a Wi-Fi setup company and we can sell Wi-Fi setup software to people instead of what we're doing. And so it just, you know, it really came down, you know, the same thing with the Mark 2 setting on my counter, like giving up the ghost. It feels like, it feels like we're stuck in a forever limbo of, you know, going back and redoing or doing over or doing for a new architecture, everything. And we never seem to get, step forward to improve the performance of the, you know, the stack for the, for the member, the customer that's using it. And that's not, you know, we're not going to solve that on this call. And I desperately do not want these DevSyncs to go over. So I'm happy to sit on it, but the, you know, the broader fact remains. Yeah, these are all going to issues that we're going to have to figure out before we start shipping out the actual Mark 2s that, you know, Derek and Kevin are working on because, you know, we can't ship it, you know, this, ship it in the state it's in now. So I think, you know, depending on what the timeline is for that, that'll certainly dictate, you know, what kind of speed we need to work on some of these issues on. Yeah, well, this, this sprint, I just want to, you know, focus our attention on the fact that what we're trying to do with this sprint in particular is fix bugs. So if there's a bug in the wifi setup that we can address that's going to, you know, make things work better than, you know, I think that's fair game. If we're talking about re-implementing a system in a better way that's going to be, you know, better for us going forward, that's great, but it's not part of this sprint. All right. Yeah, we should probably move that wifi setup thing down for now. All right. Gez, any progress on any of these other tickets? Yeah, I've been doing a bit of a bit of work just trying to fix up the test that's sporadically failing on VoIP Conf. And so some of them, the last one was some of the alarm skill queries getting caught by the timer skill and can't replicate it on a local device unless you deactivate the alarm skill. And so the query that shouldn't go to the timer skill anyway, so I've sort of fixed that up. And hopefully that improves things. The temperature one, I shifted, it was actually one of the old integration tests that was failing, like from the old system. So I shifted that across the VoIP Conf and marked it as a fail because it wasn't, it's not a priority piece. And, you know, there's a bunch of other stuff in there that's marked as a fail that I'd certainly jump on before we go there. So that's all I did there. And the news and singing skills, this NYC 383, that is a, it's proving to be a difficult one. So OK has also been looking at it and both just, you know, been trying to add lots of debugging in there to try and figure out why what's going on. Basically, it's just the stop command timing out and not stopping the, not stopping the playback, but only in the VoIP Conf test. So like I've never been able to replicate it outside of our CI process, which makes it even harder to pin down. So, yeah, I am potentially just going to leave it and, you know, if we can get the other tests passing more consistently, then at least there will be much less failures. But I'm a little stuck there at the moment. So were those last three on the end review? Are those PRs you need me to look at right now? No, so the payouts are just our old interview. Oh, that's probably what you said. Yes. And then the, yeah, their old payouts. OK, I can look at those after this meeting. And cool. But yeah, once I finish those, if you are doing it, I'll jump on the volume of Max on the suit. So not volume of Max on first boot. I'm pretty sure that, OK, at least told me once that that was intentional, mostly because I think worried about maybe a hearing impaired person or something like that. Not on the PV, well, not on the latest stuff. OK. Yeah. We did, David did the work on that as far as I remember, did the work on that. So I might not recall that. Yeah, we, it's too loud. It'll blow the speakers. Well, let's see. Actually, I don't, if we have the I2C set for the Max volume correct, it shouldn't blow the speakers, but it is still too loud. So long as, if we pull over all that I2C set stuff that we did on the thing before, back in, you know, November, December, last year, whatever. I believe it was around 70% volume on first boot. OK. Yeah. I need to, you think that's us probably in the Mark II enclosure repo, I can find it there, maybe? I mean, there should be something that boot up that's doing some I2C commands. All right. All right, I'll see if I can find it. Yeah. And we want to do that on the enclosure side of things. Right. Cause it's going to be different. Yeah. That way. Yeah. For sure. And I did that. Yeah, for sure. And I did put a note in there, like the, we're still playing around with what voltages we're going to use. But the way we expect it right now is a 12 volt supply. And given the speakers that we expect, you know, the speakers can't take the 13 watts that our amplifier can supply. So the speakers will blow out. So we might have to switch down to like an eight or nine volt supply. But we're going to play around with that and see what kind of quality we get. Yeah. We'll need to know like, you know, what the max, like I know with the old, with the previous versions, max I2C volume was higher than what we want to actually the max volume for the device to be. So I know there's some playing around there too, as to what we want, you know, a volume of 10 and how loud that needs to be. I mean, my preference is to make sure that the max volume on the I2C is the max volume that the device can create. Right. That way there's no way of accidentally destroying the hardware. But just, you know, to be aware for now, we do need to find that and be able to control that. Okay. Yeah, we don't want them turning it up to 11. Dangerous. Another piece. I was looking at the installer skills for that stuff with the stuff last week and I think, okay, a message to about the two hidden fields. And I did not realize that that was how that was our Microsoft's previously installing things from the marketplace. So anyway, okay, and I would just chatting a little bit about what that process might look like in the future. But I think that's what we need to talk about at some point from an architectural perspective. Yeah. Would it be a good idea to remove those fields from the settings since that's not working anyway. And then add them back when we fix the install problem. That way they're not at these just strange fields showing up in settings. Nobody knows what they're doing. Yeah, we could do it. We could do it. Yeah, we could do it. We could do it. But I also, yeah, I mean, one of my questions is whether that's the best way of doing those skill installations. I agree. We should probably revisit that when we start talking about marketplace installs again. Yeah, okay, cool. But that's not the reason to get rid of it because we may implement it differently. Yeah. Great. All right, I'll pull those out. And it'll be out that too. All right. Anything else bug fixture related Chris while we got you on Mike? I don't think so. I'll just go ahead and have a look at where I'm where I'm heading next. Well, actually the reporting device is ready. I still haven't gotten to the core of that. So that might take a minute. I think that's more a pie craft issue than it is a mark two issue, right? Because there's enough stuff going on with, I think, I feel like it's across the board. I feel like there's, I've had it on all sorts of devices where you pair and the device isn't actually ready. And so I think regardless of which device you're on, I feel like if we, if we handle that at the, at the micro core level, then it shouldn't be possible on a device. Yeah. I think the way we handled it with the mark two Kivi at least is we have the loading skills progress bar, which implies you shouldn't be actually interacting with it until the skills are loaded. Yeah. I think by the time you get to the actual splash screen, everything is loaded. And that's how we solved it for the mark two Kivi at least. But I think, yeah, the pie craft, for example, it doesn't start loading skills until you get through setup and that could take 30 seconds. Yeah. And there's really no indicator that that's the case. Yeah. So we want to make sure that, you know, there is a device ready or micro core, micro is ready message on the message path and that it's only sent once everything actually set up and ready to roll. And I think that's the case. I think that message exists. I just think that pie craft, for example, and other devices are look up here to be in a ready state before that message message gets sent out. Yeah. So we need to figure out a way like in on those devices to say, okay, you know, something visual or audio that says, you know, I'm not ready yet. You may have paired me, but you can't use me yet. Well, that's what I mean. That that message says it's ready. Shouldn't go out if it's not actually ready. I'm not sure that it is. I think that, you know, in the instance of like a pie craft, I think you just by the time you finish setup or pairing and pairing and the pairing skills says you can ask me things like this, then then it's still loading skills at that point. Right. So and it's sending out the bus message, but there's nothing in pie craft that acts on that bus message that says, hey, yeah, we're ready to go now. Unless you're in the CLI when you can see it. Okay. Well, in that case, you know, anyway, this is probably too much detail for right now. Yeah. Yeah, yeah. I'm just saying, I think the issue is that the bus message is coming out at the wrong time. I just think it's not being handled correctly by different enclosures. That's I think is the problem. So. All right. I'll have a look at that. All right. All right. Ken, precise. These three tickets that are somehow encompassing everything that you're doing. Actually. Yeah. So we last spoke on Thursday. I had a big build going off probably about 50, 60,000 files total out of the 114,000 that were classified. And I reported that it was taking upwards of the time. Four hours. It ended up taking six hours. And then I have been on. The lambda to server. Today. Getting a learning lay in land and getting everything set up and understanding what things are. And I just completed right before this meeting for you, Josh. The same model. Being built on that lambda to server. And verify that the GPUs are active. And build time went from and remember my MacBook Pro is pretty good. My MacBook Pro is 16 gig of Ram. So it's not like a little four gig laptop. And it's an Intel processor. So it took six hours on my laptop. It took just under an hour. On the lambda. So I guess the good news is. Getting your bang for your buck because. If I have to wait six hours to turn on these models, it's going to be torturous. An hour is doable. The model that completed Monday, I haven't had a chance to test the one that just finished right before the meeting. But the one that completed Thursday night, around 11 at night. Against the original. I ran it against three test data sets. I'll be doing some more work with the data pipeline of the data sets tonight and tomorrow. Get a little better handle on it, but just some rough numbers. So this is against a test data set called test model. It was for a different model. The original. My craft had an 88% recognition rate against that test set. The new model had a 96%. The male contributors. Test data set. Original. My craft. 87%. Balance. The new one. 94%. On the female contributors. Test data set. Am I craft the original? My craft. 84%. The new one that was completed Thursday night. 95%. So if those numbers didn't get can be trusted. And I'm going to be verifying that soon. So it looks like balancing the data may actually. Improve it, but I'm not sure that that's just it. I also run the epic at about 6000, which is one of the reasons it takes so long. And. I think Matt was saying he only ran 600, but I'm not sure. So I'm going to be doing some testing with different values for epics and. Sensitivity levels and chunk sizes and stuff, but at least I now have a. Screamer. That I can build my models again, so it doesn't take quite so long. So that's good. I'm not putting anything on there, Josh. You only have two ports open. You have. 22. In 1776. And I'm assuming you'd like to keep it locked down. So I'm not going to put anything there that exposes it. The first question I have, though, is what is on that server? This is Lambda two. Lambda one is mimic. Lambda two has almost a terabyte of storage. Of which 90% is consumed. Almost. It's almost certainly. Wakeward samples. Well, we don't have that many. We've got about a gigabyte. Oh, you're saying we have unclassified. Raw data somewhere. Ah, okay. I bet we're from people who opted in and only people who opted in. I said that's continuing to collect all that data. So we start with excellent. And my next question to you is when we. So we're searching a multi-dimensional solution space. So we have a number of different factors that play into the about the accuracy of the model, including steps and so on and so forth. Can we search that solution in space in an automated fashion rather than going through and hit, you know, hand jamming it? That's what I was heading. Yeah, you're exactly right. So exactly. I don't know what that means yet from a, you know, nuts and bolts perspective, but I understand the concept, which is if you want to test it with Epic 6000, 5000, 4000, 3000, 2000, 1000, you have to sit there and do it manually, figure out a way to back it off and look at what it, whether it got better or not or whatever. So yeah, I'm heading there. What I will do though, before I get there is, I'll try to put a little more consistency in some of these test data sets so that things are reproducible, verified on a lot of cross-contamination, obviously with the, which was why I was so surprised that the performance numbers showed what they did. The Hay-Microf model by definition is cross-contaminated because all of the data that I have that's been tagged went into building that model. None of it was taken out of it for a test set. That only happened in the one I balanced. So if anything Hay-Microf should be kicking its ass and it's not. So I don't know what's going on there. I'm going to have to look at it. But yeah, I'm working towards being able to automate that process to make it a little bit less cumbersome and less manual and less time consuming. So what we're talking about is often referred to as hyperparameter optimization. I saw your email. I didn't mean to ignore you. The problem was that that class started today and I'd love to take a class. I just don't believe I have the demo right now. I'm kind of making some progress and I'd like to get this out of the way. So I'm going to hold up on that until the next one. But yeah, I mean it looked really cool and I wouldn't mind attending. Oh, the other thing about those, you can just use them as reference. You don't have to attend the class or anything like that. You can just go through at your own pace. I've done it before. Yeah, I hadn't gotten into it too much. So I was going to look at that a little more too because I think what that's talking about though are not the parameters that are exposed externally to us. I think it's talking about the parameters from the pbparams file which is kind of like internal parameters that the model uses to, you know, wake. So I don't know, I haven't gotten into it yet. Okay. But we'll definitely take a look at that. But anyway, so the bottom line is I'm now up on the Lambda server. I'm not going to expose that Lambda server, Josh. I'll keep it locked down. The only exposure will be to get data into it that's classified because as we have raw data coming in, we want to get it classified into data sets in a consistent manner so that we have versioning on them. And we know we're comparing apples to apples. So I'm still grappling with that. The blog postcode is in a little bit of a mess. But today was a good day and I anticipate some good progress. Kind of what I'm at. Just one more note on the parameter optimization. One of the parameters is how many samples you need to make it work well. And so that's something worth looking into. Yes, yes, I will. Do you have any insights from what you read? Did it give you any... I didn't actually, in that particular course, I didn't look at that one. I did another one that was a little more general. But yeah, the net result is that looking, getting more data is not always the way to improve your model. I'm glad to hear that because I was... I'm kind of constrained anyway because I don't really have a lot more data. But what I do have is the ability to filter the data into clean and dirty, if you will, and then further classify it into high and low pitch voices and things like that. And I'm suspecting that a lot of the data that I discarded, which is almost half of it, was because the classifier couldn't classify it as either a male or female voice, could very well be because it's neither. It's just noise. And it's not clear to me that using noise during training is a great idea, so I'm not sure yet. So that's part of what I'm experimenting on and I will definitely look into some of these classes to answer some of these questions. But I'm always... And that's great input, but I'm always like from Missouri, right? I want to see it. Like if it says less data but more clean or cleaner is better, I want to prove it. So that's kind of what I'm at. I'm going to take out some of these larger models in a reasonable amount of time. So hopefully I'll be able to do some more of building some new models this week. And I'll say, first-hand, whether more or less data is better. And the hyperparameter that is going to save you the most time is the number of epochs. Yeah, the S&P right now I'm running 6,000. I get the feeling Natalie ran it around 600. We had a brief discussion on that. I was running substantially fewer epochs. So if we can identify how many epochs we need to run and crank that down to from 6,000 to 600, it means your models will take six minutes from the lambda to compile. And once you've gotten down to six minutes, you can start searching that hyperparameter space. I mean, you could build nested... nested if for loops and just build a big multi-dimensional matrix and test them all. I mean, if it's only going to take six minutes. I mean, they're probably need more elegant phrase. And I would also argue that we should maybe start looking at some of the hyperparameters by inspection instead of saying, you know, here's a range. We have no idea why it's that range. We're just going to search the whole range just because. It'd be great to know what that hyperparameter does and, you know, think it through a little bit more before randomly searching parameters. But I'm sure you can do that. There's a couple of them I'm familiar with for the, for the male frames and stuff. But yeah, I certainly don't have a handle on all of them. What I would say, though, is I was surprised that with half as much data, the model I built outperformed the production model. And if it's the case, and I don't know, and this is some of the stuff I'll be experimenting with and documenting, if it's the case that the difference is 600 versus 6,000 epic, then, you know, that's kind of where you're at on that. You should be able to verify that pretty quickly. But I suspect that it's because you're running a quote unfair test in that you have in your test data set you've got a balanced number of male and female voices. But the original training, the originally trained model is not good at female voices. Your new model is. And that's all the difference. Yeah, I mean, again, I don't know. As far as the epic goes, I'm not, I don't know what it is exactly that gave me that performance. And so I guess this week, part of this week will be answering that question. But proving it with numbers that I can publish and everybody can look at and keep me honest on that. Yeah, that's what I'm at. You can just have the number of epics in a for loop and, you know, inside of an hour, you'll have the answer. Yeah, I can certainly have the epics and see what the performance numbers will be given the same date, apples to apples, sure, training date or whatever. Again, it's not clear to me what's the gating factor. Is it the cleanliness of the data? Is it the epics? Just take every parameter one at a time and just isolate it and just see what kind of curve you get. Yeah, that's where I'm heading tomorrow morning. So, okay. Can I say that I don't fully trust the test results at the moment in their life. If we're getting 80% response, 80% success rate on female activations that's significantly different from experiences on the ground. No, the reason for that is that we are not, we probably are collecting, but we are not using the missed utterances. So, when Nate went through and said, I don't want to hear the 10 seconds before the wakewater activation, because I don't want to hear Josh and Chris fighting in their kitchen, which is what caused that change. We basically gave up all of the data for inadvertent activations. So, if you try to activate it three times and it fails all three times, and then you activate it successfully, the only one that we're capturing and multiplying is the successful activation. So, even though in that case, the accuracy rates 25%, the dataset that we get would be 100%, because the only one we got was the one where it actually activated. We need to go back to the way I originally wrote that software in an afternoon and grab the wakeward attempt before the wakeward. So, right now, the models are very heavily biased towards not activating accurately when we're heavily biased towards missing the wakeward, as opposed to being heavily biased towards accidentally waking up. We need to go back and get those samples and use them for both classifying and testing. Josh, is there more data available that I'm not aware of than that one gigabyte of data? Sounds like you've got a terabyte of data sitting on that Lambda server. Yeah. You want me to go poke it around and figure out where it is? I think I know. It's given that the guys probably didn't take the time to do it right. My guess is it's all in one folder. So, when you try to cat out the context of a folder and it locks your machine up, that's probably the one it's in. So, now you're going to have to use find or something similar to go through and reclass that stuff. I mean, there's a variety of ways that we could have handled that problem. I'm a little frustrated by this, but not so much because it's not working, because of the amount of time and money we spent with a developer to do this and we're finding just the really, truly basic stuff like, hey, hash the wakeward file, take the hash and build a file structure so that you use the each individual letter of that hash to create a subfolder so that you can find these things without destroying the kernels, without destroying the file management software. Stuff like that just didn't get... It's okay, Josh. I have a handle on that. I see that's like a quarter of a million dollars, Ken. Well, I can't speak to that. All I can say is I've got index files for all of the data that I'm indexing that are in CSV format now and classified. So you don't need to be copying stuff around and getting locked up on individual subdirectories. What I would say is there's a lot of data directories. There's a... At the file root system, there's a dot data, having those what's in there. There's an ML data and a bunch of stuff, so I'll figure it out. But what you're referring to is when somebody opts in, which I believe there's a bug, by the way, I think it sends it whether you're opt-in or not. It goes somewhere. Whoa, whoa, whoa. Stop right there, yeah. So that needs to be fixed right now. Why was that the first thing that came up in this conversation? That is the absolute top priority for the entire company until the bug is solved. There is no other priority. Move it to the top of your sprint. Nothing at the company gets done until that bug is fixed. So what's the data that we've collected that we should not have collected has been deleted? That's the only thing... Can we both verify whether that is actually active? I was just asking, where is the data that we are collecting and shipping to the cloud ending up? That's a good question. You should answer that question. Okay, I'll figure it out. I'm not... That's not a joke. I'm just asking, what is the data that we are collecting that the company is doing as of that statement right now is tracking down whether or not it's true. If it is true, tracking down the data and deleting anything that we were not authorized to collect every piece of data that we were unauthorized collect needs to be gone permanently, and that's the only thing my graph does until that's done. Chris or Gess, do you guys have any idea where that data ends up landing? Okay. That's company-wise. Yeah, no, no. I mean, that's a... Josh is right. It's a race. Chris, Gess, Ken, who can find it first? Let's go track that stuff down and we can reconvene this meeting in a couple of days. Everything else that sounds like you're doing fine, but yeah, that's the only thing the company does. What is the basis of your thought that we're collecting for everybody instead of just opt-in? Why do you think that? Because I could have swung when I installed my version from dev that the default setting for that flag is false, yet I saw activity that seemed to indicate in a log file that it was sending data to the cloud. Okay. And I'll have to chase it down. I think that end spot that you're talking about is where that one terabyte of data is. I mean, there's no process that I know of to move anything from a cloud server to that Lambda machine. I think it just goes straight there. I don't know how, because only port 1776 is open on it, but I can pour it through the code line and configure it as configuration. The way I originally built that software, I created an account that had permissions limited both from the account perspective and then also through SSH in a way that when it, the only thing it could do is shove a file up to the folder and then it would open an SSH connection and tunnel that data through one of those two ports that you just mentioned and dump it onto the machine and then close the session. And that was the end of it. So by default, you know, rather than standing up a web server with SSL and putting in right permissions and everything else, you know, I developed it so that it would use SSH for that. Now that was, I literally developed all that software in the afternoon. So the, you know, that, that was my original development in Palo Alto two years ago. You know, having had several people work on this full-time since then, I would expect them to have done something a little bit more elegant, but it could be that that's still the way it's set up. I can tell you with almost 100% certainty that there's nothing about precise data on our cloud servers. I almost live in those things and I don't know of anything that would, yeah, I don't think it's in digital ocean anywhere. And the Lambda server would be the original destination so that has not changed. And probably still is. Okay, and when you say you tunneled with Josh, you mean through SCP? Yeah, just at SCP, you know, SCP is just a SSH. Yeah, just a SCP up to the, up to the folder, dump it in and that's that. So as I recall, that user didn't even have permissions to retrieve the contents of like the listing of the folder. All it is is a destination and the only thing it can do is write. Is this code within precise? Or is it within core? I don't know, when I wrote it, it was probably it was probably I probably used the sorry, English, it's written in Python. I probably used Python to make a system call to SSH and dump the file that way. Yeah, I doubt it's in precise. Don't we, like don't the way good samples get uploaded to Selene is what I assumed? No. No, Selene reads and actually there's nothing in Selene up precise either because or at least what Tartarus used to do was read from this dumping ground of files and then classify them and hopefully put them somewhere else. So I mean it was really just reading and writing from wherever from wherever those things are living. So I think the first thing I want to think we should do is verify that this is an issue make sure that we're not going to do a lot of chase I guess is my point. Let me just finish up with one question on this data by the way. The data that Gez gave me the one gigabyte of tagged data, I'm assuming that's all the tagged data we have but my real question isn't so much that is that data available to be offered to our community for download or not? In other words if I wanted to make that data set available I've cleaned it up I've written code to clean it up and fix some of the file names and stuff there was like 7000 bad file names but that's not what I'm getting at. Eventually is it the case that we could stand that data up somewhere and say here Mr. Custom if you want to build your own model here's a great set of data to get started with are we allowed to do that or is that data confidential or we can't share it I just I don't know. Which data set can the wake word stuff? Chris Gez gave me about a gigabyte worth of data that had been tagged as either wake word or not wake word and I'm wondering if we can share that with the community at some point in the future or not. We can, they just need to sign the data sharing agreement which requires them to update the contents of that every 30 days so the thinking there each piece of that data is connected back to an individual user if that user decides to opt out of sharing the data tomorrow right? Next time we compile that data set we'll have nuked all that stuff out and so the thinking there is if you build a model today and you incorporate that user's data and that user opts out tomorrow which is fine you can continue to shift the model that was built on because it's really abstracted it away there's no way of using that model to figure out what that user's data was but you are required on your data sharing agreement after 30 days to update the data you're using to train the model and that means that if in the intervening 30 days some user has opted out the next time you download that data it will be gone so it gives our users the right to... Why don't you do that? Do you have a master manifest somewhere that says this file came from this user that I'm not aware of? Usually it's in the user ID of some sort of a file I don't know if that's the way it is there's a file name or something should have the when I originally did it I wrote that user's file name you know the reason the file names are this long string of randomly random numbers with dashes is each dash represents a database field so rather than trying to to deal with the complexity of I've got a database here and a file here and I'm trying to keep them synced up I just shoved all of the relevant information into the file name so that you know any one of those files you can look at it and say this is the date and time it was updated here's the user it came from this is what it contains so on and so it's just from the file name Do you have some sort of secret coding chart that would show me how to derive that information from the file names? I'm so frustrated by this No I don't have... the guys that were working on this as their full time job for years I'm so frustrated by that Give me a file name Ken example and I'll see if I can help you with that Let me go with you in the morning Chris and we'll interrogate one of these file names and see if we can figure out if one of these fields is a user ID or session ID or whatever Yeah I'll try to figure out where this data is going and how much are we accumulating each day but I have no idea how long this process has been going on but 90% of that disk is consumed so once I understand the system better we either upgrade that disk or we start pairing back some of that data because 90% is getting dangerously close to what we don't want to be on a dataset usage Yeah I think rather than pairing back it might be a good idea to just create an archive of the stuff we already have if this assumes that's where all the data is being pumped to still I think it makes more sense to archive what we have and just free up space for people I took a measurement today I got on first today I know where it was at when I started and I know where it was at when I was done and I'll look tomorrow day after and see what its growth rate looks like and the other thing is moving some of that data off I don't know I can be a trivial issue because you're talking about almost terabyte worth of data I don't know where you're going to stick it and how you're going to get it off of there all that data is probably sitting on right now that's attached to the Lambda so I'm pretty sure you can move it from one place on that yeah it's almost certainly sitting on disk on the machine itself because I'm pretty sure the lambas don't use the NAS but yeah it's on a 10 gig full duplex core so it's not like moving it's going to be a huge deal and worst case I can send one of the wicked guys or send Derek over there with an external drive plug it in and just suck it out through a USB3 port but yeah get some knowledge so I can speak more intelligently to the situation okay yeah the big rock is if we are collecting data from anybody who hasn't locked it in we need to identify where that data is and ideally without accessing it ever nuke it yeah then we need to disclose immediately to the community what happened why how we fixed it like it needs to be we need to be transparent once we know then we can proceed okay that's my update sorry it took so long can I be notified once we know we have a public statement guys can you make sure to loop me in on that yeah we will definitely be talking sorry people really like it when we admit our failures and talk about how we fix them I don't know why they like it but we get a lot of praise for that because it's unusual people don't usually do that so I just did a dumb curse research through the code base and I don't see any obvious errors related to opt-in so if there's something it might be in if there is anything then it's got to be something either obscure or in like the interface between Selene and core alright I'll take a look I'll look in Selene while you're looking Ken at the files on the server you can see if there's an endpoint in Selene that does this work we're talking about too alright good thanks so did we have we finished up going through the all the sprints as they I just did one this Microsoft Sprint 12 this I'm leaving open until I go to production I'm going to do graphics in Grafana so that's why this is still open and probably I'm guessing tomorrow night I need to do this WordPress drop-off thing now that I'm done with all the Selene stuff I was doing I'm trying to get to some of these things I've been looking at for a while did I hear a droplet I'm sorry what is it difficult to get me a small droplet that I can put a patch on for now we use Nginx for our reverse proxies if that's what you're using it for not a reverse proxy I just want to be able to serve some CGI out of it so I can put that web page that I was running on my laptop up there so everybody internal can run their own tests if they want against our data and our models kind of to keep me honest you can have a droplet if you want I'm always reluctant to rent other people's computers given that we spent a bunch of money on our own computers would you like your own virtual machine with a you know as much processor time as you want as much storage as you want just send me what you want on your machine and I'll stand it up for you and give you shell access that will be perfect thank you Josh now we're done okay all right well thanks everybody sounds like we had a fun little scare there at the end let's make sure we take care of that first thing and we don't have a general meeting tomorrow but I'd like to see I'd like to hear at least be an email what your results are what your findings are on that hopefully it just can open for that I thought we had a bug open for that for what don't collect data if somebody hasn't opted in you thought we had a bug if anyone has mentioned that bug in one of these sprints that are on the junior board I was looking at all right well you know before I go and cause all sorts of chaos with the off the cuff statements let me go back and verify and make sure so let me dig it up figure out what's happening and see if it is indeed happening and report back it's fine we just need to move it to the top of the priority list because it's it's you know we've publicly stated and I stand by the statement that if we do spot well number one we will end up inadvertently collecting data that we don't mean to we will like we've been very transparent parent it's gonna happen from time to time because these systems are complicated you know even twitter got busted keeping all of their users passwords clear text in a log file right like you know given the amount of money they spend on IT you know it should be should not be surprising that from time to time we're going to have a hiccup where we're inadvertently keeping something but we have made the commitment that when that happens we will be you know it will only last as long as we're unaware of it so now that we know it might be a problem it becomes our top priority and that once we've resolved it will be transparent about what happened and how we're going to fix it so I don't it doesn't security problems don't bother me and data collection problems don't bother me because they're going to happen it's a hundred percent certainty there's nothing you can do to prevent them what the issue is we know once they've occurred making sure that they have become the top priority of the organization and that we're transparent about about resolving them and that gives people the confidence that you know they can trust us with their data and their privacy I understand I don't mean to keep the dragging this meeting out but when I learn something that's actionable it's interesting that I understand you correctly the automated process which is gathering the data that's being sent to the cloud is determining whether or not a wake word was spoken or not and sending it up there based upon its interpretation of whether it thinks the energy was actually a wake word or not how is that data getting classified there used to be this great online suite where everybody could go in and classify it and they would get points for classifying the data I get that I get I thought I heard you say that it was on a magically being transferred up based upon an activation or something it used to be I don't know I'll get into that code I'll figure it out it's part of this whole process all right and yeah I have a whole data pipeline that we want to kind of get working so we can have a consistent data flow of data coming into the system and getting it classified and getting it into the models to constantly improve the model and then Josh you and I in the future we'll talk about how we can do that at a more local level by listening and kind of maybe rebuilding in an incremental manner locally but we're not there yet but we'll get to that conversation soon because that's really I think where we're going to get our big bang for a buck is if we can take the base not on an incrementally improvement based upon the user that's using it but I digress okay sorry we do find out and this research we're doing tomorrow morning is needs to be I think there's still a lot of questions around you know where the state is going and where it's living and how you know all those things we need to make sure that you know we're going to answer those questions let's make sure they we don't answer them again absolutely I'm big on that that's why I'm documenting what I'm doing on wiki and yeah I want to get to the bottom of it we'll document I'll document the process since that's knowledge it's valuable to this team and we don't have it right so Ken just a quick update your recollection is not wrong but it's a little bit off the there is a bug report it's precise 22 and it just says verify that wake word utterances are only selected for users that opt in it's not because we thought we saw a problem it's because there was a problem in a different system that we fixed and out of abundance of caution Chris there thought that we should also check and make sure that there wasn't a problem in precise okay we encountered this issue a few months ago with the STT stuff so I had to fix it up and and all that's good stuff so we've been through this rodeo once somewhat recently okay um one other thing I wanted to bring up before we go is an update on project rollover we've talked to their developer um he recycled or rebooted or he power cycled his devices and now the audio works um so after power cycling them several times and the audio not working um so on the bright side it does work now and we that's good on the downside the sometimes it works sometimes it doesn't bug is a very hard one to nail down um well it's not like it's a timing issue in the bring up sequence now uh I don't even know if it's software or hardware um I talked to ok about this a little bit um in chat and he gave me the very reassuring statement of um and also linux audio is sometimes magic especially when also used directly that gave me a warm fuzzy right there um so yeah I don't know what to say about that um if we really this is the only time the only case of this that I know of um if it's happening elsewhere then something we need to look at because we don't want to be shipping devices that just pump out audio and they feel like it um well yeah with our new system we're going to be doing a lot more um taking a much more critical eye to like the boot up sequence and all that kind of stuff there's things that I'm sure we didn't even pay attention to like the fact that the power supplies have to ramp up monotonically from zero to x volts you know in order to guarantee this chip works and blah blah blah so using all the off shelf parts and you know are sort of you know hack together system you know it's going to be a lot different with the fully integrated board so just to say I'm not spending any more time on it now that it's working for him and there seems to be just some weirdness going on there I I could spend a bunch of time trying to figure it out but unless it breaks again I'm not spending time on that right now okay that's fair alright um did anybody else have anything before we wrap it up Michael did you still want to stay on and talk about Wi-Fi setup or do you want to do another time sure we can let everyone go hang out and talk about that sure um I just struck the end point that we're uploading Weiswood to apparently so uh that might be a direction where did you find it uh well I took it from Jarvis because Jarvis does weird stuff for that so so this is this is this end training mycroft.ai persona tag precise tag yeah so the precise tag is the one that Josh was holding out that people were tagging hey mycroft not hey mycroft well but it says precise has moved when you click on it we've moved precise tagging under the tagging tag so they used to leave as a separate thing and then when we created the new back end they just taken down this is just where the tagging happens but there's other stuff because it's not in the tagging I didn't believe that end point isn't in the tagging it's going to become more out alright yeah I'll try to figure it out I'll go through the code in the morning and figure out where the decisions made and I'll set a break point and trap it and make sure we're not doing anything we shouldn't be doing and if we are I will contact Michael right away alright thank you everybody if you are riveted by wifi setup and you can certainly hang on if not we will talk to you on Thursday alright talk to everybody on Thursday alright fighting you guys