 cool all right well today is 11th of january 2021 this is the dev sync and ken has volunteered to go first so let's go ken so friday um i was full of encouragement we had thought we found the problem we patched it and we were testing it and uh when i went to bed friday night uh i was playing music so loud that it was aggravating my wife which i can't do with my laptop it doesn't get loud enough an aggravator but actually the mark two was which is a good thing and so i was getting turned that crap down turned that crap down so i had actually thrown a playlist in there i was playing it in the background while core was running i could still communicate with it i could tell it to turn the volume up and down and unless the volume was at almost 100 it could understand me and respond and everything was great and i went to bed friday night encouraged and i said i'm going to leave this sucker run all weekend and monday i've got good stuff to report and so saturday around three o'clock it stopped responding now that was a good run it was like you know 12 to 18 hours but uh so i started looking at over the weekend what was wrong with it and continued through this morning and i believe that chris and i are both in the same place in the code and what i i'll give you what i think is happening um but i haven't proven it yet but i'm working on it and i think we'll we'll lock this down pretty quick what i suspect is happening is because of the fix that we put in place which resets the mic when an overrun occurs you're in the middle of a routine that's reading from a buffer that just got emptied and reset and your mic connection is no longer valid your buffer is no longer valid in other words our code isn't used to having the buffer and mic being reset down below it and all the data being cleared out of it while it's in the middle of a routine pulling data out of the buffer so i suspect if we can catch that condition and reset the mic in accordance with that we should be okay but that's a working theory and that's my status for today okay well if that's true shouldn't we be should you be looking at error handling in that case because yeah so what i did is uh the last thing i did is i went in there and i right now it's set to not throw an exception on an overrun condition in the mic class and i changed that to true and was just getting ready to test that when the meeting started so i'll you know that's that's yeah but but i don't know that that'll catch it because it's not really throwing an exception it's saying read from the buffer and the screen keeps saying there's zero bytes available but it came into the routine trying to fulfill a certain count so it thinks it has like in this particular case 912 bytes remaining to be read but the stream keeps telling it there's zero bytes available yeah so that seems like an error condition that needs to be handled yes that's what i believe is going on but it's not as simple as getting an exception we're gonna have to use some logic to detect it and reset and whatever and test which brings me to the question of um is the is the mic rough qa team me and and chris and chris because i guess that's what we're kind of doing is we're beta testing right yeah i mean you're the ones with the devices so yeah we don't we don't have devices like and veyer's been with the team long enough to know that when you give me a device i absolutely will tell you when it sucks so you break it every single time you have the capability of breaking it no matter how good it is yeah so i should have a device in the mail today are there instructions for me bringing that device up on the current software stack somewhere that's accessible yes there's a whole panticore introduction blog or document that we've been building right yeah there's a document that did give you step by step how to does that live in in google drive no it lives in panticores there's just a link to it it's a hack md so that we can us and panticore can both edit it yeah but i think i think for for you george like unless you're gonna be like going into the you know to the guts of the device like if you're just using this then use as in all you all you have to do is download an image flash it on and go okay even that requires instructions so yeah and i'm telling you to write them i'm happy to write the instructions happy to for both that and the the assembly of the device right just somebody point me at them point me at a starting place and i will i'm happy to write once we have an image we can probably put that on google drive the image that we like um so that we don't have to go into panticore and log into that and yeah you won't be able to log into your device or look at log files but at least you can get your image to run yeah i i i probably don't need that you might want me to have that so i can dump problems on your desk with enough information to debug them but for now i would settle for by the end of today whenever the ups guy shows up being able to to have a device on the on the desk here and i might move it to the kitchen we'll see well i think we have some api work that we did that connects our back into their back end so maybe that's done already and you can do that i don't know the api work is not done but that's not needed so for this so somebody shoot me by somebody guess if you're the best person just shoot me a very short two to three line email with the link in it and i will deal with it yeah we'll do okay great uh well uh let's go over to veer since he's unmuted yeah so uh i've had an interesting few days um i have my r5 device which is good as i have an r4 and r5 device running right now um i also put a bunch of log messages into the uh microphone file in an attempt to figure out why that first boot isn't working and what i've found is or what i think i've found that there's a reason there's no log messages in that loop right now because whenever i put log messages in that loop failure just like that um right now i'm running it without any log messages in that loop and it's been running for 15 20 minutes without problem um which so i'm wondering if there's some sort of timing issue with i o and that mic loop or something like that i don't know but if you pull those log messages out and let it run it won't make it 24 hours no but it'll make some more than 15 seconds which is what i was with a lot of messages in i couldn't even get all the way through boot without the microphone failure are the log messages going into an interrupt routine okay so we have some really strange logging now it's pulled it's pulled it's pulled okay never mind so i don't know exactly why that is but that's what i found today or through multiple tests that um you know the more log messages i have the faster it quits on me the microphone um so i i do agree with ken we're going to have to figure out a way that if this interrupt happens in the middle of that listening loop that we're going to have to figure out a way to handle it but still trying to figure out why it fails on first boot is going to be very difficult because i can't put log messages in there to me it and log messages i'm putting in there are probably making a break in a different place that would be if they weren't there which is scary in and of itself the um the email i sent out earlier this morning said that when i attempted that what i saw my very first boot was it's running without the benefit of a network connection and that causes some difficulty and the two specific points i saw on the log file are in that email i sent you where it's saying i can't get it i can't connect to github you know so it's trying to update something and it can't and then there's another one it's trying to download something and it's throwing a url exception so i suspect some of the aggravation from the initial boot has to do with um the network not being available and system services that expect it to be available breaking and possibly not recovering but um i agree with chris's assessment that we have a timing issue in the code that pulls the audio samples out of the ring buffer and certainly the log messages could be aggravating that yeah yeah so i think yeah what i'm down to is instead of being going into pie audio and port audio and trying to fix this we just need to figure out a way and maybe there's a small change you have to make to raise an assertion or something but we have to figure out a way to handle it in our code when it happens um right now what i'm seeing is when the microphone fails you get into this basically it's an infinite loop of trying to read the stream and it says it can't read anything and it just keeps going and going and going because it can't it says it can't read anything but there's stuff in the buffer um so um so yeah so i think we just need to get out of that loop somehow and restart it and the only question i have is if we do that um you know we're gonna lose you know what's the user experience going to be like you know if they're in the middle of giving a request when it fails or restart it you know how does that look and we'll we'll figure that out when we make some coding changes but yeah i put a little more detail logging in the specific specific condition as it's asking for a 1k block it's saying it has 912 bytes remaining to read based upon its previous report of how big the buffer was and it's saying that the stream is reporting there are zero bytes available so it's going to stay in that loop until it can read 912 bytes the stream is going to continue to tell it there's none available because the stream has been reset and so that's the situation now to resolve that i don't know yet but that's the actual numbers that i'm seeing on this particular one and it's it's different remaining values sometimes it's 900 bytes sometimes it's 50 whatever so yeah so uh why are we seeing this error in the containerized version of core and not in when we just run it you know on a normal operating system so two reasons i suspect um and when you say normal operating system remember that the pi versus my mac laptop are two vastly different operating environments probably the answer to your question regarding my mac is my mac's got a beefy processor and a bunch of ram not having any kind of race condition timing issues whereas the pi is probably struggling to come back in time which is why we're getting those overrun errors in the first place it happens after a valid recognition the callback's not fast enough and then it gets a buffer overrun because that interrupt service routine is consuming too much time so you're overrunning the buffer and uh so the next time it tries to read the buffer it gets an overrun because heads equal to tail or whatever and uh so the fix we put in was to reset port audio at that point but port audio is low in the stack core is up in the stack so we have to figure out that issue if that makes sense all right well i won't recommend digging into the port audio source just yet there's no need to there's no need to i mean i'd still like to know why we're getting an overrun error but this almost positive is because of the patch we put in right no i i get that but yeah but we're not doing anything that a gigahertz plus processor shouldn't be able to handle you know in terms of isrs so i agree i agree well as performant as you know stuff running on the bare metal either so and there's also the case that we're reading three times as much data as we used to there there's a lot of potential doesn't make any sense to me but why well why well because the hardware hasn't changed the amount of data that you're getting from the hardware is the same as it was before no we're getting 48k we used to get 16k that must have been translated somewhere in software because the hardware always outputs at 48k uh when you say this hardware or other hardware no i mean hard the the 201 hardware that because the xMOS chip only outputs at whatever the same rate as the input is well i don't know that we've ever had an xMOS device running long enough to get into data formally okay but be that as it may you know it's going to output 48k you would think i agree you would think of quad core could keep up with it uh you know so i don't know i think it's because we do too much when we get a in inbound recognition well i'm wondering if it's has something to do with the i don't know how the containerization works but if there's some sort of higher level interrupt that's happening in interrupting our isr that can cause problems yes and the same thing with our um clock drivers if they're getting starved of cpu or something so yeah i just i'm trying to take the uh low hanging fruit approach we may end up in there i hope not can make a good point we our mic processing logic is quite complex there's a lot going on in a single loop to uh you know to read the mic so it's possible there's a problem there too but i mean at the same time core runs fine on mark one so um you know it doesn't have this microphone but um yeah but if you look in that code we were looking at there were specific sections where it says well we got to do this weird funky thing because the mark one block so you know the question is if we're doing that weird funky thing for the mark one and we're running on a mark two is that causing i mean i you know we'll say like we've we've run the respeaker and and the community run you know a whole range of different microphones like you'd think this this sort of thing would have presented itself pretty front and side i'm wondering if it's continuing to stack usb versus i2s yeah so there are some differences but you know sure we'll do our best to try to you know fix this without digging too deep into yeah that's a good point kent on the i2s side um i'll talk to kevin and see if we can build a simple hardware monitor to see if there's any glitches on the i2s bus yeah certainly with the clocks right i don't know any easy way without molesting them to and a scope to get a decent clock signal but if kevin to do that that would yeah the clocks were a concern originally with the i2s bus um i thought we had solved that they should be they are generated in hardware there's no software in the loop when it comes to those clock signals but since the hardware is not really open source we can't tell if there's some some condition that might make them glitch so i think maybe an external monitor is is the the way to go there yeah but i really believe that and again it's really like chris said i believe we can fix this up high by detecting the condition and resetting the question is what does that do to the user experience and we'll know as soon as we do that yeah so my next my tomorrow will be spent um working with ken trying to get the next iteration of this fix in it's getting better it's just not there yet right okay uh let's go up to gues then and that is the end of my no uh i spent the day doing lots of little things um uh so we uh gotta fix in for um a situation where the the idle screen should be showing but instead it shows a black screen um and which i think everyone's probably seen at some point in time uh so we fixed uh one way that that happens so we'll see if it continues to happen um and whether that that was everything um i did some cleaning up of our ipai permissions um and started while i was at it started a document for um for how we published to to ipai um because it's kind of been done by different people over the years and therefore has been done consistently and um so different different projects are kind of traded differently and want to make sure that that becomes more consistent and uh yeah not not just for consistency sake but so that you know the same types of distributions are being uploaded and and the permissions are correct and you know information is available to end users and all that sort of stuff um and make a quick skill for the uh panic call to so they can test the virtual keyboard um because they're having they're having trouble with bringing that up um so i need to check in on how that's going um we did some little fixes for the timer skill um UI uh which i think is all good to go um and uh yeah that's about it i think but i think based on the chats that just happened i think my next focus is um is getting that uh mic visualization visualization stuff um out of the mark two skill and and into the um intercore where it should be that visualization may not be completely worth it considering how quick and small some of the you know there's causing problems some of this when i'm working with it there's two things that appear for like a half second and it looks really busy you say something and you get these um you know the listening bars for like a half a second then you get that spinning thing thing for like a half a second um you know just a lot going on considering the interaction is yeah we're talking about that too yeah guys the um if you look in the mic.py file you will see where it's writing rms to the disk yeah if you look at the blue system skill you will see where it's writing rms to the disk and then later reading it so simply take out from the skill the writing and reading latching onto the mic and reading the rms and use the file written by the mic and you'll be halfway there or there you know yeah yeah yeah um so there's been a little bit of work on it already so i'm going to go back and check how they um how they tried to change that because you know i don't just want to move that to another place we want to make yeah make sure that we're not attaching to the mic again no i think you're missing what i'm saying what i'm saying is in the mic.py file it is writing the rms to a file yeah blue system skill is also writing it to a different file just from the one the mic writes i heard what you said there's there's been some work to remove that stuff already but i think that they've essentially just tried to move the current process to another place as opposed to changing it so that it's not rereading it like so that it's you know to do what you said so that's that's what i need to go check okay um ken does your fix work longer or better when you take that stuff out of the mark two skill or does that doesn't have as many difference so are you saying if i put the log messages in there does it blow out faster no i'm saying if you take that the second uh mark two skills rease of the mic out of it no no because the core foundational issue is the one you and i found um we know i know from looking at precise the thing that got me on it was that whenever it would hang after saturday in the afternoon i would just go ahead and say all right then uh i'm gonna i'm gonna see if i break out a core if i shut core down is it hardware and so i'd run the precise recognizer little test i ran using runner and it would always work so i knew it wasn't hardware i knew it was inside core yeah all right was that it guess uh derrick hey well okay so today mostly well yesterday i i tested uh i i looked a little bit on on the mark two that i put together um just to see if i could get it to do uh anything with the latest image um i got it up once with the mic working and i haven't been able to get it to work since um so i've tried many reboots and i've not got the mic to work again so uh based on all this conversation um i think that just i'm gonna wait till you guys have them work uh reliable version before i was around with that so i got back to um the uh the 3d printed design got the fans from kevin on friday so i've been adding um i needed the the correct dimensions off the fan anyway so i've got that and i've been adding that to the 3d printed design um and adding flow holes on both sides and all that stuff so that's been most of my day but um i do have two two sj201s ready to to fire up or if there are any use for testing let me know um i've got one put together i can put the other one in that enclosure pretty quickly as well yeah until then i'm just going to continue getting this 3d printed design done so we can put one together for project rollover when uh when we have uh stable software okay great thanks um i'll want to talk to you derrick and josh uh after this uh just real quick about uh how the fans should be assembled there's a just a little tweak uh to the last minute change of vendor so okay um so josh looks like he stepped out of frame so i'll go next uh i don't have anything to report on the dev side of things uh except to let everyone know that uh oh well i already did this but i'm going to be out most of this week uh starting on wednesday um so uh i'll see you here for the dev things but i don't expect to get much done outside of that um okay if josh is around do you have anything to add josh we'll just go ahead and assume not since he hasn't got as hard we're working yet um all right well let's call it there for today and uh we'll talk again tomorrow thanks