 We should be live, we'll be covering a couple of things, we'll be covering the fast logic stuff that we are combining with this ACG labeling game, but before that we'll just warm up with eye tracking business. We have it running locally anymore, so do this one quickly just to warm up. We had the GitHub co-pilot fixing some minor bugs in the algorithm, so it's actually working well if you have your eyes open and don't move much in terms of tracing as well, so that's the green line, I don't know, yeah there seem to be some, so we're getting, we got all the code into GPT-4 and because we need some broader context for this, we also have two descriptions for the application, that's a more recent one, just read that one quickly. This application is an innovative solution for remote eye tracking using a standard webcam. It's built on a Flask web framework with Python, integrating OpenCV for image processing and media pipe for precise facial landmark detection. Users can calibrate the system via a simple interface, aligning the gaze with specific points on the screen. The system is designed to be user-friendly, making it suitable for a wide range of applications, from telemedicine to cognitive research. It demonstrates reliable performance, comparable to specialized eye tracking equipment. Our application is a step forward in making eye tracking technology more accessible and versatile for remote use. Yeah, yeah, yeah, obviously there's a bit of, it keeps, be guessing a bit. Hopefully we'll fix that eventually. But let's say about commercial, it's obviously only experimental prototype to make sure it says no commercial use, no medical use, so on so forth. Just double check something quickly with the stream. I'm streaming to three different places. Yeah, I'm not sure what's going on. I'm still streaming to LinkedIn. That's what my streaming software says. Yeah, LinkedIn is the worst one, in terms of streaming anything onto it. Okay, I'll just keep going. Yeah, let me know if you can hear me okay or not. I'm trying to monitor my own sound and everything, but then it gets delayed. Okay, so this is the Flask application. So yeah, we're using Flask, we're using eye tracking as a CV2, so it's working fine. Now for the eye tracking, yeah, we're using Media Pipe, so that's the main library. And NumPy as well. So we're in those frames. I think it was from GitHub, Copilot was able to do this for us quickly. We did this line later. Well, there's an obvious difference between eye tracking and pupil tracking. So that's the whole different ballgame. So we have the code in GPT-4, we had some issues with it. We had an OR error, which it actually fixed. But then we improved the frame generation function as well. That's actually checks for the frames now. Because if I move outside before it used to just fail, just drop out, now it keeps going. I should have a printout of empty frames somewhere. It's still dropping out. It's the same error. Yes. Yeah, it doesn't seem to be able to sort this one out. No, it's actually, I don't know, it is the same. Yeah, empty, like an empty array. Empty array. Okay, we might tilt that one later. Yes, obviously, one eye is jumpier than the other. And one good thing about the GPT is that we actually can generate the images for it. Let's do that quickly. Do a few pop them straight away in. Yeah, GitHub co-pilot obviously cannot do this. I wish there was a video. I wonder if you can upload a video of it working. So essentially trying to get it fixed, jittery, a business for us might just talk to it clear. So we have some jitter in the pupil detection, eye detection working well, but the pupil is getting not detected very well. Can we also go over an ability to display the whole processing through turning the image into grayscale? Because that might need tweaking the level of grayscaling. Can you also explain how the pupil is being currently detected? You have all the code. Yep, several factors. Yeah, it's pretty good at this, explaining these things. Much better than a university lecture, for example, that might, well, mainly have hundreds of students to take care of and also not be the most patient teacher. So doing the grayscale conversion, the region of interest, ROI, around the eye is converted to grayscale, which reduces the complexity of the image and helps in detecting the darker regions that represent the pupil. Yeah, we want to know how that exactly works. Like, can we actually view the grayscale image somehow, or potentially not overlay it, but maybe a button to provide an option to see the grayscale image? Half circle transform, the adjusted grayscale image is used to detect circles that represent the pupil. It might not always find a circle, especially if the contrast is not optimal or if the pupil is not the most prominent circular feature in the ROI. Yeah, can we also visualize the hue circle transform? Darkest point fallback, if no circle is found, the code falls back to locating the darkest point in the ROI, which is assumed to be the pupil. ROI size, make sure the size of the ROI is appropriate. If it's too large, it might include non-pupil areas causing instability. If it's too small, it might miss the pupil entirely when there's movement. Okay, with the ROI size, I cannot make sure that the ROI is actually the blue square around the eye, because I'm not sure that is actually the case at the moment. Please confirm. Contrast, brightness tuning, the parameters for CV2, convert scale abs might need to be tweaked. This might require some experimentation. If the lighting is not consistent, you might need an adaptive approach that adjusts these parameters in real time based on the average brightness of the ROI. Half circle parameters, the parameters for CV2, half circles, especially Param1 and Param2. The higher and lower thresholds for the canny edge detector used by the transform might need adjustment. Also, the range for min radius and max radius should closely match the expected size of the pupil. Smoothing, to reduce jitter, you can implement a temporal smoothing algorithm. For example, you can average the pupil's position over the last few frames to dampen the effect of any erratic changes. Yeah, I'm not sure about the temporal smoothing a bull will that reduce the quality, the detection quality, the speed? Is that essentially suggesting to make the algorithms lower? Is it? Validation, implement a check to see if the detected pupil position is reasonable compared to the last known position. If it's too far from where you expect it to be, you could discard it or recalculate it. To show the whole process, including the grayscale conversion, you might want to create a separate window that displays each step in your eye tracking application. This would involve creating additional CV too. I am show calls of various points in your process underscore frame function. For example, after converting to grayscale but before and after adjusting contrast and brightness. Yes, displaying a second window sounds good. Can we, are you gonna be generating the code or get the prompts for GitHub co-pilot to do it? Finally, ensure you're processing the images fast enough to keep up with the frame rate of the video. If processing lags behind the capture rate, this will inherently cause jitter because the pupil's position will always be a few frames behind the actual position. Can we also display the current frame rate? And yeah, you can give a overall summary of the importance of the frame rate and what the potential cameras should be used. Now, the other thing that I would like you to touch on is the fact that we actually have two web cameras a position side by side. Would it help that each camera will be doing detection on each eye? To visualize the grayscale image in the half circle transform, you can add functionality in your application that displays these steps in separate windows. Here's a conceptual breakdown of what you need to do. Grayscale visualization, after converting the ROI to grayscale, use CV2. I am showed to create a window that shows this grayscale image. Yeah, might as well. It doesn't sound like it's gonna generate any code. We might as well get the GitHub co-pilot to do it. So where are we currently displaying the image doing video capture with CV2? It's capturing frames, is it? I don't know what that does. Ah, right, because the actual video footage is displayed using something else. So grayscale will be happening eye tracking pie. So we need another... Half circle visualization, after running the half circle transform, draw the circles it detects on a copy of the grayscale ROI and display this in another window using CV2. I am showed. Okay, we're doing this in which file? That's how much I know, Python. I don't even know what file. So we have obviously our index HTML. By the way, it currently has two descriptions of this application and both of them are pretty bad. So we eventually will get the GPT-4 to rewrite it. I think originally it was done by co-pilot. That's why the quality is not great, because it's not looking at the overall context of the whole project. So it's just looking at the individual files. You can get it to look at the whole context, but for example, if I use this to S-prones, a interior should find where the code should be changed or edit. So it's determining the workspace structure, deciding which information to collect. Right, so in find pupil, are we doing IM show in here? No, so find pupil is in eye tracking pie file. But what was the change? So it's creating two windows, one showing the grayscale and one showing the circles. I thought it could overlay that too. Can the circles be overlaid over the grayscale in one image? I'm not sure if it... Yeah, you see now it's looking at the specific lines of code, so we were in the correct region of the code. So that's fine. And single window showing the grayscale ROI with the circles overlaid to call cv2 wait key one. So is it waiting for a keyboard response or something to update the window and handle the user input? Why it's returning the same stuff? By the way, if you haven't checked out bioniccares.com, please do so. All the current tools, which is the waveform, a feature extraction detection, and the webcam eye tracking will eventually come up on this site as well. Currently, we're just testing them locally, but we'll try to share it with you as quickly as possible. Just has to do something useful. Doesn't have to be perfect. And now I think by the way that we are doing, looking at this dataset from the Childmind Institute was a healthy brain network. That's what the dataset is called. There is about a thousand recordings from a thousand patients. And the dataset is publicly available under Creative Commons license, which is great for us because this is what we do as well. Apparently, I didn't download the correct file. It's just looking at one subject, but then realized it doesn't have the full phenotypic file, even though it doesn't make sense. Anyway, I'm still trying to figure it out. Back to this quickly. Just trying to improve on this eye tracking system. We're selecting the find pupil. Might as well just comment it out. So our kit is not working at the moment. So I'm not actually using it or not properly using it. Right? And this will, yes, when you update that file, it gives a fatal error, which ideally we will not. But in the moment, we just have to rerun the application. And the window is a separate window. And it's not updating. And that's obviously running in Python. Yeah, so that's GitHub co-pilot. Unfortunately, it doesn't have the full context for the whole or everything we are trying to do. So let's just go back to GPT-4. To confirm that the blue squares are indeed being drawn around the eyes as intended, you should be able to see these squares in the live video feed your application provides. If they're not visible, ensure that the coordinates and dimensions used to draw the rectangles using CLE2. Rectangle are correct and that they're being drawn on the frame that is being displayed. This is actually quite important. And let's check the blue rectangle squares. Yet they are not the ROI. Are they? That's a cat on my little piece. Lower, it's actually seem to be jumping around too fast at the moment. To display the current frame rate, you can calculate the number of frames processed in one second and display this on the video feed using CLE2 for text. Yeah, we do want to do that because there is a chance that I'm not in the correct file. I'll just use the at workspace. I want to do this bit. We want to display the frame rate onto the footage and generate frames. Yeah, which is, I wasn't the wrong file. This display, the frame rate on the top left corner of the video feed will be updated each frame. Yeah, and this, well, that's the thing. It would be a different to the frame rate of the camera. Double check that we don't have any placeholders in that function. Doesn't look like, right? So we have the frame rate, which is good, 30, 31, whatever. Yeah, this one should be showing us this grayscale ROI. And that's wrong. Well, because first we don't want to display it in Python. We want to display it on the front page, keeps opening that bit. Read of that, go back to the previous file in pupil function to rerun the application, do that bit again. The frame rate is crucial in eye tracking because it dictates how smooth the tracking will be and how quickly the system can respond to movement. Higher frame rates can capture more data points and provide a smoother tracking experience. For eye tracking, the camera that can provide a consistent 30fps or higher is usually recommended. Yeah, we're using 30, so it's dropping down to 27 or something. Using two web cameras positioned side by side could potentially improve detection by having each camera focus on one eye. This could provide more data and reduce occlusion, but would also double the data processing requirements. The algorithm would need to be adapted to handle input from two cameras and merge the data accurately. Yeah, we might do it as a potentially separate tool, like a tool update. A dealing with one camera is difficult enough. For the changes in visualization features you want to implement, you can modify the process underscore frame function in your Python code. For example, to display the grayscale image. So where am I actually doing that? And for displaying the circles detected by the half-circle transform. So what did it do? No, it wasn't doing it. It was plotting a separate window. Remember to update the code where necessary to include these changes and ensure that you have cv2. Wait key and cv2. Destroy all windows are the appropriate places in your code to handle the display windows correctly. Lastly, regarding GitHub co-pilot, it can certainly help generate code snippets based on your prompts. However, you'll need to feed it specific tasks or issues you're facing and it will provide suggestions based on its training. It's a powerful tool, but remember that its suggestions should still be reviewed and tested by a human programmer. Now that's fine. So currently the suggestions to use cv2.imshow is fine, but I do want to display the extra window on the front end. So the end user can see how the pupil detection is actually working. Also any idea why my FPS dropped from 30 to 15? That would be good to know. Yes, I suspect we'll have to transfer more data to the front end, which is fine. I actually wanted for everything to be done on the front end as well. That would be great because then I don't have to use as much electricity. I want to do it in this main.js because it's currently only 50 lines of code as opposed to 85 in eye tracking and another 70 in the flash application. That might need restructuring of the whole code. So let's see how we go. Displaying the pupil detection process on the front end for the user to see is a bit more complex than just popping up a window with cv2. I am show, which only displays on the server side. To achieve this in a web application, you would need to send the process frames back to the client browser. Yeah, this is not ideal, is it? I want to do their whole processing in JavaScript to finish reading this quickly. Here's a high-level overview of how you might adjust your flash gap to provide this functionality. Server side processing. Modify the process underscore frame function to generate the grayscale and half circle visualizations as image frames. Client side display. Send these frames to the client side via your existing flash groups. You could use a similar approach to how you're sending a webcam feed, but have separate endpoints for the grayscale image and the half circle visualization. Front end JavaScript. On the client side, use JavaScript to handle these additional streams and display them in less than IMG greater than tags or less than canvas greater than elements. Toggle button. Implement a button on the front end that allows users to toggle these visualizations on and off. I wasn't actually listening, I did suggest to add a button which will toggle between real view and grayscale with the huge circle visualization. I'm thinking restructuring the whole thing so all the image processing is happening in JavaScript. I mean, let alone this thing is not working very well as it is. Unless I do this, then it's working better. Yeah, we want to see how the grayscale stuff might need tweaking. Now it's back to 30 fps. So obviously we can get 60 fps from two cameras. I mean, we can get a GoPro or something. I don't know what's it's a maximum frames per second will be like a thousand or so. I wouldn't mention. Increased processing load. If you've added more processing steps like the pupil detection visualizations, this can significantly increase the computational load and decrease the frame rate. Network bandwidth. If you're streaming video, network bandwidth can be a bottleneck. If the bandwidth is insufficient for the stream's data rate, frames will be dropped to compensate. So how does it currently? Yeah, currently it's not great because we're streaming because you will be streaming the footage to the website because in comparison, if you look at the, let's close this for a sec. If you look at the hand gesture thing, yeah, this one is doing everything in JavaScript to false detection. We'll do long ones, short ones. And as in music, musical, that's what a head scratch sounds like. The good thing about this one, when you use it, which is already up there, can try it out. I do not use electricity because it's all JavaScript and also don't have to post anything on GitHub because JavaScript just should be available in the browser. So you can just download the source code. So probably we'll try and do the eye tracking in the same fashion. So I can try this one out, especially because it has some advertisement. So you'll be supporting the project now. Okay, back to this. We might need to restructure the whole thing. Yeah, originally we told it to do a Flask application. That's why it's heavily relying on Python, generating frames. Yeah, there was some era business when there was no frame, no, well, no person, no eyes in the footage. That was failing. Yeah, with the tensor, what's it giving me? Boo. Something to do. The camera is one camera is not being used. No, it's actually working. I don't know what's all this. Well, right. The other thing is when you do use a JavaScript alone, then I'm relying. So I can't guarantee that it's going to work at all on your machine because while I'm running a GPU and everything. So I don't know how quickly if at all it will work in your browser on your machine. Yeah, not the music generation. It doesn't work well on the mobile. So I have to use those equivalent JavaScript libraries or running this. I'm not sure if it's going to work at all. Ideally, I will find some sort of balance between back and front end. Let me know what you think. This session is meant to be interactive, by the way. There is even, well, there's an option to chat in the chat, put your comments. I'll respond. There is even an option to bring you live on air. Well, on air, there's some sort of five, 10 second delay, but you know what I mean. A via Zoom. Get back to development. Chat GPT. Displaying the pupil detection process on the front end for the user to see is a bit more complex than just popping up a window with CV2. I am Joe, which only displays on the server side. To achieve this in a web application, you will need to send the process frames back to the client's browser. Here's a high-level overview of how you might adjust your flask app to provide this functionality. Server side processing. Modify the process underscore frame function to generate the grayscale and half circle visualizations as image frames. Client side display. Send these frames to the client side via your existing flask routes. You could use a similar approach to how you're sending the webcam feed, but have separate endpoints for the grayscale image and the half circle visualization. Front end JavaScript. On the client side, use JavaScript to handle these additional streams and display them in less than IMG, greater than tags or less than canvas, greater than elements. Toggle button. Implement a button on the front end that allows users to toggle these visualizations on and off. Regarding the drop in FPS, here are a few potential causes. Increase processing load. If you've added more processing steps, like the pupil detection visualizations, this can significantly increase the computational load and decrease the frame rate. Network bandwidth. If you're streaming video, network bandwidth can be a bottleneck. If the bandwidth is insufficient for the stream's data rate, frames will be dropped to compensate. Hardware limitations. The hardware on the server side, where the image processing takes place, may not be able to handle the increased load, leading to a decrease in FPS. Camera settings. Some webcams adjust their frame rate based on the lighting conditions to maintain exposure quality. Low light can cause the camera to reduce the frame rate to let more light in per frame. Thermal throttling. If the device capturing or processing the video overheats, it might throttle performance to cool down, which could decrease FPS. To pinpoint the issue, you will need to monitor the performance metrics of your server during operation. Look at CPU and memory usage, check for any errors or warnings in the logs, and if possible, profile your code to see where bottlenecks might be occurring. It's also worth checking the network throughput if the FPS drop coincides with a higher network load. Is it possible to move most of the processing as much as possible to the front end? So, is the equivalent javascript library that can do eye tracking? Eye tracking. Eye tracking. We ideally would like to display the output, the result of the eye tracker, and all the processing stages. A grayscale hue transform, so on and so forth. Um, on the front page, on the html. It's not a good sign when it's thinking that long. Tens of low being overkill for this. Moving processing to the front end can definitely help distribute the computational load and may improve the performance of your application. For eye tracking in a browser, you can use javascript libraries that are designed to work with facial and eye detection. One proper library is TensorFlow. JS with models like blazeface for face detection, which you can use as a starting point for eye tracking. Yeah, because I don't think a media pipe does eye tracking anyway. Yeah, this eye tracking module, where is it even coming from? Well, right, that's the media, that would be the media pipe stuff. No, sorry, that's the, sorry, that's how little Python I know. That's importing the eye tracking pie and the eye tracking pie using media pipe. And media pipe just does the, it detects the eyes, not the pupils. Yes, we don't want to just display it in. I am sure because that would be only in the back end. It won't be accessible to you. Moving processing to the front end can definitely help distribute the computational load and may improve the performance of your application. For eye tracking in the browser, you can use JavaScript libraries that are designed to work with facial and eye detection. One proper library is TensorFlow. JS with models like blazeface for face detection, which you can use as a starting point for eye tracking. Yeah, this is a very high level approach. Doesn't have any. Face detection, use a pre-trained TensorFlow. JS model to detect the face in a video stream from the webcam. Eye detection, once you have the face landmarks, you can locate the eyes and create a region of interest, ROI around them. Grayscale conversion, convert the ROI to grayscale using JavaScript Canvas API. Yeah, probably really something like that exists, but it will be really good for us to display the whole processing, ideally on the same window. But if another window is necessary, it should be fine to... Pupil detection, use image processing techniques to detect the pupil within the ROI. You might have to implement the logic for this manually, as there may not be a ready-made JavaScript library for pupil detection that's as straightforward as OpenCiv's half-circle transform. Pupil detection, use image processing techniques to detect the pupil within the ROI. You might have to implement the logic for this manually, as there may not be a ready-made JavaScript library for pupil detection that's as straightforward as OpenCiv's half-circle transform. I think I'm wondering if it still should be a flask application, but because eventually we might do something on the back end, but yes, rely on JavaScript as much as possible. Visualization, draw the detected eyes and pupils on the canvas and update it in real time as the video stream processes. Optimization, since JavaScript runs on the client's browser, you need to ensure that the code is optimized to run efficiently without causing lag in the user interface. Well, right, that's kind of not up to me. I'll be testing it on, well, my machine, which is a beast. It's a 3090 Ti GPU, so on and so forth, and the Intel i9, but then I could also test it on a standard laptop as well, that's as much as I could be doing this thing. You'll have to let me know how this would work for you. To display the output and processing stages on the HTML page, you've set up multiple less than canvas greater than elements or less than video greater than elements that are updated with the various stages of processing, original video, grayscale image, eye detection bounding boxes, etc. Keep in mind that while moving processing to the front end can reduce server load, it also places more demands on the client's device. This approach assumes that the end users have sufficiently powerful devices to handle the real-time processing. You would then have a JavaScript file, eye tracking, JS, that handles the TensorFlow, JS model loading, face detection, eye detection, grayscale conversion, and kubel detection. This file would also be responsible for updating the output canvas with the visual results of these processing stages. Yeah, I like this approach better considering I'll be paying less for electricity. Remember to check the privacy implications of processing biometric data like eye movements on the client's side, as well as ensuring that your application is compliant with any relevant data protection regulations. All right, so with the last point, currently with the Flask application, does it mean that we are sending the video footage onto the server? If you can confirm, that's not ideal if that's the case. Now, I'm not sure if we should start fresh. We have another version of this application. Shall we still keep at the Flask application, just in case we do need to do something on the server? However, rely mainly on JavaScript code. Now, either way, can you provide the eye tracking JS file and the full HTML code and the updated Python code as well. And let know if you decided to start from scratch and have an application or we modifying the current one. Don't forget displaying the FPS on the video footage. Also, ideally, we would like to do everything in one window, but if two windows are necessary for displaying the original footage and the grayscale with the hue transform, circle transform, then go for it. By the way, don't mind you to be a little bit funny if you feel like it. Yes, in your current Flask application set up, the video footage is being sent from the client browser to the server for processing. This isn't ideal if you're dealing with bandwidth issues or privacy concerns since raw video data is quite sensitive. Yeah, we don't want to do that. We don't want to send. So we definitely get rid of this current Flask application. The way it works is sending the video footage onto the cloud, onto my server. We do not want that. Keeping the Flask app and adding JavaScript for the heavy lifting of eye tracking can give us the best of both worlds. We get to keep the server for any necessary back-end tasks while offloading real-time, intensive processing to the client side. Yeah, that sounds better. The tracking. JS file will handle the capturing of video from the webcam, face detection, eye detection, and pupil tracking. Unfortunately, I can't whip up an entire library's worth of code right here, but I can guide you on the structure. I'm sure it has no trouble generating the whole code, but yes, we will do GitHub Copilot considering we have those 40 messages per three hours. Don't want to overload GPT, so we'll be using Copilot as well. So it's kind of good that it's provided the doesn't have, ah, yeah, the video footage from camera. Okay, so let's close this one for a second. By the way, if you haven't checked the body chaos yet, please do so and don't forget to provide your feedback. Everything we're currently doing eventually will go on this site as well. You can display the original footage in the grayscale with half circle transform or its JavaScript equivalent, side by side in a single less than canvas greater than element by dividing it into regions for each visual output. Yeah, we'll actually go back to the whole Bionic Chaos project. We have the, what are we calling this one? So the source code for it, CardiQuest, no, CardiQuest Bot is the other application that we meant to be moving to, meant to be publishing already, but we got stuck on that one as well. Now this one is the eye tracking. So they both, they will be both Flask applications. But this one will, the front end will create a copy of it. So that's the old, so we know that's the old one. And we'll keep developing the new one. Okay, so eye track. Yes, instead of the Python code, yeah, we don't have the calibration procedures yet. I will just do this template, check templates that we still have the original code. This one will be the new one. Okay, we'll have a tracking JS that just be the main JS. Get confused eventually with those files and keeps just copy pasting a text from the bottom. It should be loading from URL, let's leave it for a second. The HTML, the Flask application now is much simpler. It's just Flask, so it's not doing eye tracking. If your current Flask app doesn't have a ton of dependencies or complex server side logic, you can start fresh with a focus on front end processing. However, if you've already got infrastructure set up that you think might be useful, like user management, data storage, etc, then modifying the existing app is the way to go. One told me but my camera froze. It should be better now. I have both of them running. Why check this sound as well? I have to keep talking so I can monitor my microphone. What's using the second camera thing is using the second camera? This is a job for our JavaScript file. It will keep track of how many frames are processed in a second and display it, probably in the corner of the canvas like a stealthy ninja, not wanting to disturb the view but always present. Okay, it has this in the code. How do I actually get the co-pilot to start developing? For simplicity and a seamless user experience, we aim to keep everything in one window. However, if it turns out we need a pop-out window to properly show our gray scale wizardry, we won't shy away from it. After all, sometimes you need to open a new window to let some fresh air in. You're actually funny. Not too funny but just funny. In conclusion, you're now equipped with a blueprint for turning your application into a front-end powerhouse with a little server-side magic kept in your back pocket for when you need it. Keep the server lean and mean, ready for action, and let the client's browser do the heavy eye lifting. Time to get coding. Okay, can you list all the files that we need, all the folders and the project structure and can you provide the whole script for each file separately? I don't mind you having templates for GitHub co-pilot to fill out. But I need you to generate the whole code still. Also, hopefully it will be working better than what we had in the Python Flask application that we want would not be using anymore. So I assume I don't need the eye tracking Python file. And also do I need to separate the main JS or tracking JS as the main and only JavaScript file? Yep, should have also asked for the description on the HTML page. Absolutely. Let's sketch out the directory structure and the corresponding scripts for each file. We'll keep it neat and organize for ease of development and maintenance. Yeah, so that's what we had before and this is the new structure. Yeah, we'll have one JavaScript requirements file. Yeah, we have this info file, but it's okay, doesn't know about it. Call it whatever this matter for now. Here's an outline of the content for each file. We can get rid of this one to not have a main JS. We're getting rid of this one as well. So when you have the HTML tracking JS and the app, why? If hopefully this will just remain empty because I don't actually need any data to be sent to my server. I want you to be able to do the eye tracking in your browser. Keep your data secure on your own machine. It sounds like a good option, so requirement file, just have the flask. Why is it a different version to what we had before? So we won't be using media pipe and OpenCV Python anymore. Check the Python code. Be as simple as possible. A lex html should be straightforward as well, using these two libraries. And this is the tracking JS, text, base, size, draw detection, video element. Tracking JS will be your main and only JavaScript file containing the eye tracking logic. That's good, so we got rid of main JS file. Since we're moving the processing to the client side, you don't need the eye underscore tracking Python file anymore. Yeah, we got rid of them. Make sure you have TensorFlow, JS and any other required libraries included in your html file, as shown in the index, html. You might need additional libraries or scripts depending on how complex your eye tracking logic gets and what features you decide to implement. Make sure these files are not made up and they actually have the code copyright Google, that's okay. We're essentially all working for Google now. This setup assumes a simplified model where all the heavy processing is done on the front end. The flask service role is minimal, mainly serving the static files and templates. Remember that implementing a full fledged eye tracking system in JavaScript can be quite complex and may require additional libraries and a lot of fine tuning. Access to use a webcam. Why? Allpilot, you just do it. A GitHub corepilot. I'm like complaining to about another robot, to this robot. Okay GPT-4, can you help me? GitHub corepilot is misbehaving. It does not want to generate the code. Um, can you do something? I just generated the functions yourself, like streaming the video to the front end and the areas. Use a model to detect faces and find the eyes within the face. Then draw the textions on the canvas. This could be a bounding box for the eye region. Maybe make sure it's not too large and then we are missing a bunch of functions. We should have had the hue transform that will find the circles. Or you're suggesting a different way of doing it. So that's the overall thing that works at all. It works at all. Now I also have the camera being taken over and should be settings of this. Should be like a camera icon. Don't see it. I know it's being taken over by something. I think we're getting stuck with this one. Might move on to the fuzzy logic stuff. Because we won't be making much progress here. All right, it sounds like github co-pilot might be acting a bit like a stubborn robot. So let's roll up our sleeves and draft some code to get you moving. I'll provide you with more fleshed out functions for the tasks you've described. Remember that these will be high level functions and you'll need to fine tune them to fit your application perfectly. Guess we have... Yeah, TensorFlow.js and Blazephase. That's a good name. Creating a model. Loading. Sorry, loading a model. Not creating a model. Should definitely be better than what we currently have. Yeah, github co-pilot. I didn't want to generate the code. I don't know why. The half circle transform specifically is not available in TensorFlow. JS or Blazephase. However, you can attempt to detect the pupils using a different approach like analyzing the grayscale values within the eye regions and looking for dark circles. This would involve custom code and a good understanding of image processing techniques in JavaScript. I'm sure there is no JavaScript code for library or something for detecting pupil or I mean if there is no huge circle transform surely there is a grayscale transform that should be available. So currently we are starting the video stream, detecting the face, extracting eye regions, drawing the eyes. Okay, that's not enough is it? Yeah, I'm going to detect the pupil. Are you going to generate the whole function? Or detect pupil eye region? See if co-pilot woke up. No, don't let this stand. How it works. Let's try to generate this function. Okay, github co-pilot actually doing this, returned darkest point. We are not using it at the moment. The function creates a canvas, draws the eye region on it, converts the image to grayscale and then iterates over the pixel data to find the darkest point and returns it as an x, y coordinate object. That's interesting but it's not currently being used is it? It's not used, popped the whole code into a GPT. So now they compete with one another, turn the text to a speech to text for a sec. So co-pilot generated, it changed the detect faces function. Sometimes it's actually quite handy when it's providing links and not at the moment. So we have the detect faces function is currently not being used in the provided code. Sorry, the detect pupil is not currently being used. Good code, we're in the detect faces function after you extracted the eye regions. Here's how you could modify the text faces and what did the GPT say? The code you've shared seems to be a good start for setting up the face and eye detection using TensorFlow, JS and Blazeface on the front end. What we need to do next is to integrate the detect pupil function properly into the flow, as well as handling the grayscale transformation and drawing everything onto the canvas. Here's a more complete example of how you might modify the detect faces function to include pupil detection. Okay, so let's do different functions. I kind of trust the GPT-4 more than co-pilot, but happy to try both. So both eventually drawing left and right pupils, GPT-4 code makes this prediction as well that waits for the model estimate faces. Co-pilot code is better. Currently, I don't have anything right because we decided not to continue with the Flask application that we had. So we rewriting the whole load started from scratch, essentially. So we definitely need that and it's added another function, which is similar to what GPT-4 did before, so it has the draw face which co-pilot didn't suggest at all. Now, the other thing is we also have a problem. So we have two webcams connected to our machine. One is being used, kids can understand what I'm trying to do. One of them is being used by something to be able to use GPT-3 for a sec. On Ubuntu, I have two webcams connected. A one is being used, legit. The other one is being hogged by something else. Can I check what it is and reset potentially both cameras? If you want to reset both cameras, you can try restarting the webcam sources or unloading and reloading the webcam kernel modules. Here's how. Replace less than module underscore name greater than with the name of the webcam kernel module. You can find the module name by running lsmod, grep video. It seems like the webcam kernel modules are loaded properly. The UVC video module is responsible for handling USB video class, UVC, compliant devices, which includes most webcams. Since both webcams are detected and the necessary modules are loaded, you can try restarting the webcam service to see if it resolves the issue with the second webcam. If the second webcam still doesn't work after restarting the service, you may want to check if any other applications or processes are using it by running. If you identify any processes using a webcam, you can terminate them or close the applications to free up the webcam further uses. Additionally, make sure that the second webcam is properly connected and not being used by any other hardware or software. First, let's try restarting the Udo service, which manages device events. Then, you can try reloading the UVC video module, which handles the webcam. These commands will unload and reload the UVC video module, essentially resetting the webcam driver. After running these commands, check if both webcams are recognized and available for use. If the second webcam still doesn't work, you may need to investigate further to identify any potential conflicts or issues with the hardware or software configuration. It seems that the UVC video module is currently in use, which prevents it from being unloaded and reloaded. In this case, you can try stopping any processes or applications that might be using a webcam before attempting to reload the module again. Which, too, when they have the same names doesn't help, does it? Yeah, one of the cameras just got stuck. Okay, that doesn't work. Get this air and shower script as well. I have an event list now. Pretty sure we're doing this already, aren't we? Oops. Don't actually do that. It's the main function. Don't have an event list now. Still don't have an event listener. Do we need an event listener here in this function? Okay. Can you modify the function as needed? What's an ESP32cam module? My copy from the chat. That's something wrong with my camera in general. So it works. No, that's the frozen one that I have in the stream. IOT, yeah, run away from anything called IOT, but we need a server for it. What? Yeah, so we want to do it in JavaScript only. So that doesn't seem right. I want to do it in a simple way, kind of similar. Kind of similar to what we had here. Is this working? Yeah, so that's brought this video brought to you by... Well, no, it's not sponsored. So we closed that. Should be as simple as this. And that one JavaScript alone, that works okay. So I don't know why. This one doesn't. There is no... That's 3.5. That... Why this window is so funny? Anything wrong with this code? We do not have the video footage displayed. Can you also check the HTML and the Python code? I see you're looking to get everything tied up nicely. Let's walk through each part of your code to ensure they're all set to work together harmoniously. Your HTML is mostly set, but I notice you're missing a less than video greater than element to display the webcam stream. You need to add it so that the video element in your JavaScript has a source to play from. The autoplay and plays in line attributes ensure that the video plays right away and inline on the web page, respectively. The Python code looks good. It's serving the single HTML page you have and should be working fine. Just make sure you have the static folders set up correctly in your flask project structure so the tracking JS file can be served without issues. There are a few adjustments needed in your JavaScript code. The star video stream function is not called anywhere after the model loads. You should call star video stream within the main function before setting the interval for detect faces. The detect pupil function seems to be attempting to draw on a non-existent eye region. You need to get the actual pixel data from the video feed. This would involve creating a new canvas element or drawing on an existing canvas, then cropping to the eye region based on the landmarks. The pupil detection logic is sound in theory, but it does not account for context of the eyes within the entire video frame. You need to calculate the exact coordinates of the eye regions within the full frame to crop them accurately. Let's fix the call to the star video stream in the main function. Remember, the pupil detection through the darkest point method is a simplified approach and may not be robust in various lighting conditions or if the user is wearing glasses. In a real world application, you would likely need a more sophisticated method, possibly involving machine learning or more complex image processing techniques. Finally, make sure your web server is set up to serve the static folder where tracking JS resides. This is typically done by setting the static underscore folder parameter in your flask app or by placing your static files in the default static directory. Okay, this is my eyes. Funny, they move in and out. Okay, so obviously there's no pupil detection. What did it say about the pupil detection? I wasn't listening. And a quick fix for the detect pupil function to clarify the eye region. Remember, the pupil detection through the darkest point method is a simplified approach and may not be robust in various lighting conditions or if the user is wearing glasses. In a real world application, you would likely need a more sophisticated method, possibly involving machine learning or more complex image processing techniques. Okay, this is already doing this, isn't it? So what's wrong with this function and how do you suggest changing it? It does return the XY coordinates and brightness levels as well. Can you explain? The function detect pupil that you've provided is based on a reasonable approach to find the darkest point in an image, which is assumed to be the pupil. The function, however, has a few issues that need to be addressed. Eye region extraction. The eye region parameter appears to be a placeholder. For this function to work, eye region should be an actual image or canvas context of the eye. You need to extract the eye region from the video feed first, which involves more steps than just passing a parameter. Canvas manipulation. The function creates a new canvas element every time it's called. This is not efficient, especially if it's being called frequently in a loop. It's better to create a single canvas element and reuse it. Grayscale conversion. The grayscale conversion loop is correct, but remember that this alone does not find the pupil. It's just the first step in processing the image to make the pupil, usually the darkest part, stand out more clearly. Darkest point calculation. The function iterates through every pixel to find the darkest one, which is an intensive operation. It's important to note that the darkest point may not always be the pupil, especially if there are shadows or other artifacts in the image. Here is a revised version of the function with some conceptual fixes. To use this function, you call it with the eye coordinates extracted from the landmarks provided by the face detection model and the video element from which the frames are being captured. A few things to keep in mind. This function now assumes that the CTX is the context of a canvas element that has the same dimensions as the video element. This canvas is used for image processing and is not necessarily the same canvas where you draw the tracking results. The darkest point detection might still pick up false positives if there are reflections or shadows. More sophisticated methods might involve thresholding the image to isolate the pupil or even using machine learning-based approaches. Performance can be an issue, especially if you're processing high-resolution video. You may need to downscale the image for analysis or use web workers to process the data without freezing the main thread. It's loading forever as well. It's not much different, except that one eye is really chumpy now. Am I calling it correctly? Video element. Not using the video element, am I? Because previously we didn't have a video element. Can you fix the detect pupil call the way this function is being called? It has another variable as an input. Let's see if GitHub Copilot can do it. Talking about detect faces. We need to update how the detect pupil function is being called. Are we inputting the model? No, it's the video element. Oops, am I calling it model? It's the video element that out quickly. It's loading forever. It's a bit funny like that, isn't it? What's it trying to do? It's bad. Good thing about a GPT as opposed to Copilot is that we can bombarded with stuff. We can add an image to it that it will surely not ignore. And we can ask it to fix the following code. That it will surely do perfectly well. There is an issue with how the detect pupil function is being called now after it was changed. Can we fix that? And any other suggestions to improve the pupil eye detector based on the image that I just loaded. That I just uploaded. There are two images uploaded. Please check. Make any suggestions as to how to improve the code. Going in circle with this one. My camera, no my camera is still frozen. Based on the images you've uploaded and the code snippets you shared, let's address the issues and suggest improvements for the eye pupil detection process. Function call. The detect pupil function is being called with parameters left eye and model, but it should be called with left eye and video element. The model parameter is not used in the detect pupil function. Parameter mismatch. The eye parameter in the detect pupil function is expected to be an object with properties x, y, width and height. However, the extract eyes region's function is returning an array with landmark points. These points need to be processed to create a bounding box for the eyes. Dark point detection. The current method for detecting the darkest point assumes that the pupil is the darkest point in the eye region. This might not always be true due to shadows, reflections or lighting conditions. Modify the extract eyes region's function to calculate the bounding box around the eyes. Implement a more robust method for pupil detection, potentially involving fresh holding or other image processing techniques. Here is how you might adjust the detect faces function and the detect pupil call. Yeah, one thing is for sure. Copa is generating code much faster, but it doesn't always generate working code. So I prefer slower than the rubbish code that seem to have corrected the image. Considering the images, it appears there may be some inaccuracies in the detection, possibly due to the bounding box size or the assumption that the darkest point is the pupil. Here are some improvements you can make. Dynamic fresh holding. Instead of looking for the darkest point, apply a dynamic fresh hold that adapts for lighting conditions to better isolate the pupil. Morphological operations, use operations such as erosion and dilation to remove noise and small reflections that might be confused with the pupil. Region of interest adjustment, fine tune the size of the bounding box around the eyes. The size should be large enough to include the whole eye, but not so large that it includes other features that could be mistaken for the pupil. Machine learning, if feasible, consider using a machine learning approach to directly detect the pupil within the eye region, which can be more accurate than heuristic methods. Please note that implementing these suggestions involves more complex image processing, which may need to be done on a server with more powerful libraries like OpenCV or by employing machine learning models that are trained specifically for pupil detection. I'm actually drawing the pupil as well, it's kind of hopeless. So you'll try this code. By the way, it says there's some improvements that you can make. Now, how about you make them? And I'll just tell you what to do. Is that how we treat a robot? No, it better be nice before they take over the human race. Then we'll remember everything. A nice little robot, nice little robot. Take the faces. The window is not scaling properly as well. It's not fun to use. It's regenerating code that we already have. What's called the left eye box when that common says circles. That's the same. Think hallucinating a bit. Write the draw pupil. This extract eye region function probably was missing stuff. But no, what's up? Those red boxes meant to be the eye detector. It's not drawing the eyes. It's hallucinating quite a bit. Might I just start fresh or move on to an out tool? Let's just remove a commented out code that didn't do anything. Take pupil. Doesn't do much. There is no pupil detection. The red boxes keep being redrawn. So the eyes and the overall eye detection and the pupil detection are being redrawn. Other circles, they meant to be the pupil detector. Or they're moving with their head. That's fine. They're also moving inside the head. Need really for them to be overlaid on top of the head. Is that an option? If the circles are the pupil they need to be smaller and ideally overlaid on top of the video footage. Yep, much progress on this one. So might finish this stream or continue. What do you reckon? Might take a break. I understand the issues you're facing. Let's tackle each one and make the necessary corrections. Redrawing the bounding boxes. The red boxes are being redrawn over and over without clearing the previous drawing, leading to an opaque red rectangle. You need to clear the canvas before each new drawing to maintain the transparency and see the video feed below the drawings. Pupil detection. If the circles are meant to be the pupils, they need to be positioned accurately over the eyes in the video feed. The current pupil detection logic assumes the darkest point is the pupil, which might not be accurate, especially with varying light conditions. Circle size and position. The circles representing the pupils should be scaled appropriately to the size of the detected eye regions and should overlay the video feed accurately. Here's how you can address these issues. Before you start drawing the new frames detections, clear the previous ones. To improve pupil detection. Consider using a threshold to isolate the pupil more effectively. Use morphological operations like erosion and dilation to remove noise. If the darkest point method isn't sufficient, consider training a machine learning model specifically for pupil detection or employing other computer vision techniques. Ensure that the circles drawn for the pupils are scaled and positioned correctly. To ensure that the drawings are overlaid on the video, make sure that. The canvas is positioned directly over the video element using CSS. The canvas has a transparent background. The video and canvas are the same size and aligned with each other. Here is an example of how you might adjust your CSS to overlay the canvas on the video. And ensure your HTML is structured correctly. Make sure the less than video greater than element is hidden, as in a style above, so that only the canvas is visible, but it's still streaming the video for processing. By applying these changes, you should be able to see the video feed with the eye and pupil detections overlaid correctly. Remember that tuning the parameters for detection, especially for the pupils, might require some experimentation. Ensure that the circles drawn for the pupils are scaled and positioned correctly. To ensure that the drawings are overlaid on the video, make sure that. The canvas is positioned directly over the video element using CSS. The canvas has a transparent background. The video and canvas are the same size and aligned with each other. Here is an example of how you might adjust your CSS to overlay the canvas on the video. And ensure your HTML is structured correctly. Make sure the less than video greater than element is hidden, as in a style above, so that only the canvas is visible, but it's still streaming the video for processing. By applying these changes, you should be able to see the video feed with the eye and pupil detections overlaid correctly. Remember that tuning the parameters for detection, especially for pupils, might require some experimentation. All right, let's wrap this up with a final overview and a dash of humor to lighten the tech talk. This flask app is like a digital sous chef specialized in eye tracking. It's meant to whip up a storm by processing video streams in real time, identifying where a user's pupils are peeping, and highlighting those shifty little pupils with snazzy circles. Here's what we've got simmering on the stove. Flask server, this is the head chef, serving up the main HTML page and making sure all the ingredients are in place for a smooth user experience. TensorFlow, JS with Blazeface, our kitchen helper for identifying faces. It's like having an extra pair of eyes that never blinks. Webcam stream, we're tapping into the diners, users, webcams to get a live feed, like a reality TV show before eye movements. Canvas overlays, this is the plate presentation. We're drawing bounding boxes and circles over the video feed to show where the eyes and pupils are, aiming for a Michelin star in accuracy. Currently, the code is simmering nicely, but it's not quite ready for Gordon Ramsay yet. Bounding boxes, red squares, they're like the outlines of a great dish, but they keep stacking up like pancakes. We had to clear the canvas to avoid a pile up. Pupil detection, circles, these are supposed to be the olives in our martini, but right now they might be more like grapes, a bit too big and not quite in the right spot. Overlaying artistry, we're trying to paint our detections onto the live feed like a Renaissance artist, but we need to make sure our strokes are precise and our canvas is transparent. In this culinary code saga, JavaScript is doing the heavy lifting on the client's side, using the user's own computer to process the video feed. It's like making your dinner guests cook their own meal, but in a fun, interactive way. Meanwhile, Python is in the back, making sure the plates and cutlery, web server and routing are spotless and ready to go. In the grand safari of eye-tracking applications, this app is aiming to join the pride of Lions. It's on the prowl to compete with the best, offering accessibility and convenience by not requiring any fancy equipment, just a webcam and a good internet connection. So there you have it, folks, a flask app that's trying to follow your gaze like a puppy follows a treat. It's got potential, but it might still chase its tail a few times before it catches it. Remember, all great tech starts as a recipe that needs tweaking. Maybe the salt is your JavaScript logic, or the pepper is your Python server, seasoned to taste, and don't be afraid to stir the pot. And hey, if we didn't have bugs, developers would be out of a job, right? So, here's to those little critters that keep us coding into the wee hours. May your coffee be strong, your compiler understanding, and your console logs verbose and informative. That's all from this kitchen for now. Stay coding, and remember, in the world of programming, sometimes you're the bug, and sometimes you're the debugger.