 Hey guys, welcome back to my YouTube channel. This is Daniel Rossell here. So a few days ago I uploaded a video to the Singularity subreddit showing my attempt to create a digital avatar of myself using HeyGen. For those who haven't heard about this technology, it's pretty wild. It's this artificial intelligence AI software that basically takes a couple of minutes of video recording of you and will use that to duplicate yourself as an avatar. This means that you can type in text for the digital clone of yourself to do and it will actually generate a video reading it as if it were you. People on Reddit seemed to think that it was pretty good but I personally had very mixed feelings about the success of my first avatar. For one the accident was totally off. I'm originally from Ireland so I speak with an Irish accent and my digital clone had this kind of generic American accent. As well as that it was very very robotic. I actually misread the avatar making instructions that HeyGen have and I thought it said that you can't use any facial expressions so when I was recording my first two minute training video which is what the AI uses to make the avatar, I was kind of speaking in this really like monotone fashion. So what I thought I would do is actually show you the whole process from start to finish of both cloning myself digitally using HeyGen showing the results and seeing if I can get something any better. So I'm actually going to do the whole process again today in this video including the training video. I'll leave timestamp so if you're just curious about part of it please feel free to jump forward. But these are the instructions that HeyGen have for creating your avatar. So just to say as well that they do offer this kind of like fine-tuned premium avatar. That's when someone from their team is going to actually like review your model and make tweaks to make it even more realistic. That cost I'll show you guys a cost actually later in the video. It's a bit more. I've subscribed to the basic creator package. It costs me $30 for my first month. That allows me to create three avatars and generate a specific amount of video. However if you want to get if you want to just try this out by doing your first avatar you can actually get that on the totally free tier. So you don't need to pay money to build your first avatar. So let's just take a look at the footage user requirements. So it's got to be two minutes long of uninterrupted speech. You can talk in any language about any topic you're interested in. You can use any camera to record the videos the higher the quality the better. So I'm actually just using the webcam that I used to shoot a lot of these kind of YouTube vlogs. I do have a professional camera that I could use but I was actually kind of happy enough with the results of the webcam. Just for anyone interested I'm using the Logitech 930e. I think it's called it's kind of a standard 1080p business webcam. And again the first avatar I thought the replica of myself the image itself was actually decent. So I don't think I need a sort of fancier camera to just it was just all about the kind of what people called micro expressions that kind of let the avatar down. So they recommend filming a 1080p or full full HD for higher avatar resolution quality. Don't edit the video so I'm going to just basically record this and you know don't do any editing in the footage. Here's some important stuff don't walk or move around so you want to stand still and avoid sudden head motions or unnatural body movements. Keep your eyes fixed on the camera right so just make sure you're looking straight at what you're recording with for the whole duration of the speech. Be natural and calm your avatar will perform exactly as you do. Now here's a couple of important notes here take pauses with lips closed in between sentences. So when I was finishing a sentence I would like bring my lips to closed like I would say this is a sentence. And you just like pause like that for a couple of seconds. It doesn't feel natural but the point of this is that the AI needs to have some way of animating your mouth opening and closing closing. So it still it needs a few examples which is why it's important to do that. And the second thing is that there's a two to three second pause lips closed period at the start of the video. The final thing required for your digital avatar creation process is to record this consent statement. I John Doe allow Hagen and you it's you replace John Doe of course with your name to use the footage of me to build a Hagen avatar for use on the Hagen platform. So the purpose of this I mean obviously we're getting into some kind of like weird technology here with all these kind of deep fake creation tools that they obviously have black hat malicious use cases. So the idea of the consent statement is to make sure that the footage is being uploaded by the person who wants their avatar to be created. Now I actually did something wrong just there I did this kind of hand motion. So when you're recording your avatar as well you can't make any hand motions in front of cameras. So I'm just going to keep my hands down here so that I'm not tempted to do any kind of moving. So just put your hands I'm putting them like literally on my knees. I'm looking straight at the camera and I also am going to just create a Google stopwatch going to move that over to a new tab so I can see when I've recorded two minutes. And I'm starting the stopwatch so I'll record for two minutes and 15 and this is actually going to be my video. Okay. Hi my name is Daniel Russell and this is my second attempt at recording a training video for Hey Jen. Hey Jen is pretty wild. It's this technology that allows you to create a digital avatar of yourself. What this means is that you're creating a avatar that you can then animate and give a text to read and it will read the text. It's supposed to be in a voice as close to your own one as possible. What are the applications of this? Well this is actually something I've been thinking about a ton. Hey Jen seemed to suggest that this is going to be the new way for creating videos. So if you have a podcast or a YouTube channel like me and you record a lot of these so called talking head videos where you're just talking to the camera. Instead of doing that yourself every time you're going to be able to simply create your avatar the first time and then give a text read and it will literally animate your video. I mentioned that my first avatar creation was very very flat and robotic. So what I'm actually doing this time is I look like a complete crazy person. I'm smiling for no reason because I want my avatar to have a bit of smiling and I'm going to look angry in my next expression because I'm actually really angry because I had to return a speaker today that didn't work and that really really sucked. So as you guys can see I'm trying to improve on the model by doing a little bit. I'm glancing over without trying to divert my eyes from the camera at my stopwatch. I'm at about one minute and 40 so 20 more seconds of garbage content. Today is actually a really exciting day because I'm going to be picking up a brand new desktop computer. And if you want to know the kind of interesting process for how you can get custom made desktop computers built in Israel. Do consider subscribing because that video is going to be coming up soon and I'm doing the crazy person smile again. Alright guys so I've got about two minutes here so I can finally stop speaking. I'm going to stop the camera and this is going to be my training video for V2 of my Hagen avatar. Okay so I said if I was going to show the process of doing this I'm going to go the full nine yards. So I'm going to be really interested to see if I'm going to be able to get my microphone windshield. You can then animate the video here. This is a proxy clip so it's a little bit down-resed. But basically all I'm doing here is just kind of trimming out the start. You can see by the sound waves where I started speaking. And just remember that you do want two to three seconds of neutral. So I'm just kind of like tweaking my recording. Like I wasn't looking at the camera here so I'm just going to get rid of those first few frames. For those interested I'm using Kaden Live as a Linux video editor. This is going to be my training video for V2 of my Hagen avatar. And I'll just cuss here so there's not too much motionless. I don't know if any of this stuff is going to actually affect the avatar that I create using Hagen but I figure it can't help, it can't hurt to give it any chance of succeeding. I'm now going to render out this video. They do give you the option of rendering in Hagen if you want to do that. So you can see I've already actually done this process earlier today. It didn't work out as well as I hoped so I'm giving it another go. I'm going to just overwrite this file calling it Hagen training video. And here we go, yes I want to overwrite. So now I'm going to render out my training video. And I'm also going to go in now and get the text from my consent statement which is the next thing that you need to give Hagen permission to digitally clone your avatar. Okay so while my video is rendering I'm here in my Hagen dashboard and I'm pretty sure now that this video of the CEO is actually a demo. In fact it is a demo of their technology and use. So I'm going to click on free instant avatar. Okay I just skip through that and go and get started. You can get the instructions in video or text. I'm just going to have another read off these requirements because they're a little bit different than the ones in their help center. Submit at least two minutes of footage. Use a high res camera. Record in a well lit, quiet environment. Look directly into the camera. Pause between each sentence with your mouth closed. That's the important thing. And use generic gestures and keep hands below your chest. Okay as I mentioned. What you don't want to do, stitches or cut your footage. So don't do that. Keep it one take as I did. Talk without pauses. Don't just like keep talking and keep talking and keep talking. You want to break it up so the model can learn your mouth movements. Don't change, don't like shift around position. I think that'll confuse it. Background music and just you know use of pointing gestures and hand gestures above the chest. So I don't know if you're an Italian or someone very expressive, keep your hands as I said at the start down. So those are pretty much the same and talk about anything. So I just talked about like stuff that's going on in my life today. So you can do this in the directly in the browser if you want. I've done it. I recommend obviously doing it as it says you're going to get better quality and you can just make those small edits to make sure you get the best file. So you can also import it via Google Drive if you have it there for some reason. So but I have it on my local computer. So you want to upload here my Hagen training video. And now you can get a preview of your video. Hi, my name is Daniel Rossell and this is my second attempt at recording a training video for Hagen. Okay, so that looks kind of pretty much okay to me. Let me just play one more second. It's pretty wild. It's this technology. That's fine. And then it just asks you to take these boxes just verifying. Okay, my face is visible. I'm looking into the camera. There are pauses and the environment as well as quiet. It's the best lighting I can do. And then you can opt optionally opt in. Hagen holds the highest standard to protect your data, but you can allow Hagen to use my footage to optimize future AI models. I'm going to do that. I hope that this doesn't lead to me being created as a deep fake or like there's going to be like virtual Daniel on YouTube in a few years. I like the idea of like helping to make this project better. So I'm going to give them voluntarily opting in to that. And then if you want to confirm just say my footage looks good. And this was my previous version. So I'm actually just going to use this. Oh, and then they're asking me now for a consent statement. So I'm going to record click record on can record a consent. And it tells you what to say. So I'm not going to read this because that would allow people to reverse engineer this process and create deepfakes of me, which would be highly, highly disturbing. So I'm going to go now record this offline and then upload this and validate it. Alrighty. So I've just recorded my consent video. And I'm just going to go browse local files on my desktop. I have Hagenconsent.mp4. And so then as it says, it takes 20 seconds to validate your consent. And basically it's just making sure that this is you. I presume the AI is like checking your face points. Against the video and also making sure that your speech is in sync. And the purpose of this, as I mentioned is I'm guessing like deepfake prevention because right, you could easily find two minutes of, I don't know, like Obama or President Biden speaking that fits all this criteria. And you could say, Hey, let me create Biden as my avatar. And that would basically be all you need to do to create deepfakes. But this process just makes sure that it's like legitimate use because you may have that footage of Biden speaking, but you don't have his consent video. So it's validated my consent. It says that your consent is good. So I'm going to click submit. And now it's actually uploading my video into the, into the engine. And then it's going to take like about five minutes to actually build the avatar. So I'm going to just pause this while this process is going on. Okay. So roughly five minutes later and we have the avatar ready. Let's click check it out. We can check out the Daniel Rossell avatar. Your avatar is ready. And here's a quick moment where it speaks your instant avatar is ready. Feel free to create videos with it. Also click the feedback button to share what you think. Hope you enjoy. So the accent is still totally off, which would really limit the utility of this. I'm not going to share the feedback with them because I'm, I feel like I'm kind of doing that by making this video. And that's my first impression, but let's kind of give it a, give it a better chance and going to record. Have it record another video. So this is avatar version of me and you can do here. I just wanted to show as well the fine-tuned pricing. So this is an add-on they offer that if you want to like make this better from what I can understand, it means that like a human is going to get involved in the process of tweaking your, your avatar a bit. So it's $64 per month and it's higher video resolution and quality and a private AI model for better limp lips, lip sync, as well as a professional voice. Create a digital replica of your voice and clone your voice in more than 25 different languages. The monthly pricing to retain that is a bit more expensive. It's almost $80, 79 bucks. So this is actually pretty expensive and it's not included in the cost of your subscription. So you need to pay more to get this fine-tuning of your avatar. Okay. So now I'm going to create my first video using my avatar. So I'm going to click create video and I'm going to click on landscape because I'm going to make sort of a test video for YouTube. And I've just written this script here that I'm going to just copy and paste here. And I'm going to read it firstly myself so that we can compare exactly me real me versus avatar me. And I'll just take off my headphones to match the look of my avatar. So let me go and read this myself firstly. Hi. This is Daniel. This is V2 of the avatar that I've created using Haygen. I'd really love to hear what you have to say about this. Hopefully it will be a bit less robotic than my first attempt. It will be interesting to see how this tech evolves over coming years. Okay. So that was me reading that little paragraph of text. And now I'm going to go into Haygen and have it generate that with my avatar by clicking on the play icon. So now it is generating and let's see how this turns out. Hi. This is Daniel. This is V2 of the avatar that I've created using Haygen. I'd really love to hear what you have to say about this. Hopefully it will be a bit less robotic than my fist attempt. It will be interesting to see how this tech evolves over coming years. Okay. So there was one type of my script. I'm sorry about it. It said fist attempt instead of first attempt. You don't get to see the animation until you actually schedule the job. But one thing that I'm interested to see is actually is whether I think you got this detail. This microphone windscreen that I was intrigued to see if I could get this in the avatar. It says Daniel Rosehill video. So I'm just intrigued to see whether you can get kind of branding and backdrop elements into your avatar. And it looks like you can. So I'm going to go ahead now and call this test avatar. I'm going to trigger this job for rendering. It's going to cost me half a credit because the credits are just show you if I click here was 15 credits, which is what you get on the 29 buck a month creator plan. You get 15 minutes of video and every 30 seconds of video is rounded up. So to the nearest half of a minute. So this is going to cost me half of a credit and it is submitting. So let's see how this ultimately does. As you can see, the rendering process isn't very slow. Well, you can see because I've just started recording again, but it just jumped there from three to 10. Admittedly, this isn't like a very long recording. It's gone up to 18% now, but it just takes a few minutes and they send you emails as well. Notifications when your video has been renders. So while it's rendering my video, I just wanted to kind of offer a few thoughts because this is kind of such a weird technology. I mean, it's definitely very cool. But the question that I have in my mind as well, what is this actually going to be useful for? Like what are the white white hat use cases? The thing that I've seen all these AI platforms kind of talking about is content creation. In other words, if you're in the AI voice cloning world, which I tried a couple of those tools last week. If you're a podcaster instead of you, you know, coming to sitting down to podcast at your studio, you're going to just take your script, write a script and then have your AI voice read the podcast. And for YouTube videos, you're going to likewise just give it a script and your avatar is going to, you know, bring in the facial expressions. And I am just quite skeptical about that. Firstly, I don't think the internet needs more content. We're already at way past the point of content saturation. There's more content out there on the internet than people have time to read. So there's definitely more, I would say, to my mind, more in the way of malicious use cases or black hat use cases like deepfakes that I think we can all agree are not going to be helpful. But that's what these AI platforms seem to be banking on that people are going to use them for, you know, this is going to be the smart way of content creation. You're not going to actually voice over something anymore. You're not going to do a voice talking head video to camera anymore. You're going to do that process once and then AI will clone your voice or clone your face. And okay, so anyway, so basically that's what I think it's very open for debate. So here's what my avatar looks like. I'm going to show it to you guys. And here we go. Let's make this full screen and let's play. Hi, this is Daniel. This is V2 of the avatar that I've created using Hey, Jen. I'd really love to hear what you have to say about this. Hopefully it will be a bit less robotic than my fist attempt. It will be interesting to see how this tech evolves over coming years. Okay, that's it. Let me just play it one more time so you guys can really just see all the kind of facial expressions. Hi, this is Daniel. This is V2 of the avatar that I've created using Hey, Jen. I'd really love to hear what you have to say about this. Hopefully it will be a bit less robotic than my fist attempt. It will be interesting to see how this tech evolves over coming years. Okay, so that was interesting. One thing I would say is my little experiment for the microphone, the branded microphone cover. It's got that. It's picked that up so you can definitely record your avatar. I would say with a, you know, YouTube backdrop. I have like an on air sign and other decorative items that I could have arranged behind me and they would come through. So it's good at isolating just the animation on your face. As I said, I think the accent is way off and it's a pity that you basically have to pay the fine tuning fee to really get this usable because if the idea is to convince someone that this is actually me. If the accent is totally wrong, that's not going to work and it's not going to be useful. Besides that, it's quite interesting. It's a, you know, definitely early stage tech, but just look at fake me versus real me for a second. The glasses are there. My, the rough, I think I didn't shave this morning. So like it's got my stubble right. It's got my hair right. It's basically mirror the image and it's actually animating my face and look, I'm kind of it's, I'm pulling my mouth this way and just watching all the little facial expressions that it's running through. I'm just going to play. This is Daniel. This is V2 of the Avatar. Right. My body, my shoulders are moving around a little bit. So there's a little bit of, I would say this is more less robotic and more interesting than my first go. It's definitely still a technology that is in its early stages, but I made the prediction before or made the prediction a few days ago that I think in a few years we're going to be seeing. I mean, if this is how good it is in January 2024, intriguing to see what these models are like in January 2025, perhaps by January 2026, we're going to be getting these avatars that are actually indistinguishable from the original creators, right? Right now, as I said many times, I'm sorry to be so repetitive, but the voice is wrong. The thing is still lacking a little bit of feeling. I don't know how to put my hands on it, but it's still quite good. And I was surprised by people on Reddit. I posted this on Reddit and people were quite complimentary. My friends I sent it to are like, wow, is that fake? I'm like, yeah, it's fake. But then again, you also like know yourself things that your friends might notice. And you're also looking out when you make this model for any kind of deviations from the accurate video. So anyway, that's been the process from start to end. I hope this has been interesting. Hey, Jen's very interesting tech. I'd love to do another video testing out the fine-tuned stuff. I might do that. And if you'd like to get more videos from me, do consider subscribing to this YouTube channel. This has been The Real Daniel, not his AI clone. Thanks for watching today's video.