 Hello, everyone. My name is Ajay Swamy. I'm a product leader with Amazon Web Services, where I manage a portfolio of AI ML products. So welcome to today's session. We'll be talking a little bit about advanced AI prompting for product management. And in 30 minutes, you should be able to get a pretty good overview on what it means to use GenAI for all of your PM needs. So let's look at the agenda for today. We'll start with the recap of the personas you can use GenAI for. That was from my previous webinar late last year. We will dive into using advanced prompts. I'll be using two LLMs for this. I'll be using OpenAI's GPT-4. And then I'll also do a demo using Cloud3's Sonnet model, which are both fantastic. But I'll talk a little bit about how you can use custom AI bots, custom GPTs. These are available if you pay $20 a month to OpenAI. Then I'll share with you a table on LLM benchmarks and how these benchmarks are actually scored. So you have a good view into what's out there, what's accurate, which is more advanced, et cetera. And then finally, we will wrap up. OK, in my previous webinar, you might remember seeing this slide. The personas that you can use GenAI for, you can use GenAI as an assistant. These are things like taking meaning notes, summarizing notes, getting action items, uploading data for text summarization, asking about topics that you don't know about, for example, summarize, the amendments for me. Or generally speaking, you can pass it a passage or your writing. And you can say, hey, clean up my writing or help me improve my writing from a business persona, for example. But today's focus is going to be on the personas of a thought partner and an ideator. GenAI LLMs are really, really great at being thought partners, where, for example, you can use the LLM as a sounding board. So for example, one of the examples that we'll go through later on today is around using GenAI as a CPO in your organization. And then you can almost tailor that LLM to act as the CPO. And you can use that as a sounding board to run ideas by the CPO and see what inputs that CPO might give you and then how you need to alter your deliverables or your thought process and come up with better features or more details. The second persona we're going to talk about is the ideator. This is where you can get really creative. And this is where I'll do a demo around being able to build a GenAI-powered invoice management solution. And I'll use Claude Sonny to do that. And with that, you'll be able to really get a good PRD product requirements document together just using LLMs. And this will be a pretty good draft if you're starting to do something from scratch. You can really sort of leverage these LLMs to accelerate your time to value as a product manager. OK, so my first prompt is with the ChatGPT. And what I'm doing here is that I'm telling ChatGPT that it is a product manager for a GenAI-powered Chatbot. And I fed it a bunch of meeting notes from customers, from account managers, et cetera. And I basically told it, look, your goal is to optimize and help me prioritize features for development. And the prioritization is called the Rice Framework. And the Rice Framework is a pretty popular one. And if you notice, I didn't even tell ChatGPT what Rice is, or what the Rice Framework is. And then based on my prompt, which was number one, very clear, concise, and it was very specific and what I wanted ChatGPT to do, it gave me an answer. It said, OK, based on the interview notes provided, there are several features that I have identified for your GenAI-powered Chatbot. And then it also tells me a little bit about the Rice Framework. The reason it does that is so that you know and it knows that you're referring to the same Rice Framework, which is basically a prioritization based on how many users this is going to reach, what is the impact of this, what is the confidence that it will be successful and adopted, and finally the effort. So this is fantastic that ChatGPT already knows this. And then based on that, it gave me a bunch of features that I should be prioritizing. I've just listed three here. So for example, it says, hey, for this AI Chatbot, I want the user's history to be maintained so I can easily recall and continue conversations. The second one is orchestration with RAG. RAG is retrieval augmented generation, which means that it gives the LLM the ability to use other documents along with the LLM to make decisions when it comes to providing very specific answers. And then the other one is like, hey, let's assume you're chatting with an AI Chatbot and you're talking about I want to figure out where I want to travel to, and you want to switch topics seamlessly to, hey, what should I cook tonight for dinner? This should allow for that topic reset without having to refresh that Chatbot. And then based on that, I very clearly told it to output the features in a tabular format with the key factors of the rice framework. And it did that, right? It said the feature, the reach, the impact, and the confidence. And it made some assumptions in terms of what the effort could be and what the reach could be. And it sort of tells you what the calculation is for the rice score. Then I followed up with another prompt. I say, hey, in the above, give me the top three features and also provide some engineering estimates on how long it is going to take to build based on the complexity of that feature. Then I said, assume that you have three SDEs, software development engineers, with five years of average experience, which means they're sort of mid-level engineers. And then add the engineering estimate in total man hours along with the complexity as two additional columns to the rice table above. And it does that really, really well. I'm not showing you all of the output that ChatGBT gave me, but it does a pretty solid job of trying to assess complexity. And by the way, the complexity, it also broke it down in terms of why user history maintenance is high complexity. And it articulated it pretty clearly. It said, one, is because you need to store that user history, which can get pretty long. Number two, you need to be able to pick up the right history and then send that context back to the LLM. Number three, it requires a lot of memory, et cetera, et cetera. And that's why the complexity is high. And then it also gave me the engineering estimate in terms of the man hours, which you can see. And you can use all of this data to actually give you a draft. This draft, you can obviously need to modify it. You need to take it back to your engineering team, maybe share it with your peer, and then understand whether this is something you can work off of instead of having to create from scratch. So super powerful. And this is what I talk about when I say your time to value as a PM is accelerated. Because imagine having to do this from scratch, and LLM did this for you in seconds. OK, so my next example for a PM thought partner is around making LLM your chief product officer. So this is a pretty complex prompt. But before we get into the prompt itself, there's a couple of things I want to tell you about good prompts. Good prompts have these six to seven different common characteristics. One is they have the clarity, which means it's not ambiguous, and it's very clear and succinct. They're very specific in terms of this ask, what you're asking the LLM to do. It provides adequate context so the LLM knows what the background is for the ask. It uses a clear and a concise language that is also error-free. And then eventually, it also provides the LLM, the formatting for the output, and it makes it very clear. And all of this characteristics make it essential for the LLM to respond with less hallucinations and maybe even the answer that you're looking for. OK, so let's get into it. So my prompt here, and again, I'm using ChatGPT-4 for this. My prompt here is that, hey, I'm saying, you are Clara, a chief product officer at a fintech company called PayNow. And then PayNow allows companies to pay their vendors utilizing the latest payment channels and modalities like ACH, Debit, PayPal, et cetera. And I also tell the LLM that you're skilled at qualifying product ideas and you manage a team of technical product managers. New job is to validate new product features from an ROI perspective. Your technical product managers will provide you ideas and you must guide them thoughtfully and kindly on ensuring that you're building the right set of features for your customers. I also say, look, all of the product ideas are pasted within the ideas tag. And then I paste a whole bunch of ideas for this fintech company called PayNow. There's features for payment automation. There's payments for taxes. There's payments for status tracking. There's payment for security. There's features for security, et cetera. And then after that, I follow up the prompt and I say, the user asked this follow-up based on the email he had sent to Clara. And I say, within the follow-up message that I say, hi, Clara, as requested, I'm sending you a high-level list of product ideas for PayNow. Can you help me prioritize as it's aligned with PayNow and PayNow's mission at North Star? And then finally, I end the prompt with, hey, first you need to assess if the user's follow-up question logically proceeds from the previous ideas block because otherwise it's unrelated and the prompt can break or you're just not going to get the answer you're looking for. And then, converse or talk to the user gently asking follow-up questions to add more details to the features. And then finally, pick five features to build based on a relevant prioritization framework. Make sure you end each interaction with a question that allows the user or the product manager to continue the conversation until satisfied. I know there's some errors there in that last sentence. But nonetheless, chatGPD got it. So let's look at what happened. So here's the response, right? So now chatGPD is acting as Clara, the chief product officer for PayNow. And it says, hey, thank you for sharing a list of ideas for PayNow. It's great to see such a comprehensive set of features that can add value to our platform. To prize effectively, I'll first need to understand a little bit more about our strategic objectives and the needs of our users. So Clara comes back to me as one of the product managers working for her. And some of the questions she's asking me is like, hey, look, like who are the primary users right now? Like, which channel is contributing to the most revenue? Is it small businesses, is it large enterprises, is it specific industries? Who are their most common challenges? And then she asked me about, hey, do we have the data of the most frequently used payment on Dali, someone current uses? This can help us identify if there's a strong demand for new payment methods versus others. And then she also asked me, have we received specific feedback or requests from the users on any of these features, et cetera, et cetera. And then she says, hey, the reason why I'm asking all these questions is because I want to gain insights, and it's going to help us align on the prioritization of new features aligned with PayNow's mission and North Star. So then she says, OK, the first question is, who are our primary users right now? And what are their most common payment challenges? I respond, I'm like, as a junior PM perhaps, I say, hey, look, I'm not really sure. We have many users from small businesses to large enterprises that seem to struggle to manage invoices and cannot pay their vendors on time. And Clara says, OK, I get it. Look, are user-based payments from small businesses and large enterprises and the common payment is managing invoices? Given this context, features that streamline payment processes, improve payment timelines, reduce admin overhead, those are the ones should be prioritized. And then she gives me a list of the five features. And then I say, hey, but hang on for a second. Yeah, it's great that you give me these features for priority, but I'm not even sure which personas I should be focusing on. Can you help me narrow it down? I need to get it released out in three months. And then Clara says, look, absolutely. Narrowing down specific user personas can significantly impact our future prioritization. And then considering the challenges you've mentioned, let's focus on these two primary personas. One is the busy SMB owner that juggles multiple roles, including managing finances. And there are no benefit from an automated solution that reduces the cognitive load of managing payments, invoices, et cetera. And then the second persona we should focus on is the finance manager and growing enterprises, because this person deals with a higher volume of invoices and vendor relationships than an SMB owner. And they need the tools that can scale and with automated features that we're proposing, they'll be able to do that. So you can see this kind of interaction can be really powerful, where Clara acts almost as a thought leader and can do this kind of back and forth or ask more direct questions to actually validate your hypothesis, validate the features you put forth before you actually take them to your engineering leader or your product leader, you'll have sort of the opportunity to check your work and make sure you're heading down the right path. Really powerful stuff. And then I ask Clara another question, right? I'm like, hey, can you help me come up with a roadmap? Look, as a junior PM, I have no idea what to do with the roadmap or how to organize a roadmap by teams or through by quarters or months. Can you help me provide some short business descriptions and benefits for the features in the roadmap? And then once you're providing the roadmap, I can use it as a template for any future roadmap activities. And then I also tell Clara that, hey, I want this roadmap in a tabular format organized by teams and quarters and maybe also use some coloring so the teams actually stand out. So ChatGPD took this prompt and maybe after 30 seconds, it actually created an Excel file along with the automated theme, the features that we should be building, the description and the benefit and then organized by quarter. So this is really powerful. Now imagine taking this and also tying it back to the engineering estimates with the previous prompt and you could tie it all together and say, hey, can you align these quarters and those engineering estimates together and see if they can actually deliver it across the year? And you can actually validate that using ChatGPD. Super powerful, right? I think you'll have very good, strong drafts for you to put forth in front of your engineering team or your other stakeholders. Obviously, this is not to shortcut your work. This is to get you started with a very good draft and you'll need to bake in all of the details into it and use your great starting point. Again, the goal here is acceleration and time to value than trying to shortcut the process. You as a PM, you still need to be very aware of what your stakeholders need, what your customers need, what your leadership needs and you'll still need to account for all of those but this gives you an excellent foundation to get there. All right, great. So now this brings me to the second person that we're focusing on, which is your product management ideator. And for this, I'm gonna show a demo using Claude III's latest Sonic model. I love using Claude III as super friendly. The text-based responses are way friendlier. The English is better. Is what I've at least noticed from my experience using chat, gpt and sonnet. But let's do that right now. All right, so now I'm in Claude that AI. It's by Anthropic. And what I'm gonna do is I'm gonna start with a bunch of prompts and I'm gonna tell Claude with my first prompt, which is, hey, I'm starting an AI powered and voicing processing software B2B company. Can you identify some user segments for the software? Can you provide the answers as the user segment name followed by why the software will be beneficial for them? So let's see what happens. And there you go, right? Claude comes up with a bunch of user segments, accounts, payable teams, procurement teams, finance and accounting departments, large enterprises. And by the way, note that for each of these segments, it actually tells you why it's useful. So for example, for account payable teams, it says AI powered voice processing can significantly streamline the accounts payable process, reducing manual effort and improving accuracy. This would benefit the AP teams by automating data entry, validation, approval workflows, and then allow them to focus on more strategic work, right? And these strategic work would be like, hey, what is my cash outflow or my projected cash outflow? What is my cash position today? Focus on those really relevant aspects. So now I wanna take this a little bit further, right? And I wanna find out within these user segments, like what are the personas that should be focusing on? Like how can they use the software and what are their benefits? So that's my prompt. And there you go, right? And by the way, Claude III Sonnet is so quick. And by the way, notice that I haven't told it how to format it into bullet points, but it automatically did it. And this is what I love about Claude. It's the UX, it's super friendly and very intuitive. Anyway, so it says here are some sample personas for the user segments mentioned earlier along with how they can use AI powered invoicing invoice processing software, right? So the account payable steam, it comes up with a persona, her name is Sarah. She's an AP manager at a mid-sized manufacturing company. And this is how she can use the software, automate data extraction from invoices, vendor details and amounts due dates, set up rules and workflows, integrate with existing ERP systems. And some of the benefits are that it's reduced manual effort in data extraction and data entry, improved accuracy, fewer errors because there's no manual input as much anymore and definitely faster turnaround times. And it goes on and on and on and it gives you a whole bunch of user personas. Now, I want to take this a little further and let's actually drill down into a specific user persona, okay? So I want to drill down into the accounts payable user segment and provide the top three use cases and their pain points. So I'm going to ask for the accounts payable user segment provide the top three use cases and their pain points, very straightforward, right? And that gives me the three use cases which are invalid data capture and validation invoice approval, workflow management pain points, exception have handling and dispute resolution. And then also tells me, by the way, like for all of those pain points and those use cases, how my AI powered invoice processing software can actually streamline that process, super cool. And then lastly, I want to say, okay, yeah, look, I get it, you know, the software can do all the stuff but let's actually put, let's actually make sure that we can actually build this, right? So what are some of the solutions for those pain points and how can AI solve for them? So I'm asking, I'm going to ask Sonnet to like, give me answers in a product feature format that includes the feature name, details, complexity and prioritize based on the highest pain points. So let's see what it does. All right, awesome. So first, it says, look, my first feature name is intelligent data document capture and data extraction. It says leverage my AI machine learning to automatically capture and extract data from invoices in various formats from PDS images, canned documents. The feature should support OCR, optical capture recognition and intelligent data extraction to accurately identify and extract relevant information, super stuff, right? And complexity to build, it's as high, the priority is high because it addresses the pain point of manual data entry and handling invoices in various formats. Then another one I looked at was, what about intelligent approval workflow and management? It says utilize AI and machine learning to automatically route invoices through approval workflows based on pre-defendables and criteria, invoice appointment, invoice amount, department, cost center, et cetera. And the feature should provide real-time visibility to the approval process, identify bottlenecks, et cetera. And it also talks a little bit about, it also spits out complexity to build is medium and priority is high because it addresses the pain point of complex approval workflows and lack of visibility. And now you can go on and on and on and you can say, hey, look, then you can also say things like, for the first three features, help me identify why the complexity to build is medium or high. Please break down the complexity of those and give me more details. So you can use very clear, concise prompts and create a conversational playground for yourself in order to either build a new product or improve on an existing product. So highly suggest making, you know, gen AI, LLMs and everyday part of your work because you will actually thrive, you'll have better ideas and you can also use it to sort of cross check some of your work. All right, so now here's another great cheat code. So if you go to chat.openai.com slash GPTs, and if you're actually paying for a monthly subscription, it's $20 a month, you can actually access custom GPTs that other users or companies have created, right? So what are custom GPTs? These are custom versions of chat GPT that combine your own data sources, your own experience in terms of deliverables, your own input that GPT sort of takes in and uses basically RAG, which is Retrieval Augmented Generation to really sort of cater a very specific response to your use case. So I went there and I said, hey, look, I'm just looking for like product manager GPTs to help me with, you know, with my daily tasks or research or whatever, right? So I searched for product manager. It comes up with a whole bunch of product manager GPTs. You can see here that, you know, there's the first one talks about helping product managers with the first steps of a new product development. Then there is something called the AI product manager GPT. It helps you be a better PM behind the team from Craftful.com. Then there's another product manager called Alina and it's adept in requirements analysis and product design and write requirements, stories, et cetera, et cetera. And then you can also click on any of those and it tells you in terms of ratings, like how well it's rated, how many conversations it's had, what are the capabilities? For example, this particular AI product manager is able to take action outside of chat GPT, i.e. It's able to either use APIs or web scraping and actually can access data from other websites or other data sources. It can do data analysis and it can also create images using Dali. So, pretty great stuff. So I would suggest taking a look at this and maybe even bookmarking some of the best GPTs for your specific needs. Okay, and this brings me to my last slide, right? Like how do you rate LLMs? Like what are the benchmarks? This is from a blog one of my peers wrote at AWS. But look, I think there is, the benchmarks will change and evolve. There is no standard organization that I know of that has sort of created this benchmarks to track across all the LLMs. But what I found is like looking at LLMs across the board, Claude, by the way, which is Claude III and all of its models that were just released a few weeks ago, they seem to have a higher level of confidence, right? So for example, when you look at graduate level knowledge, right, you can see how Claude III outperforms GPT-4. Claude III Sonnet, which is the free version under Claude III now that everybody should be upgraded to outperforms GPT-4 and definitely GPT-3.5. Look at, for example, the undergraduate level knowledge, 92.3% versus GPT-4, right? Which is the paid version of GPT-4 versus the free version of Claude III. You can also look at code human evaluations. You can also look at, for example, logical reasoning and you can see how Claude III right now is sort of outperforming GPT-4 and even Gemini. Now, what does this mean? It means that all of these different companies are moving so quickly to make the LLMs better and better. So is this table going to change in the next few weeks? Absolutely. Because GPT-4, there's going to be improvements, there's going to be a ton of improvements to Gemini 1.0 and plus, with the impending partnership between Google AI and Apple, who knows what's going to come after that. But for now, I thought it would be useful for you to know that if you look at benchmarks across graduate, undergraduate knowledge, math, problem solving, human evaluation, using code, reasoning over text, logical reasoning, et cetera. If you look at some of those benchmarks, those are the benchmarks you should be thinking about when you try to understand, okay, which is the best LLM that I should be using either for myself or for my enterprise. And obviously, by the way, there's cost ramifications underneath all of these LLMs. So Opus, by the way, is more expensive than Sonnet. Haiku, by the way, it's supposed to be really, really quick, but they can also see that it's accurately sort of reduces compared to the other two. But yeah, those are LLM benchmarks. And hopefully these get ingrained in the GNAI space and people can start to use this with more authority. And well, that brings us to the end, right? So a couple of takeaways. Number one, look, GNAI is fantastic. This is the next big change. And as long as you use GNAI, make sure you check your work. I know it can be cumbersome to check your work across multiple LLMs, but do so if you can. But then at the end of the day, always put on your PM hat, right? Like you need to make sure you look at your product sense and understand that, hey, am I addressing the right problem for the right customer? Am I being empathetic? What is my communication? Who should I be informing? Who are my stakeholders? All of that great stuff. Again, be careful of the data that you expose. Some of these other models, you don't know where the data goes and how it might be used to train their LLMs. So for example, again, I'm very biased towards Amazon's bedrock, but Amazon's bedrock keeps the data within your AWS account. But if you go with open chat GPT or cloud and you upload any proprietary information, you don't know where that's gonna go and that might be used to train their models. And then lastly, like I said before, re-emphasizing product sense, there are no shortcuts. Just be very thoughtful in how you use this. Thank you. I hope this was useful and then reach out if you have any advice.