Prompt Engineering Best Practices: Hack and Track [AI Today Podcast] AI transcript and summary - episode of podcast AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion
Go to PodExtra AI's episode page (Prompt Engineering Best Practices: Hack and Track [AI Today Podcast]) to play and view complete AI-processed content: summary, mindmap, topics, takeaways, transcript, keywords and highlights.
Go to PodExtra AI's podcast page (AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion) to view the AI-processed content of all episodes of this podcast.
View full AI transcripts and summaries of all podcast episodes on the blog: AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion
Episode: Prompt Engineering Best Practices: Hack and Track [AI Today Podcast]
Author: AI & Data Today
Duration: 00:09:18
Episode Shownotes
Experimenting, testing, and refining your prompts are essential. The journey to crafting the perfect prompt often involves trying various strategies to discover what works best for your specific needs. A best practice is to constantly experiment, practice, and try new things using an approach called “hack and track”. This is
where you use a spreadsheet or other method to track what prompts work well as you experiment. Continue reading Prompt Engineering Best Practices: Hack and Track [AI Today Podcast] at Cognilytica.
Full Transcript
00:00:01 Speaker_00
The AI Today podcast, produced by Cognolitica, cuts through the hype and noise to identify what is really happening now in the world of artificial intelligence.
00:00:10 Speaker_00
Learn about emerging AI trends, technologies, and use cases from Cognolitica analysts and guest experts.
00:00:22 Speaker_01
Hello, and welcome to the AI Today podcast. I'm your host, Kathleen Walsh. And I'm your host, Ron Schmelzer.
00:00:28 Speaker_01
And we've had a lot of great feedback on our prompt engineering series, mainly because prompts on prompt engineering and generative AI is the entry point for so many people into the world of AI. And it is proving to be highly valuable.
00:00:44 Speaker_01
People are getting a ton of real results from these generative AI systems, whether they're generating text or images or audio, just in general, all these use of large language models. And we are making just as much use of them as everybody else is.
00:00:57 Speaker_01
So while, you know, yes, we are using the AI systems, we're not building
00:01:02 Speaker_01
new AI models, we're not creating a lot of the AI systems that we spend so much time talking about in our AI Today podcast, especially in use of the CPMEA methodology, the Cognitive Project Management for AI methodology, is still highly valuable to talk about prompt engineering and the use of prompts, primarily because it's a way that so many people get value today from AI systems.
00:01:27 Speaker_03
Exactly, you know, and Ron and I were having a conversation a few days ago talking about, because he says that, you know, prompt engineering is now kind of that gateway into AI, where a few years ago it was RPA, robotic process automation, that everybody was talking about.
00:01:42 Speaker_03
And now everybody's talking about, you know, large language models, prompt engineering, and getting better at that. So we wanted to spend time on today's podcast, continuing in our Prompt Engineering Best Practices series.
00:01:55 Speaker_03
And for today's podcast, we're going to be discussing Hackintrack. So if you've followed along to our newsletters, then great, because you're getting some insights there. And if you're not, I will link to it in the show notes.
00:02:07 Speaker_03
You can also find it on LinkedIn, and I encourage you to read and subscribe to our newsletter because we provide a lot of great, you know, additional value in that newsletter, as well as links to our upcoming talks that we do.
00:02:22 Speaker_03
Ron and I do a lot of virtual and in-person presentations and keynotes. So you can see all of that there. But for today's podcast, we're going to be talking about hack and track. So what is it? Why is it a best practice?
00:02:34 Speaker_03
And then you can reflect and see if you're doing this at your organization. So if you've been doing prompts, you know that experimenting, testing, and refining your prompts really is essential.
00:02:48 Speaker_03
We've gone through different prompt patterns in previous podcasts on this topic. And you'll know that some prompts are better for certain tasks than others. And also, it's really an art. So you only will get better with time.
00:03:04 Speaker_03
So the journey to crafting that perfect prompt often involves trying different strategies. And you need to figure out what's going to work best for your specific need.
00:03:13 Speaker_03
If you've been doing prompts, you know it's really rare to get just the perfect prompt exactly right the first time. It's going to take iterations. And that's okay, because this really should be an iterative process.
00:03:27 Speaker_03
We always say, think big, start small, and iterate often. Continue to iterate. Try different prompts. Analyze your response. You're going to want to tweak things. Ron and I had a conversation yesterday with someone
00:03:38 Speaker_03
And he said, you know, one word or one letter difference sometimes mattered in prompts. That's how specific you need to get. So this is why it really is an art, you know, and you can type something. I can type something.
00:03:52 Speaker_03
Ron can type something that I think looks pretty similar. And we get very different results because, you know, whatever it was with that one tweak, really had different results. So this is what happens.
00:04:05 Speaker_03
And it's important to know that that does happen and that's okay. These large language models are also evolving constantly. So what works yesterday may not work tomorrow.
00:04:17 Speaker_03
If you've been experimenting with these, sometimes they get upgrades as well, or maybe you've gotten an upgrade and now you've paid for a version and that gives you different access to things that you might not otherwise have had.
00:04:30 Speaker_03
As a result, these prompts are, you know, and your results are going to change and get different.
00:04:36 Speaker_03
So a best practice right now is this approach called hack and track, which is where you use a spreadsheet or another method to track your prompts as you experiment.
00:04:47 Speaker_03
And so you can kind of, you know, say, okay, this is the prompt I used, this is the results that I got, and then rate it on a number of different factors.
00:04:55 Speaker_01
Yeah, and I think the idea of variability and things not necessarily always working the exact same way is a little bit kind of the point of AI, right?
00:05:03 Speaker_01
Because as we talk about often, especially in our trainings, you know, that the whole purpose, the reason why we use AI is because other approaches don't work, right?
00:05:14 Speaker_01
So the other approach might be, well, I can program something or I can use rules-based system. And that works if We always want to get the same output for the same input. There's no reason to use a probabilistic system for that.
00:05:25 Speaker_01
I can write an easy script or I can create a program or a flow chart, whatever I want to do. And I can say, whenever the user types these exact words, provide these exact outputs. Or like, that's how websites work.
00:05:37 Speaker_01
When they click on that page, show them this content, right? If that's sufficient, that's great.
00:05:43 Speaker_01
On the other hand, if we have a lot of variability, like everything's changing all the time, the inputs aren't looking the same, the documents are changing, well, if we didn't have AI, that's where we would need a person, someone who needs to look at the email and make a decision or look at the document or write a document or do those sorts of things.
00:05:59 Speaker_01
It's because we want machines to do that, that we have to deal with variability, which means, by definition, these AI models, they will probably never generate the exact same output twice.
00:06:10 Speaker_01
It may look very similar, but if you compare, things might be slightly different, right? No two people are going to necessarily use the exact same prompts. Now, of course, they could cut and paste, and you could
00:06:21 Speaker_01
actually use the same prompts, but the models might be different. Between the time that I tried this prompt, let's say a month or two ago, and you've tried it, the model might have changed, as Kathleen mentioned.
00:06:31 Speaker_01
Maybe they've added some moderation that is preventing you from doing some things.
00:06:35 Speaker_01
Maybe, you know, we talk a lot about the context window too, and I think there's sometimes a little confusion here because the context window is all the things that the LLM is going to consider when it tries to generate the response.
00:06:46 Speaker_01
So it's your prompt and, of course, anything that the LLM has responded to your prompt and all of your follow ups. And there's usually like a limit. Now the limits are getting pretty, pretty, pretty big, right?
00:06:56 Speaker_01
You could put like entire books in here and you can ask questions about that. However, the confusion is that the output windows are almost always very small.
00:07:07 Speaker_01
So even I think someone is saying even GPT-4 has something like, I don't know, 32 or 64K input window. The output window is still only like a few thousand. tokens.
00:07:16 Speaker_01
So if you're trying to get this system to generate a lot of output, you will find that it'll stop, or it might say network error, or it might summarize and go dot, dot, dot, or something like that. That's because the output windows are small.
00:07:29 Speaker_01
So even with your prompts, you might find that the outputs are getting truncated, they're getting summarized. So you need to keep track of all this.
00:07:39 Speaker_01
And I think that's why this is the best practice, because one, you want to document how things are working for yourself. Two, you might want to share it with others, because they might want to do the same thing.
00:07:50 Speaker_01
Three, you might want to see the different LLMs and their sensitivities and maybe how they're changing over time. So obviously, if you're going to be depending on it, if generative AI is going to be important to you,
00:08:02 Speaker_01
Probably like leaving everything just in your chat window and saying, now, which one was it that has this prompt? Probably is not the best approach. So I mean, this may seem obvious, but keep track of it.
00:08:13 Speaker_01
And the hack part comes from trying different things, not necessarily hacking into something. It's more like hacking around, trial and error, which is always a best approach.
00:08:22 Speaker_01
So let me get into, we could talk a little about like maybe some of the things that you should consider. as you're tracking the things you're putting into the prompts and the stuff that you're getting back from the generative AI systems.
00:08:35 Speaker_03
Right, because you want to make sure that it's useful so that anybody can go back and understand what was done. Because the point of this is not just to have it for you, but to have it for maybe your team or group or organization-wide.
00:08:49 Speaker_03
So when we do this, we want to make sure, you know, some of the criteria that you can use when setting up your hack and track sheet should include the name of the task or the query, what prompt pattern or patterns you used,
00:09:05 Speaker_03
What LLM was used and the version as well. Maybe also the date that you ran this because that can matter. And what prompt chaining approach you used, if any. And then when you look at the accuracy of the output, you're going to want to measure that.
00:09:25 Speaker_03
So, of course, this is going to be determined by each person, but you'll want to measure the output accuracy maybe on a scale of 1 to 5 or 1 to 10, whatever you feel is the best number for your organization, with 1 being poor and 5 or 10 being the best.
00:09:42 Speaker_03
You also want to look at the output relevance, because we know that you're going to get an answer no matter what. These systems can hallucinate. They can provide just false and inaccurate information. So what was the relevance and the accuracy?
00:09:56 Speaker_03
You know, were they right? And the output consensus, was it too short? Was it too long? Was it just right? Like Goldilocks, as we always say. And so you're going to want to put that on a scale of one to 10. And then also the need for follow-up prompts.
00:10:11 Speaker_03
if it was not desired or not requested. So how many times did you have to have follow-up prompts if you were just trying to do maybe a single-shot prompt? And then also prompt refusals and prompt incompletion. So you want to measure all of this.
00:10:26 Speaker_03
Now, there may be other factors at your organization that you think are important. And as you continue to use this method, you'll continue to iterate over time.
00:10:36 Speaker_03
Maybe also, maybe you can have the name of the person that did these prompts or the group that did these prompts. You know, whatever organization you'd like to say. Some people are nervous or feel uncomfortable with using prompts.
00:10:51 Speaker_03
You know, they don't want other people to judge them, so maybe they don't want their name in there. There's a lot to learn from this.
00:10:56 Speaker_03
You know, we say that one thing that is really important when it comes to prompt engineering in general is practice, practice, practice, and also learn from others because that's going to help you get creative.
00:11:08 Speaker_03
with your prompts, which really is the key to success. Even Ron and I, we read a lot, we talk to other people. The way that they are doing some of these prompts are ways that we never would have even thought.
00:11:20 Speaker_03
We're like, wow, you did that with prompt engineering? Wow, really? We can do it that way? And even Ron and I sometimes iterate back and forth. And he goes, well, I did this way, so I try it. And I'm like, OK, we talk about data prep, right?
00:11:32 Speaker_03
Data is so important, even with prompts, and so we have to do some data preparation, data cleaning.
00:11:37 Speaker_03
It's like, okay, well, I have to play around and say, how do I clean this data so that I can have it be an input and it understands what I'm trying to do? And I've had to play around with that myself and continue to get more comfortable.
00:11:51 Speaker_03
So again, you just have to keep practicing and trying.
00:11:55 Speaker_03
that's why you know we think this is such a cool technique because he can write a prompt that works for him and then i go oh okay let me try that and because we're iterating and because we're tracking this i say oh interesting if i tweak it just a little
00:12:11 Speaker_03
I get better results this way, you know, and then you continue to go back and iterate. So it really does help with that creativity aspect.
00:12:18 Speaker_03
And it helps with that collaboration aspect, the communication aspect, you know, all of these things are so incredibly important.
00:12:24 Speaker_03
And learning from others, because we always say, don't reinvent the wheel, don't struggle in silence, really learn to see what's out there.
00:12:31 Speaker_01
Yeah, it's interesting how much of prompt engineering really is about these non-technical things. That's why we call it the art of artificial intelligence.
00:12:44 Speaker_01
It's so weird to talk about creativity, but it's like as if you're learning how to draw, and all you see are examples. Until you're like, oh, I'm going to draw. And you draw whatever you want to draw, pencils, crayons, markers, whatever it is.
00:12:55 Speaker_01
And then you see somebody else's artwork, and you're like, Wow, I had never thought about that, never thought about that.
00:13:01 Speaker_01
So now it kind of, you know, you may not necessarily copy it exactly, but it's like, oh, you think about it and you start changing the way you draw. And then you see somebody else, you're like, oh, that was something else I had never thought of.
00:13:12 Speaker_01
And it feels a lot of the same way. And that people are starting to share prompts, they're trying to share ideas.
00:13:17 Speaker_01
And I think this is why, you know, hack and track more than just for yourself, you know, keeping track of things and seeing what's working. But this is where someone else could share their track sheet and say, oh, look, you know, I've tried this.
00:13:27 Speaker_01
And you may be like, hey, that's really interesting. I have a need that's not too different from that. I might try that general approach.
00:13:35 Speaker_01
This actually happened recently, you know, we were doing some education around prompt and prompt engineering, and there's many, many different prompt patterns.
00:13:43 Speaker_01
But sometimes it feels like the people who come up with prompt patterns, they're not technical people. I mean, they're just like enthusiastic users, I guess.
00:13:51 Speaker_01
Maybe some of them are like content marketing people who just like to do stuff for, you know, all of the content marketing channels. They always want to do something cool and different and unique, but sometimes they're trying stuff and they feel like,
00:14:02 Speaker_01
found this really weird trick. That's actually an SEO trick, with this one simple trick. It's never one simple trick. But there's always these interesting approaches. One approach is something called flipped interaction, which I never tried.
00:14:17 Speaker_01
But we had tried it and we did it, and it's like, oh, this is really interesting. And we obviously knew about it. And the idea is if something does require many steps to do some task, or it's a complicated problem that you
00:14:28 Speaker_01
don't really know all the steps, you can actually ask the LLM to tell you. You could say, hey, I'm looking at buying a house, but I don't know what I should consider.
00:14:39 Speaker_01
And instead of you guessing, right, and then writing that in a prompt, you could say, ask me a bunch of questions until you can provide a response that is significant or has enough of the criteria. And it'll do that.
00:14:55 Speaker_01
And it's kind of cool how that works. You're like, I never thought about that. Was there anything programmatically different about the LLMs? The same LLMs, the same chat system, the same generative AI.
00:15:04 Speaker_01
It's just you figured out a different way to interact with it. So I think this is why we like this idea of hack and track, not just for yourself to keep an eye on things, especially as things are changing.
00:15:12 Speaker_01
Different models also have different sensitivities. So you can include some notes where you could say this one model really likes it when you provide positive reinforcement.
00:15:21 Speaker_01
to the model or you're like, you know, every time you get the right answer or you capitalize things correctly, I'll give you a $200 tip. I know it's weird. But sometimes that works.
00:15:30 Speaker_01
Or you can do negative reinforcements like, you know, every time you put things in the wrong order, I will, you know, penalize you a point or something like that. Some weird magical thing. Sometimes different models have different sensitivities.
00:15:44 Speaker_01
You know, that's the thing about these neural nets. They're black boxes. We don't know exactly how one thing relates to another, so they have weird sensitivities. But I think this is basically it.
00:15:55 Speaker_01
Now, if you're looking for an example of a hack and track sheet, we do have an example in our newsletter, which is on LinkedIn. So please do find us. Go to Cognolitica, find us on LinkedIn. We'll provide a link to that in our show notes as well.
00:16:08 Speaker_01
take a look at the Hack and Track newsletter edition and you'll see an example of what that sheet should work. But then again, that's just, you know, one possible way of setting that up.
00:16:18 Speaker_03
Exactly. And if you're doing this at your organization and you can share it, we would love to see what your spreadsheets look like and how you're tracking that.
00:16:27 Speaker_03
As we continue to have conversations with folks, you know, these best practices are emerging and evolving and, you know, continue to different things continue to become best practices because prompt engineering is still fairly new.
00:16:42 Speaker_03
So that, you know, we would love to see what our listeners are doing. I know that we've talked to some of you already and that is wonderful, but definitely you can reach out on LinkedIn, either personally to me and Ron or through Cognolitica.
00:16:54 Speaker_03
You can email us info at cognolitica.com or you can, you know, go to our website as well at cognolitica.com. But this really is an overview of Hack and Track, what it is, how to use it, why it's important to have.
00:17:09 Speaker_03
On follow-up podcast, continuing in our Prompt Engineering Best Practices, we'll be talking about using external data and documents.
00:17:17 Speaker_03
And I'll also link to all of our previous Prompt Engineering Best Practices podcasts so that you can listen to them if you haven't already or you just like to re-listen to them for a refresher.
00:17:28 Speaker_03
If you haven't done so already, make sure to subscribe to AI Today so you can get notified of all of our upcoming episodes. Like I said, we have another one in this Prompt Engineering Best Practices series.
00:17:37 Speaker_03
We have some great interviews lined up as well. And then we have some additional topics coming up.
00:17:43 Speaker_02
Like this episode and want to hear more? With hundreds of episodes and over 3 million downloads, check out more AI Today podcasts at aitoday.live.
00:17:52 Speaker_02
Make sure to subscribe to AI Today if you haven't already on Apple Podcasts, Spotify, Stitcher, Google, Amazon, or your favorite podcast platform. Want to dive deeper and get resources to drive your AI efforts further?
00:18:05 Speaker_02
We've put together a carefully curated collection of resources and tools. handcrafted for you, our listeners, to expand your knowledge, dive deeper into the world of AI, and provide you with the essential resources you need.
00:18:18 Speaker_02
Check it out at aitoday.live slash list. This sound recording and its contents are copyright by Cognolitica. All rights reserved. Music by Matsu Gravas. As always, thanks for listening to AI Today, and we'll catch you at the next podcast.