Google's Gems are a gentle introduction to AI prompt engineering
Prompt engineering has emerged as one of the important new tech skills in the age of generative artificial intelligence (Gen AI). More an art than a science, engineering a good prompt involves crafting the right requests to make a chatbot, such as ChatGPT or Google's Gemini, do what you want.
A good prompt can sometimes be the difference between halfway-decent and terrible output from a bot.
Also: How to write better ChatGPT prompts in 5 steps
A new feature of Google´s Gemini large language model, Gems, introduced last week, offers a crash course in prompt engineering. The feature is worth checking out if you spend much time working with Gen AI or intend to use the technology extensively.
Gems are focused chat sessions you can save in your Gemini dashboard. They are supposed to help with tasks, such as brainstorming a corporate strategy, refining your study habits, or improving your writing.
Gems are similar to other approaches that let a user of Gen AI craft a prompt and save the prompt for later use. For example, OpenAI offers its marketplace for GPTs developed by third parties.
You can think of Gems as a more basic kind of app to build on for your own purpose.
Gems is also similar to ChatGPT's custom instructions, which are prompt material you save in your settings that ChatGPT is supposed to incorporate when responding. The difference between the two is that custom instructions are meant to work in every instance of ChatGPT, whereas Gems instructions are particular to that individual Gem.
Gems may be available to some users of Google's Gemini mobile app on Android, but not for all users. If you don't see Gems, go to gemini.google.com. Gems don't yet work at all on the iOS app for iPhone and iPad; Apple users will have to use Gemini on the Web.
Also: OpenAI's stock investing GPTs fail this basic question about stock investing
Only subscribers to the Gemini Advanced plan from Google, or the business version, can use Gems (starting at $19.99 a month as part of a Google One subscription).
If you're making your own Gem from scratch, you'll start by going to the Gem manager screen from the Gemini sidebar:
Click "New Gem" and give your Gem a name and/or description, such as "French tutor". Then, you'll enter instructions. This is the important part. You're telling Gemini with instructions what the function of this Gem is supposed to be -- "help me to learn the French language", for example -- and how you would like to proceed, such as the conversation style. There are no hints here, so you're trying to develop your own prompting style:
You can, however, get some hints by using one of five pre-built Gems that Google provides in the Gems manager: Brainstormer, Career guide, Coding partner, Learning coach, and Writing editor:
When you make a copy of any of these Gems, using the little "copy" icon, that copy action reveals all the instructions that Google has filled out for the Gem. Think of it as a template for prompt engineering from which you can build. You can put your instructions in the instructions field, adding and removing or modifying the boilerplate that Google has provided.
You can add more prompt elements later if you think of them. Just go back to the sidebar and back into the Gem manager screen, and select the pencil icon next to the Gem you want to edit.
When you call up one of the Gems from the sidebar, you start typing to it at the prompt, just like with any chat experience.
Also: How to use Gemini (formerly Google Bard): Everything you should know
To test Gems, I copied the Brainstormer Gem and tried getting help with a sales plan for a subscription tech newsletter. I titled it "Sales coach", and edited Google's boilerplate code for Brainstorming, replacing the prompt text with my modifications.
For example, for the first line of the prompt -- "Purpose" -- I inserted: "Your purpose is to guide me in crafting sales tactics and strategy. You'll help me reflect on what's working and not working with a given prospect." I added several requirements, such as, "Explain the logic behind each proposed sales tactic or strategy."
After making all the modifications, I pressed the "Save" button.
From there, I engaged in a chat with the bot. I explained an effort to sell a particular prospect a $30 subscription to a technology newsletter that would provide investment advice. I began with the prompt, "I'd like to formulate a plan to sell my subscription product to a prospective customer."
I proceeded through numerous rounds of question and answer with the bot, for about a half hour, which included working back and forth drafting a letter to the prospect, and culminated in a role-playing Zoom call where the Gem played the role of the prospect acting skeptically to the sales pitch. I was also challenged to enter compelling responses:
The Gem assessed my performance as the salesperson -- "demonstrated a good grasp of sales fundamentals while navigating the challenges presented by a hesitant prospect" -- and even offered several areas for improvement: "Your communication style could have been slightly warmer and more engaging."
Not being a career salesperson, I have no idea if all of this advice amounts to good coaching. It probably doesn't rise to the level of a legendary coach, such as Jordan Belfort, the Wolf of Wall Street, and his Straight Line System.
Nevertheless, it seems there's some value here. Having the transcript of the entire chat, which is saved in the sidebar, is a nice takeaway if you want to go back and review the chat session.
Some limitations are glaringly obvious after going through the exercise. One is that the Gem, while being consistent in tone during the half-hour exchange, doesn't go back to earlier points and only moves forward. In a real coaching session, the coach should be able to connect later turns of the conversation with earlier turns.
Also: 4 Apple AI features that ChatGPT already offers (and two more that are coming soon)
I also think that sentiment is true for collaborative activities, such as brainstorming a birthday party or working on a resume.
That limitation strikes me as a general issue with large language models. The model probably requires more effective use of the context window, all the stuff typed earlier in the exchange. I suspect that's an engineering challenge that requires further development of the underlying Gemini model.
Second, it appears the Gem relies on its very general knowledge of selling from within whatever training data was used to develop Gemini. For these focused use cases, I suspect the Gem app could benefit from retrieval-augmented generation (RAG), an increasingly popular Gen AI technique, where the AI model taps into an external database. That approach might allow the Gem to get more resources for domain-specific sales knowledge.
Third, the underlying process might benefit from storing simple background knowledge in the form of sentences, which is something OpenAI offers in its "memory" function. Storing background knowledge in that way means someone could use a Gem without re-inventing things with each chat.
Also: Google's new Gemini models achieve 'near-perfect recall'
For example, if you're a salesperson, you should be able to store background information such as, "I sell a subscription tech newsletter for $30", and have the Gem automatically incorporate that fact each time you have a chat.
This brings me to the fourth and most glaring omission -- Gems have no record of past conversations. Even though there is a transcript stored of each chat with the Gem, the Gem itself starts blank each time you use it. You can't ask the Gem to explore something from a prior session because that's not part of the Gem's context window anymore, as it has become the past.
Also: I tried ChatGPT's memory function and found it intriguing but limited
That's a big deficit if you want to return to use the Gem over and over. For example, if you want another coaching session, you should be able to explore the things that came up in a prior coaching session and improve upon that exchange, as an additive process, rather than starting from scratch.
Imagine having a real-world coach -- of any kind, sales, fitness, ice hockey, whatever -- who never remembered where you last left off in your long journey to get better. You'd probably seek a coach who paid more attention and had a memory.
Despite those shortcomings, Gems have the value of bringing a user up to speed on the basics of prompt engineering. That capability is useful for a generalist audience unaware that prompt engineering exists.