Advanced Text Generation and Prompt Engineering with Vercel AI SDK - Part 4

This post will build upon our completion example in Part 3, focusing on:

Strategies for more sophisticated prompts.
Using system messages to set context or persona.
Few-shot prompting techniques.
Controlling output parameters like temperature and max tokens.

Prerequisites

Familiarity with the setup from Part 3 (using streamText and useCompletion).

The Power of Prompt Engineering

The same streamText function and useCompletion hook can produce vastly different results based on the prompt. Here are some techniques:

Step 1: Clear Instructions and Context

Be explicit about what you want.

Bad Prompt: "Write about dogs."
Good Prompt: "Write a 100-word blog post introduction for an article titled 'The Joys of Adopting a Senior Dog,' focusing on the emotional benefits for both the dog and the owner. The tone should be heartwarming and encouraging."

Step 2: System Messages (for Chat Models)

When using chat models (like gpt-3.5-turbo, gpt-4o) via streamText (by passing a messages array instead of a single prompt string), you can use a "system" message to set the overall behavior or persona of the AI.

API Route Modification (app/api/generate-text/route.ts - a new one or adapt complete):

// app/api/generate-text/route.ts
import { OpenAI } from '@ai-sdk/openai';
import { StreamingTextResponse, streamText, Message } from 'ai';

export const runtime = 'edge';
export const maxDuration = 60; // Longer tasks might need more time

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

export async function POST(req: Request) {
  const { userPrompt, systemMessage, temperature, maxTokens } = await req.json();

  if (!userPrompt) {
    return new Response('User prompt is required', { status: 400 });
  }

  const messages: Message[] = [];
  if (systemMessage) {
    messages.push({ role: 'system', content: systemMessage });
  }
  messages.push({ role: 'user', content: userPrompt });

  const result = await streamText({
    model: openai.chat('gpt-4o'), // Or your preferred chat model
    messages,
    temperature: temperature ? parseFloat(temperature) : 0.7, // Default 0.7
    maxTokens: maxTokens ? parseInt(maxTokens) : 500,       // Default reasonable limit
    // topP, frequencyPenalty, presencePenalty can also be set
  });

  return result.toAIStreamResponse();
}

Client-Side (app/advanced-textgen/page.tsx):

// app/advanced-textgen/page.tsx
'use client';

import { FormEvent, useState } from 'react';
import { useCompletion } from 'ai/react';

export default function AdvancedTextGenPage() {
  const [userPrompt, setUserPrompt] = useState('');
  const [systemMessage, setSystemMessage] = useState('');
  const [temperature, setTemperature] = useState('0.7');
  const [maxTokens, setMaxTokens] = useState('500');

  const { completion, complete, isLoading, stop } = useCompletion({
    api: '/api/generate-text', // Our new/updated API endpoint
  });

  const handleSubmit = async (e: FormEvent<HTMLFormElement>) => {
    e.preventDefault();
    // The `useCompletion` hook's `complete` function can take the prompt
    // and an optional `body` for additional parameters.
    complete(userPrompt, {
      body: {
        systemMessage,
        temperature: parseFloat(temperature),
        maxTokens: parseInt(maxTokens),
      },
    });
  };

  return (
    <div className="flex flex-col w-full max-w-xl py-12 mx-auto">
      <h1 className="text-2xl font-bold mb-6">Advanced Text Generation</h1>
      <form onSubmit={handleSubmit} className="space-y-4">
        <div>
          <label htmlFor="system-message" className="block text-sm font-medium text-gray-700">
            System Message (Persona/Context):
          </label>
          <textarea
            id="system-message"
            className="w-full p-2 border border-gray-300 rounded shadow-sm text-black"
            rows={2}
            value={systemMessage}
            placeholder="e.g., You are a sarcastic pirate captain."
            onChange={(e) => setSystemMessage(e.target.value)}
          />
        </div>

        <div>
          <label htmlFor="user-prompt" className="block text-sm font-medium text-gray-700">
            User Prompt:
          </label>
          <textarea
            id="user-prompt"
            className="w-full p-2 border border-gray-300 rounded shadow-sm text-black"
            rows={4}
            value={userPrompt}
            placeholder="e.g., Tell me a short tale about finding treasure."
            onChange={(e) => setUserPrompt(e.target.value)}
            required
          />
        </div>

        <div className="grid grid-cols-2 gap-4">
          <div>
            <label htmlFor="temperature" className="block text-sm font-medium text-gray-700">
              Temperature (0-1): {temperature}
            </label>
            <input
              type="range"
              id="temperature"
              min="0" max="1" step="0.1"
              value={temperature}
              onChange={(e) => setTemperature(e.target.value)}
              className="w-full"
            />
          </div>
          <div>
            <label htmlFor="max-tokens" className="block text-sm font-medium text-gray-700">
              Max Tokens: {maxTokens}
            </label>
            <input
              type="number"
              id="max-tokens"
              min="50" max="4000" step="50"
              value={maxTokens}
              onChange={(e) => setMaxTokens(e.target.value)}
              className="w-full p-2 border border-gray-300 rounded text-black"
            />
          </div>
        </div>

        <div className="flex items-center space-x-2">
          <button
            type="submit"
            disabled={isLoading}
            className="px-4 py-2 bg-green-500 text-white rounded hover:bg-green-600 disabled:bg-gray-300"
          >
            {isLoading ? 'Generating...' : 'Generate Text'}
          </button>
          {isLoading && (
            <button
              type="button"
              onClick={stop}
              className="px-4 py-2 bg-red-500 text-white rounded hover:bg-red-600"
            >
              Stop
            </button>
          )}
        </div>
      </form>

      {completion && (
        <div className="mt-6 p-4 border border-gray-200 rounded bg-gray-50">
          <h3 className="text-lg font-semibold mb-2">Generated Text:</h3>
          <p className="whitespace-pre-wrap text-gray-800">{completion}</p>
        </div>
      )}
    </div>
  );
}

Now you can set a system message like "You are a helpful assistant that explains complex topics in simple terms" and then a user prompt like "Explain quantum entanglement."

Step 3: Few-Shot Prompting

Provide examples within your prompt to show the LLM the desired output format or style.

Example User Prompt (for the API above, no system message needed here):

Translate the following English phrases to French:
sea otter => loutre de mer
peppermint => menthe poivrée
cheese => fromage
cherry =>

The LLM is more likely to just output "cerise".

This technique is powerful for tasks like:

Classification: "Classify the sentiment of these reviews: [example positive], [example negative], Review: [new review] => Sentiment:"
Style imitation: Provide a few sentences in a specific writing style, then ask it to continue.

Step 4: Controlling Output Parameters

Temperature: (0.0 - 2.0, typically 0-1 for most models). Lower values (e.g., 0.2) make the output more deterministic and focused. Higher values (e.g., 0.8) make it more random and creative.
Max Tokens (maxTokens): Limits the length of the generated response. Useful for controlling costs and response time.
Top P (topP): Nucleus sampling. Considers only the tokens whose cumulative probability mass exceeds topP.
Frequency/Presence Penalty: Discourage repetition of tokens.

We added temperature and maxTokens to our API and client example above.

Step 5: Experiment!

The best way to master text generation is to experiment.

Try different personas in the system message.
Craft few-shot prompts for various tasks (e.g., writing a tweet, generating a function signature, summarizing a news headline).
Play with temperature to see how it affects creativity vs. predictability.
If you're trying to get structured output (like JSON), be very specific in your prompt about the desired format. Hint: Part 5 will cover a better way for structured JSON!

Key Takeaways

Prompt engineering is a critical skill for effective LLM use.
System messages set the stage for chat model interactions.
Few-shot examples can significantly improve output quality and adherence to format.
Parameters like temperature and maxTokens offer fine-grained control over the generation process.
The Vercel AI SDK streamText (with messages) and useCompletion (with a body for extra params) give you the flexibility to implement these advanced techniques.

What's Next?

Sometimes, you don't just want free-form text; you need structured data, like a JSON object. In Part 5, we'll explore the Vercel AI SDK's generateObject function, which is designed specifically for generating typed, structured objects using LLMs, often with schema validation.