Prompt design for Gemini Nano

When using Prompt API, there are specific strategies you can use to tailor your prompts and receive optimal results. This page describes best practices for formatting prompts for Gemini Nano.

For more general prompt engineering guidance, see Prompt Engineering whitepaper, Prompt Engineering for Generative AI, and Prompt design strategies.

Prompt design best practices

When designing prompts for Prompt API, use the following techniques:

  • Provide examples for in-context learning. Add well-distributed examples to your prompt to show Gemini Nano the kind of result you expect.

    Consider using the prefix caching feature when you use in-context learning, as providing examples makes the prompt longer and increases inference time.

  • Be concise. Verbose preambles with repeated instructions can produce suboptimal results. Keep your prompt focused and to-the-point.

  • Structure prompts to generate more effective responses, such as this sample prompt template that clearly defines instructions, constraints, and examples.

  • Keep output short. LLM inference speeds are heavily dependent on the output length. Carefully consider how you can generate the shortest possible output for your use case and do manual post-processing to structure the output in a desired format.

  • Add delimiters. Use delimiters like <background_information>, <instruction>, and ## to create separation between different parts of your prompt. Using ## between components is particularly critical for Gemini Nano, as it significantly reduces the chances of the model failing to correctly interpret each component.

  • Prefer simple logic and a more focused task. If you find it challenging to achieve good results with a prompt requiring multi-step reasoning (for example, do X first, if the result of X is A, do M; otherwise do N; then do Y...), consider breaking the task up and let each Gemini Nano call handle a more focused task, while using code to chain multiple calls together.

  • Use lower temperature values for deterministic tasks. For tasks such as entity extraction or translation that don't rely on creativity, consider starting with a temperature value of 0.2, and tune this value based on your testing.