Stay organized with collections
Save and categorize content based on your preferences.
Conversational Actions let you extend Google Assistant with your own
conversational interfaces that give users access to your products and
services. Actions leverage Assistant's powerful natural language
understanding (NLU) engine to process and understand natural language input
and carry out tasks based on that input.
Overview
A Conversational Action is a simple object that defines an
entry point (referred to as invocation) into a conversation:
An invocation defines how users tell Assistant they want to start a
conversation with one of your Actions. An Action's invocation is defined by a
single intent that gets matched when users request the Action.
A conversation defines how users interact with an Action after
it's invoked. You build conversations with intents, types,
scenes, and prompts.
In addition, your Actions can delegate extra work to fulfillment, which
are web services that communicate with your Actions via webhooks. This
lets you do data validation, call other web services, carry out business
logic, and more.
You bundle one or many Actions together, based on the use cases that are
important for your users, into a logical container called an Actions project.
Your Actions project contains your entire invocation model (the collection of
all your invocations), which lets users start at logical places in your
conversation model (all the possible things users can say and all the possible
ways you respond back to users).
Figure 1. A collection of Actions that serve as entry
points into a conversation model. Intents that are eligible for invocation
are considered to be global.
Invocation
Invocation is associated with a display name that represents a brand,
name, or persona that lets users ask Assistant to invoke your Actions.
Users can use this display name on its own (called the main invocation) or in
combination with optional, deep link phrases to invoke your Actions.
For example, users can say the following phrases to invoke three separate
Actions in an project with a display name of "Facts about Google":
"Ok Google, talk to Facts about Google"
"Ok Google, talk to Facts about Google to get company facts"
"Ok Google, talk to Facts about Google to get history facts"
The first invocation in the example is the main invocation. This
invocation is associated with a special system intent named
actions.intent.MAIN. The second and third invocations are deep link
invocations that let you specify additional phrases that let users ask for
specific functionality. These invocations correspond to user intents that you
designated as global. Each invocation in this example provides an entry point
into a conversation and corresponds to a single Action.
Figure 2. Example of main invocation
Figure 2 describes a typical main invocation flow:
When users request an Action, they typically ask Assistant for it
by your display name.
Assistant matches the user's request with the corresponding intent
that matches the request. In this case, it would be actions.intent.MAIN.
The Action is notified of the intent match and responds with the
corresponding prompt to start a conversation with the user.
Conversation
Conversation defines how users interact with an Action after it's invoked. You
build these interactions by defining the valid user input for your
conversation, the logic to process that input, and the corresponding prompts
to respond back to the user with. The following figure and explanation shows
you how a typical conversation turn works with a conversation's low level
components: intents, types, scenes, and
prompts.
Figure 3. Example of a conversation
Figure 3 describes a typical conversation turn:
When users say something, the Assistant NLU matches the input to an
appropriate intent. An intent is matched if the language model for that
intent can closely or exactly match the user input. You define the language
model by specifying training phrases, or examples of things users might want
to say. Assistant takes these training phrases and expands upon them to
create the intent's language model.
When the Assistant NLU matches an intent, it can extract parameters that
you need from the input. These parameters have types associated with them,
such as a date or number. You annotate specific parts of an intent's training
phrases to specify what parameters you want to extract.
A scene then processes the matched intent. You can think of scenes as the
logic executors of an Action, doing the heavy lifting and carrying out logic
necessary to drive a conversation forward. Scenes run in a loop, providing a
flexible execution lifecycle that lets you do things like validate intent
parameters, do slot filling, send prompts back to the user, and more.
When a scene is done executing, it typically sends a prompt back to users
to continue the conversation or can end the conversation if appropriate.
Fulfillment
During invocation or a conversation, your Action can trigger a webhook that
notifies a fulfillment service to carry out some tasks.
Figure 4. Example of a conversation
Figure 4 describes how you can use fulfillment to generate prompts, a common
way to use fulfillment:
At specific points of your Action's execution, it can trigger a webhook
that sends a request to a registered webhook handler (your fulfillment
service) with a JSON payload.
Your fulfillment processes the request, such as calling a REST API to do
some data lookup or validating some data from the JSON payload. A very common
way to use fulfillment is to generate a dynamic prompt at runtime so your
conversations are more tailored to the current user.
Your fulfillment returns a response back to your Action containing a JSON
payload. It can use the data from the payload to continue it's execution and
respond back to the user.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-09-18 UTC."],[[["\u003cp\u003eConversational Actions extend Google Assistant, letting you create conversational interfaces for your services using natural language understanding.\u003c/p\u003e\n"],["\u003cp\u003eActions are invoked by users through specific phrases, triggering a conversation flow defined by intents, types, scenes, and prompts.\u003c/p\u003e\n"],["\u003cp\u003eFulfillment webhooks can be used to enhance Actions by validating data, calling external services, and generating dynamic prompts during conversations.\u003c/p\u003e\n"],["\u003cp\u003eActions are grouped within an Actions project which manages the invocation model and overall conversation flow.\u003c/p\u003e\n"]]],["Conversational Actions enable interactions with Google Assistant via natural language. Users initiate these interactions through **invocations**, using a display name or deep links. **Conversations** follow, where Assistant's NLU matches user input to **intents**, extracting **parameters**. **Scenes** then process these intents, executing logic and sending **prompts**. **Fulfillment** services handle tasks like data validation or dynamic prompt generation through webhooks, allowing actions to interact with web services and tailor responses. An **Actions project** bundles actions together.\n"],null,["# Conversational Actions let you extend Google Assistant with your own\nconversational interfaces that give users access to your products and\nservices. Actions leverage Assistant's powerful natural language\nunderstanding (NLU) engine to process and understand natural language input\nand carry out tasks based on that input.\n\nOverview\n--------\n\nA Conversational Action is a simple object that defines an\nentry point (referred to as invocation) into a conversation:\n\n- An **invocation** defines how users tell Assistant they want to start a conversation with one of your Actions. An Action's invocation is defined by a single [intent](/assistant/conversational/intents) that gets matched when users request the Action.\n- A **conversation** defines how users interact with an Action after it's invoked. You build conversations with [intents](/assistant/conversational/intents), [types](/assistant/conversational/types), [scenes](/assistant/conversational/scenes), and [prompts](/assistant/conversational/prompts).\n- In addition, your Actions can delegate extra work to **fulfillment**, which are web services that communicate with your Actions via webhooks. This lets you do data validation, call other web services, carry out business logic, and more.\n\nYou bundle one or many Actions together, based on the use cases that are\nimportant for your users, into a logical container called an Actions project.\nYour Actions project contains your entire invocation model (the collection of\nall your invocations), which lets users start at logical places in your\nconversation model (all the possible things users can say and all the possible\nways you respond back to users).\n**Figure 1** . A collection of Actions that serve as entry points into a conversation model. Intents that are eligible for invocation are considered to be *global*.\n\nInvocation\n----------\n\nInvocation is associated with a **display name** that represents a brand,\nname, or persona that lets users ask Assistant to invoke your Actions.\nUsers can use this display name on its own (called the main invocation) or in\ncombination with optional, **deep link** phrases to invoke your Actions.\n\nFor example, users can say the following phrases to invoke three separate\nActions in an project with a display name of \"Facts about Google\":\n\n- *\"Ok Google, talk to Facts about Google\"*\n- *\"Ok Google, talk to Facts about Google to get company facts\"*\n- *\"Ok Google, talk to Facts about Google to get history facts\"*\n\nThe first invocation in the example is the **main invocation** . This\ninvocation is associated with a special system intent named\n`actions.intent.MAIN`. The second and third invocations are deep link\ninvocations that let you specify additional phrases that let users ask for\nspecific functionality. These invocations correspond to user intents that you\ndesignated as global. Each invocation in this example provides an entry point\ninto a conversation and corresponds to a single Action.\n**Figure 2**. Example of main invocation\n\nFigure 2 describes a typical main invocation flow:\n\n1. When users request an Action, they typically ask Assistant for it by your display name.\n2. Assistant matches the user's request with the corresponding intent that matches the request. In this case, it would be `actions.intent.MAIN`.\n3. The Action is notified of the intent match and responds with the corresponding prompt to start a conversation with the user.\n\nConversation\n------------\n\nConversation defines how users interact with an Action after it's invoked. You\nbuild these interactions by defining the valid user input for your\nconversation, the logic to process that input, and the corresponding prompts\nto respond back to the user with. The following figure and explanation shows\nyou how a typical conversation turn works with a conversation's low level\ncomponents: [intents](/assistant/conversational/intents), [types](/assistant/conversational/types), [scenes](/assistant/conversational/scenes), and\n[prompts](/assistant/conversational/prompts).\n**Figure 3**. Example of a conversation\n\nFigure 3 describes a typical conversation turn:\n\n1. When users say something, the Assistant NLU matches the input to an appropriate intent. An intent is matched if the *language model* for that intent can closely or exactly match the user input. You define the language model by specifying *training phrases*, or examples of things users might want to say. Assistant takes these training phrases and expands upon them to create the intent's language model.\n2. When the Assistant NLU matches an intent, it can extract *parameters* that you need from the input. These parameters have *types* associated with them, such as a date or number. You annotate specific parts of an intent's training phrases to specify what parameters you want to extract.\n3. A *scene* then processes the matched intent. You can think of scenes as the logic executors of an Action, doing the heavy lifting and carrying out logic necessary to drive a conversation forward. Scenes run in a loop, providing a flexible execution lifecycle that lets you do things like validate intent parameters, do slot filling, send prompts back to the user, and more.\n4. When a scene is done executing, it typically sends a prompt back to users to continue the conversation or can end the conversation if appropriate.\n\nFulfillment\n-----------\n\nDuring invocation or a conversation, your Action can trigger a webhook that\nnotifies a fulfillment service to carry out some tasks.\n**Figure 4**. Example of a conversation\n\nFigure 4 describes how you can use fulfillment to generate prompts, a common\nway to use fulfillment:\n\n1. At specific points of your Action's execution, it can trigger a webhook that sends a request to a registered webhook handler (your fulfillment service) with a JSON payload.\n2. Your fulfillment processes the request, such as calling a REST API to do some data lookup or validating some data from the JSON payload. A very common way to use fulfillment is to generate a dynamic prompt at runtime so your conversations are more tailored to the current user.\n3. Your fulfillment returns a response back to your Action containing a JSON payload. It can use the data from the payload to continue it's execution and respond back to the user."]]