App Builder
Last updated
Last updated
The core concept behind the App is a state machine integrated with tasks that can perform AI functions, transitioning between different states through user interactions.
App Builder has significant differences from Workflow:
Essence: Workflow executes each node sequentially until a one-time completion, while the App can interact with users indefinitely without stopping, and developer can have full agency on setting the user journey.
Editing Interface: In Workflow, the connections between nodes represent data transmission; in the App, connections between widgets represent user interactions. The content of Workflow nodes is embeded within the nodes, while in the App, most of the widget content sourced from external forms.
Taking a chat agent as an example, generally speaking, in the App, a State corresponds to a reply message from the bot. Users can transition to different State through various actions such as text input, voice input,clicking buttons, or filling forms.
Each State of the agent can run different AI modules (Tasks) based on the input, or collectively build the reply content (Message). It can also output some other results (Outputs) for use by other States.
The App Builder is a no-code canvas-like editor where you set up States and connect Widgets to build your App.
The left side includes two types of interactive entities:
State: Corresponds to a State in Pro Config editing mode, which can be dragged onto the canvas.
Widget: Corresponds to AnyWidgetModule in Pro Config editing mode, which can be dragged under a State's Task or added by clicking "Add" under Task. Currently, only the following two Widgets are supported, and we will soon support many more widgets in future updates:
Workflow Runner Widget: This Widget can execute a specified Workflow
GPT Widget: This Widget can call a specified GPT model
You can set the Context; Variables under Context can be accessed by all States and can be modified by any State.
For how to access them, please refer to Using Variables in this chapter, and for how to modify them, please refer to Modifying Context Variables in this chapter.
For developers who are familiar with Pro Config mode: The Start state corresponds to Automata in ProConfig. However, Automata can set global transitions, which the current App Builder does not support.
State Input has the same functionality as the Input of the Workflow Start node; it represents the parameters accepted by the current State.
Just like in Workflow, an Input includes a name, type, and more advanced settings.
Compared to Workflow, it is more complex because Inputs have more sources:
User Input (User Input: True): This input is filled in by user.
Form Input (Source: form): User fills in the input in a form that pops up when clicking a button.
Chat Input (Source: IM): User inputs in the chat input box.
Non-user Input (User Input: False): This input is set by the developer.
Transferred Input: Other States assign values to this Input when transitioning to this State.
Local Variables: Variables that will be used by this State.
To avoid unnecessary issues, we recommend that the User Inputs in a single State be either all form inputs or all IM inputs.
The State connected to the Start node is the initial state; the Input of the initial state will be ignored during the first run. You can create a greeting statement that does not require Input as the initial state, or set default values for Inputs to avoid errors caused by missing values.
A Task is a call to a Widget.
After adding a Task, click its name to edit the corresponding configuration and see its returned data structure.
You can use the Task's name to reference its returned results.
Tasks are executed sequentially; each Task can only reference the results of previously executed Tasks.
Output is mainly used to output variables for other States to use. Whenever certain data can be repeatedly used , such as Task execution results or user Inputs, you can set them as Outputs.
Outputs can be created as new Output variables or by modifying existing Context variables.
Create a new Output variable
Edit its properties
Modify the Variable Name and other properties, then confirm.
Assign a value to the Output variable. The assignment method is the same as Using Variables.
After creation, this Output can be referenced in other States.
Note that any State can reference the Outputs of any other State, but there is no guarantee that the referenced State can be executed. Therefore, be mindful of the execution order of States when referencing.
Create a new Output variable
Edit its properties
Set Variable Name mode to Select Output
Select the Context variable you want to modify and confirm
Assign a new value to the Context variable. The assignment method is the same as Using Variables
Control what content the chat agent replies to the user, which can include text, images, audio, and buttons.
Text supports using expressions to reference variables, Markdown syntax, and a few HTML elements (<img>
, <audio>
, <video>
, <a>
, and you can set width and height for them).
Images support using a single image or an array of images.
Audio supports a single audio file.
Buttons can create as many as you want; when users are interacting with the App, users can click buttons to transition to other States or rerun the current State.
Buttons are relatively more complex; their configuration includes Button Name, Button Prompt, and Payload fields. After configuration, you can connect the button on the canvas to the State it needs to transition to when clicked (it can be the same State). Advanced configurations for transitions can be made, which will be introduced in the next section on Transition.
Payload can create some variables when the user clicks. It generally has two use cases:
Perceiving User Actions: For example, two buttons (Style 1, Style 2) correspond to the same variable (style
) but with different data (1, 2). This way, clicking different buttons can transition to the same State but with different input data.
Making Data Snapshots in Chat History: Since each specific button creates and records the Payload when displayed, clicking buttons in the chat history can retrieve historical snapshot data.
In the App Builder, transitions are represented by connections on the canvas. The right side represents outputs, and the left side represents inputs.
There are three types of transitions on the canvas, representing different trigger conditions:
CHAT: Transition triggered after the user inputs content in the chat input box, achieved by connecting the circle on the right side of the simulated chat input box below the State node to another State.
ALWAYS: Transition directly to another State after the State execution is complete, achieved by connecting State to State.
Custom Click Events: Triggered when the user clicks a button, achieved by connecting the circle on the right side of the button to another State.
Click the small square button at the midpoint of the connection line; a Transition editing form will pop up on the right. You can add conditional transitions and set target inputs.
Transitions can conditionally jump to different States. A set of Conditions and Target State settings is called a Transition Case. For example, in the figure below, you can transition to the winner state or continue state based on whether the user's current score is greater than 10.
Enter logical expressions in the Condition (refer to Expressions) for judgment; the result of the expression should be True
or False
. During execution, Transition Cases are evaluated in order (from top to bottom in the interface); the first Transition Case whose Condition evaluates to True
will take effect, and subsequent Transition Cases will not be evaluated.
If the Condition is left blank, it is equivalent to setting it as default True
.
When transitioning to the Target State, you can also assign values to its Inputs. This is similar to a function calling in programming and passing parameters during the call.
For many fields, we support three modes of assignment:
UI Mode: Directly edit in the form.
Ref Mode: Reference other variables.
Code Mode: Use programming language expressions to combine other variables or perform simple data processing through code.
You can usually switch modes by clicking on the right side or upper right corner of the field form item.
We support expressions, which is often a very powerful way of making state machine more controllable, and making it convenient for developers to perform simple calculations on variables.
In modes that support expressions, expressions are enclosed within double curly braces {{ }}
. Variables can be referenced using /
.
Common use cases for expressions include:
Concatenating text.
Accessing part of complex data, such as getting the first item of an array.
Calculating Conditions in Transitions.
For more complex calculation scenarios, we will later support a Code Runner Widget to execute functions.
[!TIP] We currently support expressions in the Python language. JavaScript expressions will be supported later. Python expressions currently support the following libraries:
builtins
marshal
math
time
You need to click Settings to set the environment variable OPENAI_API_KEY
.
To use GPT models, currently you need to have your own OpenAI API Key. Due to the fact that ShellAgent is an open-sourced project, MyShell cannot provide limitless free API keys to everyone. However, we will connect the MyShell account's battery system with ShellAgent in the future, so that developers can use the batteries in their MyShell account to build with ShellAgent.
This API Key will only be used for building purposes in your local open-sourced project. Once you export the bot's JSON file from ShellAgent to MyShell's website, all API compute costs will still be hosted by MyShell.
You can use Markdown in Message.Text.
If you need multimedia elements, you can add corresponding HTML elements. For example:
Turn off the Message.Image switch, insert <img />
in Message.Text, and set its width and height. For example: