Image Generation
Introduction to MyShell's image generation function: From natural language image generation to parameter image generation.
Natural Language Image Generation
MyShell's natural language image generation allows users to obtain high-quality images at extremely low cost.
You only need to provide a demand, a sentence, or even a word, and the MyShell image generation series of AI apps can generate high-quality images for you.
To achieve high-quality and fast image generation, MyShell uses the following technical solutions:
At the algorithm level, we use large language models (LLMs) to assist users in semantic understanding, so that natural language can be effectively transformed into the complex parameters required by Stable Diffusion; in addition, we have implemented model scheduling, which can call the corresponding checkpoint according to the image style required by the user, demonstrating preliminary scheduling capabilities.
At the engineering level, we use an asynchronously generated structure to ensure the usability of the image service, making image generation less likely to be interrupted and enhancing the robustness of the service.
Find five apps with specific uses and a large number of apps with special image styles here.
Quick start guide
Taking the "Image Magician" and "Avatar Magician" as examples, the following use cases can experience the ability of natural language image generation.
When you use natural language to generate images, we recommend that you provide three types of information:
Image content: This determines the main subject and background elements of your picture.
Image style: This determines which stable diffusion checkpoint the AI app will call for you to draw, and is closely related to the artistic style of the image.
Image usage: This determines the size and content quality enhancement template of your picture. Currently, we recommend that you use "Avatar", "Portrait", and "Banner". The first two usages are convenient for you to draw high-quality portraits for use as avatars and photo wall decorations, while the latter is good at drawing landscapes as banner accessories.
Parameters
In addition to natural language image generation, MyShell also enables advanced users to adjust image generation through parameters.
Prompt
Description: The main text instruction that guides the image generation process. It's the core input that the AI uses to generate images.
Enter a descriptive narrative or keywords that closely detail the desired outcome. The more specific and descriptive the prompt, the more accurate the generated image will be.
Negative Prompt
Description: Instructions specifying what should be excluded from the generated image. This helps in refining the output by avoiding unwanted elements.
List elements you wish to avoid in the output, such as "Watermark, signature, motion blur, extra fingers, bad anatomy." This feature is particularly useful for achieving more precise results and avoiding common pitfalls in image generation.
Model
Description: The image model used for generating the image. Different models may have different capabilities and specialties.
Usage: Choose from the available options. The choice of model can significantly impact the style and quality of the output.
MyShell supports a large number of drawing models, here are some of the most popular ones.
Popular Models
Image Size
Description: The dimensions of the generated image. This determines the resolution and aspect ratio of the output.
If you want to create an image for your own AI apps on MyShell, we recommend the following sizes for you:
Avatar: 750*750
Banner: 1000*200
Photo wall: 750*750, 600*800, ...
Immersive background: 1000*840, 1500 * 1260, 2000*1680...
Image Number
Description: The number of images to generate in a single run. This allows for the creation of multiple variations of an image based on the same prompt.
Select a number to generate multiple variations. This is useful for exploring different interpretations of the same prompt. Maximum support for ten images.
Advanced Parameters
Sampling Method
Description: The algorithm that drives the transformation from noise to image. Different sampling methods can produce different styles and qualities of images.
Select from options like Euler, which affect the visual style and coherence. The choice of sampler can influence the smoothness, color distribution, and edge definition in the generated images. Different sampling methods will produce different image effects, and we recommend that you try them out.
Sampling Steps
Description: The number of iterations in the image refinement process. This affects how gradually the image is developed from the initial noise.
Higher numbers typically result in more detailed images. However, too many steps might not always improve quality and can increase processing time.
CFG Scale
Description: Balances between the AI's creativity and adherence to the prompt. It's a scale that determines how closely the AI should follow the prompt.
A higher scale makes the AI closely follow the prompt, while a lower scale allows more creative freedom. This parameter is crucial for controlling the balance between fidelity to the prompt and artistic interpretation.
Seed
Description: Initializes the random noise pattern, affecting image uniqueness. It's a value that determines the starting point of the image generation process.
A specific seed value ensures reproducibility of results. This is useful for creating consistent results or variations of the same image. When you enter -1, a random seed will be used.
Last updated