Bria VGL — Full Control Over Image Generation

Define every visual attribute as structured JSON instead of hoping natural language gets it right. VGL (Visual Generation Language) gives you explicit, deterministic control over objects, lighting, camera settings, composition, and style for Bria's FIBO models.

Related Skill: Use bria-ai to execute these VGL prompts via the Bria API. VGL defines the structured control format; bria-ai handles generation, editing, and background removal.

Core Concept

VGL replaces ambiguous natural language prompts with deterministic JSON that explicitly declares every visual attribute: objects, lighting, camera settings, composition, and style. This ensures reproducible, controllable image generation.

Operation Modes

Mode	Input	Output	Use Case
Generate	Text prompt	VGL JSON	Create new image from description
Edit	Image + instruction	VGL JSON	Modify reference image
Edit_with_Mask	Masked image + instruction	VGL JSON	Fill grey masked regions
Caption	Image only	VGL JSON	Describe existing image
Refine	Existing JSON + edit	Updated VGL JSON	Modify existing prompt

JSON Schema

Output a single valid JSON object with these required keys:

1. `short_description` (String)

Concise summary of image content, max 200 words. Include key subjects, actions, setting, and mood.

2. `objects` (Array, max 5 items)

Each object requires:

{
  "description": "Detailed description, max 100 words",
  "location": "center | top-left | bottom-right foreground | etc.",
  "relative_size": "small | medium | large within frame",
  "shape_and_color": "Basic shape and dominant color",
  "texture": "smooth | rough | metallic | furry | fabric | etc.",
  "appearance_details": "Notable visual details",
  "relationship": "Relationship to other objects",
  "orientation": "upright | tilted 45 degrees | facing left | horizontal | etc."
}

Human subjects add:

{
  "pose": "Body position description",
  "expression": "winking | joyful | serious | surprised | calm",
  "clothing": "Attire description",
  "action": "What the person is doing",
  "gender": "Gender description",
  "skin_tone_and_texture": "Skin appearance"
}

Object clusters add:

{
  "number_of_objects": 3
}

Size guidance: If a person is the main subject, use "medium-to-large" or "large within frame".

3. `background_setting` (String)

Overall environment, setting, and background elements not in objects.

4. `lighting` (Object)

{
  "conditions": "bright daylight | dim indoor | studio lighting | golden hour | blue hour | overcast",
  "direction": "front-lit | backlit | side-lit from left | top-down",
  "shadows": "long, soft shadows | sharp, defined shadows | minimal shadows"
}

5. `aesthetics` (Object)

{
  "composition": "rule of thirds | symmetrical | centered | leading lines | medium shot | close-up",
  "color_scheme": "monochromatic blue | warm complementary | high contrast | pastel",
  "mood_atmosphere": "serene | energetic | mysterious | joyful | dramatic | peaceful"
}

For people as main subject, specify shot type in composition: "medium shot", "close-up", "portrait composition".

6. `photographic_characteristics` (Object)

{
  "depth_of_field": "shallow | deep | bokeh background",
  "focus": "sharp focus on subject | soft focus | motion blur",
  "camera_angle": "eye-level | low angle | high angle | dutch angle | bird's-eye",
  "lens_focal_length": "wide-angle | 50mm standard | 85mm portrait | telephoto | macro"
}

For people: Prefer "standard lens (35mm-50mm)" or "portrait lens (50mm-85mm)". Avoid wide-angle unless specified.

7. `style_medium` (String)

Default to "photograph" unless explicitly requested otherwise.

8. `artistic_style` (String)

If not photograph, describe characteristics in max 3 words: "impressionistic, vibrant, textured"

For photographs, use "realistic" or similar.

9. `context` (String)

Describe the image type/purpose:

"High-fashion editorial photograph for magazine spread"
"Concept art for fantasy video game"
"Commercial product photography for e-commerce"

10. `text_render` (Array)

Default: empty array []

Only populate if user explicitly provides exact text content:

{
  "text": "Exact text from user (never placeholder)",
  "location": "center | top-left | bottom",
  "size": "small | medium | large",
  "color": "white | red | blue",
  "font": "serif typeface | sans-serif | handwritten | bold impact",
  "appearance_details": "Metallic finish | 3D effect | etc."
}

Exception: Universal text integral to objects (e.g., "STOP" on stop sign).

11. `edit_instruction` (String)

Single imperative command describing the edit/generation.

Edit Instruction Formats

For Standard Edits (no mask)

Start with action verb, describe changes, never reference "original image":

Category	Rewritten Instruction
Style change	`Turn the image into the cartoon style.`
Object attribute	`Change the dog's color to black and white.`
Add element	`Add a wide-brimmed felt hat to the subject.`
Remove object	`Remove the book from the subject's hands.`
Replace object	`Change the rose to a bright yellow sunflower.`
Lighting	`Change the lighting from dark and moody to bright and vibrant.`
Composition	`Change the perspective to a wider shot.`
Text change	`Change the text "Happy Anniversary" to "Hello".`
Quality	`Refine the image to obtain increased clarity and sharpness.`

For Masked Region Edits

Reference "masked regions" or "masked area" as target:

Intent	Rewritten Instruction
Object generation	`Generate a white rose with a blue center in the masked region.`
Extension	`Extend the image into the masked region to create a scene featuring...`
Background fill	`Create the following background in the masked region: A vast ocean extending to horizon.`
Atmospheric fill	`Fill the background masked area with a clear, bright blue sky with wispy clouds.`
Subject restoration	`Restore the area in the mask with a young woman.`
Environment infill	`Create inside the masked area: a greenhouse with rows of plants under glass ceiling.`

Fidelity Rules

Standard Edit Mode

Preserve ALL visual properties unless explicitly changed by instruction:

Subject identity, pose, appearance
Object existence, location, size, orientation
Composition, camera angle, lens characteristics
Style/medium

Only change what the edit strictly requires.

Masked Edit Mode

Preserve all visible (non-masked) portions exactly
Fill grey masked regions to blend seamlessly with unmasked areas
Match existing style, lighting, and subject matter
Never describe grey masks—describe content that fills them

Example Output

{
  "short_description": "A professional businesswoman in a navy blazer stands confidently in a modern glass office, holding a tablet. Natural daylight streams through floor-to-ceiling windows, creating a warm, productive atmosphere.",
  "objects": [
    {
      "description": "A confident businesswoman in her 30s with shoulder-length dark hair, wearing a tailored navy blazer over a white blouse. She holds a tablet in her left hand while gesturing naturally with her right.",
      "location": "center-right",
      "relative_size": "large within frame",
      "shape_and_color": "Human figure, navy and white clothing",
      "texture": "smooth fabric, professional attire",
      "appearance_details": "Minimal jewelry, well-groomed professional appearance",
      "relationship": "Main subject

vgl

How to add

Drop this on your repo README

Related skills

internal-comms

babysit

do

smart-explore

Get new DevOps e Infra skills every Monday

Bria VGL — Full Control Over Image Generation

Core Concept

Operation Modes

JSON Schema

1. `short_description` (String)

2. `objects` (Array, max 5 items)

3. `background_setting` (String)

4. `lighting` (Object)

5. `aesthetics` (Object)

6. `photographic_characteristics` (Object)

7. `style_medium` (String)

8. `artistic_style` (String)

9. `context` (String)

10. `text_render` (Array)

11. `edit_instruction` (String)

Edit Instruction Formats

For Standard Edits (no mask)

For Masked Region Edits

Fidelity Rules

Standard Edit Mode

Masked Edit Mode

Example Output

Comments · No comments

How to add

Drop this on your repo README

Related skills

internal-comms

babysit

do

smart-explore

Get new DevOps e Infra skills every Monday

Bria VGL — Full Control Over Image Generation

Core Concept

Operation Modes

JSON Schema

1. short_description (String)

2. objects (Array, max 5 items)

3. background_setting (String)

4. lighting (Object)

5. aesthetics (Object)

6. photographic_characteristics (Object)

7. style_medium (String)

8. artistic_style (String)

9. context (String)

10. text_render (Array)

11. edit_instruction (String)

Edit Instruction Formats

For Standard Edits (no mask)

For Masked Region Edits

Fidelity Rules

Standard Edit Mode

Masked Edit Mode

Example Output

Comments · No comments

1. `short_description` (String)

2. `objects` (Array, max 5 items)

3. `background_setting` (String)

4. `lighting` (Object)

5. `aesthetics` (Object)

6. `photographic_characteristics` (Object)

7. `style_medium` (String)

8. `artistic_style` (String)

9. `context` (String)

10. `text_render` (Array)

11. `edit_instruction` (String)