LogoLogo
Back to OsmosBlogContact Us
  • Welcome to Osmos
    • Introduction
  • Getting Started with Microsoft Fabric
    • Fabric Tenant Settings
    • Common Fabric Issues & Troubleshooting
    • Adding the Osmos Workload
  • Adding the Osmos Workspace
  • Adding Workspace Items
  • Adding Data into a Lakehouse
  • AI Data Wrangler
    • AI Data Wrangler Overview
    • Create an AI Data Wrangler
    • Running a Wrangler
    • Wrangler Data Statuses
    • Wrangler Context
      • Descriptors
        • Best Practices for Column Descriptors
      • Instructions
    • Writing to the Destination
    • File Metadata
  • AI Data Engineer
    • AI Data Engineer Overview
    • Create an AI Data Engineer
    • Connect a Destination Table
    • Auto-Configure Instructions
    • Generate Notebook
  • Support
Powered by GitBook
On this page
  • Overview
  • What It Is
  • Auto-configure instructions
  • Two Ways to Configure Instructions
  • How It Works Behind the Scenes
  • 🧩 Real-World Example
  • 🔁 Iteration & Feedback
  • 🔒Folder Constraints

Was this helpful?

Export as PDF
  1. AI Data Engineer

Auto-Configure Instructions

PreviousConnect a Destination TableNextGenerate Notebook

Last updated 4 days ago

Was this helpful?

Overview

The Auto-Configure Instructions feature enables users to define how Osmos AI Data Agents (such as the AI Data Engineer) should transform and process source data, without requiring any code. Whether you provide instructions manually or let the AI derive them from documentation in a folder, this feature puts you in control of the configuration logic. At the same time, the AI does the heavy lifting.

What It Is

Auto-Configure is an intelligent setup assistant that helps you:

  • Create a structured instruction set for your AI agent

  • Define destination schema and transformation logic

  • Provide either typed instructions or a folder of documentation

  • Preview and edit the AI-generated configuration before execution

It transforms your documentation, business rules, or direct input into clean, human-readable instructions the AI can follow for repeatable, controlled data processing.

Auto-configure instructions

  1. Select Add Instructions

  1. Select Manually provide Instructions

  1. Now you have the option to upload a folder and/or provide instructions manually.

Two Ways to Configure Instructions

1. Manual Instructions

Directly input your configuration logic using a structured form:

  • Destination Tables: Specify where your output should go. This often pre-populates from your Fabric workspace.

  • Source Files: Identify what data is being transformed.

  • Ingestion Instructions: Describe transformations, validation rules, mappings, and business logic.

✅ Best for cases where you know exactly what the AI needs to do or when dealing with new, one-off logic.

2. Choose an Instructions Folder

Point the AI to a folder containing relevant documentation. The AI will:

  • Read all provided files (up to 10)

  • Extract transformation logic, schemas, and business intent

  • Generate editable instructions from:

    • Business requirements docs

    • Data models and schema designs

    • Code snippets or prior scripts (SQL, Python, etc.)

    • Sample source/output files

✅ Best for existing projects, historical context, or when repurposing prior data transformation logic.

How It Works Behind the Scenes

The Osmos AI Data Engineer uses generative AI to analyze your inputs and convert them into a structured instruction set, including:

  • Target schema details

  • Source-to-destination mapping logic

  • Transformation functions and validations

  • Edge cases and data quality checks

These instructions act as guardrails for the AI, helping it:

  • Stay aligned with business rules

  • Ensure data integrity

  • Avoid brittle or incorrect transformations

🧩 Real-World Example

Let’s say your destination is a employee_pets table, and you want the AI to extract employee and pet information from messy spreadsheets. You could either:

  • Manually configure: “Destination table is employee_pets. Use all files in the folder. Map emp_type to one of [Full-time, Part-time, Contractor]. Standardize phone numbers. Extract date from header.”

  • Use a folder: Upload a folder containing:

    • A document describing the target schema

    • A sample table of cleaned data

    • A script with useful regex patterns

    • Notes about mapping rules

The AI will parse that information and present it back to you as an editable instruction template. You can then adjust as needed.

🔁 Iteration & Feedback

  • After reviewing the generated instructions, you can:

    • Edit them inline

    • Add edge-case handling

    • Strengthen constraints (e.g., "fail if source columns change")

  • If the result isn't right, update your instructions and regenerate

  • Use real-time feedback to refine and guide the AI’s behavior

🔒Folder Constraints

When using a folder to generate instructions:

  • Limit of 10 files per instruction set

  • All files must be relevant to the current data use case

  • Accepted formats include DOCX, PDF, TXT, CSV, XLSX, SQL, and code files