Microsoft Fabric User Data Functions Performance remediate

Systematic guide for diagnosing and resolving performance issues with Fabric User Data Functions (UDFs). Covers cold starts, execution timeouts, capacity consumption, connection bottlenecks, and Python code optimization.

When to Use This Skill

Function invocations are slow or intermittently timing out
Capacity metrics show unexpected CU consumption from UDF operations
Functions fail with timeout, response size, or connection errors
Cold start latency is impacting downstream consumers (Pipelines, Notebooks, Power BI)
Historical logs show increasing duration trends
Need to optimize UDF code for better performance within service limits

Prerequisites

Access to the Fabric portal with permissions on the User Data Functions item
Microsoft Fabric Capacity Metrics app installed (for CU analysis)
Python 3.11+ locally (for code profiling outside Fabric)
PowerShell 7+ (for running diagnostic scripts)

Service Limits Quick Reference

Limit	Value	Impact
Request payload	4 MB	All input parameters combined
Execution timeout	240 seconds	Maximum function runtime
Response size	30 MB	Maximum return value size
Log retention	30 days	Historical invocation log window
Private library max	28.6 MB	Per `.whl` file upload
Test session timeout	15 minutes	Idle timeout in Develop mode
Daily log ingestion	250 MB	Logs may be sampled beyond this
Python version (Run)	3.11	Published functions runtime
Python version (Test)	3.12	Develop mode test runtime

Step-by-Step remediate Workflow

Step 1: Identify the Symptom

Determine which category your issue falls into:

Symptom	Likely Root Cause	Go To
First invocation slow, subsequent fast	Cold start / initialization	Step 2
All invocations consistently slow	Code inefficiency or data volume	Step 3
Intermittent timeouts	Connection issues or capacity throttling	Step 4
Response too large error	Unbounded query results	Step 5
High CU consumption in Metrics app	Excessive execution frequency or duration	Step 6
Function fails with import errors	Library loading overhead	Step 7

Step 2: Diagnose Cold Start Latency

Fabric User Data Functions run in a serverless environment. The first invocation after a period of inactivity incurs initialization overhead.

Check historical logs for the pattern:

Switch to Run only mode in the Functions portal
Open View historical log for the target function
Compare Duration(ms) of the first invocation vs. subsequent ones
A 3-10x difference confirms cold start behavior

Mitigations:

Implement a health-check or warm-up invocation on a schedule via Pipeline
Minimize top-level imports; use lazy imports for heavy libraries
Reduce private library count and size (each .whl adds init time)
Keep PyPI dependency list minimal in definition.json

Step 3: Profile Slow Function Code

For consistently slow functions, instrument your code with timing:

import logging
import time

@udf.function()
def my_function(param: str) -> str:
    start = time.perf_counter()

    # Phase 1: Data retrieval
    t1 = time.perf_counter()
    data = fetch_data(param)
    logging.info(f"Data retrieval: {time.perf_counter() - t1:.3f}s")

    # Phase 2: Processing
    t2 = time.perf_counter()
    result = process(data)
    logging.info(f"Processing: {time.perf_counter() - t2:.3f}s")

    logging.info(f"Total execution: {time.perf_counter() - start:.3f}s")
    return result

Review logs in the Invocation details pane to identify the slowest phase.

Common bottlenecks and fixes:

Data source queries: Add WHERE clauses, limit columns, use parameterized queries
DataFrame operations: Filter early, avoid iterrows(), use vectorized operations
Serialization: Return only required fields, use compact formats
External API calls: Add timeouts, implement retry with backoff

See performance-optimization.md for detailed code patterns.

Step 4: Investigate Connection and Timeout Issues

Connection errors to Fabric data sources:

Verify connections in Manage connections panel
Confirm credentials are valid and not expired
Check that connected data source artifacts still exist
Test the data source independently (run a query directly in the Warehouse/Lakehouse)

Capacity throttling indicators:

Open the Microsoft Fabric Capacity Metrics app
Navigate to the Compute page
Filter to the workspace containing your UDF
Check if CU utilization exceeds 100% during the failure window
Look for HTTP 430 errors in logs: TooManyRequestsForCapacity

Timeout approaching 240s:

Break large operations into smaller chunks
Implement pagination in data retrieval
Consider moving heavy processing to a Notebook and using the UDF as a thin API layer
Use logging.warning() to flag operations exceeding thresholds

Step 5: Resolve Response Size Issues

The 30 MB response limit triggers when functions return large datasets unbounded.

Diagnostic approach:

import sys
import json
import logging

@udf.function()
def my_query_function() -> list:
    results = execute_query()
    size_bytes = sys.getsizeof(json.dumps(results))
    logging.info(f"Response size estimate: {size_bytes / (1024*1024):.2f} MB")

    if size_bytes > 25_000_000:  # 25 MB warning threshold
        logging.warning("Response approaching 30 MB limit")

    return results

Mitigations:

Add TOP/LIMIT clauses to queries
Implement pagination with offset parameters
Return summary/aggregated data instead of raw rows
Compress or filter response fields

Step 6: Analyze Capacity Consumption

UDF operations reported in the Fabric Capacity Metrics app:

Operation	Type	Trigger
User Data Functions Execution	Interactive	Function invoked by portal, Fabric item, or external app
User Data Functions Portal Test	Interactive	Testing in Develop mode (minimum 15-min session)
User Data Functions Static Storage	Background	Metadata stored in OneLake (always-on cost)
User Data Functions Static Storage Read	Background	Metadata read after inactivity period
User Data Functions Static Storage Write	Background	Every publish operation

Cost reduction strategies:

Reduce invocation frequency from calling items (Pipelines, Notebooks)
Cache results in the caller when data doesn't change frequently
Optimize function duration (execution time directly impacts CU consumption)
Consolidate multiple small functions into fewer, more efficient ones
Avoid unnecessary publishes (each triggers storage write operations)

Run the capacity-analysis.ps1 script to generate a capacity usage summary.

Step 7: Resolve Library Loading Issues

Heavy or numerous libraries increase initialization time and can cause import errors.

Best practices:

Use only libraries you actually need in definition.json
Pin specific versions to

fabric-udf-perf-remediate

How to add

Drop this on your repo README

Related skills

xlsx

how-it-works

mem-search

weekly-digests

Get new Dados e Análise skills every Monday