Universal Dataset Import for FiftyOne
Overview
Import any dataset into FiftyOne regardless of media type, label format, or folder structure. Automatically detects and handles:
- All media types: images, videos, point clouds, 3D scenes
- All label formats: COCO, YOLO, VOC, CVAT, KITTI, OpenLABEL, and more
- Multimodal groups: Multiple cameras + LiDAR per scene (autonomous driving, robotics)
- Complex folder structures: Nested directories, scene-based organization
Use this skill when:
- Importing datasets from any source or format
- Working with autonomous driving data (multiple cameras, LiDAR, radar)
- Loading multimodal data that needs grouping
- The user doesn't know or specify the exact format
- Importing point clouds, 3D scenes, or mixed media types
Prerequisites
- FiftyOne MCP server installed and running
@voxel51/ioplugin for importing data@voxel51/utilsplugin for dataset management
Key Directives
ALWAYS follow these rules:
1. Scan folder FIRST
Before any import, deeply scan the directory to understand its structure:
# Use bash to explore
find /path/to/data -type f | head -50
ls -la /path/to/data
2. Auto-detect everything
Detect media types, label formats, and grouping patterns automatically. Never ask the user to specify format if it can be inferred.
3. Detect multimodal groups
Look for patterns that indicate grouped data:
- Scene folders containing multiple media files
- Filename patterns with common prefixes (e.g.,
scene_001_left.jpg,scene_001_right.jpg) - Mixed media types that should be grouped (images + point clouds)
4. Detect and install required packages
Many specialized dataset formats require external Python packages. After detecting the format:
- Identify required packages based on the detected format
- Check if packages are installed using
pip show <package> - Search for installation instructions if needed (use web search or FiftyOne docs)
- Ask user for permission before installing any packages
- Install required packages (see installation methods below)
- Verify installation before proceeding
Common format-to-package mappings:
| Dataset Format | Package Name | Install Command |
|---|---|---|
| PandaSet | pandaset | pip install "git+https://github.com/scaleapi/pandaset-devkit.git#subdirectory=python" |
| nuScenes | nuscenes-devkit | pip install nuscenes-devkit |
| Waymo Open | waymo-open-dataset-tf | See Waymo docs (requires TensorFlow) |
| Argoverse 2 | av2 | pip install av2 |
| KITTI 3D | pykitti | pip install pykitti |
| Lyft L5 | l5kit | pip install l5kit |
| A2D2 | a2d2 | See Audi A2D2 docs |
Additional packages for 3D processing:
| Purpose | Package Name | Install Command |
|---|---|---|
| Point cloud conversion to PCD | open3d | pip install open3d |
| Point cloud processing | pyntcloud | pip install pyntcloud |
| LAS/LAZ point clouds | laspy | pip install laspy |
Installation methods (in order of preference):
-
PyPI - Standard pip install:
pip install <package-name> -
GitHub URL - When package is not on PyPI:
# Standard GitHub install pip install "git+https://github.com/<org>/<repo>.git" # With subdirectory (for monorepos) pip install "git+https://github.com/<org>/<repo>.git#subdirectory=python" # Specific branch or tag pip install "git+https://github.com/<org>/<repo>.git@v1.0.0" -
Clone and install - For complex builds:
git clone https://github.com/<org>/<repo>.git cd <repo> pip install .
Dynamic package discovery workflow:
If the format is not in the table above:
- Search PyPI for
<format-name>,<format-name>-devkit, or<format-name>-sdk - Search GitHub for
<format-name> devkitor<format-name> python - Search web for "FiftyOne import <format-name>" or "<format-name> python tutorial"
- Check the dataset's official website for developer tools/SDK
- Present findings to user with installation options
After installation:
- Verify the package is installed:
pip show <package-name> - Test import in Python:
python -c "from <package> import ..." - Search for FiftyOne integration examples or write custom import code
5. Confirm before importing
Present findings to user and explicitly ask for confirmation before creating the dataset. Always end your scan summary with a clear question like:
- "Proceed with import?"
- "Should I create the dataset with these settings?"
Wait for user response before proceeding. Do not create the dataset until the user confirms.
6. Check for existing datasets
Before creating a dataset, check if the proposed name already exists:
list_datasets()
If the dataset name exists, ask the user:
- Overwrite: Delete existing and create new
- Rename: Use a different name (suggest alternatives like
dataset-name-v2) - Abort: Cancel the import
7. Validate after import
Compare imported sample count with source file count. Report any discrepancies.
8. Report errors minimally to user
Keep error messages simple for the user. Use detailed error info internally to diagnose issues.
Complete Workflow
Step 1: Deep Folder Scan
Scan the target directory to understand its structure:
# Count files by extension
find /path/to/data -type f | sed 's/.*\.//' | sort | uniq -c | sort -rn
# List directory structure (2 levels deep)
find /path/to/data -maxdepth 2 -type d
# Sample some files
ls -la /path/to/data/* | head -20
# IMPORTANT: Scan for ALL annotation/label directories
ls -la /path/to/data/annotations/ 2>/dev/null || ls -la /path/to/data/labels/ 2>/dev/null
Build an inventory of:
- Media files by type (images, videos, point clouds, 3D)
- Label files by format (JSON, XML, TXT, YAML, PKL)
- Directory structure (flat vs nested vs scene-based)
- ALL annotation types present (cuboids, segmentation, tracking, etc.)
For 3D/Autonomous Driving datasets, specifically check:
# List all annotation subdirectories
find /path/to/data -type d -name "annotations" -o -name "labels" | xargs -I {} ls -la {}
# Sample an annotation file to understand its structure
python3 -c "import pickle, gzip; print(pickle.load(gzip.open('path/to/annotation.pkl.gz', 'rb'))[:2])"
Step 2: Identify Media Types
Classify files by extension:
| Extensions | Media Type | FiftyOne Type |
|---|---|---|
.jpg, .jpeg, .png, .gif, .bmp, .webp, .tiff | Image | image |
.mp4, .avi, .mov, .mkv, .webm | Video | video |
.pcd, .ply, .las, .laz | Point Cloud | point-cloud |
.fo3d, .obj, .gltf, .glb | 3D Scene | 3d |
Step 3: Detect Label Format
Identify label format from file patterns:
| Pattern | Format | Dataset Type |
|---|---|---|
annotations.json or instances*.json with COCO structure | COCO | COCO |
*.xml files with Pascal VOC structure | VOC | VOC |
*.txt per image + classes.txt | YOLOv4 | YOLOv4 |
data.yaml + labels/*.txt | YOLOv5 | YOLOv5 |
*.txt per image (KITTI format) | KITTI | KITTI |
Single annotations.xml (CVAT format) | CVAT | CVAT Image |
*.json with OpenLABEL structure | OpenLABEL | OpenLABEL Image |
| Folder-per-class structure | Classification | Image Classification Directory Tree |
*.csv with filepath column | CSV | CSV |
*.json with GeoJSON structure | GeoJSON | GeoJSON |
.dcm DICOM files | DICOM | DICOM |
.tiff with geo metadata | GeoTIFF | GeoTIFF |
Specialized Autonomous Driving Formats (require external packages):
| Directory Pattern | Format | Required Package |
|---|---|---|
camera/, lidar/, annotations/cuboids/ wi |