FlowIO: Flow Cytometry Standard File Handler
Overview
FlowIO is a lightweight Python library for reading and writing Flow Cytometry Standard (FCS) files. Parse FCS metadata, extract event data, and create new FCS files with minimal dependencies. The library supports FCS versions 2.0, 3.0, and 3.1, making it ideal for backend services, data pipelines, and basic cytometry file operations.
When to Use This Skill
This skill should be used when:
- FCS files requiring parsing or metadata extraction
- Flow cytometry data needing conversion to NumPy arrays
- Event data requiring export to FCS format
- Multi-dataset FCS files needing separation
- Channel information extraction (scatter, fluorescence, time)
- Cytometry file validation or inspection
- Pre-processing workflows before advanced analysis
Related Tools: For advanced flow cytometry analysis including compensation, gating, and FlowJo/GatingML support, recommend FlowKit library as a companion to FlowIO.
Installation
uv pip install flowio
Requires Python 3.9 or later.
Quick Start
Basic File Reading
from flowio import FlowData
# Read FCS file
flow_data = FlowData('experiment.fcs')
# Access basic information
print(f"FCS Version: {flow_data.version}")
print(f"Events: {flow_data.event_count}")
print(f"Channels: {flow_data.pnn_labels}")
# Get event data as NumPy array
events = flow_data.as_array() # Shape: (events, channels)
Creating FCS Files
import numpy as np
from flowio import create_fcs
# Prepare data
data = np.array([[100, 200, 50], [150, 180, 60]]) # 2 events, 3 channels
channels = ['FSC-A', 'SSC-A', 'FL1-A']
# Create FCS file
create_fcs('output.fcs', data, channels)
Core Workflows
Reading and Parsing FCS Files
The FlowData class provides the primary interface for reading FCS files.
Standard Reading:
from flowio import FlowData
# Basic reading
flow = FlowData('sample.fcs')
# Access attributes
version = flow.version # '3.0', '3.1', etc.
event_count = flow.event_count # Number of events
channel_count = flow.channel_count # Number of channels
pnn_labels = flow.pnn_labels # Short channel names
pns_labels = flow.pns_labels # Descriptive stain names
# Get event data
events = flow.as_array() # Preprocessed (gain, log scaling applied)
raw_events = flow.as_array(preprocess=False) # Raw data
Memory-Efficient Metadata Reading:
When only metadata is needed (no event data):
# Only parse TEXT segment, skip DATA and ANALYSIS
flow = FlowData('sample.fcs', only_text=True)
# Access metadata
metadata = flow.text # Dictionary of TEXT segment keywords
print(metadata.get('$DATE')) # Acquisition date
print(metadata.get('$CYT')) # Instrument name
Handling Problematic Files:
Some FCS files have offset discrepancies or errors:
# Ignore offset discrepancies between HEADER and TEXT sections
flow = FlowData('problematic.fcs', ignore_offset_discrepancy=True)
# Use HEADER offsets instead of TEXT offsets
flow = FlowData('problematic.fcs', use_header_offsets=True)
# Ignore offset errors entirely
flow = FlowData('problematic.fcs', ignore_offset_error=True)
Excluding Null Channels:
# Exclude specific channels during parsing
flow = FlowData('sample.fcs', null_channel_list=['Time', 'Null'])
Extracting Metadata and Channel Information
FCS files contain rich metadata in the TEXT segment.
Common Metadata Keywords:
flow = FlowData('sample.fcs')
# File-level metadata
text_dict = flow.text
acquisition_date = text_dict.get('$DATE', 'Unknown')
instrument = text_dict.get('$CYT', 'Unknown')
data_type = flow.data_type # 'I', 'F', 'D', 'A'
# Channel metadata
for i in range(flow.channel_count):
pnn = flow.pnn_labels[i] # Short name (e.g., 'FSC-A')
pns = flow.pns_labels[i] # Descriptive name (e.g., 'Forward Scatter')
pnr = flow.pnr_values[i] # Range/max value
print(f"Channel {i}: {pnn} ({pns}), Range: {pnr}")
Channel Type Identification:
FlowIO automatically categorizes channels:
# Get indices by channel type
scatter_idx = flow.scatter_indices # [0, 1] for FSC, SSC
fluoro_idx = flow.fluoro_indices # [2, 3, 4] for FL channels
time_idx = flow.time_index # Index of time channel (or None)
# Access specific channel types
events = flow.as_array()
scatter_data = events[:, scatter_idx]
fluorescence_data = events[:, fluoro_idx]
ANALYSIS Segment:
If present, access processed results:
if flow.analysis:
analysis_keywords = flow.analysis # Dictionary of ANALYSIS keywords
print(analysis_keywords)
Creating New FCS Files
Generate FCS files from NumPy arrays or other data sources.
Basic Creation:
import numpy as np
from flowio import create_fcs
# Create event data (rows=events, columns=channels)
events = np.random.rand(10000, 5) * 1000
# Define channel names
channel_names = ['FSC-A', 'SSC-A', 'FL1-A', 'FL2-A', 'Time']
# Create FCS file
create_fcs('output.fcs', events, channel_names)
With Descriptive Channel Names:
# Add optional descriptive names (PnS)
channel_names = ['FSC-A', 'SSC-A', 'FL1-A', 'FL2-A', 'Time']
descriptive_names = ['Forward Scatter', 'Side Scatter', 'FITC', 'PE', 'Time']
create_fcs('output.fcs',
events,
channel_names,
opt_channel_names=descriptive_names)
With Custom Metadata:
# Add TEXT segment metadata
metadata = {
'$SRC': 'Python script',
'$DATE': '19-OCT-2025',
'$CYT': 'Synthetic Instrument',
'$INST': 'Laboratory A'
}
create_fcs('output.fcs',
events,
channel_names,
opt_channel_names=descriptive_names,
metadata=metadata)
Note: FlowIO exports as FCS 3.1 with single-precision floating-point data.
Exporting Modified Data
Modify existing FCS files and re-export them.
Approach 1: Using write_fcs() Method:
from flowio import FlowData
# Read original file
flow = FlowData('original.fcs')
# Write with updated metadata
flow.write_fcs('modified.fcs', metadata={'$SRC': 'Modified data'})
Approach 2: Extract, Modify, and Recreate:
For modifying event data:
from flowio import FlowData, create_fcs
# Read and extract data
flow = FlowData('original.fcs')
events = flow.as_array(preprocess=False)
# Modify event data
events[:, 0] = events[:, 0] * 1.5 # Scale first channel
# Create new FCS file with modified data
create_fcs('modified.fcs',
events,
flow.pnn_labels,
opt_channel_names=flow.pns_labels,
metadata=flow.text)
Handling Multi-Dataset FCS Files
Some FCS files contain multiple datasets in a single file.
Detecting Multi-Dataset Files:
from flowio import FlowData, MultipleDataSetsError
try:
flow = FlowData('sample.fcs')
except MultipleDataSetsError:
print("File contains multiple datasets")
# Use read_multiple_data_sets() instead
Reading All Datasets:
from flowio import read_multiple_data_sets
# Read all datasets from file
datasets = read_multiple_data_sets('multi_dataset.fcs')
print(f"Found {len(datasets)} datasets")
# Process each dataset
for i, dataset in enumerate(datasets):
print(f"\nDataset {i}:")
print(f" Events: {dataset.event_count}")
print(f" Channels: {dataset.pnn_labels}")
# Get event data for this dataset
events = dataset.as_array()
print(f" Shape: {events.shape}")
print(f" Mean values: {events.mean(axis=0)}")
Reading Specific Dataset:
from flowio import FlowData
# Read first dataset (nextdata_offset=0)
first_dataset = FlowData('multi.fcs', nextdata_offset=0)
# Read second dataset using NEXTDATA offset from first
next_offset = int(first_dataset.text['$NEXTDATA'])
if next_offset > 0:
second_dataset = FlowData('multi.fcs'