Explore skills
5,086 skills found
Category alert
Get new Dados e Análise skills every Monday
exploratory-data-analysis
Performs comprehensive exploratory data analysis on scientific data files across 200+ formats. It automatically detects file types and generates detailed markdown reports with format-specific analysis, quality metrics, and downstream analysis recommendations to understand data structure, content, and quality.
deeptools
NGS analysis toolkit for BAM to bigWig conversion, QC (correlation, PCA, fingerprints), and visualization of ChIP-seq, RNA-seq, and ATAC-seq data through heatmaps/profiles (TSS, peaks).
matlab
MATLAB and GNU Octave for numerical computing, including matrix operations, data analysis, visualization, and scientific computing. Use for scripts involving linear algebra, signal/image processing, differential equations, optimization, statistics, or scientific visualizations, as well as for syntax help, functions, or conversions in MATLAB.
torch-geometric
PyTorch Geometric (PyG) is for graph neural networks, supporting node/link/graph classification, message passing (GCN, GAT, GraphSAGE, GIN), heterogeneous graphs, neighbor sampling, and custom datasets. Use it specifically with torch_geometric, not for general NetworkX analytics or non-graph PyTorch models.
cellxgene-census
Programmatically query the CELLxGENE Census (61M+ cells) to get expression data across tissues, diseases, or cell types from the largest curated single-cell atlas. It is best for population-scale queries and reference atlas comparisons.
dask
Distributed computing for larger-than-RAM pandas/NumPy workflows, scaling existing code beyond memory or across clusters. Ideal for parallel file processing, distributed ML, and integration with pandas.
glycoengineering
Analyze and engineer protein glycosylation. Scan sequences for N-glycosylation sequons (N-X-S/T), predict O-glycosylation hotspots, and access curated glycoengineering tools (NetOGlyc, GlycoShield, GlycoWorkbench) for glycoprotein engineering, therapeutic antibody optimization, and vaccine design.
imaging-data-commons
Query and download public cancer imaging data from NCI Imaging Data Commons using idc-index. Access large-scale radiology (CT, MR, PET) and pathology datasets for AI training or research without authentication, querying by metadata, visualizing in browser, and checking licenses.
deepchem
Molecular ML for property prediction (ADMET, toxicity) using diverse featurizers and pre-built datasets, supporting traditional ML or GNNs. It's excellent for quick experiments with pre-trained models and extensive featurization, often leveraging MoleculeNet benchmarks.
hugging-science
Hugging Science is a curated catalog of scientific datasets, models, blog posts, and interactive Spaces, designed for users engaged in AI/ML work across various scientific domains like biology, chemistry, and physics.
seaborn
Statistical visualization with pandas integration for quick exploration of distributions, relationships, and categorical comparisons. It's best for box plots, violin plots, pair plots, and heatmaps, built on matplotlib.
usfiscaldata
Query the U.S. Treasury Fiscal Data REST API for federal financial data, no API key required. Access national debt, Treasury statements, securities auctions, interest and foreign exchange rates, savings bonds, or U.S. government revenue and spending statistics.