Case Studies
Featured Projects
Real-world bioinformatics solutions delivered for biotech, pharma, and academic clients. All projects follow best practices for reproducibility, documentation, and scientific rigor.
𧬠113-Gene Classifier for Lung Adenocarcinoma
Client Challenge
Roche/Genentech needed to identify patient subgroups in lung adenocarcinoma who would respond to MEK inhibitors. Over 800 patient samples across multiple clinical trials (TCGA, OAK, POPLAR) required comprehensive transcriptional subtyping.
Our Solution
Developed a 113-gene PAM (Prediction Analysis for Microarrays) classifier using consensus nonnegative matrix factorization (NMF) to identify three distinct molecular subtypes (MUC, PRO, MES) with differential drug response.
Technical Approach
Data Integration:
- Harmonized >800 patient samples across TCGA, OAK (Phase III), POPLAR, and gp28363 trials
- Batch effect correction and normalization
- Multi-cohort validation design
Analysis Methods:
- Consensus NMF for subtype discovery
- PAM classifier with 113-gene signature
- Cross-validation with independent cohorts (87-91% accuracy)
- Survival analysis (Cox proportional hazards)
- Gene set enrichment (Camera + MSigDB Hallmark)
- Drug response modeling (526 compounds, 89 cell lines)
Validation:
- 175 cell lines, 232 PDX models, 108 GEMM tumors
- Cross-cohort validation (4-fold larger validation set)
- Independent clinical trial testing
Results & Impact
β
Published in Clinical Cancer Research (2021) β High-impact journal
β
87-91% accuracy across independent validation cohorts
β
Identified MES subtype as primary beneficiary of MEK inhibition
β
Clinical utility for patient stratification in future trials
β
Open source code available on GitHub
Technologies: R/Bioconductor, NMF, PAM, limma, survfit, ggplot2, ComplexHeatmap
π¬ Genome-Wide CRISPR Screen Analysis Platform
Client Challenge
Identify genes involved in tumor immune evasion using a genome-wide CRISPR knockout screen with ~160,000 sgRNAs. Needed robust analysis pipeline for hit calling, validation, and mechanism characterization.
Our Solution
Comprehensive CRISPR screen analysis pipeline using crisprVerse workflow with TMM normalization on non-essential genes, limma-voom for differential abundance, and integration with ChIP-seq and RNA-seq data.
Technical Approach
Screen Analysis:
- ~160,000 sgRNAs processed (screenCounter)
- TMM normalization on non-essential genes
- Robust empirical Bayes statistics (limma-voom)
- FDR correction for multiple testing
- Hit prioritization based on fold-change and statistical significance
Mechanistic Validation:
- RNA-seq analysis (HTSeqGenie pipeline)
- ChIP-seq integration (public GEO data: GSE102616)
- Pathway enrichment (MSigDB Hallmark)
- TCGA survival modeling (Cox proportional hazards)
- Clinical correlation analysis
Results & Impact
β
Published in iScience (2025)
β
Identified ZFX transcription factor as key regulator of immune evasion
β
Biomarker discovery for anti-PD-1 response prediction
β
Mechanistic insights into apoptosis pathway regulation
β
Public data available (GEO: GSE276964, GSE276965)
Technologies: crisprVerse, screenCounter, limma-voom, edgeR, BWA, TCGA integration
π§« Single-Cell RNA-seq Pipeline for CRISPR Screens
Client Challenge
Academic and pharma clients needed reproducible, automated pipeline for processing single-cell CRISPR perturbation screens, generating ML-ready datasets with proper quality control and normalization.
Our Solution
Developed production-grade Snakemake pipeline automating the entire workflow from FASTQ to analysis-ready perturbation datasets. The pipeline implements comprehensive QC, normalization using Scanpy, and balanced class sampling for downstream machine learning applications.
Technical Approach
Pipeline Features:
- Automated quality control and filtering
- Scanpy-based normalization workflow
- Balanced control/perturbation group sampling
- Integration of multiple samples and batches
- Dimensionality reduction (PCA, UMAP)
- Cell type annotation support
- Multiple output formats (CSV, h5ad, AnnData)
- ML-ready feature matrices
Reproducibility:
- Complete Docker containerization
- Snakemake workflow management
- Comprehensive configuration files
- Automated Quarto report generation
- Version-controlled codebase (GitHub)
- Continuous integration testing (GitHub Actions)
Key Technologies:
- Snakemake for workflow orchestration
- Python/Scanpy for scRNA-seq analysis
- Docker for reproducible environments
- Quarto for dynamic documentation
- GitHub Actions for CI/CD
Results & Impact
β
Open source on GitHub
β
Fully documented with comprehensive guide
β
Production-ready for multiple client projects
β
Scalable from hundreds to tens of thousands of cells
β
Reproducible across HPC and cloud environments
β
Flexible configuration for different experimental designs
Technologies: Snakemake, Python, Scanpy, Docker, Quarto, GitHub Actions
π GitHub Repository π Documentation
π NK Cell Activation Analysis for Bispecific Antibody
Client Challenge
Characterize NK cell activation mechanisms following T cell-dependent bispecific antibody (TDB) treatment using RNA-seq, flow cytometry, and multiplex cytokine analysis.
Our Solution
Integrated multi-modal analysis combining RNA-seq (Smart-Seq V4), flow cytometry, and Luminex cytokine profiling to identify IFN, TNF, and IL2/IL10 signaling axes driving NK cell activation.
Technical Approach
RNA-seq Analysis:
- Smart-Seq V4 ultra-low input (2ng total RNA)
- HTSeqGenie processing pipeline
- edgeR (logCPM) and voom/limma differential expression
- Baseline vs 24h TDB treatment comparison
Pathway Analysis:
- fgsea with MSigDB collections (c2, c5, c7)
- IFN, TNF, interleukin pathway enrichment
- Cytokine receptor signaling analysis
Data Integration:
- Flow cytometry gating and quantification
- Luminex multiplex cytokine data (17-plex panel)
- Correlation analysis across data types
Results & Impact
β
Published in Cancer Immunology Research (2024)
β
Identified IL2/IL10 axis as key NK activation mechanism
β
Enhanced ADCC observed in vitro and in vivo
β
Mechanistic understanding for TDB development
Technologies: HTSeqGenie, edgeR, limma-voom, fgsea, FlowJo, Luminex
π₯ Multi-Omics Integration for Tumor Microenvironment
Client Challenge
Understand KRAS-driven lung cancer immune escape mechanisms using integrated analysis of bulk RNA-seq, single-cell RNA-seq, and whole exome sequencing (WES).
Our Solution
Multi-modal analysis combining temporal bulk RNA-seq (8w, 12w, endpoint), 10X Genomics scRNA-seq (40,000 cells, 22 clusters), and WES (COSMIC signature analysis) to map immune evasion pathways.
Technical Approach
Bulk RNA-seq:
- Multiple timepoints (longitudinal analysis)
- Metacore GeneGO pathway analysis
- Differential expression across treatment groups
Single-Cell Analysis:
- 10X Genomics Chromium platform
- 22 clusters identified using ImmGen databrowser
- PANTHER overrepresentation analysis
- EGFR/ERBB signaling pathway enrichment (33.5x, FDR 1.06e-2)
WES Integration:
- COSMIC signature analysis (APOBEC mutagenesis)
- Mutation burden assessment
- Driver mutation identification
Results & Impact
β
Preprint on bioRxiv (2023)
β
Identified ERBB ligand upregulation (AREG, HBEGF) as immune escape mechanism
β
Afatinib restores immune infiltration post-Ξ±PD-1 resistance
β
Therapeutic strategy for combination therapy
Technologies: 10X Genomics, Seurat, ImmGen, PANTHER, GATK, COSMIC
π¦ R Package Development for Pharma Workflows
Client Challenge
Internal pharma teams needed standardized, documented R package for Fluidigm qPCR data analysis with comprehensive testing and version control.
Our Solution
Developed production-grade R package following CRAN/Bioconductor standards with comprehensive documentation (roxygen2), unit tests (testthat >80% coverage), and pkgdown website.
Technical Approach
Package Development:
- devtools/usethis workflow
- roxygen2 documentation for all functions
- testthat unit tests (>80% coverage)
- Vignettes with worked examples
- S3/S4 class systems for data structures
- Error handling and input validation
Documentation:
- pkgdown website with GitHub Pages
- Comprehensive function reference
- Multiple vignettes (basic usage, advanced workflows)
- NEWS.md for version tracking
Quality Assurance:
- R CMD check passes with no errors/warnings
- Continuous integration (GitHub Actions)
- Code coverage monitoring
- CRAN standards compliance
Results & Impact
β
Internal deployment at Roche/Genentech
β
Standardized workflows across analysis teams
β
Reduced analysis time through automation
β
Comprehensive documentation reducing onboarding time
β
Maintainable codebase with unit tests
Technologies: R, devtools, testthat, roxygen2, pkgdown, GitHub Actions
π‘ Interested in Similar Projects?
We can apply these same methodologies and best practices to your research:
𧬠NGS Analysis β RNA-seq, scRNA-seq, ChIP-seq, CRISPR screens, WGS/WES
π¦ R Package Development β Custom packages with testing and documentation
βοΈ Pipeline Development β Snakemake/Nextflow with Docker containers
π€ AI Integration β ML-ready datasets and predictive modeling
π Multi-Omics Analysis β Integrated analysis of multiple data types
π Research Output
All case studies above resulted in peer-reviewed publications demonstrating scientific rigor and impact.
π View Full Publication List
π Ready to Start Your Project?
Letβs discuss how we can apply our expertise to your bioinformatics challenges.
Email: kontakt@actn3.pl
Response Time: Within 24 hours