Restricting datasets to classifiable samples augments discovery of immune disease biomarkers

Glehr, Gunther and Riquelme, Paloma and Kronenberg, Katharina and Lohmayer, Robert and Lopez-Madrona, Victor J. and Kapinsky, Michael and Schlitt, Hans J. and Geissler, Edward K. and Spang, Rainer and Haferkamp, Sebastian and Hutchinson, James A. (2024) Restricting datasets to classifiable samples augments discovery of immune disease biomarkers. NATURE COMMUNICATIONS, 15 (1): 5417. ISSN , 2041-1723

Full text not available from this repository. (Request a copy)

Abstract

Immunological diseases are typically heterogeneous in clinical presentation, severity and response to therapy. Biomarkers of immune diseases often reflect this variability, especially compared to their regulated behaviour in health. This leads to a common difficulty that frustrates biomarker discovery and interpretation - namely, unequal dispersion of immune disease biomarker expression between patient classes necessarily limits a biomarker's informative range. To solve this problem, we introduce dataset restriction, a procedure that splits datasets into classifiable and unclassifiable samples. Applied to synthetic flow cytometry data, restriction identifies biomarkers that are otherwise disregarded. In advanced melanoma, restriction finds biomarkers of immune-related adverse event risk after immunotherapy and enables us to build multivariate models that accurately predict immunotherapy-related hepatitis. Hence, dataset restriction augments discovery of immune disease biomarkers, increases predictive certainty for classifiable samples and improves multivariate models incorporating biomarkers with a limited informative range. This principle can be directly extended to any classification task. Immune disease-associated biomarker values are commonly more variable in affected compared to unaffected patient populations, which limits a biomarker's informative range. Here, the authors formalise a computational solution that splits datasets into informative and uninformative subsets to improve biomarker discovery and performance of multivariate predictive models.

Item Type: Article
Uncontrolled Keywords: FLOW-CYTOMETRY; PARTIAL AUC; CURVES; TOOLS; AREA; SELF;
Subjects: 600 Technology > 610 Medical sciences Medicine
Divisions: Medicine > Lehrstuhl für Chirurgie
Medicine > Lehrstuhl für Dermatologie und Venerologie
Medicine > Institut für Funktionelle Genomik > Lehrstuhl für Statistische Bioinformatik (Prof. Spang)
Informatics and Data Science > Department Computational Life Science > Lehrstuhl für Statistische Bioinformatik (Prof. Spang)
Depositing User: Dr. Gernot Deinzer
Date Deposited: 24 Jul 2025 07:32
Last Modified: 24 Jul 2025 07:32
URI: https://pred.uni-regensburg.de/id/eprint/63578

Actions (login required)

View Item View Item