In the United States, colorectal cancer is the second largest cause of cancer death, and accurate early detection and identification of high-risk patients is a high priority. Although fecal screening tests are available, the close relationship between colorectal cancer and the gut microbiome has generated considerable interest. We describe a machine learning method for gut microbiome data to assist in diagnosing colorectal cancer. Our methodology integrates feature engineering, mediation analysis, statistical modeling, and network analysis into a novel unified pipeline. Simulation results illustrate the value of the method in comparison to existing methods. For predicting colorectal cancer in two real datasets, this pipeline showed an 8.7% higher prediction accuracy and 13% higher area under the receiver operator characteristic curve than other published work. Additionally, the approach highlights important colorectal cancer-related taxa for prioritization, such as high levels of Bacteroides fragilis, which can help elucidate disease pathology.
Dr.Zhou is a tenured faculty member and member in Biological Sciences and Statistics at N.C. State University, USA, where she was appointed as part of the Chancellor’s Faculty of Excellence Program. She has an Associate Director role in the Bioinformatics Research Center, which has 19 faculty spanning multiple colleges and departments, including Computer Science, Biological Sciences, and Statistics. She was Associate Editor for Biometrics 2018 – 2021 and is an active A.E. for Biostatistics. Her current research interests include machine learning, data science, causal inference, precision medicine, and environmental health sciences. Dr. Zhou created and received funding for the High Dimensional Predictive Biology Lab, and her team has developed numerous software packages that significantly impact the fields of public health and biostatistics.