Abstract
TITLE: A Method to Identify At-Risk Individuals When They Have Developed Colorectal Cancer by Machine Learning Analysis of Their Multiple Longitudinal Self-Reported Symptoms
ABSTRACT: Background. Surveillance of at-risk individuals and identification of those who have developed cancer could improve the treatment success and reduce mortality. Colorectal cancer (CRC) is the second most common cause of death worldwide. Aim. The purpose of this project is to demonstrate a new noninvasive surveillance regimen that could detect when individuals at-risk for colorectal cancer have developed early-stage colorectal cancer. This heuristic example is intended to encourage national health systems to begin collecting such surveillance data. Method. At-risk Individuals could include military veterans who were exposed to toxic environmental hazards and non-veterans who have a higher risk than the general population, as determined by the online Colorectal Cancer Risk Assessment Tool. Individuals would be instructed to self-report the frequency of the occurrence of multiple noninvasive symptoms each week. The recorded longitudinal data would be analyzed every two weeks by the machine learning classifier multiple logistic regression. Results. The frequency of occurrence of multiple symptoms in individuals would be compared with that of a reference population to stratify the individuals into two classes: those who have developed early-stage colorectal cancer with a high probability and those who have not. Individuals with a high probability of having developed early-stage colorectal cancer would be alerted and advised to seek medical attention. Conclusion.A noninvasive surveillance procedure is described that could alert at-risk individuals when they have a high probability of developing early-stage colorectal cancer. This is based upon the number of their self-reported multiple symptoms analyzed every two weeks using multiple logistic regression.