Researchers from MIT and Dana-Farber Cancer Institute have unveiled a computational artificial intelligence (AI) model that analyses the sequence of about 400 genes to predict a tumour's origin within the body.
This development has the potential to transform treatment decisions for a specific subset of cancer patients where the primary tumour source remains elusive.
For some cancer patients, pinpointing the exact origin of their disease is a challenge, leading to difficulties in selecting the most apt treatment.
Moreover, certain cancer medications are specifically formulated for defined cancer types.
With the newly developed OncoNPC model, the research team demonstrated its ability to accurately determine the origin in 40% of previously indeterminate tumors within a sample set of 900 patients.
As a result, the potential pool of patients suitable for targeted treatments, based on the identified origin of their cancer, expanded by 2.2 times.
Personalised treatment for cancer
MIT graduate student and the study's lead author Intae Moon said: “That was the most important finding in our paper, that this model could be potentially used to aid treatment decisions, guiding doctors toward personalised treatments for patients with cancers of unknown primary origin.
"About 3 to 5% of all cancer patients, particularly those whose tumors have metastasised, are diagnosed with cancers of unknown primary (CUP)."
The significance of determining a tumour's origin lies in the efficacy of precision drugs, which are created for specific cancers and generally result in fewer side effects than more general treatments often prescribed to CUP patients.
AI model can help determine where a patient’s cancer arose: Predictions from the OncoNPC model could enable doctors to choose targeted treatments for difficult-to-treat tumors. https://t.co/Qy30gcjI02 pic.twitter.com/mpAxhkh3sg— Massachusetts Institute of Technology (MIT) (@MIT) August 8, 2023
What is the training data ?
Together with Alexander Gusev, an associate professor at Harvard Medical School and Dana-Farber, Moon and the team leveraged genetic data that Dana-Farber regularly collects.
This data set, which consisted of genetic sequences from around 400 genes frequently mutated in cancer, was used to train the OncoNPC model on information from nearly 30,000 patients diagnosed with one of 22 known cancer types.
The training data included patient samples from institutions like the Memorial Sloan Kettering Cancer Center, Vanderbilt-Ingram Cancer Center, and Dana-Farber.
Prediction rates
In its evaluation phase, the model showcased its ability to predict the origin of about 7,000 tumors with an 80% accuracy rate.
When it came to high-confidence predictions, which made up about 65% of the total, the model's accuracy soared to roughly 95%.
Upon analysing 900 CUP tumours, the model provided high-confidence predictions for 40% of these cases.
What's more
To further validate the model's predictions, they were juxtaposed against germline (inherited) mutations, with findings indicating a high alignment with the type of cancer most strongly suggested by these germline mutations.
Furthermore, a correlation was observed between CUP patient survival rates and the model's predictions, patients whose cancer types, as suggested by the model, generally have a grim prognosis exhibited shorter survival times.
Conversely, those with cancer types that usually have a better prognosis demonstrated longer survival durations.
Interestingly, the study identified that an additional 15% of the patients could have potentially benefited from targeted treatments had the origin of their cancer been determined earlier. Instead, they underwent more generic chemotherapy treatments.
Highlighting the potential real-world implications of these findings, Gusev mentioned: “This population can now be eligible for precision treatments that already exist.”
Looking forward, the research team aims to further refine the model by incorporating diverse data types, such as pathology and radiology images, aiming for an even more comprehensive tumor analysis.