MIT researchers are employing novel machine learning techniques to improve the quality of life for patients by reducing toxic chemotherapy and radiotherapy dosing for glioblastoma, the most aggressive form of brain cancer. Patients must endure a combination of radiation therapy and multiple drugs taken every month. Medical professionals generally administer maximum safe drug doses to shrink the tumor as much as possible. But these strong pharmaceuticals still cause debilitating side effects in patients.

In a paper, MIT Media Lab researchers detail a model that could make dosing regimens less toxic but still effective. Powered by a “self-learning” machine learning technique, the model looks at treatment regimens currently in use, and iteratively adjusts the doses. Eventually, it finds an optimal treatment plan, with the lowest possible potency and frequency of doses that should still reduce tumor sizes to a degree comparable to that of traditional regimens. “We kept the goal, where we have to help patients by reducing tumor sizes but, at the same time, we want to make sure the quality of life — the dosing toxicity — doesn’t lead to overwhelming sickness and harmful side effects,” says Pratik Shah, a principal investigator at the Media Lab who supervised this research.

Rewarding good choices

The researchers’ model uses a technique called reinforced learning (RL), a method inspired by behavioral psychology, in which a model learns to favor certain behavior that leads to a desired outcome. The technique comprises artificially intelligent “agents” that complete “actions” in an unpredictable, complex environment to reach a desired “outcome.” Whenever it completes an action, the agent receives a “reward” or “penalty,” depending on whether the action works toward the outcome. Then, the agent adjusts its actions accordingly to achieve that outcome.

The researchers adapted an RL model for glioblastoma treatments that use a combination of the drugs temozolomide (TMZ) and procarbazine, lomustine, and vincristine (PVC), administered over weeks or months. The model’s agent combs through traditionally administered regimens. These regimens are based on protocols that have been used clinically for decades and are based on animal testing and various clinical trials. Oncologists use these established protocols to predict how much doses to give patients based on weight.

The researchers also had to make sure the model doesn’t just dish out a maximum number and potency of doses. Whenever the model chooses to administer all full doses, therefore, it gets penalized, so instead chooses fewer, smaller doses. “If all we want to do is reduce the mean tumor diameter, and let it take whatever actions it wants, it will administer drugs irresponsibly,” Shah says. “Instead, we said, ‘We need to reduce the harmful actions it takes to get to that outcome.'”

Optimal regimens

The researchers designed the model to treat each patient individually, as well as in a single cohort, and achieved similar results (medical data for each patient was available to the researchers). Traditionally, a same dosing regimen is applied to groups of patients, but differences in tumor size, medical histories, genetic profiles, and biomarkers can all change how a patient is treated. These variables are not considered during traditional clinical trial designs and other treatments, often leading to poor responses to therapy in large populations, Shah says.

“We said [to the model], ‘Do you have to administer the same dose for all the patients? And it said, ‘No. I can give a quarter dose to this person, half to this person, and maybe we skip a dose for this person.’ That was the most exciting part of this work, where we are able to generate precision medicine-based treatments by conducting one-person trials using unorthodox machine learning architectures,” Shah says.

Source: Massachusetts Institute of Technology