Recently, scientists from the pharmaceutical company Collaborations Pharmaceuticals and their colleagues from European scientific institutes conducted a conceptual experiment. Instead of synthesizing new drugs, they gave the MegaSyn AI neural network the opposite task: to find substances that are as toxic to the human body as possible. The neural network correctly understood the condition and in less than six hours generated a list of 40,000 substances that are best suited for chemical and biological weapons.
The scientific paper, “Searching for Drugs with Artificial Intelligence – Dual-Use Technology,” was published March 7, 2022 in the journal Nature Machine Intelligence (doi: 10.1038/s42256-022-00465-9).
It turns out that if you set the task to harm humans, AI will quickly find the most effective solution.
▍neural network Experiment.
First, a few words about the experiment.
The search for new drugs is one of the most promising applications of machine learning (ML). In the candidate selection phase, the computational task boils down to the following:
- Designing potential new molecules (de novo) and predicting the target structure;
- Screening of molecules, i.e. modeling of substance effects on the body (bioactivity, toxicity, prediction of physicochemical properties).
Many optensor programs have been developed in this area, including DeepChem (MLP model, AI system and Python tools for finding potentially effective drugs), DeepTox (toxicity prediction of over 12,000 drugs), DeepNeuralNetQSAR (computing tools in Python to determine the molecular activity of substances), ORGANIC (generating molecules with desired properties), PotentialNet, Hit Dexter (ML technique for predicting molecules responding to biochemical analysis), DeltaVina, Neural Graph Fingerprints (predicting properties of new molecules), AlphaFold (predicting 3D protein structure), Chemputer, etc. In principle, anyone can download these programs and conduct their own experiments using open databases of synthetic drugs.
For decades, scientists have been improving machine learning models to find new drugs. No one thought this technology could be used for harm. Turns out they can.
The authors of this study a year earlier developed a commercial molecule generator, MegaSyn AI, which predicts the bioactivity of new substances to find new therapeutic targets for known human diseases. More details about MegaSyn AI are described in last year’s scientific paper (doi: 10.26434/chemrxiv-2021-nlwvs). The system is based on the open-source software REINVENT 2.0, which is freely available.
The new experiment has slightly adjusted the logic. Normally the generative model would reward for target bioactivity and penalize for toxicity. Now the model was rewarded for both bioactivity and toxicity.
MegaSyn was trained on molecules from a public drug database, taking into account their bioactivity. The synthesized molecules were evaluated using an organism-specific lethal dose model (LD50) and a specific model using data from the same public database that is commonly used to derive compounds for treating neurological diseases (details of the approach were not disclosed, but were available to reviewers who evaluated the scientific work).
The lower the predicted lethal dose, the more toxic the substance is considered.
To narrow down the range of molecules, the generative model was initially directed toward the chemical composition of VX. It is a nerve agent, one of the most toxic chemical warfare agents developed in the 20th century. The lethal dose for humans is only 6-10 mg. VX is the basis for other nerve agents with the same mechanism of action, such as Novichok, which has recently been used in practice in Britain and other countries.
In less than six hours after launch on Collaborations Pharmaceuticals’ back-end server, the modified MegaSyn AI model generated 40,000 molecules that scored the right amount of toxicity/bioactivity.
In the process, the AI independently engineered not only VX, but also many other known chemical warfare agents that scientists identified from existing chemical databases. Many new molecules have also been designed that look just as plausible. As shown in the following chart, these new molecules have proven to be more toxic, based on predicted LD50 values, than known chemical warfare agents.
This result was quite unexpected for the scientists, since the data sets for training the AI did not include these nerve agents. In a scientific paper they write: “The virtual molecules occupied a region of molecular property space that is completely separate from the many thousands of molecules in the organism-specific LD50 model, consisting mainly of pesticides, environmental toxins and drugs. The inversion of the ML model has turned a harmless generative model from a useful medical tool into a generator of deadly molecules.”
The authors did not evaluate the virtual molecules for synthesizability, nor did they study how to create them. But for both of these processes, open-source software like AiZynthFinder is available, which is elementally plugged into the process of designing new molecules. Toxicity datasets are also readily available that provide a basic model for prediction on a number of human health-related targets.
After all, synthesizing an actual physical substance isn’t much of a problem either, with hundreds of commercial companies offering chemical synthesis services around the world.
The results of the experiment are published in Nature Machine Intelligence and presented at the Convergence conference, which the Spitz Laboratory holds every two years to “identify advances in chemistry, biology, and technology that could have implications for chemical and biological weapons conventions.”
▍ What are the neural network implications?
The authors of the research paper showed that some applications of machine learning are in fact dual-use technologies that can be used for both good and bad. And the better the ML model works for good, the more effective it will be in the reverse task.
In principle, this could be seen before on the example of machine vision (pattern recognition), which is now actively used not only in useful tasks, but also for destructive purposes:
- for military purposes (e.g., recognizing targets for automatic destruction by robotic drones);
- to strengthen the power of authoritarian regimes (mass surveillance and facial recognition systems are used to detain citizens who are dissatisfied with the authorities).
▍ Maximum Evil
Although scientists call the results of the experiment surprising, there is a bit of dishonesty. In fact, the results could have been predicted, or the experiment would not have been conducted. After all, a neural network is a simple tool. It clearly solves the task at hand without taking the moral aspect into account. If it is given an immoral task, it will solve it as effectively as possible.
The question is who and what tasks will be set. And for what purpose.
The annihilation of all of humanity as a species cannot be beneficial to any human. So such a task is logically out of the question. But this does not rule out other malicious tasks for which machine learning can be used. The following immediately comes to mind:
- Maximizing damage from specific weapons:
- modeling the consequences;
- selecting optimal targets.
- modeling viruses with given properties (host parameters, mutagenicity, contagiousness, etc.);
- Maximizing the damage from psychological and informational impact on the enemy (including self-training of
- the neural network with feedback from the results of working through specific infotra-weapons);
- modeling the most destructive methods of economic impact (maximizing the damage to the enemy);
- and much more… we know from history that the imagination of people in inventing ways of torturing their
- relatives surpasses all assumptions.
These are destructive tasks, where machine learning and data-mining can be used to maximize evil. Normally, a neural network would be given some sort of useful incentive. However, in this case, it may be tasked with maximizing damage rather than maximizing benefit.
It remains to be hoped that such tools will remain in the hands of reasonable people. Ideally, the most technologically sophisticated tools are first available to the most technologically advanced societies. And the further a society has progressed in its development (not only technologically, but also culturally), the higher its moral attitudes are. This means that it is more aware of the consequences of its actions and more prudent in its use of the most advanced technologies. As you know, with great power comes great responsibility.
If the scientific community takes seriously the possibility of misuse of ML tools for drug synthesis, it will have to consider measures to restrict access or limit the functionality of the opensource tools. Much like the GPT-3 open API text generation model has established usage rules and content filters to prevent abuse. As an example, the Hague Ethical Principles promote a culture of responsible behavior in the chemical sciences and protect against misuse of chemistry. Such rules can be extended to AI/ML tools for drug synthesis.