Skip to main content

On the ethics of machine learning applications in clinical neuroscience

By Philipp Kellmeyer

Dr. med. Philipp Kellmeyer, M.D., M.Phil. (Cantab) is a board-certified neurologist working as postdoctoral researcher in the Intracranial EEG and Brain Imaging group at the University of Freiburg Medical Center, German. His current projects include the preparation of a clinical trial for using a wireless brain-computer interface to restore communication in severely paralyzed patients. In neuroethics, he works on ethical issues of emerging neurotechnologies. He is a member of the Rapid Action Task Force of the International Neuroethics Society and the Advisory Committee of the Neuroethics Network.

What is machine learning, you ask? 
As a brief working definition up front: machine learning refers to software that can learn from experience and is thus particularly good at extracting knowledge from data and for generating predictions [1]. Recently, one particularly powerful variant called deep learning has become the staple of much of recent progress (and hype) in applied machine learning. Deep learning uses biologically inspired artificial neural networks with many processing stages (hence the word “deep”). These deep networks, together with the ever-growing computing power and larger datasets for learning, now deliver groundbreaking performances at many tasks. For example, Google’s AlphaGo program that comprehensively beat a Go champion in January 2016 uses deep learning algorithms for reinforcement learning (analyzing 30 million Go moves and playing against itself). Despite these spectacular (and media-friendly) successes, however, the interaction between humans and algorithms may also go badly awry.

The software engineers who designed ‘Tay,’ the chatbot based on machine learning, for instance, surely had high hopes that it may hold its own on Twitter’s unforgiving world of high-density human microblogging. Soon, however, these hopes turned to dust when – seemingly coordinated – interactions between Twitter users and Tay turned the ideologically blank slate of a program into a foul display of racist and sexist tweets [2].

Image courtesy Wikimedia

These examples reflect diverse efforts to create more and more “use-cases” for machine learning such as predictive policing (using machine learning to proactively identify potential offenders) [3], earthquake prediction [4], self-driving vehicles [5], autonomous weapons systems [6], or even for creative purposes like the composition of Beatles-like songs or lyrics. Here, I focus on some aspects of machine learning applications in clinical neuroscience that, in my opinion, warrant particular scrutiny.

Machine learning applications in clinical neuroscience 
In recent years, leveraging computational methods for the modeling of disorders has become a particularly fruitful strategy for research in neurology and psychiatry [7], [8].  In clinical neuroimaging, for example, machine learning algorithms have been shown to detect morphological brain changes typical of Alzheimer’s dementia [9], identify brain tumor types and grades [10], predict language outcome after stroke [11], or distinguish typical from atypical Parkinson’s syndromes [12]. In psychiatric research, examples for applying machine learning are the prediction of outcomes in psychosis, [13] and the persistence and severity of depressive symptoms.14 More generally, most current applications follow one of the following rationales: (1) to distinguish between healthy and pathological tissue in images, (2) to distinguish between different variants of conditions, (3) to make predictions on the outcome of particular conditions. While these are potentially helpful tools for assisting doctors in clinical decision-making, they are not a routinely used in clinics yet. It is safe to predict, however, that machine learning based programs for automated image processing, diagnosis, and outcome prediction will play a significant role in the near future.

Some of the ethical challenges 
One area in which intelligent systems may create ethical challenges is their impact on autonomy and accountability of clinical decision-making. As long as machine learning software for computer-aided diagnosis merely assist radiologists and the clinician keeps the authority over clinical decision-making, it would seem that there is no profound conflict between autonomy and accountability. If, on the other hand, decision-making was to be relegated to the intelligent system, to any degree whatsoever, we may indeed face the problem of an “accountability gap” [15]. After all, who (or what) would need to be held accountable in the case of a grave system error resulting in misdiagnosis: the software engineer, the company or the regulatory body that allowed the software to enter the clinic?

Image courtesy of Vimeo

Another problem may arise from the potential for malicious exploitation of an adaptive, initially “blank”, machine learning algorithm – as in the case of Tay, the chatbot. A machine learning software in its initial, untrained state would perhaps be particularly vulnerable for exploitation by interacting users with malicious intents. Nevertheless, it still requires some leap of the imagination to go from collectively trolling a chatbot to become racist or sexist, to scenarios referred to as “neurohacking” in which hackers viciously exploit computational weaknesses of neurotechnological devices for improper purposes. Despite this potential for misuse, the adaptiveness of modern machine learning software may, with appropriate political oversight and regulation, work in favor of developing programs that are capable of ethically sound decision-making.

While intelligent systems based on machine learning software perform increasingly more complex tasks, designing a “moral machine” [16]  (also see previous discussion on blog here)- a computer program with a conscience if you will – alas remains elusive. A rigid set of algorithms will most likely perform poorly in the face of uncertainty, in ethically ambiguous or conflicting scenarios, and will not improve its behavior through its experiences. From an optimistic point of view, the “innate” learning capabilities of machine learning may enable software to develop ethically responsible behavior if given appropriate data sets for learning. For example, having responsible and professionally trained humans interact and train with intelligent systems – “digital parenting” – may enhance the moral conduct of machine learning software and immunize it against misuse [17].

While the limited scope here precludes an in-depth reconstruction of this debate, I encourage you to ponder how the extent of and relationship between autonomy, intentionality, and accountability, when exhibited by an intelligent system, may influence our inclination to consider it a moral agent. Meanwhile, one interesting ancillary benefit that arises from this increasing interest in teaching ethics to machines is that we study the principles of human moral reasoning and decision-making much more intensely [18].

 Suggestions for political regulation and oversight of machine learning software 
To prevent maladaptive system behavior and malicious interference, close regulatory legislation and oversight is necessary which appreciates the complexities of machine learning applications in medical neuroscience. In analogy to ethical codes for the development of robotic systems – the concept of “responsible robotics” [19] – I would emphasize the need for such an ethical framework to include non-embodied software – “responsible algorithmics,” if you will. From the policy-making perspective, the extent of regulatory involvement in developing intelligent systems for medical applications should be proportionate to the degree of autonomous system behavior and potential harm caused by these systems. We may also consider whether the regulatory review process for novel medical applications based on machine learning should include a specialized commission containing experts in clinical medicine, data and computer science, engineering, and medical ethics.

Instead of merely remaining playful children of the Internet age we may eventually grow up to become “digital parents”, teaching intelligent systems to behave responsibly and ethically – just as we would with our actual children.

I thank Julia Turan (Science Communicator, London, @JuliaTuran) and the editors of The Neuroethics Blog for valuable discussions of the text and editing. I also thank Robin Schirrmeister (Department of Computer Science, University of Freiburg) for clarifications and discussions on machine learning. Remaining factual and conceptual shortcomings are thus entirely my own.


1. Russell, S. & Norvig, P. Artificial Intelligence: A Modern Approach. (Prentice Hall, 2013).

2. Staff & agencies. Microsoft ‘deeply sorry’ for racist and sexist tweets by AI chatbot. The Guardian (2016).

3. Lartey, J. Predictive policing practices labeled as ‘flawed’ by civil rights coalition. The Guardian (2016).

4. Adeli, H. & Panakkat, A. A probabilistic neural network for earthquake magnitude prediction. Neural Netw. 22, 1018-1024 (2009).

5. Surden, H. & Williams, M.-A. Technological Opacity, Predictability, and Self-Driving Cars. (Social Science Research Network, 2016).

6. Thurnher, J. S. in Targeting: The Challenges of Modern Warfare (eds. Ducheine, P. A. L., Schmitt, M. N. & Osinga, F. P. B.) 177-199 (T.M.C. Asser Press, 2016).

7. Maia, T. V. & Frank, M. J. From reinforcement learning models to psychiatric and neurological disorders. Nat. Neurosci. 14, 154-162 (2011).

8. Fletcher, P. C. & Frith, C. D. Perceiving is believing: a Bayesian approach to explaining the positive symptoms of schizophrenia. Nat. Rev. Neurosci. 10, 48-58 (2009).

9. Li, S. et al. Hippocampal Shape Analysis of Alzheimer Disease Based on Machine Learning Methods. Am. J. Neuroradiol. 28, 1339-1345 (2007).

10. Zacharaki, E. I. et al. Classification of brain tumor type and grade using MRI texture and shape in a machine learning scheme. Magn. Reson. Med. 62, 1609-1618 (2009).

11. Saur, D. et al. Early functional magnetic resonance imaging activations predict language outcome after stroke. Brain 133, 1252-1264 (2010).

12. Salvatore, C. et al. Machine learning on brain MRI data for differential diagnosis of Parkinson’s disease and Progressive Supranuclear Palsy. J. Neurosci. Methods 222, 230-237 (2014).

13. Young, J., Kempton, M. J. & McGuire, P. Using machine learning to predict outcomes in psychosis. Lancet Psychiatry 3, 908-909 (2016).

14. Kessler, R. C. et al. Testing a machine-learning algorithm to predict the persistence and severity of major depressive disorder from baseline self-reports. Mol. Psychiatry 21, 1366-1371 (2016).

15. Kellmeyer, P. et al. Effects of closed-loop medical devices on the autonomy and accountability of persons and systems. Camb. Q. Healthc. Ethics (2016).

16. Wallach, W. & Allen, C. Moral Machines: Teaching Robots Right from Wrong. (Oxford University Press, 2008).

17. Floridi, L. & Sanders, J. W. On the Morality of Artificial Agents. Minds Mach. 14, 349-379.

18. Skalko, J. & Cherry, M. J. Bioethics and Moral Agency: On Autonomy and Moral Responsibility. J. Med. Philos. 41, 435-443 (2016).

19. Murphy, R. R. & Woods, D. D. Beyond Asimov: The Three Laws of Responsible Robotics. IEEE Intell. Syst. 24, 14-20 (2009).

Want to cite this post?

Kellmeyer, P. (2016). On the ethics of machine learning applications in clinical neuroscience. The Neuroethics Blog. Retrieved on , from


  1. I like the fact people begin to use machine learning in different areas of our life. It's very powerful technology with a big potential! Thanks for this information!


Post a Comment

Emory Neuroethics on Facebook