The Unbearable Shallow Understanding of Deep Learning

Plebe, Alessio; Grasso, Giorgio

doi:10.1007/s11023-019-09512-8

This paper analyzes the rapid and unexpected rise of deep learning within Artificial Intelligence and its applications. It tackles the possible reasons for this remarkable success, providing candidate paths towards a satisfactory explanation of why it works so well, at least in some domains. A historical account is given for the ups and downs, which have characterized neural networks research and its evolution from “shallow” to “deep” learning architectures. A precise account of “success” is given, in order to sieve out aspects pertaining to marketing or sociology of research, and the remaining aspects seem to certify a genuine value of deep learning, calling for explanation. The alleged two main propelling factors for deep learning, namely computing hardware performance and neuroscience findings, are scrutinized, and evaluated as relevant but insufficient for a comprehensive explanation. We review various attempts that have been made to provide mathematical foundations able to justify the efficiency of deep learning, and we deem this is the most promising road to follow, even if the current achievements are too scattered and relevant for very limited classes of deep neural models. The authors’ take is that most of what can explain the very nature of why deep learning works at all and even very well across so many domains of application is still to be understood and further research, which addresses the theoretical foundation of artificial learning, is still very much needed.