Josef "Sepp" Hochreiter (born 14 February 1967) is a German
computer scientist. Since 2018 he has led the Institute for Machine Learning at the
Johannes Kepler University of
Linz after having led the Institute of Bioinformatics from 2006 to 2018. In 2017 he became the head of the Linz Institute of Technology (LIT) AI Lab. Hochreiter is also a founding director of the Institute of Advanced Research in Artificial Intelligence (IARAI).[1] Previously, he was at the
Technical University of
Berlin, at
University of Colorado Boulder, and at the
Technical University of
Munich. He is a chair of the Critical Assessment of Massive Data Analysis (CAMDA) conference.[2]
Hochreiter developed the
long short-term memory (LSTM) neural network architecture in his diploma thesis in 1991 leading to the main publication in 1997.[3][4] LSTM overcomes the problem of numerical instability in training
recurrent neural networks (RNNs) that prevents them from learning from long sequences (
vanishing or exploding gradient).[3][8][9] In 2007, Hochreiter and others successfully applied LSTM with an optimized architecture to very fast protein
homology detection without requiring a
sequence alignment.[10] LSTM networks have also been used in
Google Voice for transcription[11] and search,[12] and in the
Google Allo chat app for generating response suggestion with low latency.[13]
Other machine learning contributions
Beyond LSTM, Hochreiter has developed "Flat Minimum Search" to increase the
generalization of neural networks[14] and introduced
rectified factor networks (RFNs) for sparse coding[15][16]
which have been applied in bioinformatics and genetics.[17] Hochreiter introduced modern
Hopfield networks with continuous states[18] and applied them to the task of immune repertoire classification.[19]
Hochreiter worked with Jürgen Schmidhuber in the field of
reinforcement learning on actor-critic systems that learn by "backpropagation through a model".[6][20]
In 2006, Hochreiter and others proposed an extension of the
support vector machine (SVM), the "Potential Support Vector Machine" (PSVM),[24] which can be applied to non-square kernel matrices and can be used with kernels that are not positive definite. Hochreiter and his collaborators have applied PSVM to
feature selection, including gene selection for microarray data.[25][26][27]
Awards
Hochreiter was awarded the IEEE CIS Neural Networks Pioneer Prize in 2021 for his work on LSTM.[28]
^Arjona-Medina, J. A.; Gillhofer, M.; Widrich, M.; Unterthiner, T.; Hochreiter, S. (2018). "RUDDER: Return Decomposition for Delayed Rewards".
arXiv:1806.07857 [
cs.LG].
^Hochreiter, S. (1998). "The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions". International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems. 06 (2): 107–116.
doi:
10.1142/S0218488598000094.
ISSN0218-4885.
S2CID18452318.
^Hochreiter, S.; Bengio, Y.; Frasconi, P.; Schmidhuber, J. (2000). Kolen, J. F.; Kremer, S. C. (eds.). Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. A Field Guide to Dynamical Recurrent Networks. New York City: IEEE Press. pp. 237–244.
CiteSeerX10.1.1.24.7321.
^Clevert, D.-A.; Mayr, A.; Unterthiner, T.; Hochreiter, S. (2015). Rectified Factor Networks. Advances in Neural Information Processing Systems 29.
arXiv:1502.06464.
^Ramsauer, H.; Schäfl, B.; Lehner, J.; Seidl, P.; Widrich, M.; Gruber, L.; Holzleitner, M.; Pavlović, M.; Sandve, G. K.; Greiff, V.; Kreil, D.; Kopp, M.; Klambauer, G.; Brandstetter, J.; Hochreiter, S. (2020). "Hopfield Networks is All You Need".
arXiv:2008.02217 [
cs.NE].
^Widrich, M.; Schäfl, B.; Ramsauer, H.; Pavlović, M.; Gruber, L.; Holzleitner, M.; Brandstetter, J.; Sandve, G. K.; Greiff, V.; Hochreiter, S.; Klambauer, G. (2020). "Modern Hopfield Networks and Attention for Immune Repertoire Classification".
arXiv:2007.13505 [
cs.LG].
^Hochreiter, S.; Obermayer, K. (2006). Nonlinear Feature Selection with the Potential Support Vector Machine. Feature Extraction, Studies in Fuzziness and Soft Computing. pp. 419–438.
doi:
10.1007/978-3-540-35488-8_20.
ISBN978-3-540-35487-1.