In this lecture, we will look at the problem of modeling sequences with neural networks. We will first see recurrent neural networks and then move on to their a commonly used variant – namely, the long-short term memory (LSTM) network.
Lectures and readings
Readings
-
The chapter titled RNNs and LSTMs in Jurafsky and Martin’s textbook.
-
Chapter 14 of Yoav Goldberg, Neural network methods for natural language processing. Synthesis Lectures on Human Language Technologies. 2017 Apr 17;10(1):1-309.
Papers
-
Elman, Jeffrey L. “Finding structure in time.” Cognitive science 14, no. 2 (1990): 179-211.
-
Pearlmutter, Barak A. “Gradient calculations for dynamic recurrent neural networks: A survey.” IEEE Transactions on Neural networks 6, no. 5 (1995): 1212-1228.
-
Hochreiter, Sepp, and Jürgen Schmidhuber. “Long short-term memory.” Neural computation 9, no. 8 (1997): 1735-1780.
-
Schuster, Mike, and Kuldip K. Paliwal. “Bidirectional recurrent neural networks.” IEEE Transactions on Signal Processing 45, no. 11 (1997): 2673-2681.
-
Lipton, Zachary C., John Berkowitz, and Charles Elkan. “A critical review of recurrent neural networks for sequence learning.” arXiv preprint arXiv:1506.00019 (2015).
-
(*) Chung, Junyoung, Caglar Gulcehre, Kyunghyun Cho, and Yoshua Bengio. “Gated feedback recurrent neural networks.” In International Conference on Machine Learning, pp. 2067-2075. 2015.
-
(*) Greff, Klaus, Rupesh K. Srivastava, Jan Koutník, Bas R. Steunebrink, and Jürgen Schmidhuber. “LSTM: A search space odyssey.” IEEE transactions on neural networks and learning systems 28, no. 10 (2017): 2222-2232.
Blog posts
Several blog posts explain and explore recurrent neural networks along various facets.
-
The Unreasonable Effectiveness of Recurrent Neural Networks: A blog post that talks