NLP with Neural Networks

CS 6957, Fall 2023

Recurrent Neural Networks

In this lecture, we will look at the problem of modeling sequences with neural networks. We will first see recurrent neural networks and then move on to their a commonly used variant – namely, the long-short term memory (LSTM) network.

Lectures and readings

Readings

The chapter titled RNNs and LSTMs in Jurafsky and Martin’s textbook.
Chapter 14 of Yoav Goldberg, Neural network methods for natural language processing. Synthesis Lectures on Human Language Technologies. 2017 Apr 17;10(1):1-309.

Papers

Elman, Jeffrey L. “Finding structure in time.” Cognitive science 14, no. 2 (1990): 179-211.
Pearlmutter, Barak A. “Gradient calculations for dynamic recurrent neural networks: A survey.” IEEE Transactions on Neural networks 6, no. 5 (1995): 1212-1228.
Hochreiter, Sepp, and Jürgen Schmidhuber. “Long short-term memory.” Neural computation 9, no. 8 (1997): 1735-1780.
Schuster, Mike, and Kuldip K. Paliwal. “Bidirectional recurrent neural networks.” IEEE Transactions on Signal Processing 45, no. 11 (1997): 2673-2681.
Lipton, Zachary C., John Berkowitz, and Charles Elkan. “A critical review of recurrent neural networks for sequence learning.” arXiv preprint arXiv:1506.00019 (2015).
(*) Chung, Junyoung, Caglar Gulcehre, Kyunghyun Cho, and Yoshua Bengio. “Gated feedback recurrent neural networks.” In International Conference on Machine Learning, pp. 2067-2075. 2015.
(*) Greff, Klaus, Rupesh K. Srivastava, Jan Koutník, Bas R. Steunebrink, and Jürgen Schmidhuber. “LSTM: A search space odyssey.” IEEE transactions on neural networks and learning systems 28, no. 10 (2017): 2222-2232.

Blog posts

Several blog posts explain and explore recurrent neural networks along various facets.

The Unreasonable Effectiveness of Recurrent Neural Networks: A blog post that talks
Understanding LSTM Networks