NLP with Neural Networks

CS 6957, Fall 2023

home
information
lectures
homeworks
resources

Vanishing gradient revisited: Highway/Residual connections

In this lecture, we will revisit the vanishing gradient problem. The general techniques used to make recurrent networks robust can be applied for general deep neural networks.

Lectures and readings

Lecture slides
Srivastava, Rupesh K., Klaus Greff, and Jürgen Schmidhuber. “Training very deep networks.” In Advances in neural information processing systems, pp. 2377-2385. 2015.
He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. “Deep residual learning for image recognition.” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778. 2016.

home
information
lectures
homeworks
resources