Gradient descent aligns the layers of deep linear networks

Ziwei Ji and Matus Telgarsky. ICLR 2019. (Poster PDF · Paper)