Analytical Solution of Three-layer Network with Matrix Exponential Activation Function
Kuo Gai · Shihua Zhang
Abstract
It's known that in practice deeper networks tends to be more powerful than shallow one, but this has not been understood theoretically. In this paper, we find the analytical solution of a three-layer network with matrix exponential activation function, i.e., $$f(X)=W_3\exp(W_2\exp(W_1X)), X\in \mathbb{C}^{d\times d}$$have analytical solutions for the equations$$\begin{cases}Y_1=f(X_1) \\\\Y_2=f(X_2) \end{cases}$$for $X_1,X_2,Y_1,Y_2$ with only invertible assumptions. Our proof shows the power of depth and the use of non-linear activation function, since one layer network can only solve one equation,i.e.,$Y=WX$.
Chat is not available.
Successful Page Load