Two meanings for “innovation” in Wiener filter are the same?

Two meanings for

This is related question to A question about Wiener filter based on Linear Estimation by Kailath, based on the textbook Linear Estimation by Kailath. In that link I talk about how I first learned what innovation is, but in this next class we now learned a different “innovation” and I wonder how these two seemingly different concepts are related. Basically, I learned that whitening filter of y(s) is transfer function from y(s) to white-noise v(s), whereas the inverse transfer function from v(s) to y(s) is innovation filter. However, in this next class, we learned relationship among estimates $\hat{x}$ based on different measurements y1, y2… Knowing only $y_1$, we can make estimate $\hat{x}_{y_1} = E[x y_1^T](E[y_1 y_1^T])^{-1} y_1$ which is analogous to projecting x to the space of $y_1$ Then knowing $y_2$, as well, we can get rid of redundant info by projecting to the space of $y_1$ first to get $\hat{y_2}_{y_1} = E[y_2 y_1^T](E[y_1 y_1^T])^{-1} y_1$ and using $e_2 = \tilde{y_2}_{y_1} = y_2 – \hat{y_2}_{y_1} $ to find the second estimate component $\hat{x}_{e_2}$ to add onto the previous estimate $\hat{x}_{y_1} $ to get $\hat{x}=\hat{x}_{\tilde{y_2}_{y_1}}+\hat{x}_{y_1}$ But why is this process of basically doing Gram-Schmidt to create orthogonal bases $e_1 = y_1$ , $e_2 = \tilde{y_2}_{y_1}$ etc. and projecting to them also called an innovation process? He said innovation is like adding new information. But I thought that had to do with going back and forth with the white noise in the s-domain? I don’t see the connection between the two concepts.