Frequency-Domain Better than Time-Domain for Causal Structure Recovery in Dynamical Systems on Networks
Abstract
Learning causal effects from data is a fundamental and well-studied problem across science, especially when the cause-effect relationship is static in nature. However, causal effect is less explored when there are dynamical dependencies, i.e., when dependencies exist between entities across time. In general, it is not possible to reconstruct the causal graph from data alone. The conventional static causal structure recovery algorithms employ tests such as the Fischer-z test and the chi-square test to assess the conditional independence (CI) of data which forms the basis for recovering Markov Equivalent Graphs (MEGs) wherein causal structure can be recovered partially. For data that are dynamically related, multivariate least square estimation, based on Wiener Filters (WFs) relying on second order statistics for estimating a data stream from other streams, provides a means of recovering influence structures of the directed network underlying the data. Here, WF based projections can be determined in time-domain or in frequency-domain; the question this article sets out to answer is which is better? Here, we obtain concentration bounds on the accuracy of the WF estimation in both time and frequency-based approaches. Exploiting the computation speed of Fast Fourier Transform (FFT), we establish that the frequency domain provides distinct advantages. Moreover, frequency domain projections involve complex numbers; we establish that the phase properties of the resulting estimates can be effectively leveraged for better recovery of the MEG in a large class of networks; the time-domain has no analogue of phase. Thus we report the "Wiener-Phase" algorithm provides the best accuracy as well as computational advantages. We validate the theoretical analysis with numerical results. Performance comparison with state of the art algorithms are also provided. Further, the proposed algorithms are validated on a real field dataset known as the "river-runoff" dataset collected from the online repository of CauseMe, and on measurement data from transistor based circuits.