论文部分内容阅读
非高斯性数据间的因果网络已经在经济学、生物学和环境学等学科得到了广泛应用.DirectLingam(Direct Method for Learning a Linear Non-Gaussian Structural Equation Model)算法是其中一个经典解法,但其存在维度达到25维度以上时外生变量(exogenous variable)识别率低的问题,进而产生级联效应,使得整个网络的估计误差随着层数增大越来越大.为此提出了一种基于负熵局部选择外生变量的DirectLingam算法(LS-DirectLingam),把变量的非高斯性作为外生变量选择的标准,用负熵来度量变量的非高斯,选择负熵最大的k个变量存入局部目标变量集合Lv中,在集合Lv中进一步去寻找外生变量,从而提高了外生变量的识别率.与基本的DirectLingam算法进行实验比较,结果表明LS-DirectLingam算法优于DirectLingam算法.
The causal network between non-Gaussian data has been widely used in economics, biology and environmental science, etc. DirectLingam (Direct Method for Learning a Linear Non-Gaussian Structural Equation Model) algorithm is one of the classical solutions, but its existence The problem of low recognition rate of exogenous variable when the dimension reaches more than 25 dimensions, and then there is a cascade effect, so that the estimation error of the whole network increases with the number of layers increasing.For this reason, The DirectLingam algorithm (LS-DirectLingam), which selects exogenous variables locally, uses the non-Gaussian variable as the criterion of exogenous variable selection. It uses negative entropy to measure the non-Gaussian variables and selects the k variables with the largest negative entropy to be stored in local targets Compared with the basic DirectLingam algorithm, the experimental results show that the LS-DirectLingam algorithm is superior to the DirectLingam algorithm.