ICA algorithm

ICA independent component analysis is a powerful data analysis tool that has emerged in recent years (Hyvarinen A, Karhunen J, Oja E, 2001; Roberts SJ, Everson R, 2001). In 1994, Comon gave a relatively strict mathematical definition of ICA. The idea was first proposed by Heranlt and Jutten in 1986.

On the concept, essence and process of ICA algorithm

Although ICA has not been around for a long time, it has received more and more attention in theory and application, and has become a hot spot in domestic and foreign research.

ICA independent component analysis is a method used to find hidden factors or components from multivariate (multidimensional) statistical data. It is considered to be one of PCA principal component analysis (see artificial intelligence (46)) and FA factor analysis. Kind of expansion. For the blind source separation problem, ICA refers to an analysis process that separates or approximately separates the source signal without knowing the source signal, noise, and mixing mechanism.

ICA algorithm concept:

ICA (IndependentComponent Analysis) Independent component analysis is a statistical technique used to discover hidden factors that exist under random variables. ICA defines a generative model for observation data. In this model, it is considered that the data variables are implicit variables, linearly mixed by a hybrid system, which is unknown. And assuming that the underlying factors are non-Gaussian and independent of each other, they are called independent components of observable data.

ICA is related to PCA, but it works well in identifying potential factors. It can be applied to digital images, document databases, economic indicators, psychological measurement, etc.

The essence of ICA algorithm:

ICA is to find out the mutually independent parts that constitute the signal (without orthogonality), corresponding to the analysis of higher-order statistics. According to ICA theory, the mixed data array X used for observation is obtained by linearly weighting the independent elements S through A. The goal of ICA theory is to find a separation matrix W through X, so that the signal Y obtained by W acting on X is the optimal approximation of the independent source S. This relationship can be expressed by the following formula:

Y = WX = WAS, A = inv (W)

Compared with PCA, ICA can characterize the random statistical characteristics of variables and suppress Gaussian noise.

From the perspective of linear algebra, both PCA and ICA have to find a set of bases. This set of bases is expanded into a feature space, and the data processing needs to be mapped into a new space.

ICA theoretical basis:

The theoretical basis of ICA is as follows:

1) Standard orthogonal basis

2) Albino

3) Gradient descent

ICA objective function:

The objective function of ICA is as follows:

The L1 norm of the linear transformation of the sample data x after the parameter matrix W actually describes the characteristics of the sample data.

With the addition of standard orthogonality constraints, ICA independent component analysis is equivalent to solving the following optimization problems:

This is the objective function of standard orthogonal ICA. As usual in deep learning, there is no simple analytical solution to this problem, so gradient descent is needed to solve it, and due to standard orthogonality constraints, it is necessary to map the new base back to orthogonal after each gradient descent iteration In the base space, this guarantees orthogonality constraints.

ICA optimization parameters:

For the ICA's objective function and constraints, you can use the gradient descent method, and add a projection (projecTIon) step in each step of gradient descent to meet the standard orthogonal constraints. The process is as follows:

ICA algorithm process:

The known signal is S, and the signal after the transformation of the mixing matrix is: X = AS. For the overlapping signal X, solve the mixing matrix B so that the components of Y = WX are as independent as possible. The process of solving W is not necessarily an inverse matrix approximating A, and Y is not an approximation of signal S, but to make Y components independent of each other. The purpose is to find a demixing matrix starting from the only observation data X.

Common methods: InfoMax method (use neural network to maximize information), FasTICA method (fixed-point algorithm, seek X-component projection on W (W ^ t) * X) to maximize non-Gaussian

The main algorithm flow is as follows:
1. The pre-processing part: 1) X zero mean processing
2) Spheroidization (whitening)

Multiply the spheroidization matrix S to make each row of Z = SX orthogonal to one, ie ZZ '= I

2. The core algorithm part: Seek the unmixing matrix U so that Y = UZ, and Y data are as independent as possible (independent criterion function G).
1) Since Y is independent, the rows must be orthogonal. And usually take U to keep the variance of each line of Y as 1, so U is an orthogonal transform.
2) The pre-processing part of all algorithms is the same. In the future, the input is the spheroidized data z, and the orthogonal matrix U is searched to make Y = Uz independent.

Due to the different independent criterion function G and the different steps, there are different independent component analysis methods.

3. Fast ICA algorithm ideas:

Idea: Exploratory projection tracking

Purpose: Input the spheroidized data z, after orthogonal array U processing, output Y = Uz
1) Input the spheroidized data z, after a row vector ui processing (projection) of the orthogonal matrix, extract a certain independent component yi.

2) Remove this component and extract it in order to get all yi and ui.

3) Obtain an independent basis vector U

U = WX

The Fast ICA algorithm program is as follows:

funcTIon [Out1, Out2, Out3] = fasTIca (mixedsig, varargin)

% FASTICA (mixedsig) estimates the independent components from given

% multidimensional signals. Each row ofmatrix mixedsig is one

% observed signal.

% = FASTICA (mixedsig); the rows oficasig contain the

% estimated independent components.

% = FASTICA (mixedsig); outputs the estimatedseparating

% matrix W and the corresponding mixingmatrix A.

mixedsig is the input vector, and icasig is the basis vector of the solution.

A is the mixed matrix, and it can be verified that mixedsig = A Ã— icasig.

W is the unmixing matrix, and it can be verified that icasig = W Ã— mixedsig.

Advantages of ICA algorithm:

1) Fast convergence speed.

2) Parallel and distributed computing requires small memory and is easy to use.

3) Any independent component of non-Gaussian distribution can be directly found by using a nonlinear function g.

4) It can be optimized by choosing an appropriate nonlinear function g. Especially the algorithm that can get the smallest variance.

5) Only a few (not all) independent components need to be estimated, which can greatly reduce the amount of calculation.

Disadvantages of ICA algorithm:

1) The number of features of the feature matrix W (that is, the number of basis vectors) is greater than the original data dimension, which will cause optimization difficulties and lead to too long training time;

2) The objective function of the ICA model is an L1 norm, which is not differentiable at 0, which affects the application of the gradient method.

Note: Although the shortcoming 2) can be avoided by other non-gradient descent methods, it can also be solved by using the approximation "smooth" L1 norm, that is, using (x2 + Îµ) 1/2 instead of | x |, for the L1 norm The number is smoothed, where Îµ is the "smoothing parameter".

The difference between ICA and PCA:

1) PCA is to reduce the dimension of the original data and extract unrelated attributes, while ICA is to reduce the dimension of the original data and extract mutually independent attributes.

2) The purpose of PCA is to find such a set of component representations, so that the reconstruction error is the smallest, that is, the most representative feature of the original thing. The purpose of ICA is to find such a set of component representations, so that each component is maximized and independent, and some hidden factors can be found. This shows that the conditions of ICA are stronger than PCA.

3) ICA requires to find the direction of maximum independence, each component is independent; PCA requires to find the direction of maximum variance, each component is orthogonal.

4) ICA believes that the observed signal is a linear combination of several statistically independent components. What ICA does is a demixing process. PCA is an information extraction process that reduces the original data dimension, and has become a pre-processing step for ICA to standardize the data.

ICA algorithm application:

From an application point of view, ICA application fields and application prospects are very broad, currently mainly used in blind source separation, image processing, language recognition, communication, biomedical signal processing, brain functional imaging research, fault diagnosis, feature extraction, finance Time series analysis and data mining.

Conclusion:

ICA is a commonly used data analysis method, a powerful method in the field of blind signal analysis, and a method to find the hidden factors of non-Gaussian distribution data. From the perspective of sample-features, the prerequisite for using ICA is that the sample data is generated by implicit factors of independent non-Gaussian distribution. ICA algorithm has been widely used in blind source separation, image processing, language recognition, communication, biomedical signal processing, brain functional imaging research, fault diagnosis, feature extraction, financial time series analysis and data mining.

RAM/RFM Induction Heating Capacitors

RAM/RFM induction heating capacitors

RAM/RFM Induction Heating Capacitors,Water Pump Capacitor,Water Cooled Condense,Tank Capacitor,RAM/RFM Induction Heating Capacitors

YANGZHOU POSITIONING TECH CO., LTD. , https://www.cnpositioning.com