Self Organizing Maps (SOM) can perform unsupervised learning and clustering of data. Its idea is very simple, essentially it is a neural network with only input layer-hidden layer. A node in the hidden layer represents a class that needs to be clustered. The method of "competitive learning" is adopted during training. Each input example finds a node that best matches it in the hidden layer, which is called its activation node, also called "winning neuron". Then use the stochastic gradient descent method to update the parameters of the activated node. At the same time, the points close to the active node also update their parameters appropriately according to their distance from the active node. Therefore, a feature of SOM is that the nodes of the hidden layer are topologically related. This topological relationship needs to be determined by us. If we want a one-dimensional model, then the hidden nodes are connected to form a line in turn; if you want a two-dimensional topological relationship, then line up into a plane.
The SOM service provided by CD ComputaBio can discretize any dimensional input into a one-dimensional or two-dimensional (uncommon higher dimensional) discrete space. The nodes in the Computation layer and the Input layer are fully connected.
After the topological relationship is determined, the calculation process begins, which is roughly divided into several parts:
1) Initialization: Each node initializes its own parameters randomly. The number of parameters for each node is the same as the dimension of Input.
2) For each input data, CD ComputaBio will find the node that matches it best. Assuming that the input is D-dimensional, that is, X={x_i, i=1,...,D}, then the discriminant function can be Euclidean distance.
3) After finding the active node I(x), CD ComputaBio also hopes to update its neighboring nodes. Let S_ij denote the distance between nodes i and j. For nodes that are adjacent to I(x), assign them an update weight.
Simply put, the update degree of the neighboring nodes is discounted according to the distance.
4) The next step is to update the parameters of the node. Update according to the gradient descent method and iterate until convergence.
(1) K-Means needs to determine the number of classes in advance, that is, the value of K. SOM is not used, some nodes in the hidden layer may not have any input data belonging to it. Therefore, K-Means is greatly affected by initialization.
(2) After K-means finds the most similar class for each input data, only the parameters of this class are updated. SOM will update the neighboring nodes. Therefore, K-mean is more affected by noise data, and the accuracy of SOM may be lower than that of k-means (because adjacent nodes are also updated).
(3) The visualization of SOM is better. Elegant topology diagram.
Project name | SOM service |
---|---|
Our advantages |
|
Sample requirements | Our SOM service requires you to provide specific requirements. |
Screening cycle | Decide according to your needs. |
Deliverables | We provide you with raw data and analysis service. |
Price | Inquiry |
CD ComputaBio' SOM service can significantly reduce the cost and labor of the subsequent experiments. SOM service is a personalized and customized innovative scientific research service. Each project needs to be evaluated before the corresponding analysis plan and price can be determined. SOM Service can yield misleading results in cases in which multivariate analysis is more appropriate. If you want to know more about service prices or technical details, please feel free to contact us.