Comparison of the social network weight measurements

To extract social networks from the information sources, it is necessary to measure the similarities. There are several measurements that can become application to form a social network based on information sources. On the other side, various the information sources with various conditions and are not homogeneous, but it is possible to cluster them into a homogeneous form. However, measurement about it require proof. To produce the proof, a comparison of results from various measurements gives one of the answers.


Introduction
Social network extraction from the Web by involving any search engine is necessary by extracting the information contained in the information sources [1,2,3]. Information sources such as the Web contain information that is not only dynamic, but also ambiguous and unstructured. The use of search engines to access such social network information: the information of social actors and their relationships, it requires a strategy or an approach that varies according to interests [4,5]. However, in general the approach requires measurement which is generally also referred to as the similarities.
There are many similarity measures used in extracting social networks from the Web. Similarity targets various data objects. Any search engine like Geevv, Google, Yahoo, Bing, and others provide the hit count answers to questions in the query. The query contains question information from search engine users. Clustering involves queries, search engines, and the Web, this generates social networks. This paper aims to show the results of several measurements from the Web.

Some Similarities
After identifying the existence of social actors [6], the social networks exist when identifying the existence of relationships between social actors has a proof [7]. The existence of social actors through clustering involves a query, the query contains the name of the social actor and submits it to the search engine, and produces a hit count in occurrence which we then declare as |Ω a |, where a ∈ A is a set of social actors [8].  While the existence of social relationships through the use of queries, which contains a pair of social actor names, submit it to search engines, and produce a hit count in co-occurrenc [9], which can then be stated as |Ω a ∩ Ω b |, a, b ∈ A [10].
For graph-based social networks representation [11]: When node v ∈ V has been declared, and edge e ∈ E can be only be expressed by calculating similarity weights so that each e ∈ E is a representation of the relations based on the weight calculation between two occurrences and one co-occurrence [12]. In the early stages of developing social network extraction from the web, the researchers use the Jaccard coefficient [13], to determine the weight of the relation [14]. However, the existence of a relation between two social actors is based on the matching coefficient i.e. if M c > 0 [15]. While the mutual information [16], is to measure of the mutual dependence between two actors, or dependence of a social actor on other social actors. Use of dice coefficient [17], serves to maintain the heterogeneity of sensitive information between two social actors [18,4].
is to measure the overlap of information between two social actors [19], in terms of who influences and where the influence comes from, namely |Ω a | > |Ω b | assumes that the  Table 1. Calculation for two social actors based on some similarities.

Social Actors
Co-Occ Similarity  influence comes from a social actor a ∈ A against other social actor b ∈ A. Besides, the cosine similarity [20], is measurement to dertemine the opposition of a social actor to other social actors whereby the cosine angle serves to provide that opposition [18,4]. 1], and m an arbitrary number such that M i ∈ [0, 1]. For example, for two social actors Mahyuddin K. M. Nasution dan Shahrul Azman Mohd Noah, we obtain occurrences and co-occurrences (Co-Occ) as in Figure 1, and by using Eqs. (1), (3), (4), (5), (6) and an approach (Appr) to it we get the results such as Table 1 [21].

Simulation
To express the behaviour of each similarities in Eqs. (1), (3), (4), (5) and (6). For this reason, we generate a set of numbers randomly divided into three parts, namely

Comparison of the social network weight measurements
to represent two occurrences and one co-occurrence. Each of these parts is arranged in such a way that there is a maximum numbers, and 2 sets for twin numbers. The maximum collection of numbers is to represent the hit count for the social actor a, which is H 1 = {|Ω a | i |i = 1, . . ., n}. The first set of twin numbers to represent the hit count for the social actor b, which is H b = {|Ω b | i |i = 1, . . . , n}. While the second set of twin numbers to represent the hit count for two social actors at once, namely The sort of order of three sets of numbers from smallest to largest. Thus in general the members of all three arrangements apply || · · ·| i ∈ H c | ≤ || · · · | i ∈ H a | and || · · · | i ∈ H c | ≤ || · · ·| i ∈ H b |, i = 1, . . . , n [22].
Numerically, for example, the simulation approach is as in Table 2. Diagonally, starting from 0 (left-down) to point 1 (right-up) the line from J c cuts the area bounded by [0, 1] and the number n becomes two same part [23]. In other words, symmetrically the area by J c is divided into two parts, see Figure 2(a). In contrast, the line formed by D c from the point 0 to 1 diagonally divides the same area into two parts, but becomes one curve. The curve is reinforced at the initial numbers and descending to the other end, see Figure 2(b). Likewise, the line formed by computing S cos , experiences more reinforcement at the starting point of the number and decreases again at the end, see Figure 2(c). An approach is possible to provide another impact on computing the numbers in a simulation, namely by reducing the strength at the beginning and experiencing reinforcement to the end.

Discussion about similarity for social networks
Similarity has an important role in generating social networks from information sources. The role is related to determining the relation between social actors in the form of weights [24,25,26]. As such, this role largely determines social networks presenting information that can be trusted or not. To the extent that role applies, similarities are a decisive measure other than input from information sources. Meanwhile, discussing the input applies on this occasion, in this case, it only involves the hit cont [2,18]. The validity of a hit count depends on the capabilities and information structure of the search engine [4]. Thus, in a simulation the comparison between several similarity measurements is shown in Figure 3, and however there is a search for an approach that results in a comparison of similarity measurements whose results have so far been expressed as the proposed approach [27].
A experiment is to produce social networks from information space by involving several measures of similarity have produced some descriptions of the behaviour of measures of similarity [28]. A number of edges exist by the help of two occurrences and one co-occurrence for a pair of social actor, and then produce a set of edges for a number of social actors. A set of edges behaves as in Figure 4, and it is a result of that similarity [4].
Comparison between two measures of similarities by the help of relative probability is to show different behaviour of each measure. Whereas the comparison of two similar comparison structures shows different data behaviour [4,21,29].

Comparison of the social network weight measurements
o c according to the distribution density, see Figure 4(b) [3].
The involvement of measuring the proposal similarity Sim our , which has been proposed on different occasions [2,30], has shown different behavior by treating the same data to it, Figure 4(c). Edges come from results sim our classified differently following the curve C o according to the order of relative probabilities. Likewise, related to the measurement behavior of S cos by the same treatment, also it produces the same pattern as Sim our , but it generates the more detail classification to edges. This shows that all similarities have characteristics separate, and by different goals on the different results [31]. As a presumption there may be a measure of similarity based on the line pattern that borders the edges presented in Figure 4(b), it simulated in Figure 3 as a proposed similarity.

Conclusion
Measurement of the weight of relations in social network varies according to the emphasis on meaning and proof to be disclosed. Several measurement of similarity are used and comparisons between them indicate there is a specificity of each. Of course, based on the results of the comparison obtained there are three positions, namely the Jaccard coefficient in the midline position, while the Cosine and Dice coefficient are in the upper arch position, and finally the proposed measurement position is in the lower arch. Based on comparisons with the relative probability measurements of relations weights in social networks showed different behaviors.