Asymmetric ties are ties that are directed and indicates that the relationship is one-sided, with a specified direction (hence “directed”). In other words, the relationship that exists between two nodes that have an asymmetric tie is not mutual and the tie is directed from one node to another. An example of this would be a tie that represents advice-seeking, for instance. An asymmetric, advice-seeking tie between two nodes indicates that node #1 seeks advice from node #2, but node #2 does not seek advice from node #1, suggesting that this relationship is one-sided and not mutual.
Symmetric ties, or undirected ties, are ties that show mutual relationships where the exchange of the good, which the tie represents, is mutually transferred or exchanged between the two nodes. These ties are called undirected because they do not have a specific direction that details the movement of the good between two nodes, since both the nodes receive the same good. An example of this would be a tie that represents friendship. This type of tie describes an exchange of a good that both nodes incur from the other. Friendship is (hopefully) regarded as a mutual exchange, where both nodes give and receive friendship thereby creating an undirected, symmetric tie between the nodes involved.
In trying to understand how to decide if a tie is symmetric or asymmetric, you need to first understand the type of tie, and ask whether the relationship between both the nodes represents a good that is mutually exchanged or does one node give the good and the other node takes. If the relationship is mutual between the nodes, then the tie is undirected, symmetric. However, if the relationship represents a good that is not mutual and is one-sided where one node gives and the other receives, then the tie is categorized as an asymmetric, directed tie.
A) See Figure 1
B) The cutpoints are critical nodes that link otherwise disconnected components in the graph, where components are maximal groups in a network where every node is linked to another every node. A cutpoint node works to connected groups of nodes that are already linked together in a component but may not be linked to other components in the network. This type of node has a critically important role in being a liaison between components of nodes within a network and allows ties to form between different groups in the network. If cutpoints were removed from the network, the network would become less connected as components that used to be connected via the cutpoint node no longer have a tie.
In Figure 1, the network, which illustrates organizational consortium communication, has 5 cutpoints. These cutpoints connect the rest of the participating organizations to each other. They act as a primary liaison between organizations that would otherwise not interact. The significance of this in a consortium that aims to improve the drug development process for Multiple Sclerosis drugs, is that the cutpoints are essential nodes that allow consortium-related communication and subsequent collaboration between all the organizational representatives in this network in order to efficiently work towards a common goal. If the cutpoints in this network were removed, organizations that used to communicate with each other would most likely not be able to, or would find it difficult to establish a tie with an organization that is unknown to them. This is because the cutpoint no longer exists as a liaison that connects them and allows them to begin communication and collaboration towards the common goal of improving the drug development process.
Tie strength is also an important feature to look at in Figure 1. Tie strength is a measure of how frequent the communication was between participating organizations in the scientific consortium. This consortium was centered around improving the Multiple Sclerosis drug development process where organizations communicated and collaborated with one another. The strongest ties in the network are represented by the thickest line which indicates that there was the most frequent communication, or that the two organizations
communicated, on average, four times per month. Conversely, the weakest ties in the network are represented by the thinnest line which demonstrates that there was infrequent communication where organizations communicated, on average, once per month.
In Figure 1, it is evident that the most frequent communication occurred between the cutpoint nodes, and also between cutpoint nodes and other nodes in the network. This suggests that cutpoints have the most communication between each other and other nodes in the network, which reinforces their critical position as a liaison between nodes that may not communicate with each otherwise. Cutpoints that communicate frequently with other nodes are allowing for information and communication to flow easily between other nodes in the network, and also establishes the cutpoint’s position in the network as an important node for connecting otherwise disconnected components in the network. For example, node Z is an organizational node that communicates frequently with cutpoint node E. Due to this interaction, node Z now has access to a communication tie with AG and cutpoint G, because cutpoint node E communicates frequently with point AG. Referring back to the significance of cutpoints in this network, if cutpoint node E was removed, then node Z (and any other organizational node connected to cutpoint node E) would likely not have a tie with node AG or any other node on the right side of the network diagram. As a result, this demonstrates the cutpoints significance and importance in connecting otherwise disconnected parts of the network. Additionally, in this network, where collaboration is critical, the frequent communication of cutpoints between themselves and other organizations in the network underscores their importance in facilitating collaboration and communication in network.
Figure 1: MSOAC network of scientific consortium-related communication
A) By calculating the degree centrality of the MSOAC network, we will highlight the number of ties that each organization in the network is connected to and is communicating with. Organizations that have a high degree centrality indicate that they are directly communicating with a larger number of other organizations in the network, establishing them as a central node in the communication network of this scientific consortium. In running this degree centrality analysis, nodes G, H, AG and D (AG and D are tied for the same normalized degree centrality score) are the organizations with the highest normalized degree centrality values. We use the normalized degree centrality score instead of the degree centrality score for analysis because the normalized score accounts for network size and other attributes that allow it to be compared to other networks. Normalized degree centrality score varies between 0 and 1, where 0 indicates that the node has no ties with any other node in the network and 1 indicates that it is connected to every other tie in the network. For the MSOAC network, Table 1 specifies the normalized degree centralities of the top three most central nodes. Node G has a normalized degree centrality score of 0.413 which means that node G has ties to 41.3% of the other nodes in the network, which is quite significant in establishing it as a critical and central node. Node G is clearly a highly central node and has significant involvement in this network, both in collaboration and communicating with other nodes to improve the drug development process. Node G is also central in providing information to other organizations in the networks, who would otherwise be unable to access that information and effectively collaborate. Similarly, nodes H, AG, and D have normalized degree centrality scores of 0.348 and 0.326, respectively, which suggests that these nodes are tied to and communicating with at least 33% of the other nodes in this network.
Bonacich power centrality (with a negative beta) measures the centrality of an organizations with regards to their communication ties to low degree centrality organizations in the MSOAC. In other words, this centrality measure is measuring the number of low-degree centrality nodes an organizations is connected to. This measure aims to capture the nodes that have a lot of ties with nodes that are peripheral, or not central to the node in order to assess the power that certain nodes have in controlling the information flow or tie exchange in a network. In the MSOAC network, nodes R, P, and L have the highest Bonacich centrality and normalized scores of 3.072, 2.183, and 2.105, respectively. Node R’s highest Bonacich centrality score is especially significant because it indicates that this organization has a substantial amount of power as control, such that they are connected to the most amount of organizations that do not have many ties elsewhere, or have low-degree centrality. As a result, these low-degree centrality organizations rely upon node R for information within the network, as well as ties to other nodes for the purpose of collaboration and communication within this network. Through this measure, it is evident that node R is quite “central” in providing and effectively controlling information to peripheral nodes that do not have many communication ties to other organizations in the network.
B) Both measures (degree centrality and negative beta Bonacich power centrality) are important in analyzing this network because it is essential to know who the critical, central organizations are since they are the most important in facilitating collaboration and allowing communication to flow through the network. However, for this network, the Bonacich power centrality (with negative beta) measures seems to make the most sense given that this network is about collaboration in order to improve the drug development process. Collaboration of this sort of nature requires that all organizations participate and contribute in order to improve the outcome. In this network, the nodes represent organizations from different parts of the healthcare system: government, for-profit, and non-profit. This diverse representation demonstrates that there are vested interests in this network and that each organization has some different perspective, information, or idea to offer. As a result, it is important that each organization is effectively integrated into the collaborative nature of this network. Bonacich centrality (with a negative beta) allows us to understand how certain nodes help to integrate other nodes into the network and are central to providing information to those peripheral nodes that may not have communication ties to others in the network given their low-degree centrality. Nodes R, P, and L have high Bonacich power centrality scores which suggest that they are influential in providing information to and connecting the other nodes to the larger network, thereby facilitating collaboration and communication. This is a key aspect of this particular network.
Table 1: Degree Centrality Output for MSOAC network
Organization (node) Degree centrality score Normalized Degree centrality score
G 19 0.413
H 16 0.348
AG 15 0.326
D 15 0.326
Table 2: Bonacich (negative beta) Power Centrality Output for MSOAC network
Organization (node) Bonacich power centrality score Normalized Bonacich centrality score
R 11.822 3.072
P 8.401 2.183
L 8.102 2.105
Figure 3: Degree Centrality Graph for MSOAC network
Figure 4: Bonacich (negative beta) Power Centrality Graph for MSOAC network
A) Reciprocity can be described as the mutual exchange of a tie between two nodes, or dyads. It is the likelihood that the nodes will return a tie that it receives from another node, thereby creating a symmetric and mutual exchange of the tie. In this network of high school students, the percentage of dyads, or a pair of nodes, that are reciprocal is 100%. This is not surprising because the tie in this network is friendship, where this type of tie is symmetric and reciprocal. Both nodes give and receive friendship from the other node in the dyad.
B) Homophily refers to the tendency of individuals to form groups with people that are similar to them, with regards to attribute. This is a measure of cohesion because individuals with certain similarities are likely to form ties with one another, based on those similarities, thereby forming groups and a more cohesive network. Using the Adolescent Health dataset, we examine race as an attribute of individuals, which is a homophily-generating force in this network. In other words, we hypothesize that individuals of similar race are likely to form friendship ties between one another. Upon running the homophily test, we found that the expected value for E-I index is -0.836. However, we have to take into account network attributes such as size and density when determining the E-I index of this particular network, and so the rescaled E-I index value of -0.843 is the appropriate measure for this analysis. The negative value of this rescaled index, and its close proximity to the -1 minimum value of the index, initially suggests that race has a tendency towards homophily in the network. This initial conclusion suggests that high school students in this network form friendship ties with others of similar races.
However, we have to evaluate the observed rescaled E-I index under the random chance, where the friendship ties in the network are randomly distributed. This expected E-I index value is -0.836 and varies by a standard deviation of +0.045. This leads to a sampling distribution range of -0.881 to -0.791. The observed rescaled E-I index of -0.843 falls directly into this range, which suggests that this value is not statistically significant. This value, despite its initial indication of race being a homophily-generating force, is most likely just a result that randomly occurred, but because it is not statistically significant since it occurs within the expected E-I index interval, we can conclude that race is not a homophily-generating force in this network.
A) When symmetrizing asymmetric, or directed data, there are two options that one can use to do so. One can either use the minimum function or the maximum function. The minimum function evaluates the direction the tie and is the least inclusive measure. This means that if two nodes have an asymmetric relationship where the tie is directed from one node to the other, and is not mutual, then the minimum function will not regard this as a symmetric tie since there is no mutual exchange of the tie. This is the least inclusive measure because it only includes network ties that are mutually exchanged between both nodes. Alternatively, the maximum function is the most inclusive because it will include ties regardless of the tie’s direction. In other words, even if there is an asymmetric or one-sided tie where the exchange is directed from one node to other, the maximum function interprets this relationship as a tie. As long as there is a relationship, the maximum function considers this to be a tie.
B) In running a clique analysis with a minimum size of 3, we are analyzing the subgroups of nodes in which each node is directly connected to each other by a geodesic distance of 1. This means that each node is only 1 tie away from, or is directly connected to every other node in the subgroup. In this clique of minimum size 3 analysis, 8 cliques were found. Nodes 8 and 11 were in 4 out of the 8 cliques which indicates their integral position as acting as a liaison between cohesive groups and cliques. They are directly tied to the other nodes of those 4 cliques, which demonstrates the significant position of these nodes as a preferred partner for school work. In the context of the network, nodes 8 and 11 represents seventh grade students who were elected by their classmates in regards to who they wanted to work with. These findings demonstrate that nodes 8 and 11 are students who are popular among their peers and directly work with other students in the network. These nodes can be a source of knowledge for their peers and also facilitate information flow within the network of seventh grade students.
C) In this 2-clique analysis, 9 2-cliques were found which means that there are 9 subgroups where the maximum geodesic distance between nodes in this subgroup is 2. This is different from the clique analysis, because nodes can be connected by a maximum geodesic distance of 2, rather than 1, which relaxes the criteria for clique formation. This subgroup measures the cohesiveness by analyzing the connectivity of all the nodes in the network, and also includes ties involving intermediaries not included in the network. In other words, intermediary or connecting nodes don’t have to be included in this subgroup as long as it connects two nodes that belong to the 2-clique. In this 2-clique analysis, node 22 is a part of 4 out of the 9 2-cliques which suggests that node 22 has an important role in working with their peers, and is also a liaison between different groups of students with regards to work and academic information exchanges. This also suggests that node 22 is connected to and works with or is preferred for working with many students in these cliques, and is at least connected through an intermediary student. This places node 22 at a unique position in this network as they are connected to many other students and can mutually work with other students that may not be a part of the cliques, and are considered intermediaries, as well as with other students included in these cliques. Additionally, nodes 14, 20, 21, and 22 are in 3 of the 2-cliques together. This demonstrates that these students preferred each other to work with, and likely worked with each other frequently. They also have similar position to node 22, where they are connected to other students, and are connected through at least an intermediary.
D) The 2-clique analysis is the most interesting, in my opinion, because this research question is inquiring about the work preferences that seventh graders have, which has implications for who they have worked with and who has a reputation in the network of being a good worker. Since much of the work in grade school is project based and collaborative, work preferences are important within students. A 2-clique would be a more interesting way to understand these work preferences, as the clique analysis would be too rigid given that the ties must be one geodesic distance apart. The 2-clique analysis allows us to expand this clique definition by understanding the subgroups that may include other intermediaries and see how students overlap in their work preferences. Also, it is likely that given the minimum function that we used to symmetrize the data, many ties were not counted as a connection between the nodes. There could be some bias with regards to students forgetting to list others, and because of the minimum function these were not counted as ties, even though the relationship still existed in reality. The 2-clique, as opposed to the general clique analysis will allow for a more inclusive measure where we can assess the relationships, including intermediaries, of work preferences between seventh grade students.
Figure 4: 2-clique graph of seventh grader work preferences