BSDS3002 Social Computing Methods and Applications - L6 - Hands-on Network Analysis Practice
BSDS3002 L6—Hands-on Network Analysis Practice (20’)
Deliverables:
- 1) Python codes in .ipynb format for Part I and Part II
- 2) A short report for Part II (around 800 words)
Part I (8’):
When we construct networks using real data, it is common to import the edge info and node info from a csv or excel file using pandas. We introduced the nx.from_pandas_edgelist() function for constructing an undirected and unweighted network.
Q1. Construct an undirected and unweighted network using the Polblog dataset using the Part1_edges_L6.csv, and set the node attributes based on the “political_orientation” (the “value” column) in Part1_nodes_L6.csv
Q2. Identify the largest component of the network
Q3. Visualize the largest component of the network
- 1) color the nodes based on their ‘political_orientation’ attribute
- 2) plot the network in the spring layout, nx.spring_layout()
- 3) do not show the node labels in the graph
- 4) plot edge_color to be ‘grey’
- 5) set edgecolors = ‘w’
- 6) plot node_size to be 90
- 7) save the network visualization to be a pdf file
Hint: the expected network visualization:
you can find the information about nx.draw_networkx() from the link below. https://networkx.org/documentation/stable/reference/generated/networkx.drawing.nx_pylab.d raw_networkx.html
Part II (12’):
Bipartite network analysis
In this part, you will construct a bipartite network modeling insect-plant pollinating relationships, and analyze the network to write a short report (around 800 words, 3’) presenting and elaborating on the analysis results of the questions Q3-Q12 (you can use methods beyond network analysis to address the questions, if appropriate).
- We learned bipartite network analysis in L5. A bipartite network is a network consisting of two sets of nodes and edges only connect nodes from different sets of nodes. https://networkx.org/documentation/stable/reference/algorithms/bipartite.html
- Dataset: plant-pollinator network (on Moodle)
This network consists of two types of nodes. In the Part2_node_L6.csv file, insects are the nodes with the value of ‘pollinator’ equal to 1, and plants are the other set of nodes with the value of ‘pollinator’ equal to 0. - More information about the dataset can be found from this article:
Kato, M., Kakutani, T., Inoue, T., & Itino, T. (1990). Insect-flower relationship in the primary beech forest of Ashu, Kyoto: an overview of the flowering phenology and the seasonal pattern of insect visits. Contributions from the biological laboratory, Kyoto University, 27(4), 309-376.
Questions
Q4. Import the dataset, construct the bipartite insect-plant network, and report the summary information of the bipartite network (a and b below). Please make sure to interpret the results within context (e.g the number of nodes in node_set_1 is 1000, which means there are 1000 insects in the dataset)
- Number of nodes in each node set
- Number of edges in the bipartite network
Q5. What plants are pollinated by the most insects? What insects pollinate the most plants? Q6. Which pair of insects are the strongest competitors to each other? (they pollinate the largest number of same plants)
Q7. Project the bipartite network into two one-mode networks
- Interpret the two projected networks (what do the edges mean in each projected network?).
- Provide the summary information of the two projected networks (e.g. number of nodes, number of edges, average degree) as well as interpret this information in context.
Q8. Identify the nodes with the largest degree centrality in both projected networks
- Interpret the meaning of degree centrality within this context
- Compare the results with Q5, discuss the differences of these two results
Q9. Identify the isolated nodes in each projected network (if there are any), and regardless, interpret the meaning of isolated nodes in context.
Q10. Analyze the subgroups in the two projected networks by selecting an appropriate
cluster analysis approach, and explain the results (refer to Lecture L4)
Q11. Compute and report the degree assortativity coefficient in the two projected networks and explain the result in context
Q12. Visualize the bipartite network as well as the two projected networks.
a. In the bipartite network, color the nodes based on their “pollinator” value
b. In the two projected networks, scale the nodes based on the degree centrality.
No need to add the interpretation of the results in the jupyter notebook file. The interpretation of the analysis results should be placed in the written report.