This page will contains notes, materials, and resources for the faculty learning community.
-
What the FLC is about:
FLC Goals
Build interdisciplinary, cross-institutional community centered around teaching
Discuss exemplars in network science education and applications
Contribute the development or evaluation of modules
Discussion:
The concept to build on network science and complexity because a solution to one problem can solve problems in a lot of areas; communication across fields isn’t always as strong
Example of education level is the Northwestern Network Institute of Technology and Organization (All Engineering Students Must Take Intro Level Course and want to make it mandatory for all freshmen)
Theoretical aspect: relational approach to understanding the world
A good focal point for interdisciplinary work because it can be introduced easily by social sciences
Discussion: Can network science be applied to all fields?
Discussion: Agent Based Models
Small Groups: 3 interesting questions a student with no background in networks would want to address
Group 1:
Can a network (structure or network relationships) affect/influence behavior?
Which types of characteristics matter for networks? What types of features are important? Strong/weak ties? Centrality? Context?
Network scale and what you can do with networks. Affects what you can do.
Boundary specification. Who’s in your network?
Networks as static or dynamic processes? What types of questions can you ask from each perspective?
Context: everyday lives, putting it in terms of their disciplines, (Facebook)
Group 2:
Group 3:
Students gather and map their own data (Facebook friends/University network)
DARPA Weather Balloon Experiment at
MIT (asking students to design a similar experiment
Using data (How are Wikipedia articles linked together, information transversal)
Kaufman paper (Biological Boolean networks used to replicate differences in cells)
Best time:
Expected Outputs:
Courses
Feedback
Contributions
-
By Miller McPherson, Lynn Smith-Lovin, and Matthew Brashears JSTOR Link
Used the General Social Survey to obtain national data that was then analyzed to answer a relevant research question
Can do longitudinal or demographic research on the United States with the survey, which has been conducted since the 1970s; easy to use the internet site and accessible to everyone
This particular article researches core social networks and how the number of confidants one shares important information has changed within a twenty year span
More people in 2004 than 1985 discuss important matters with a spouse, yet fewer discuss important matters with other family or non-family members
People are more likely in 2004 to have a confidant of a different race
Loss of community or neighborhood confidants and stronger bonds within the nuclear family have arisen
-
By Peter Bearman, James Moody, and Katherine Stovel JSTOR Link
Analyzes the structure of teenage romantic and sexual network in one town’s high school
Compares generated simulations of what is expected to happen to real life outcomes; discovered that current models and the process used to construct them are not easily translated into real-world scenarios
The article goes in depth on the specific models of disease transmission found in the high school:
Sexual partner choices were not random and models should take this notion into account (homophily principle), which they did not at the time; Randomization was widely accepted
Types of physical models generated by the researchers: the Core Model, the Bridging Processes Model, and the Spanning Tree Model
The outcome of the research has implications for the development of preventative health measures and how they could/should be modeled to accommodate different settings
By Mark Mizruchi and Blyden Potts ScienceDirect
Behavior and attitude are relational in a group setting based on a person’s surroundings
The researchers use group decision making in order to analyze the power of centrality and whether it is as easy for a central individual to obtain desired outcomes as previously believed
“The model is based on a distinction between what we call ‘individual’ and ‘structural’ interests. An individual interest is a preference, for a particular outcome, that is exogenously formed. A structural interest is a preference, for a particular outcome, resulting from identifiable social constraints or influence, that may differ from what one’s preference would be in the absence of such constraints or influence. In this formulation, an actor has an initial preference for an outcome.”
The zero-sum principle was used to even the playing field so that no subject had more incentive to win
Findings:
Centrality matters in understanding an actor’s power, but it does not always matter in the same way across different settings or situations
Exactly how centrality affects an actor’s power is heavily determined by the structure of the network, not who is in it
A significant aspect of the network structure involves the number of subgroups, and whether the central actor is in a position to break a deadlock amongst competing groups
*The study of Sociology
*The general Social Survey:
It is an anonymous survey representative of the English speaking population with the United States above the age of 18
It is conducted via face-to-face interviews
Has the highest response rate of any national survey
Conducted yearly
Consists of a set of basic questions and a yearly onslaught of extra questions that can be submitted by researchers
Anyone can access the general social survey and the data (can run the exact models)
*McPherson/Smith-Lovin Paper
Asked an ego-centered question & published paper
Noticed that many people now (2004) have zero network contacts with whom to discuss important matters with
Couldn’t believe the result so researchers tried to destroy this result
The paper was reviewed by Fisher who wrote a response
Later discovered some discrepancies in the results from 2004 as far as discrepancies in the data (41 out of 1493 cases; 25% had reported zeros in general)
Now the GSS is going back into the field and they will go in with the same questions asked before by McPherson and Smith-Lovin
Voluntary association questions, but more streamlined this time
Places the question as early in the non-core items as it can possible be (early in the survey to counteract fatigue issues/as soon as possible after the core questions)
*Network Analysis:
*Ideas to implement with students:
Introduce the homophily structure (introduced by the context in which you meet people)
Can demonstrate in the classroom
Tracing of communicative acts (emails, cell phone logs, etc)
Given data sets such as the GSS, what types of modeling could students do? What type of questions would be interesting for the them to answer? What experiments could be conducted
*Date: April 2nd
Administrative stuff
Summer workshop July 8-9
FLC Honoraria, make sure you send information to Camelia Pierson Eaves
revp@cs.duke.edu
Identify areas of common interest
Break up into 3 groups based on interest in a problem domain
Process by which you analyze data
workplace teams
Important concepts
degree centrality,
homophilous attributes (gender, race, relgious)
density
content of relationships
Questions/Problems
comparing 2004 vs. 2006
ideas: alcohol use, religion,
where do social isolates get their news from?
do gun owners have more friends than non-gun owners?
introduce with question
students hypothesize with answer
actually answering
Study your own networks
Cluster/bin people by their type of profile
Identify problems rather than solve problems
Problem: Come up with a strategy for binning friends
if someone has more than 5 music categories but only 2 movie categories then music
Groups presented their ideas for a module.
David Banks, Ketan Mayer-Patel, and Jim Moody
Background: This is a set of lesson plans for one hour classes on network science. The target audience is undergraduates who have taken an introductory college course in mathematics, statistics, computer science, or some other field that conveys an understanding of modeling.
At the end of this module students will
understand issues related to data collection and data quality in the context of network surveys.
understand how summary statistics can distinguish visually significant and interpretable differences between networks.
understand how statistical modeling can iteratively construct formulae that provide useful and meaningful approximations to the observed networks, enabling insight into the processes that produced those networks.
Before the lecture, each student is given the name of a celebrity and asked to track their sexual contacts, and their sexual contacts’ contacts.
The lecture will show two-step diagrams for some celebrity names, to illustrate the visualization.
The lecture will discuss contact tracing for STDs and other diseases (e.g., the CDC smallpox protocol, SARS, etc.).
The lecture will say that other things flow along networks—information, money, drugs.
Introduce the “six degrees of separation” concept.
The lecture will discuss the issue of data quality. Have all hook-ups been recorded? Have some hook-ups been faked (mention the recent _30 Rock_ episode in which Jenna is hired to fake a celebrity romance to dispell rumors about a star’s fondness for Japanese dolls). How does one define a hook-up, and should there be a distinction between, say, a long-term marriage partner and a short-term affair?
Illustrate the ideas using the hook-up data base http://www.whosdatedwho.com/people/. Show how the network changes when one adds noise to the data. Show how the flow through a network is affected.
Define vertex degree, and calculate it for the Hollywood Hook-Up network.
Relate average node degree to flow through a network. Remind people of the STD example.
Compare average node degree for the Hollywood hook-up network in different decades. Compare the average node degree to the “Jefferson High” data. Are students more/less promiscuous than actors?
Compare the average node degree to the on-screen relationship data found at http://www.whosdatedwho.com/couples/on-screen-couples.asp.
Discuss other mathematical functions of networks: centrality, betweeness, diameter, triad completion. Which provide interpretable information in this context?
Assignment: Calculate these summary measures for 3-step neighborhoods of different celebrities.
Assignment: Do the six-degrees of separation game for the hook-up data; it is available at http://www.whosdatedwho.com/dating/six-degrees.asp.
Describe the p_1 model.
Fit the p_1 model, and estimate the “extroversion” parameter for various celebrities.
Elicit from the class covariates pertinent to the hook-up network.
With the covariates “gender” and “co-starring”, refit the p_1 model.
Discuss the issue of fit. Compare the models visually.
Discuss the problem of unknown/unmeasured covariates.
Assignment: Students will use instructor-provided code to fit models with other covariates: A-list vs. B-list, age.
At the end of this module, students will
Know the definitions of the basic measures used in egocentric network analysis.
Have learned how to calculate the basic egocentric network measures.
Have a better understanding of the issues involved in collecting network data via survey methods.
Have experience analyzing network data using a statistical software program.
Gain an understanding of hypothesis development and testing procedures.
Learn how to use network analysis to solve sociological problems.
Audience
Time period
Goals
Introduce egocentric network measurement, data collection, and analysis
Provide an understanding of hypothesis development and testing
Generate interest in using network analysis to solve sociological problems
Assessments
Problem
How do the social networks of women differ from the networks of men? Specifically, to what extent do male and female networks differ in terms of size, density, homophily, and kinship?
Data sources
Prerequisite skills
Some familiarity in working with databases
Some experience with statistical software (SPSS, SAS, Stata, or R)
Expected outcomes
Complete survey on ego-network characteristics
Test hypotheses about sex differences in networks using national data
Compare the results from the two data sources
Basic egocentric network measurements
Ego-network data collection strategies
Survey data analysis with statistical software program
Hypothesis development and testing
Network measurement calculation
Statistical significance testing via paired samples t-tests
Introduce the problem.
Discuss differences between men and women, male and female networks.
Discuss the various ego-network measures:
Degree centrality, density, homophily, alter attributes, relationship form/content
Option: Include a broader set of network measures (see the list below)
Option: Split the class into groups to deal with different measures.
Option: Organize groups based on similar hypotheses.
Have students develop their own hypotheses for sex differences in networks.
Discuss how one might test the hypotheses with data.
Provide instructions for completing first part of the homework assignment.
Develop a hypothesis for each of the proposed sex differences in network characteristics.
Option: have students use a source (research article, news article, blog post) to support at least one of the hypotheses.
Complete a web survey that replicates questions from the GSS.
Option: tell students to get 2 of their friends to fill out the survey (knowing the point of the assignment might affect how they respond to the survey).
Use the data from the web survey to compute network size, density, homophily, and kinship for male and female students.
Assess the consistency between the hypotheses and the findings.
Describe the General Social Survey and the 2004 network module.
Explain how to calculate the network measures in the statistical software program of choice. * Explain how to conduct a t-test to assess statistical significance.
Provide details of the second part of the homework assignment.
Calculate network measures from GSS.
Conduct paired samples t-tests to assess sex differences in network characteristics.
Option: use a graph to show the gender differences
Note which hypotheses may be accepted or rejected.
What are the differences/similarities between the GSS and classroom estimates?
What might explain the differences?
Other variables in the 2004 network module which could be examined include alter characteristics (sex, race, education, age, religion), social role of the alter (parent, child, spouse, sibling, coworker, group member, neighbor, friend, advisor), form of the relationship (closeness, frequency of contact).
Instructors might also do a similar exercise using the 2006 network module, which asks about the number of acquaintances who have various characteristics.
Instructors could also have students compare the differences between the 2004 findings and the findings from the 1985 GSS network module.
Owen Astrachan, Gary Marchionini, and Jeff Forbes
More to come...