Faculty Learning Community 2009-2010

This page will contains notes, materials, and resources for the faculty learning community.

Meeting 1 - December 4, 2009

Networks as an Introduction to Computing

  • What the FLC is about:
    • Bringing people together to work on a program together
    • Build a community around networks as a potential introduction to computing
  • FLC Goals
    • Build interdisciplinary, cross-institutional community centered around teaching
    • Discuss exemplars in network science education and applications
      • What are great ideas in network science?
      • What problems best encapsulate these great ideas?
    • Contribute the development or evaluation of modules
  • Discussion:
    • The concept to build on network science and complexity because a solution to one problem can solve problems in a lot of areas; communication across fields isn’t always as strong
    • Example of education level is the Northwestern Network Institute of Technology and Organization (All Engineering Students Must Take Intro Level Course and want to make it mandatory for all freshmen)
    • Theoretical aspect: relational approach to understanding the world
    • A good focal point for interdisciplinary work because it can be introduced easily by social sciences
  • Discussion: Can network science be applied to all fields?
    • No goodness of fit test
    • There is a similarity across fields (underlying structure) that allow same tools and the difference is in the interpretation and dynamics
  • Discussion: Agent Based Models
  • Small Groups: 3 interesting questions a student with no background in networks would want to address
    • Group 1:
      • Can a network (structure or network relationships) affect/influence behavior?
      • Which types of characteristics matter for networks? What types of features are important? Strong/weak ties? Centrality? Context?
      • Network scale and what you can do with networks. Affects what you can do.
      • Boundary specification. Who’s in your network?
      • Networks as static or dynamic processes? What types of questions can you ask from each perspective?
      • Context: everyday lives, putting it in terms of their disciplines, (Facebook)
    • Group 2:
      • Pedagogical strategies:
        • Study the structure of networks (get data and ask how can it be characterized?)
        • Ask students to simulate processes on networks (how do you model that and what are the rules and activity behind it?)
        • Conduct experiments on the classroom network (record information spread, force a group decision/consensus) and be able to record every interaction
      • Local vs. Global Emergent Phenomena
    • Group 3:
      • Students gather and map their own data (Facebook friends/University network)
        • DARPA Weather Balloon Experiment at MIT (asking students to design a similar experiment
      • Using data (How are Wikipedia articles linked together, information transversal)
      • Kaufman paper (Biological Boolean networks used to replicate differences in cells)
        • Students could simulate this experiment and run it
  • Best time:
    • Friday Mornings
    • 3 more meetings between now and May
  • Expected Outputs:
    • Courses
    • Feedback
    • Contributions
  • Weekly Network Group Mailing List: Email James Moody mailto:jmoody77@soc.duke.edu

Meeting 2 - February 12, 2010

Articles - Spring Session 1

Social Isolation in America: Changes in Core Discussion Networks over Two Decades

By Miller McPherson, Lynn Smith-Lovin, and Matthew Brashears JSTOR Link

  • Used the General Social Survey to obtain national data that was then analyzed to answer a relevant research question
  • Can do longitudinal or demographic research on the United States with the survey, which has been conducted since the 1970s; easy to use the internet site and accessible to everyone
  • This particular article researches core social networks and how the number of confidants one shares important information has changed within a twenty year span
    • More people in 2004 than 1985 discuss important matters with a spouse, yet fewer discuss important matters with other family or non-family members
    • People are more likely in 2004 to have a confidant of a different race
    • Loss of community or neighborhood confidants and stronger bonds within the nuclear family have arisen
  • Comment-and-reply about the stats modeling in American Sociological Review

Chains of Affection: The Structure of Adolescent Romantic and Sexual Networks

By Peter Bearman, James Moody, and Katherine Stovel JSTOR Link

  • Analyzes the structure of teenage romantic and sexual network in one town’s high school
  • Compares generated simulations of what is expected to happen to real life outcomes; discovered that current models and the process used to construct them are not easily translated into real-world scenarios
    • Carries implications for disease transmission and public policy because new models potentially need to be made in terms of outbreak and spread of disease
  • The article goes in depth on the specific models of disease transmission found in the high school:
    • Sexual partner choices were not random and models should take this notion into account (homophily principle), which they did not at the time; Randomization was widely accepted
    • Types of physical models generated by the researchers: the Core Model, the Bridging Processes Model, and the Spanning Tree Model
  • The outcome of the research has implications for the development of preventative health measures and how they could/should be modeled to accommodate different settings

Centrality and Power Revisited: Actor Success in Group Decision Making

By Mark Mizruchi and Blyden Potts ScienceDirect

  • Behavior and attitude are relational in a group setting based on a person’s surroundings
  • The researchers use group decision making in order to analyze the power of centrality and whether it is as easy for a central individual to obtain desired outcomes as previously believed
    • “The model is based on a distinction between what we call ‘individual’ and ‘structural’ interests. An individual interest is a preference, for a particular outcome, that is exogenously formed. A structural interest is a preference, for a particular outcome, resulting from identifiable social constraints or influence, that may differ from what one’s preference would be in the absence of such constraints or influence. In this formulation, an actor has an initial preference for an outcome.”
  • The zero-sum principle was used to even the playing field so that no subject had more incentive to win
  • Findings:
    1. Centrality matters in understanding an actor’s power, but it does not always matter in the same way across different settings or situations
    2. Exactly how centrality affects an actor’s power is heavily determined by the structure of the network, not who is in it
    3. A significant aspect of the network structure involves the number of subgroups, and whether the central actor is in a position to break a deadlock amongst competing groups

Meeting Minutes

*The study of Sociology

  • Early-on there were limited research capabilities
    • Limited to very small, basic social structures (monasteries, schools, etc)
  • The development of network science (General Social Survey – McPherson’s paper isolation)
    • To Dwell Among Friends study (Claude Fisher): talked about the features of anetwork (1970s)
    • The approach was generalized by the GSS to study “ego-networks” – a probability study of an individual taking into account their relations to contacts (aka network alters) – 1985
      • This method was chosen because it elicits the strongest ties and is able to pull in a wide variety of information outside of just the individual subject
      • Also, the method limits the number of such ties so that the researcher does not run into boundaries
      • Ego-networks allow researchers to address local questions (such as density, transitivity) by asking specific questions about the network alter

*The general Social Survey:

  • It is an anonymous survey representative of the English speaking population with the United States above the age of 18
  • It is conducted via face-to-face interviews
  • Has the highest response rate of any national survey
  • Conducted yearly
  • Consists of a set of basic questions and a yearly onslaught of extra questions that can be submitted by researchers
  • Anyone can access the general social survey and the data (can run the exact models)

*McPherson/Smith-Lovin Paper

  • Asked an ego-centered question & published paper
  • Noticed that many people now (2004) have zero network contacts with whom to discuss important matters with
  • Couldn’t believe the result so researchers tried to destroy this result
  • The paper was reviewed by Fisher who wrote a response
  • Later discovered some discrepancies in the results from 2004 as far as discrepancies in the data (41 out of 1493 cases; 25% had reported zeros in general)
  • Now the GSS is going back into the field and they will go in with the same questions asked before by McPherson and Smith-Lovin
    • Context experiment: same question that was originally asked will be asked in three different randomly assigned contexts
      1. The question is placed in as close to context as in 2004 as possible (embedded in a module of questions focused on voluntary associations)
  1. Voluntary association questions, but more streamlined this time
  2. Places the question as early in the non-core items as it can possible be (early in the survey to counteract fatigue issues/as soon as possible after the core questions)
  • Current Battle: statistical models vs. tables and cross tabulations (battle of the generations)
    • Survey research is inherently less reliable and low response rates
    • Argument: sociologists must try their hardest to model these situations
    • The ability to think about social change: the meaning that terms have themselves change overtime and it is difficult to make meanings the same over time in a longitudinal study

*Network Analysis:

  • Can the GSS survey do any higher order network science?
  • Is there a way to better analyze distinct social groups?
    • A grad student (Jeff Smith) is trying to construct the entire network of the United States based on the social isolation theory

*Ideas to implement with students:

  • Introduce the homophily structure (introduced by the context in which you meet people)
    • Survey them on their closest friends and get the friends’ data
  • Can demonstrate in the classroom
    • Defensive nature
  • Tracing of communicative acts (emails, cell phone logs, etc)
Next Meeting: Think about the following questions

Given data sets such as the GSS, what types of modeling could students do? What type of questions would be interesting for the them to answer? What experiments could be conducted

*Date: April 2nd

Meeting 3 - April 2, 2010

Agenda

  1. Administrative stuff
    • Summer workshop July 8-9
    • FLC Honoraria, make sure you send information to Camelia Pierson Eaves revp@cs.duke.edu
  2. Identify areas of common interest
  3. Break up into 3 groups based on interest in a problem domain

Hollywood Hookup

  • 4 lectures
    1. Acquiring a Hollywood Hookup dataset
      • biases in acquisition, scientology, etc.
    2. low-level network analysis & visualization
    3. dynamic graphs
    4. Compare Kevin Bacon graphs to Jefferson High
  • salacious question: who is more promiscuous: celebrities or high school students?
  • walk through the standard network analysis tools for one dataset
  • Compare across years

General Social Survey

  1. Process by which you analyze data
    1. workplace teams
  2. Important concepts
    1. degree centrality,
    2. homophilous attributes (gender, race, relgious)
    3. density
    4. content of relationships
  3. Questions/Problems
    1. comparing 2004 vs. 2006
      1. ideas: alcohol use, religion,
    2. where do social isolates get their news from?
    3. do gun owners have more friends than non-gun owners?
  4. introduce with question
    1. students hypothesize with answer
    2. actually answering

Facebook/LinkedIn

  • Study your own networks
    • Terms of service issues
    • Archaeology: Getting the data
  • Cluster/bin people by their type of profile
    • Demographics of your friends
  • Identify problems rather than solve problems
  • Problem: Come up with a strategy for binning friends
    • Honesty?, word counts
    • Pick a profile attribute (e.g. music)
      • Count # music types
      • Genres
      • Diversity of music interest
    • Correlations between different attributes
  • Possible types: Masqueraders, exhibitionists, reclusives
    • How do you detect?
      • Take small number that you know - look for indicators
        • Small set of if/then rules
if someone has more than 5 music categories but only 2 movie categories then music 
  • Techniques
    • What’s hard to automate?
    • looking at words vs. phrases

Next Meeting: Write up and present results for problem

  • Date: May 13, 9:30-11:00

Meeting 4 - May 13, 2010

Groups presented their ideas for a module.

Hollywood Hook-Up Module

David Banks, Ketan Mayer-Patel, and Jim Moody

Goal

Background: This is a set of lesson plans for one hour classes on network science. The target audience is undergraduates who have taken an introductory college course in mathematics, statistics, computer science, or some other field that conveys an understanding of modeling.

At the end of this module students will

  • understand issues related to data collection and data quality in the context of network surveys.
  • understand how summary statistics can distinguish visually significant and interpretable differences between networks.
  • understand how statistical modeling can iteratively construct formulae that provide useful and meaningful approximations to the observed networks, enabling insight into the processes that produced those networks.

Class 1: Data Collection and Quality

Before the lecture, each student is given the name of a celebrity and asked to track their sexual contacts, and their sexual contacts’ contacts.

The lecture will show two-step diagrams for some celebrity names, to illustrate the visualization.

The lecture will discuss contact tracing for STDs and other diseases (e.g., the CDC smallpox protocol, SARS, etc.).

The lecture will say that other things flow along networks—information, money, drugs.

Introduce the “six degrees of separation” concept.

The lecture will discuss the issue of data quality. Have all hook-ups been recorded? Have some hook-ups been faked (mention the recent _30 Rock_ episode in which Jenna is hired to fake a celebrity romance to dispell rumors about a star’s fondness for Japanese dolls). How does one define a hook-up, and should there be a distinction between, say, a long-term marriage partner and a short-term affair?

Illustrate the ideas using the hook-up data base http://www.whosdatedwho.com/people/. Show how the network changes when one adds noise to the data. Show how the flow through a network is affected.

Class 2: Mathematical Properties of the Hook-Up Network

Define vertex degree, and calculate it for the Hollywood Hook-Up network.

Relate average node degree to flow through a network. Remind people of the STD example.

Compare average node degree for the Hollywood hook-up network in different decades. Compare the average node degree to the “Jefferson High” data. Are students more/less promiscuous than actors?

Compare the average node degree to the on-screen relationship data found at http://www.whosdatedwho.com/couples/on-screen-couples.asp.

Discuss other mathematical functions of networks: centrality, betweeness, diameter, triad completion. Which provide interpretable information in this context?

Assignment: Calculate these summary measures for 3-step neighborhoods of different celebrities.

Assignment: Do the six-degrees of separation game for the hook-up data; it is available at http://www.whosdatedwho.com/dating/six-degrees.asp.

Class 3: Mathematical Modeling

Describe the p_1 model.

Fit the p_1 model, and estimate the “extroversion” parameter for various celebrities.

Elicit from the class covariates pertinent to the hook-up network.

With the covariates “gender” and “co-starring”, refit the p_1 model.

Discuss the issue of fit. Compare the models visually.

Discuss the problem of unknown/unmeasured covariates.

Assignment: Students will use instructor-provided code to fit models with other covariates: A-list vs. B-list, age.

Sex Differences in Social Connectedness

At the end of this module, students will

  • Know the definitions of the basic measures used in egocentric network analysis.
  • Have learned how to calculate the basic egocentric network measures.
  • Have a better understanding of the issues involved in collecting network data via survey methods.
  • Have experience analyzing network data using a statistical software program.
  • Gain an understanding of hypothesis development and testing procedures.
  • Learn how to use network analysis to solve sociological problems.

BACKGROUND

Audience

  • Introductory network analysis class for undergraduate or graduate students

Time period

  • 2 weeks

Goals

  • Introduce egocentric network measurement, data collection, and analysis
  • Provide an understanding of hypothesis development and testing
  • Generate interest in using network analysis to solve sociological problems

Assessments

  • Completion of online survey
  • Homework assignment

PROJECT DESCRIPTION

Problem

  • How do the social networks of women differ from the networks of men? Specifically, to what extent do male and female networks differ in terms of size, density, homophily, and kinship?

Data sources

  • General Social Survey
  • Classroom network data (collected via web survey)

Prerequisite skills

  • Some familiarity in working with databases
  • Some experience with statistical software (SPSS, SAS, Stata, or R)

Expected outcomes

  • Complete survey on ego-network characteristics
  • Test hypotheses about sex differences in networks using national data
  • Compare the results from the two data sources

MATERIAL PROVIDED

  • GSS 2004 dataset
  • Access to web survey
  • Access to statistical software program

DISCIPLINARY KNOWLEDGE AND SKILLS

  • Basic egocentric network measurements
  • Ego-network data collection strategies
  • Survey data analysis with statistical software program
  • Hypothesis development and testing
  • Network measurement calculation
  • Statistical significance testing via paired samples t-tests

TIMELINE

Week 1
  • Introduce the problem.
  • Discuss differences between men and women, male and female networks.
  • Discuss the various ego-network measures:
    • Degree centrality, density, homophily, alter attributes, relationship form/content
    • Option: Include a broader set of network measures (see the list below)
    • Option: Split the class into groups to deal with different measures.
    • Option: Organize groups based on similar hypotheses.
  • Have students develop their own hypotheses for sex differences in networks.
  • Discuss how one might test the hypotheses with data.
  • Provide instructions for completing first part of the homework assignment.
    • Develop a hypothesis for each of the proposed sex differences in network characteristics.
    • Option: have students use a source (research article, news article, blog post) to support at least one of the hypotheses.
    • Complete a web survey that replicates questions from the GSS.
    • Option: tell students to get 2 of their friends to fill out the survey (knowing the point of the assignment might affect how they respond to the survey).
Week 2
  • Use the data from the web survey to compute network size, density, homophily, and kinship for male and female students.
  • Assess the consistency between the hypotheses and the findings.
  • Describe the General Social Survey and the 2004 network module.
  • Explain how to calculate the network measures in the statistical software program of choice. * Explain how to conduct a t-test to assess statistical significance.
  • Provide details of the second part of the homework assignment.
    • Calculate network measures from GSS.
    • Conduct paired samples t-tests to assess sex differences in network characteristics.
    • Option: use a graph to show the gender differences
    • Note which hypotheses may be accepted or rejected.
    • What are the differences/similarities between the GSS and classroom estimates?
    • What might explain the differences?
Additional options
  • Other variables in the 2004 network module which could be examined include alter characteristics (sex, race, education, age, religion), social role of the alter (parent, child, spouse, sibling, coworker, group member, neighbor, friend, advisor), form of the relationship (closeness, frequency of contact).
  • Instructors might also do a similar exercise using the 2006 network module, which asks about the number of acquaintances who have various characteristics.
  • Instructors could also have students compare the differences between the 2004 findings and the findings from the 1985 GSS network module.

Fakebook

Owen Astrachan, Gary Marchionini, and Jeff Forbes

More to come...

 
harambenet/flc09.txt · Last modified: 2010/06/01 18:02 by forbes
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki