Networks Faculty Learning Community

This Faculty Learning Community (FLC) will design, implement, and evaluate curricular modules on networks. The Duke Center for Instructional Technology (CIT) will provide administrative support for the FLC. CIT states that FLCs provide a venue for faculty to develop their knowledge of teaching, learning and pedagogy, build a supportive community for each other, and create a product(s) useful to the University community and beyond. The FLC for this project will meet monthly over the course of the 2007-2008 academic years.

Resources

  • Papers for FLC meeting on December 13, 2007
    • D.L. Banks and G. Constantine. “Metric Models for Random Graphs.” Journal of Classification Vol. 15 (1998): 199-224.
      • Here is a summary of the general points in Professor Banks’ article on Metric Analyses.
    • James Moody, Daniel McFarland, and Skye Bender‐deMollis. “Dynamic Network Visualization.” American Journal of Sociology Vol. 110, No. 4 (January 2005): 1206-1241.
      • Here is an overview of Professor Moody’s article.

Background

Conferences & Meetings

  • Sunbelt
  • NetSci

This page is very much under construction

Goals

Each of us should enter our FLC goals here.

Astrachan

  • Owen Astrachan, Professor of the Practice of Computer Science

Solve problems.

Banks

  • David Banks, Professor of the Practice of Statistics

Cummings

  • Jonathon Cummings. Associate Professor of Management, Fuqua School of Business

Forbes

  • Jeff Forbes, Assistant Professor of the Practice of Computer Science
  • Module idea
    • How can users’ listening history and profile on Facebook, specifically friendship connections and listed musical interests, help a system effectively recommend new songs to a user?
    • How is the induced music neighbor network similar to the facebook network?
    • Given that information on Facebook links will be incomplete due to limited access and participation, how can we estimate centrality values for the actual network?
    • Does Chatter Matter? The Impact of User-Generated Content on Music Sales: Can we verify the result from this paper? What if we look at the actual songs users listen to? How does position in the network affect order of adoption (e.g. are central users early adopters?)?

Lobo

  • Miguel Lobo, Assistant Professor of Decision Sciences, Fuqua School of Business

Moody

  • James Moody, Associate Professor of Sociology
    1. Which centrality measures are most useful for different sorts of diffusion. This module would build on two dimensions (1) the nature of the good being diffused and (2) the stability of the network’s edges.
    2. Are there particular ways to visualize a network that are best suited to conveying information about centrality? How does this differ by the type of centrality?
    3. Are there features of the local network (within 1 or 2 steps of a focal node) that can reasonably capture most of the centrality information from a global (i.e. fully collected) network? Particularly w. respect to system-level centrality scores, such as information, betweenness and closeness centrality (i.e. this is not an interesting question for degree centrality).

Rodger

  • Susan Rodger, Associate Professor of the Practice of Computer Science

Socolar

  • Joshua Socolar, Associate Professor of Physics
    • Explaining the concepts of betweenness and community (modular) structure. I’d like it to explore some issues in data gathering as well as the analytical tools.
      • An example of what I have in mind would be a module dealing with the following questions: Based on co-authorship data for Duke professors in the natural sciences, can one detect communities defined by departmental boundaries? Do some professors play special roles in establishing the community structure? Are there any identifiable communities that are interdepartmental in nature?
      • The module would have to teach the concept of betweenness centrality and modularity. I suppose it would start with simple examples where the answers one seeks are obvious, then introduce more formal algorithms for calculating quantities that reflect those “obvious” intuitions, and finally apply the algorithms to more complex network data. The latter portion of the module would deal also with the problems associated with discovering or constructing the network data.
    I have strong interests in nonlinear dynamics and complex systems in general and 
    systems biology (genetic regulatory networks) in particular.
    
    My goal for this FLC project is to collaborate on development of modules that will 
    introduce undergraduate students to fundamental concepts of network structure and dynamics 
    early in their academic careers.  I am particularly interested in modules for teaching about 
    percolation transitions and long-time behavior of Boolean networks.

FLC Minutes April 14, 2008

Agenda

  • Last Official Meeting Agenda
    • Do we want to get together again to put module together
      • Working sessions
    • Data that we have
    • End of project bookkeeping

Next Session

  • Working Sessions
    • Proposed May 2, 9, 16, 23, or 1, 8, 15, 22
    • Lunch time meetings
    • Looks like Thursday May 8th would work for everyone

Module We Have and Where It Is Going

  • Modules
    • A module should be defined as focused around a problem.
    • Given our current co-authorship network, how can we detect community structure? What is our overall goal?
      • Each person from FLC brings different perspective
      • We have the data from Duke
        • Have list of nodes
        • Have subjects of interest
    • What are the different Modules that may come out?
      • First module asks certain questions and answers them: how does this network work? What are its pitfalls? Why do we care about these pitfalls?
      • A Second module gets into deeper issues: what are the data gathering problems that will make a difference for this algorithmic process
        • Is it interesting to look at triple structures versus binary structures
      • A Third Module could be for more advanced levels: what are the hidden assumptions here from statisticians point of view, what assumptions are we making that aren’t actually true? How can we correct for this?
        • May not be able to give whole analysis, give a flavor of this type of problem, create a module accessible to undergraduates
    • What can we actually put in front of undergraduates in modules?
      • Students will learn the most from playing with algorithms, asking them to look into problems, explore issues
    • What is everyone going to actually do?
      • Socolar: working with Mathematica
        • Creating an example for students to get their hands on; will eventually have to be moved from mathematica
        • Have to find a problem that stumps current program in mathematica
      • Banks: Focused on problem of triads
      • Moody: different models for community finding that are out there
      • Cummings: Why is community structure important? What does finding this data do for us?
    • Everyone works on their piece, don’t focus much on introductions
    • In next meeting we will figure out where to put ideas together, where things can be collaborated and commented on

Data For Next Year

  • Everyone should check to make sure their first payments have been given
    • Second payments due when finished (probably end of May)
    • Will be a finishing survey
  • Goals for next year
    • Want to continue this project next year
    • Want to include people from UNC, NC A and T
    • Next year meetings will be once every two months

FLC Minutes March 24, 2008

Agenda

  • Looking at the data, thinking through a module
  • Now that we have the data, how do we want to use it to make a module

Data and Our Ideas About it

  • Idea: Consider all faculty in the sciences at duke and look at co-authorship networks (co-authorship and sharing a third shared author)
    • Could you tell by looking at relationships if the university is organized in a logical way? Do organizational patterns emerge?
    • Is there data available already to help us get at this? We are hoping co-authorship will solve this
    • How do you determine community structures?
  • How can different perspectives contribute to this problem?
    • David Banks: there is going to be clumping within departments, will this cloud the data?
    • How do you detect communities within graphs?
      • Can also look at this from different departmental perspectives
      • Is this important educationally?
      • How do these skills apply to other problems?
    • Extracting community structure should be at the heart of this
  • Could we better deal with information about students?
    • There would be very dense graphs
    • Could get data on who took classes together
    • What questions could you ask about this? This is a more intractable problem
  • We could also look at data for keyword similarities in titles
    • This might provide more interesting data
  • Banks has data on wikipedia
    • This would be a much longer problem
  • Data we are getting is from faculty database

Creating Our Modules

  • Now we have data, what are we going to do with it?
    • Might be interesting to have a module just about gathering data
      • Are interesting problems about how to determine who is who, what data is valid
    • Original idea to pair up and work on parts of the module
      • One group needs to work on background information
      • Need pairs to work on different parts and Forbes hopes to compile as finished modules
    • How do we want to compile our ideas?
      • Something like an instructor’s manual
      • Need to get at assignments, how is this going to be used?
      • Want to be able to give this to non-experts and allow to be used by all/many departments
      • Learning Objectives
    • Do we want to work up through smaller problems or do we want to work on definitions?
      • Issue, work through small problems, think through issues, work up to larger problems that might be more department oriented
    • Our modules need to be for instructors, not students, at least yet
    • We want to generate as much raw material as possible
    • Look at probabilistic method, Newman method etc etc
    • We want different perspectives to bring them all together
    • There isn’t one way to do it, we want to see different sides of the same problem
    • We want smaller problems, things that are reasonable for students to come to different conclusions about
  • Want things that are going to be appropriate for different disciplines to look at and determine different solutions then be able to take that knowledge into different disciplines, apply to different papers, theories etc

FLC Minutes February 8, 2008

Agenda

  • What are the themes we want to deal with in our module?
  • How can we use these themes to make modules?
  • What is each person going to contribute?

Themes

  • Centrality
    • Meaningfulness of Centrality in Citation Networks by Jeffrey Demaine (Sunbelt)
  • Position in the Network
    • Network flows
  • Problems become a lot harder with larger networks
    • Sampling methods
    • Statistics, adaptive sampling
    • Sociology, RDS (respondent driven sampling) – Hecksthorn
    • Recall is also difficult
  • Finding data sources and exploiting those
    • Facebook
    • Email networks
    • International (trade networks, etc)
    • Problem with this is that we are exploiting networks that already exist, maybe we should be contracting the problems first then collecting data
      • Interesting ideas about anonymity
      • Do you use local network, how much local data do you need to recreate a global scale?
    • What kind of other data can we find?
      • Maybe cell phone data

Using Themes to Make Modules

  • What can we do in terms of making these things modules?
    • What is in a module?
      • Centering on a question or problem (or a set)
        • Do not want to have a “centrality” module
        • Argument to this: in order to use in a class you need themes, concrete problems
        • Something you want to have the answer to
      • Should contain some computer science
      • Should be relatively self-contained as an idea

How Do We Make Modules?

  • How should we go about making these modules?
    • We picked centrality as a starting point for a reason
    • Need backup information
    • We want to use our outlined themes
    • Work in pairs to ensure things get done
    • Suggest that everyone creates 1-2 modules, need to respond with 2-3 concrete points so we can make pairs
      • Need to come up with unique ideas revolving around three main themes: centrality, position in the networks, large networks
      • You need to come prepared to talk about what part of the module you want to work on
    • Questions about coordination and how much time needs to be spent on getting these ideas
      • Need 1 hours worth of thinking on this

FLC minutes Dec 13, 2007

Metric Models for Random Graphs

  1. Social network as the application
    • Random graphs drawn from Independent and Identically Distributed distributions (but distributions are unknown in advance)
    • Identical distribution—all samples drawn from the same distribution (repeated draws from the same population)
    • Estimate the center of distribution, dispersion (“standard deviation” or variance) around the center
    • Classroom of students, edges b/w people who are friends, no edges b/w people who aren’t friends, nodes are fixed, just choosing edges
    • Try to pin down what a “friendship” is in order to assign edges
    • Problem: don’t know true distribution
      • Build a model with normal distribution (p.201)
      • g*: typical distribution, the closer g is to g* the more likely the graph is
      • How close g is to g* is represented by “tao”
      • Represents maximum entropy
      • Makes probability as normal as possible given mean, g, g*
    • Example: difference is difference in number of edges between 2 graphs
      • Create distance metric between same distance between girl-boy is not the same distance between boy-boy
    • Phylogenetic (development of speciation) tree—take samples of hemoglobin and build a tree
      • Build another different tree, shouldn’t be too different
      • Concerned about high differences being big differences
    • Cluster analysis of word count
      • Want a metric that penalizes discrepancies at the bottom of the tree
      • Metric should reflect the scientific purpose
    • 2 choices—Hamming and Euclidean, but then there are choices about components of vectors
    • Bernoulli graphs are a special case (p.203)
    • Eq. 7 (p.204): Hamming metric allows for geometry of graphs
    • Eq. 8 (p.204): majority of sample has an edge between A and B, then you report that A and B are friends
    • Problem: different students’ perception of “friendship” but if normalized then it would change values (slightly adds to error value)
    • Assessing goodness of fit (2 ways)
      1. looks at symmetry around vertex in hypercube as you move away from vertex (no symmetry=need new model)
      2. looks at tails to see if there is large exponential difference
    • Mixture model—try to account for unknown variables (ie. If gender is unknown)
    • Establishing confidence region (p.206)
  2. Metric models for tree-value random graphs—looking at classification trees

Dynamic Network Visualization

  1. Open issues in visualizing dynamic networks
    • Aggregates of past events
    • Problems of time
    • Thousands of bits of information
    • Complex models
    • Distorted imaging
  2. Visualization examples—move or static?
    • A static layout with emerging edges can work well when the cumulative graph is sparse.
    • If static, the standard set of aesthetic criteria apply but we need a way to capture change over time
    • Multiple plots—suffers from anchoring problem
    • Use one dimension to represent time (lost the spatial correlate to distance and structure)
    • Here abstract from points and lines to groups (cliques) active at any given time. Lines follow nodes, colors denote group.
    • Examples: teacher as vertex, distances represents interactions with students
  3. Open problems with dynamic movies
    • Visual memory—details lost due to memory
    • Replication—given the random nature of algorithms it is difficult to ensure the “same” movie runs
    • Spurious movement—system level changes affect local level movement and could lead to some bouncing around
    • Measuring fit
      • Specific: each algorithm is optimizing a feature, want to match that feature to a quantitative measure of fit to know if one layout is really accurate/better than another
      • General: want to know if screen image really reflects underlying social process accurately (currently use stress statistic, comparing observed to actual)
    • Mapping effects of temporal change
      • Solution: dynamic line graph (convert every edge to a node and draw a directed arc b/w edges that share a node and precede each other in time (if they are concurrent, arc is made symmetric)

FLC minutes November 26, 2007

  • Everyone could give a lecture on a topic they find pertinent
    • Everyone produces a module
    • Extract general principles
    • What ideas can maximize utility for all
    • Different departments learning from one another
    • Links between technology and policy
    • Main Goal: to produce modules!!
    • Find support from other faculty
  • Module definition: what you do in your course
    • How important is it to be web enabled? This is an ultimate goal
    • Can be as simple as pdf links, there can also be interactive modules
    • Active models can be helpful
    • Homework, visualization applets, evaluation
    • Portable, adaptable
    • Problem-based, have something that you are trying to solve, easy to motivate
    • People have to want to solve it
  • We are not building a course, we are developing modules for our own courses
    • It is important to coordinate?
    • How important is it to define language?
    • Modules goal is to be able to be picked up by other courses
    • We need a better idea of who the audience is going to be, who is this module for?
    • We could pick a specific problem about networks and brainstorm perspectives we could bring to it, what questions it raises, how we deal with this problem
      • Alternative: we all prepare modules or things we think are important, but it is unclear how useful feedback from this community would be
  • How many problems are we going to look at?
    • Are there problems that are useful to all disciplines?
    • Systems that we are all interested in
  • Kearn’s Networked Life Class
    • Course for everybody, no prerequisites
    • Applies modules from many disciplines to computer science
  • What can be learned from different disciplines
    • Getting different perspectives on the same problem
    • We might all change our perspectives
  • Everyone can bring in a problem, set up for ten minutes, then floor open for discussion, different views on this problem
    • We can discuss problems, then hope to develop them into modules
    • We don’t need a module from every professor, a few modules can be developed by the whole group later
      • Or one person can develop a module that they really find intriguing
    • One goal from last meeting was to have modules that were useful in multiple settings
      • Fellowship is also an important goal, but it is important to have an end result
      • A good first step is to talk about network problems
  • Everyone suggests a topic for discussion/something to be developed into a module
    • Statistical inference of graph data (Banks Stat)
      • Data observations on networks
      • Inferences about average, spread, SD
      • Finding functions of the network
      • Each observation is a different network, ask a group of people about a network, see what the different results are that are produced
      • Developing different definitions of “random graphs”
        • Random is going to be a difficult word for this community, be careful with your usage
    • Understand paper by Kleinberg and Backstrom (Where for art thou R3579X) (Astrechan CS)
      • Network issues from a CS background
      • Stegonography (hiding information)
    • Longitudinal Data (Lobo Business School), Causal Inference on Graphs
      • 3 different networks: Interactions between people, Assessments of competence and Measures of “like”, affect
      • How do you make statistically solid inferences about causality? How do you control for network structure?
      • Learn about networks in order to get rid of networks
      • Is it possible that you can’t get rid of the network?
      • Do we want to do methodological or application work? What is easier to understand, getting people interested in…
    • Tension between which structures are good for individuals and which are good for the collective (CS)
      • Looking at effects of centrality etc
      • Can see this problem in many different areas, individual and collective needs
      • What is a good network for a particular group?
      • Understanding the tradeoffs between different networks, what is driving the performance of a group?
      • Could also be framed as a centrality problem
    • Network of molecules in a cell (Socolar bio)
      • Network of causal effects
      • Transcription factor networks
      • Deterministic relationships
      • Think about what a link represents, relationships between nodes
      • Questions: when I have a complex network, how is it organized so that a cell can carry out a variety of functions? Think about processing information, how it is encoded in the network.
        • Suppose I just have random Boolean networks, what do they do? Qualitatively different functions, network dynamics change, structure is fixed, when you let it run what does it do?
      • Math problem out of biology
    • Dynamic networks (Moody Sociology)
      • How edges come on and off in graphs
      • Understanding rules of how edges are created
      • How do things move through graphs? (disease etc)
    • Citations (Forbes CS)
      • Who wrote what paper?
      • Network of people that have written papers together
      • Paper from U Maryland that looks at this issue
      • Disambiguate different authors
      • Visualization of networks
  • Next Meeting: Thursday 13th December 12:15pm

FLC minutes Sept 24, 2007

  • Networked Life class at UPenn
    • Steven Johnson’s Book
      • Why is pop culture now more complex?
      • Use GEM model to analyze shows
      • Does this have any meaning in social science
      • How are the nodes connected?
      • How are we going to analyze these graphs? Morphing
      • MAGE (Duke tool, biology)
  • Larger Networks
    • How are you going to look at the diameter of the WWW?
      • Can you find this without looking at everything, no
      • Counterintuitive that the diameter is so small
      • Looking at nodes, there are not only a few key nodes
    • Idea of six degrees
      • How can you measure how far apart people are?
      • What is the furthest apart you can be?
      • Abuse of power laws, issue to think about
    • Power Laws (observation of natural phenomenon)
      • Routers, autonomous systems
      • People and number of acquaintances
      • Words and number of characters
      • City size, population
      • The fact that you know something is a power law doesn’t necessarily help you
        • What properties are causing this power law
      • Students need to know how to do this
        • Also need to see a counter example
      • Social science networks provide useful data for this, real world situations
        • What are the motivations for why we create these
        • NetSci bringing together social sciences and physicists
      • Centrality
        • How does something work on one graph and not another?
          • Centrality affects this
          • Being a part of a group allows things to work
        • Different measures of centrality
          • Not degree, closeness centrality
          • Problem based idea, how do you explain a situation
        • Would want to build around this in a class
        • GUESS finds degree, closeness, between-ness, page rank

Group Work

  • Developing Materials
    • Building up skills, modules
    • Teachers need to develop materials
    • Fit modules together
    • Have to establish a minimum skill base
      • Want to keep it as open as possible
        • Could look at the same problems from different views
    • Bring in different interest groups
    • Get at some big concepts
    • What core concepts matter?
      • Computability and complexity
      • Algorithms, ways of evaluating
      • Degree, closeness, centrality
    • Show implications of concepts in all disciplines
      • Force CS people to see social implications
      • Could use Ethical Implications
      • Privacy issues
    • Important to keep problems fresh and relevant
      • Online social networks
      • Doug Lawson book
    • How to interpret, not to get mislead
      • FundRace.org, neighbor search
        • Look at neighbors
        • Attributes about nodes
      • FOAF (xml format)
        • Lists of members
    • Recommender Systems
      • Netflicks challenge
    • Need to get information from students
    • What should/should not be in the module?
      • Modules can look like textbooks
      • Don’t want to dictate to people what they are going to do
      • Need to keep it accessible
      • Motivated by a problem
        • Many applications of one problem
        • We hope that the topic/problem will get people motivated
    • Sociology, different perspectives, getting women involved, women seem to be more attracted to design elements, make sure the title doesn’t scare people away
  • Which tools are going to be useful for a course? How would you use them?
    • What books will students actually read?
    • GUESS
      • Visualizing graphs
      • Can manipulate the graph to make it easier to understand
    • LGL (used for graphs)
    • NetVis
      • Network analysis
      • Social science research networks
      • Easy to use, fill in data
      • Are there privacy issues with this program?
    • Recommender systems
      • Can use social systems to benefit these
    • SocialAction
      • Adam Perer
      • Making sense of social networks
      • Dynamic query
      • Degree rankings
      • Choose to rank based on different features, filter results

New Courses to add: • http://Ibiblio.org/fred/inls_490 • INSNA courses HW: * What your module would be for a network class?

  • Associated around a problem
  • Concepts
  • Prerequisites

* There will be a workshop

  • Does this workshop make sense?
  • Evaluate the materials
  • Meeting to discuss this, what we have done so far etc

* Feedback

  • Lunch meeting was confusing
    • Different goals
    • That group was just building modules
    • Our goal is to increase interest in CS
    • Our hope is to create modules for other courses also
  • Look into who these modules are being developed for
  • Who is going to actually build the modules?
  • Who from Duke is participating?
 
harambenet/flc.txt · Last modified: 2008/04/29 13:58 by sfj2
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki