This Faculty Learning Community (FLC) will design, implement, and evaluate curricular modules on networks. The Duke Center for Instructional Technology (CIT) will provide administrative support for the FLC. CIT states that FLCs provide a venue for faculty to develop their knowledge of teaching, learning and pedagogy, build a supportive community for each other, and create a product(s) useful to the University community and beyond. The FLC for this project will meet monthly over the course of the 2007-2008 academic years.
This page is very much under construction
Each of us should enter our FLC goals here.
I have strong interests in nonlinear dynamics and complex systems in general and
systems biology (genetic regulatory networks) in particular.
My goal for this FLC project is to collaborate on development of modules that will
introduce undergraduate students to fundamental concepts of network structure and dynamics
early in their academic careers. I am particularly interested in modules for teaching about
percolation transitions and long-time behavior of Boolean networks.
Working Sessions
Proposed May 2, 9, 16, 23, or 1, 8, 15, 22
Lunch time meetings
Looks like Thursday May 8th would work for everyone
Modules
A module should be defined as focused around a problem.
Given our current co-authorship network, how can we detect community structure? What is our overall goal?
What are the different Modules that may come out?
First module asks certain questions and answers them: how does this network work? What are its pitfalls? Why do we care about these pitfalls?
A Second module gets into deeper issues: what are the data gathering problems that will make a difference for this algorithmic process
A Third Module could be for more advanced levels: what are the hidden assumptions here from statisticians point of view, what assumptions are we making that aren’t actually true? How can we correct for this?
What can we actually put in front of undergraduates in modules?
What is everyone going to actually do?
Socolar: working with Mathematica
Banks: Focused on problem of triads
Moody: different models for community finding that are out there
Cummings: Why is community structure important? What does finding this data do for us?
Everyone works on their piece, don’t focus much on introductions
In next meeting we will figure out where to put ideas together, where things can be collaborated and commented on
Goals for next year
Want to continue this project next year
Want to include people from UNC, NC A and T
Next year meetings will be once every two months
Looking at the data, thinking through a module
Now that we have the data, how do we want to use it to make a module
Idea: Consider all faculty in the sciences at duke and look at co-authorship networks (co-authorship and sharing a third shared author)
Could you tell by looking at relationships if the university is organized in a logical way? Do organizational patterns emerge?
Is there data available already to help us get at this? We are hoping co-authorship will solve this
How do you determine community structures?
How can different perspectives contribute to this problem?
David Banks: there is going to be clumping within departments, will this cloud the data?
How do you detect communities within graphs?
Can also look at this from different departmental perspectives
Is this important educationally?
How do these skills apply to other problems?
Extracting community structure should be at the heart of this
Could we better deal with information about students?
There would be very dense graphs
Could get data on who took classes together
What questions could you ask about this? This is a more intractable problem
We could also look at data for keyword similarities in titles
Banks has data on wikipedia
Data we are getting is from faculty database
Now we have data, what are we going to do with it?
Might be interesting to have a module just about gathering data
Original idea to pair up and work on parts of the module
How do we want to compile our ideas?
Something like an instructor’s manual
Need to get at assignments, how is this going to be used?
Want to be able to give this to non-experts and allow to be used by all/many departments
Learning Objectives
Do we want to work up through smaller problems or do we want to work on definitions?
Issue, work through small problems, think through issues, work up to larger problems that might be more department oriented
Our modules need to be for instructors, not students, at least yet
We want to generate as much raw material as possible
Look at probabilistic method, Newman method etc etc
We want different perspectives to bring them all together
There isn’t one way to do it, we want to see different sides of the same problem
We want smaller problems, things that are reasonable for students to come to different conclusions about
Want things that are going to be appropriate for different disciplines to look at and determine different solutions then be able to take that knowledge into different disciplines, apply to different papers, theories etc
What are the themes we want to deal with in our module?
How can we use these themes to make modules?
What is each person going to contribute?
Metric Models for Random Graphs
Social network as the application
Random graphs drawn from Independent and Identically Distributed distributions (but distributions are unknown in advance)
Identical distribution—all samples drawn from the same distribution (repeated draws from the same population)
Estimate the center of distribution, dispersion (“standard deviation” or variance) around the center
Classroom of students, edges b/w people who are friends, no edges b/w people who aren’t friends, nodes are fixed, just choosing edges
Try to pin down what a “friendship” is in order to assign edges
Problem: don’t know true distribution
Build a model with normal distribution (p.201)
g*: typical distribution, the closer g is to g* the more likely the graph is
How close g is to g* is represented by “tao”
Represents maximum entropy
Makes probability as normal as possible given mean, g, g*
Example: difference is difference in number of edges between 2 graphs
Phylogenetic (development of speciation) tree—take samples of hemoglobin and build a tree
Build another different tree, shouldn’t be too different
Concerned about high differences being big differences
Cluster analysis of word count
2 choices—Hamming and Euclidean, but then there are choices about components of vectors
Bernoulli graphs are a special case (p.203)
Eq. 7 (p.204): Hamming metric allows for geometry of graphs
Eq. 8 (p.204): majority of sample has an edge between A and B, then you report that A and B are friends
Problem: different students’ perception of “friendship” but if normalized then it would change values (slightly adds to error value)
Assessing goodness of fit (2 ways)
looks at symmetry around vertex in hypercube as you move away from vertex (no symmetry=need new model)
looks at tails to see if there is large exponential difference
Mixture model—try to account for unknown variables (ie. If gender is unknown)
Establishing confidence region (p.206)
Metric models for tree-value random graphs—looking at classification trees
Dynamic Network Visualization
Open issues in visualizing dynamic networks
Visualization examples—move or static?
A static layout with emerging edges can work well when the cumulative graph is sparse.
If static, the standard set of aesthetic criteria apply but we need a way to capture change over time
Multiple plots—suffers from anchoring problem
Use one dimension to represent time (lost the spatial correlate to distance and structure)
Here abstract from points and lines to groups (cliques) active at any given time. Lines follow nodes, colors denote group.
Examples: teacher as vertex, distances represents interactions with students
Open problems with dynamic movies
Visual memory—details lost due to memory
Replication—given the random nature of algorithms it is difficult to ensure the “same” movie runs
Spurious movement—system level changes affect local level movement and could lead to some bouncing around
Measuring fit
Specific: each algorithm is optimizing a feature, want to match that feature to a quantitative measure of fit to know if one layout is really accurate/better than another
General: want to know if screen image really reflects underlying social process accurately (currently use stress statistic, comparing observed to actual)
Mapping effects of temporal change
Everyone could give a lecture on a topic they find pertinent
Everyone produces a module
Extract general principles
What ideas can maximize utility for all
Different departments learning from one another
Links between technology and policy
Main Goal: to produce modules!!
Find support from other faculty
Module definition: what you do in your course
How important is it to be web enabled? This is an ultimate goal
Can be as simple as pdf links, there can also be interactive modules
Active models can be helpful
Homework, visualization applets, evaluation
Portable, adaptable
Problem-based, have something that you are trying to solve, easy to motivate
People have to want to solve it
We are not building a course, we are developing modules for our own courses
It is important to coordinate?
How important is it to define language?
Modules goal is to be able to be picked up by other courses
We need a better idea of who the audience is going to be, who is this module for?
We could pick a specific problem about networks and brainstorm perspectives we could bring to it, what questions it raises, how we deal with this problem
How many problems are we going to look at?
Kearn’s Networked Life Class
What can be learned from different disciplines
Everyone can bring in a problem, set up for ten minutes, then floor open for discussion, different views on this problem
We can discuss problems, then hope to develop them into modules
We don’t need a module from every professor, a few modules can be developed by the whole group later
One goal from last meeting was to have modules that were useful in multiple settings
Fellowship is also an important goal, but it is important to have an end result
A good first step is to talk about network problems
Everyone suggests a topic for discussion/something to be developed into a module
Statistical inference of graph data (Banks Stat)
Data observations on networks
Inferences about average, spread, SD
Finding functions of the network
Each observation is a different network, ask a group of people about a network, see what the different results are that are produced
Developing different definitions of “random graphs”
Understand paper by Kleinberg and Backstrom (Where for art thou R3579X) (Astrechan CS)
Longitudinal Data (Lobo Business School), Causal Inference on Graphs
3 different networks: Interactions between people, Assessments of competence and Measures of “like”, affect
How do you make statistically solid inferences about causality? How do you control for network structure?
Learn about networks in order to get rid of networks
Is it possible that you can’t get rid of the network?
Do we want to do methodological or application work? What is easier to understand, getting people interested in…
Tension between which structures are good for individuals and which are good for the collective (CS)
Looking at effects of centrality etc
Can see this problem in many different areas, individual and collective needs
What is a good network for a particular group?
Understanding the tradeoffs between different networks, what is driving the performance of a group?
Could also be framed as a centrality problem
Network of molecules in a cell (Socolar bio)
Network of causal effects
Transcription factor networks
Deterministic relationships
Think about what a link represents, relationships between nodes
Questions: when I have a complex network, how is it organized so that a cell can carry out a variety of functions? Think about processing information, how it is encoded in the network.
Suppose I just have random Boolean networks, what do they do? Qualitatively different functions, network dynamics change, structure is fixed, when you let it run what does it do?
Math problem out of biology
Dynamic networks (Moody Sociology)
How edges come on and off in graphs
Understanding rules of how edges are created
How do things move through graphs? (disease etc)
Citations (Forbes CS)
Who wrote what paper?
Network of people that have written papers together
Paper from U Maryland that looks at this issue
Disambiguate different authors
Visualization of networks
Next Meeting: Thursday 13th December 12:15pm
Group Work
Developing Materials
Building up skills, modules
Teachers need to develop materials
Fit modules together
Have to establish a minimum skill base
Bring in different interest groups
Get at some big concepts
What core concepts matter?
Computability and complexity
Algorithms, ways of evaluating
Degree, closeness, centrality
Show implications of concepts in all disciplines
Important to keep problems fresh and relevant
Online social networks
Doug Lawson book
How to interpret, not to get mislead
Recommender Systems
Need to get information from students
What should/should not be in the module?
Modules can look like textbooks
Don’t want to dictate to people what they are going to do
Need to keep it accessible
Motivated by a problem
Sociology, different perspectives, getting women involved, women seem to be more attracted to design elements, make sure the title doesn’t scare people away
New Courses to add: • http://Ibiblio.org/fred/inls_490 • INSNA courses HW: * What your module would be for a network class?
* There will be a workshop
* Feedback
Lunch meeting was confusing
Different goals
That group was just building modules
Our goal is to increase interest in CS
Our hope is to create modules for other courses also
Look into who these modules are being developed for
Who is going to actually build the modules?
Who from Duke is participating?