Ultimate Game Theory

An introduction to the melted, gooey mind of a post-finals PhD student

In the days preceding my game theory final, I was quarantined in my Cambridge apartment. The heat was on and pages of yellow legal paper decorated with inky matrices and tree diagrams ruled my kitchen counters. Swaddled in some convex combination of polar fleece and section notes, I would only leave my warm fortress for two activities: (1) to throw $4 at an increasingly hard-to-please chai tea habit; and (2) to play and train for my sport of choice–that is, ultimate frisbee.

When I would return from ultimate, residual thoughts about the game lingered at the edges of my legal pads. The combination of studying for my exam and ultimate exposure in the throes of winter madness led me to the inevitable: reframing game theory concepts as they apply to aspects of ultimate! While I didn’t have the time to parse out examples of “Ultimate” Game Theory back in Cambridge, I’m on winter break in San Francisco now… which means two things: (1) I am still wearing lots of fleece; and (2) I have time to tease out all the kitschy alt-sport applications of game theory that my heart desires.

To discuss game theoretic concepts in this context, I build out two games that are based in the ultimate frisbee universe.[1] First, I use The “call lines” Game to discuss some popular, well-known concepts–namely, the prisoner’s dilemma and pure Nash equilibrium. I also use this framework to talk about repeated games and subgame perfect equilibrium. In adding the concepts of offense and defense, I refine the game so that it is no longer symmetric, and provide an example of how to solve for mixed Nash equilibrium.  The second game I herein created is The “throw it to the girl” Game. This game is much more complex and interesting than the former–it is a dynamic signaling game with imperfect information that allows me to illustrate how to solve for perfect bayesian equilibrium. The “throw it to the girl” Game allows us to model one kind of dynamic that can pop up in the social context of co-ed sports.

Game I: The “call lines” Game

a. The Game Set-up

First things first, I present a simple game based on “calling lines” during an ultimate frisbee game. Ultimate is played with two teams. Each team needs to put 7 people “on the line” to play any given point. However, teams themselves consist of more than 7 people since otherwise those 7 people would probably not be super into playing this sport. (People need some rest!) In my set-up, I assume there are two teams, 1 and 2, that are identical and each always has two lines to choose from: a strong line and a weak line. The payoffs are determined by strategies employed rather than the identity of those teams employing them. In effect, the normal form of this game is a 2×2 symmetric matrix. (This is 2×2 since there are two players–team 1 and team 2–as well as two choices of lines–weak and strong.)

In order to determine the payoffs in this matrix, I need to make assumptions about the team outcomes. In expectation (which is how payoffs in a normal form matrix are presented–as expected Bernoulli utility), weak lines lose to strong lines and the same type of lines win or lose to one another with equal probability. A team gets +3/-3 for winning/losing a point. (If two types of the same type play, they receive 0 in expectation since the probability of a win is 0.5.) Moreover, I assume that teams do not want to overuse their strong lines. Ie, teams do not want to wear out their best players for fear of fatigue or injury. Therefore, teams also receive payoffs of +1/-1 for playing a weak/strong line. Given these simple and linear assumptions,[2] the following represents the normal form game for “call lines”:

tab1.png

b. Prisoner’s Dilemma Form & Solving for Pure Nash Equilibrium

The normal form of the “call lines” game might look very familiar. While conceptually different, it is mathematically identical to everyone’s favorite simple non-cooperative game: the prisoner’s dilemma! Note that the prisoner’s dilemma has infinite representations with respect to the specific payoffs. The overarching requirement is that the game is symmetric across the two players and that the following strict ranking of payoffs holds: [the payoff to a player who “defects” (plays a strong line in this case) while the other “cooperates” (plays a weak line)]  > [the payoff to a player who “cooperates” (plays a weak line) while the other “cooperates” (plays a weak line)] > [the payoff to a player who “defects” (plays a strong line) while the other “defects” (plays a strong line)] > [the payoff to a player who “cooperates” (plays a weak line) while the other “defects” (plays a strong line)].[3] In table 1 we can see this holds since 2>1>-1>-2. I could replace these payoffs in the normal form matrix with any set that maintains the same strict inequality and the game would remain a prisoner’s dilemma.

In the prisoner’s dilemma context, the relevant solution concept is the well-known concept of Nash equilibrium. In Nash equilibrium, no agent (team in this case) has an incentive to deviate if the agent knows the other’s strategy. In order to solve for Nash equilibrium, I underline the best responses of both teams to each other’s strategies:

tab2.png

(Quick refresher as to how to find these marked best responses: Imagine team 1 plays a weak line, then the payoffs to team 2 are either 1 (if play weak) or 2 (if play strong). Since 2>1, team 2 will play strong. Imagine team 1 plays a strong line, then the payoffs to team 2 are either -2 (if plays weak) or -1 (if plays strong). Since -1>-2, team 2 will play strong. The same logic then applies to team 1 since the game is symmetric.)

Since both payoffs in the (-1,-1) box of the matrix are underlined, it is evident that neither team has an incentive to deviate from the strong strategy given that the other team is playing strong. Thus, strong-strong is the sole pure Nash equilibrium in the “call lines” game. However, note that the weak-weak strategy, which yields payoffs (1,1), while not Nash, is pareto optimal (no payoff duo gives both players a higher payoff) and, accordingly, pareto dominates (-1,-1). As Prof Maskin lecture slides wisely say, this “illustrates the tension between efficiency and individual maximization.”

c. Repeated Game Prisoner’s Dilemma & Solving for Subgame Perfect Equilibrium

While the original set-up of this game was in a static context, I can also render “call lines” a repeated game and end up with a different solution concept than the traditional Nash equilibrium previously described. Let’s assume that the same normal form game shown in Table 1 will be played infinitely–this generates an “iterated prisoner’s dilemma.” In this context, I use a solution concept known as subgame perfect equilibrium. Given repetition and recall of previous outcomes/actions, teams now have the opportunity to penalize each other for previous decisions. In the “call lines” context, I investigate the following strategy: play a weak line until someone plays a strong line (play strong from then on). This is also called a “grim trigger strategy,” which alters the choice of lines if someone chooses to deviate from cooperation (playing weak lines). This strategy, therefore, incentives cooperation since otherwise the players punish one another by forcing reduced payoffs for the rest of the infinitely repeated game.

This strategy yields efficiency in subgame perfect equilibrium–a point I show below. Imagine teams have discount factors, meaning they discount future utility flows from points played. The following break-down illustrates how the “grim trigger strategy” is a subgame perfect equilibrium (given some condition on the discount factor):

condcoop copy.png

Thus, if the discount factor is greater than one-third, the grim trigger strategy is a subgame perfect equilibrium for the “call lines” game. However, note that if the number of repetitions of the game is finite and known to both teams, then (by backwards induction) the two players will play strong lines in every period. Therefore, the solution concept is the same as in the static context if the repetition is finite and known, but can diverge if the repetition is infinite and the discount factor meets some requirement. (For a more complete discussion of repeated games and cooperation, check out these slides.)

d. Adding Offense and Defense & Solving for Mixed Nash Equilibrium

I now refine the “call lines” game by adding the concepts of offense and defense. This addition will change the payoffs in the normal form matrix. Assume that team 1 is on offense and team 2 is on defense. When a team starts a point on offense (meaning the other team pulls the disc down field to them–a kick-off in football), they have an advantage for scoring. Assume accordingly that a weak offense will beat a weak defense and a strong offense will beat a strong defense. Therefore, the only offense that loses in a match-up is a weak offense against a strong defense.  Maintaining the same +3/-3 for winning/losing a point and the same +1/-1 for strong/weak lines, the normal form game with player 1 on offense is as follows:

tab3.png

Given this change, the game is no longer symmetric. It is no longer a prisoner’s dilemma, and moreover, there is no longer a pure Nash equilibrium. This can be illustrated with the best responses marked below (ie, there is no box with both payoffs underlined):

tab4.png

While there is no pure Nash equilibrium, we know that all finite games have at least one Nash equilibrium (theorem of existence of Nash equilibrium). Therefore, there must be some mixed Nash equilibrium. Mixed Nash equilibrium is made up of mixed strategies, which are those by which a team plays its available pure strategies (play a weak line, play a strong line) with certain probabilities. In solving for mixed Nash, we consider three possibilities (only team 1 uses a mixed strategy, only team 2 uses a mixed strategy, both use mixed strategies) and make use of the indifference condition as follows:

mixed.png

There is therefore one single mixed Nash equilibrium in which team 1 plays a weak line with probability 2/3 (and so a strong line with probability 1/3) and team 2 plays a weak line with probability 1/3 (and so a strong line with probability 2/3).

e. Recap of “calling lines”

In sum, we have used the original and refined “call lines” set-ups and their corresponding normal forms in order to discuss the prisoner’s dilemma, pure Nash equilibrium, repeated games, subgame perfect equilibrium, and mixed Nash equilibrium. In moving to a more complex and interesting set-up, I now transition to the “throw it to the girl” game.

Game II: The “throw it to the girl” Game

a. The Game Set-up

Ultimate is played in a myriad of circumstances. The most casual form of ultimate frisbee is pick-up–that is, a group of people who get together to play who often don’t know each other. Pick-up is often mixed gender, meaning men and women are playing together, which while empowering and fun can often lead to some noticeable gender dynamics. For instance, playing pick-up in a mixed gender setting can lead to women being “looked off” by male players. [See here for an article on this exact subject that a fellow female frisbee friend recently shared!] In other words, men sometimes do not throw to open women…which can lead to the classic “throw it to the girl!” remark from the sideline as a woman appears open upfield but the dude with the disc chooses to holster the throw instead.  The reasons for this trend (preference for bigger, more dramatic plays in the form of hucks to big dudes, implicit bias, etc.) is not the focus of this discussion…rather, it suffices to note that, yeah, this is a dynamic.

In my own personal experience as a female pickup player, I’ve found that calling for the disc when open is a solid way to signal that I am more experienced or confident and that men shouldn’t hesitate to throw to me. In learning about dynamic signaling games in game theory, I quickly realized that this calling/throwing situation could easily be melded into game theoretic form. Consider the moment when a male player with a disc is looking upfield for a throw. Assume there is an open female cutter upfield. In this moment, the female cutter (player 1 to us) has a choice: she can (1) call for the disc, signaling that she wants to be thrown to, or (2) remain silent and again not be thrown to.

This set-up is a two-player dynamic signaling game. While conceptually distinct, note that this game is identical to the well-known “gift game”! Player 1 has two types: she is either (1) dirty, or (2) a scrub. (Yeah, frisbee vernacular. Let’s go.) In this world, we are assuming that a dirty woman is better than the average male cutter on the pick-up team, while a scrub woman is worse than the average male cutter on the team. We assume that with probability 0.7 nature makes the woman dirty and with probability 0.3 nature makes her a scrub. [This was an arbitrary choice–open to edits on this.] Once the cutter has chosen to yell out or not, the dude with the disc (player 2) has a choice. Player 2 only has one type. He has no choice if the woman is silent since he will unambiguously not throw to her, but if she calls out, he can choose to throw to her or holster (not throw to her).

  • If the woman is silent, the payoffs to both players are 0 regardless of player 1 type since no one gains from this and both players continue functioning at the status quo.
  • If the woman calls out, the payoffs are different depending on her type:
    • Let’s say she is dirty:
      • If the dude throws to her, she gains 2 since she is happy she was thrown to and she played the disc well; the dude in this case is happy since she played the disc better than the average male cutter would have and gets a payoff of 1.
      • If the dude does not throw to her, then she gets a payoff of -1. (This assumes, based on personal and shared experience, that women feel more ignored or disrespected when looked off after being openly vocal than after being silent.) Meanwhile, the dude in this case goes on with the status quo and gets a payoff of 0.
    • Let’s say she is a scrub:
      • If the dude throws to her, she gains 1 since she is happy she was thrown to. (But she doesn’t gain as much as the dirty woman since she’s not as dope at frisbee. I am assuming that people gain more utility from playing when they are dirty.) The dude, in this case, is unhappy since she doesn’t play the disc as well as the average male cutter so he gets payoff of -1.
      • If the dude does not throw to her, she again gets a payoff of -1 and he again gets a payoff of 0. (We are assuming that dirty women and scrubs receive the same payoffs when ignored, but differ in payoffs when they get to play the disc.)

Given these above assumptions for payoffs and dynamics, I used the TikZ package in LaTeX to build out an extensive form of this game. [Thank you to Dr. Chiu Yu Ko who has an incredible set of TikZ Templates openly available–Here is the signaling game one that I built off of.] See figure 1 for the extensive form of this game:

tree2 copy.png

b. Solving for Perfect Bayesian Equilibrium

In the context of such dynamic games with incomplete information, the equilibrium concept of interest is perfect bayesian equilibrium (a refinement of bayesian nash equilibrium and subgame perfect equilibrium).

In order to solve for perfect bayesian equilibrium (PBE from here on), I must investigate all possible strategies for our women in the pick-up game. Since we have two types of women (dirty players/scrubs) as well as two possible actions (call out/be silent), there are four possible strategies. Two of these are what we call “separating strategies” in which the two types choose different actions:

  • dirty player is silent/scrub calls (Figure 2)
  • dirty player calls/scrub is silent (Figure 3).

The other two are called “pooling strategies” in which both types choose the same action:

  • dirty player is silent/scrub is silent (Figure 4)
  • dirty player calls/scrub calls (Figure 5)

For each of the woman’s four possible strategies, I then determine the beliefs and accordingly the optimal response of the dude with the disc. Given that optimal response, I check to see if either of our types of women would like to deviate. If not, then we have a perfect bayesian equilibrium. I will now go through this systematically for the four strategies.

f2.png

The above illustrates the separating equilibrium strategy in which the dirty player is silent and the scrub calls for the disc. (These actions for the two types of women are illustrated in red.) In a separating equilibrium, the action of player 1 signals the type, meaning that if the dude hears a “hey,” he knows she a scrub. The dude’s strategy (recall he only gets to make a choice when there has been a call for the disc) is then to holster the throw since 0>-1. (Thus holster being highlighted in red in the left information set.) Note that given that optimal response from the dude, the scrub female player could improve her payoff by remaining silent instead since 0>-1. In effect, this is not a PBE.

f3.png

The next strategy we consider is that in which a dirty player calls for the disc and a scrub remains silent. In this separating case, the dude knows that if he hears a “hey,” the woman is dirty. So, the dude’s strategy is to throw since 1>0. (Throw is highlighted in red in the left information set.) Given this optimal response from the dude, the scrub female player could improve her payoff by deviating from silence to calling since 1>0. In effect, this is not a PBE.

f4.png

The above figure illustrates the total silence strategy. In such a pooling equilibrium, the dude’s beliefs when hearing a disc called for can be arbitrary since hearing a “hey!” occurs with 0 probability and therefore bayes’ rule doesn’t apply in this context. In effect, if the dude’s beliefs as to the woman’s type are adequately pessimistic (believes with more than 50% certainty that she’s a scrub), then his strategy is to holster the throw (holster highlighted in left information set). (So, diagram is drawn for adequately pessimistic beliefs on the part of the dude.) Regardless of the probabilities determined by nature (0.7 and 0.3), neither player can improve by deviating since (-1,0) is inferior to (0,0). Therefore, this is a PBE. 

f5.png

The last strategy to look into is the all call strategy. In this pooling equilibrium, the dude’s beliefs as to the woman’s type are based on the nature a priori probabilities. The payoff from throwing is thus (1)(0.7)+(-1)(0.3) and the payoff from holstering is (0)(0.7)+(0)(0.3). since 0.4>0, the optimal response for the dude is to throw (as marked by the red). Since 2>0 and 1>0, neither type of woman wants to deviate from the prescribed strategy. In effect, this is a PBE. 

c. Refining the Set of Perfect Bayesian Equilibria

In summary, there are two PBEs for this “throw it to the girl” game: the total silence and all call strategies.  However, note that the total silence strategy is not Pareto efficient while the all call strategy is. Ie, the expected payoffs of 1.7 for the woman and 0.4 for the dude (all call strategy) are larger than 0 payoffs for both (total silence strategy). Moreover, the total silence strategy fails “the intuitive criterion,” a refinement of the set of equilibria proposed by Cho and Kreps (1987). The concept of this requirement is to restrict the set of equilibria to those with “reasonable” off-equilibrium beliefs. This allows me (as the creator of the model) to choose between the multiple PBE’s previously outlined. For a PBE to satisfy the intuitive criterion there must exist no deviation for any type of woman such that the best response of the dude leads to the woman strictly preferring a deviation from the originally chosen strategy.

Let’s explain why the all silent strategy does not satisfy this requirement. Imagine a deviation for the dirty player to calling. If the woman now calls, the best response for the dude is to throw to her, which yields a payoff of 2 for the woman, which is strictly greater than 0. So, the woman prefers this deviation and the intuitive criterion is not satisfied. However, the all call strategy passes this criterion. Imagine a deviation to silence for the dirty player. Then there is no best response for the dude since the payoffs are automatically 0 and 0. Since 2>0, the woman doesn’t prefer the deviation. Similarly, a deviation to silence for the scrub yields 0 instead of 1, which is not preferred either. Thus, the all call strategy satisfies the intuitive criterion. In effect, when we refine the set of equilibria in this way, we have both types of women calling for the disc and the dude making the throw… Sounds like a pretty good equilibrium to me![4]

d. Recap of “throw it to the girl”

We have used this “throw it to the girl” set-up and its corresponding extensive form in order to discuss dynamic signaling games, solving for perfect bayesian equilibrium, and refining the set of equilibria using the intuitive criterion.

Hard cap is on! [In frisbee parlance, it’s time to wrap this all up]

There are endless ways to extend or reform these games in the world of game theoretic concepts. My formulations for “calling lines” and “throw it to the girl” are simple by design in order that they lend themselves to discussing some subset of useful concepts. However, despite the simplicity of the model builds, I’m happy to be able to arrive at conclusions that involve social behaviors as complex as gender dynamics… For example, next time, instead of yelling “throw it to the girl!” from the sideline, you can always shout: “assuming a gift-giving game payoff structure, it is a perfect bayesian equilibrium satisfying the intuitive criterion for you to throw to open women when they call for it!” No worries–if they don’t understand, you can always womansplain the concept during the next time-out.

Code

Check out the relevant Github repository for all tex files necessary for reproducing the tables, tree diagrams, and solution write-ups!

Footnotes

[1] The good news is that since I’m pretty sure some nontrivial percentage of ultimate players have studied math, I don’t have to worry too much about this discussion being for some empty intersection of individuals.

[2] Comments on how to improve this are very welcomed. For this introductory context, I feel these payoffs suffice since it allows me to get into the prisoner’s dilemma and some useful simple equilibrium concepts.

[3] These requirements render the game a non-cooperative one. Prisoner’s dilemma terminology is often used for contexts that in fact would be better categorized as cooperative games such as Stag hunt. In the Stag hunt (or cooperative game) payoff matrix, the inequality relationship would instead be: [the payoff to a player who “cooperates” while the other “cooperates”] >[the payoff to a player who “defects” while the other “cooperates”]  >= [the payoff to a player who “defects” while the other “defects”] > [the payoff to a player who “cooperates” while the other “defects”]

[4] More generally, this will be the case as long as the nature a priori probabilities have the probability of the woman being dirty as 0.5 or greater.


© Alexandra Albright and The Little Dataset That Could, 2017. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts, accompanying visuals, and links may be used, provided that full and clear credit is given to Alex Albright and The Little Dataset That Could with appropriate and specific direction to the original content.

Anxious Optimization

On the morning of my dynamic programming midterm, I tried to calm myself by recalling the golden rules of grad school that have been passed down through the generations[1]:

“grades don’t matter… nothing matters.” 

However, I quickly realized that by adopting this perspective I was simply subbing out test anxiety for existential anxiety. And, come to think of it, the latter didn’t seem to be helpful, while the former could actually aid in my short-term goals–namely, jotting down lots of equations into a collection of small blue booklets.
In considering the roles that angst can play in day-to-day life, I started to become curious about whether I could model optimal choices of anxiety using dynamic programming techniques. Allow me to channel my inner Ed Glaeser[2] using David Laibson-esque framework[3] and wax poetic on the specifics of something you could call (using very flexible terminology) an “economic model”…
Let’s assume that anxiety has some negative cost (say to my overarching mental health), however, its presence in my life also often gets me to achieve goals that I do genuinely want to achieve. Therefore, anxiety factors into my personal utility function in both positive and negative ways. In other words, it is not some force in my life that I want to erase entirely since it can lead to incredibly positive outcomes.
Let’s say, for the sake of model simplicity and for the sake of accuracy since I’m in academia now[4], that my utility function is simply equated to my academic success. Imagine that academic success is some definable and quantifiable concept–we could define this as some weighted average of number of papers published in quality journals, number of good relationships with faculty members, etc. Let’s also assume that this type of success that is a function of (and increasing in) two items: idea creation and execution of ideas. This seems reasonable to me. The next piece is where the real controversial and strict assumptions come in with respect to anxiety: I assume that idea creation is a function of (and increasing in) existential anxiety, while execution is a function of (and increasing in) time/test anxiety. Assume that the functions with respect to the anxiety types have positive first derivatives and negative second derivatives–this is equivalent to assuming concavity. [Note: In reality, there is most definitely some level of both angsts that stops being productive… noting that this is the case calls for more complex assumptions about the functional forms beyond assuming simple concavity… suggestions are welcome!]
Then, given these assumptions and the framework of dynamic programming, the optimization problem of interest is equivalent to solving a maximization problem over the lifecycle.
max_prob
Explicitly solving this optimization problem requires more assumptions about functional forms and the like. Ed, I’m open to your ideas! Sure, it’d be much simpler to somehow make this a one variable maximization problem–a transformation we are often able to achieve by exploiting some budget constraint and Walras’ law–however, I do not believe that anxiety measures should add to some value beyond human choice. Other potential questions: Do we think our state variables follow an Ito process? Ie, I could see the existential anxiety variable following geometric Brownian motion since drift maybe should rise exponentially with time?
Back to reality, an implication of my model build that comes straight out of my assumptions (don’t even need first order conditions for this) is that I should not be thinking about how “nothing matters” when there’s an upcoming test. A test falls into the category of execution tasks, rather than the realm of idea creation. The existential anxiety that grows out of repeating the mantra “nothing matters” to myself over and over would only be helping come up with ideas… In fact, this whole model idea and post did come from continuing down the path of some existential thought process! So, perhaps the real question should be: is my blogging engrained in weighted average measure for “academic success”? If so, I’m feeling pretty optimized.
Footnotes

[1] Thank you to the G2’s (second-year’s) for the soothing (yet still terrifying) words in your recent emails.

[2] Microeconomics professor of “verbal problem” fame

[3] Macroeconomics professor for the “dynamic programming” quarter of our sequence

[4] I kid, I kid. I’m off to a frisbee tournament for this entire weekend, so clearly my utility function must be more complex.


© Alexandra Albright and The Little Dataset That Could, 2016. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts, accompanying visuals, and links may be used, provided that full and clear credit is given to Alex Albright and The Little Dataset That Could with appropriate and specific direction to the original content.

Building Visualizations Using City Open Data: Philly School Comparisons

Intro

There is a collection of notes that accompanies me throughout my day, slipped into the deep pockets of my backpack. The collection consists of small notebooks and post-its featuring sentence fragments written in inky Sharpie or scratched down frantically using some pen that was (of course) dying at the time. Ideas, hypotheses, some jokes. Mostly half baked and sometimes completely raw. Despite this surplus of scribbles, I often struggle when it comes acting on the intention of the words that felt so quick and simple to jot down… In fact, I often feel myself acting within the confines of this all too perfect graphical representation of project development:

14063489_163173454089228_1445505577_n

via the wonderful young cartoonist Liana Finck

One topic of interest–comparisons of charter and district public schools–has been on my (self-imposed) plate for over a year now. The topic was inspired by a documentary webseries that a friend actually just recently completed. [Plugs: Sivahn Barsade will be screening her documentary webseries Charter Wars this weekend in Philadelphia! Check it out if you’re around.] Given that she is currently wrapping up this long-term project, I am doing the same for my related mini-project. In other words, some post-its are officially being upgraded to objects on the internet.

To quote the filmmakers, “Charter Wars is an interactive documentary that examines the ideologies and motivations driving the charter school debate in Philadelphia.” Ah, yes, charter schools… a handful of slides glided by me on the topic in my morning Labor Economics class just this past Wednesday. Check out the intertwined and state-of-the-art Dobbie-Fryer (2013) and Fryer (2014) if you’re interested in charter school best practices and their implementation in other school environments.[1] However, despite the mention of these papers, I am not going to use this space in order to critique or praise rigorous academic research on the subject. Instead, I will use this space as a playground for the creation of city open data visualizations. Since Sivahn focuses her Charter Wars project on Philadelphia, I decided to do the same, which turned out to be a great idea since OpenDataPhilly is a joy to navigate, especially in comparison to other city data portals. After collecting data of interest from their site (details on that process available here), I used ggplot2 in R (praise Hadley!) to create two visualizations comparing district and charter schools in the city.

Think of this post as a quasi-tutorial inspired by Charter Wars; I’ll present a completed visual and then share the heart of the code in the text with some brief explanation as to the core elements therein. (I will also include links to code on my Github repo, which presents the full R scripts and explains how to get the exact data from OpenDataPhilly that you would need to replicate visuals.)

Visualization #1: Mapping out the city and schools

First things first, I wanted to map the location of public schools in the city of Philadelphia. Open data provides workable latitude and longitudes for all such schools, so this objective is entirely realizable. The tricky part in mapping the schools is that I also had to work with shape files that yield the city zip code edges and consequently build the overarching map on which points (representing the schools) can be plotted. I color schools based on four categories: Charter (Neighborhood), Charter (Citywide), District (Neighborhood), and District (Citywide);[2] and then break the plots up so that we can compare across the school levels: Elementary School, Middle School, High School, K-8 School (rather than plotting hundreds of points all on one big map). Here is my eventual result generated using R:

mappingschools

The reality is that most of the labor in creating these visuals is in figuring out both how to make functions work and how to get your data in the desired workable form. Once you’ve understood how the functions behave and you’ve reshaped your data structures, you can focus on your ggplot command, which is the cool piece of your script that you want to show off at the end of the day:

ggplot() +
geom_map(data = spr1, aes(map_id = Zip.Code), map = np_dist, fill="gray40", color="gray60") +
expand_limits(x = np_dist$long, y = np_dist$lat)+
my_theme()+
geom_point(data=datadistn, aes(x=X, y=Y, col="District (Neighborhood)"), size=1.5, alpha=1)+
geom_point(data=datachartn, aes(x=X, y=Y, col="Charter (Neighborhood)"), size=1.5, alpha=1)+
geom_point(data=datadistc, aes(x=X, y=Y, col="District (Citywide)"), size=1.5, alpha=1)+
geom_point(data=datachartc, aes(x=X, y=Y, col="Charter (Citywide)"), size=1.5, alpha=1)+
facet_wrap(~Rpt.Type.Long, ncol=2)+
ggtitle(expression(atop(bold("Mapping Philly Schools"), atop(italic("Data via OpenDataPhilly; Visual via Alex Albright (thelittledataset.com)"),""))))+
scale_colour_manual(values = c("Charter (Citywide)"="#b10026", "District (Citywide)"="#807dba","Charter (Neighborhood)"="red","District (Neighborhood)"="blue"), guide_legend(title="Type of School"))+
labs(y="", x="")

This command creates the map I had previously presented. The basic process with all these sorts of ggplot commands is that you want to start your plot with ggplot() and then add layers with additional commands (after each +). The above code uses a number of functions and geometric objects that I identify and describe below:

  • ggplot()
    • Start the plot
  • geom_map()
    • Geometric object that maps out Philadelphia with the zip code lines
  • my_theme()
    • My customized function that defines style of my visuals (defines plot background, font styles, spacing, etc.)
  • geom_point()
    • Geometric object that adds the points onto the base layer of the map (I use it four times since I want to do this for each of the four school types using different colors)
  • facet_wrap()
    • Function that says we want four different maps in order to show one for each of the four school levels (Middle School, Elementary School, High School, K-8 School)
  • ggtitle()
    • Function that specifies the overarching plot title
  • scale_colour_manual()
    • Function that maps values of school types to specific aesthetic values (in our case, colors!)
  • labs()
    • Function to change axis labels and legend titles–I use it to get rid of default axes labels for the overarching graph

Definitely head to the full R script on Github to understand what the arguments (spr1, np_dist, etc.) are in the different pieces of this large aggregated command. [Recommended resources for those interested in using R for visualization purposes: a great cheat sheet on building up plots with ggplot & the incredible collection of FlowingData tutorialsPrabhas Pokharel’s helpful post on this type of mapping in R]

Visualization #2: Violin Plots

My second creation illustrates the distribution of school scores across the four aforementioned school types: Charter (Neighborhood), Charter (Citywide), District (Neighborhood), and District (Citywide). (Note that the colors match those used for the points in the previous maps.) To explore this topic, I create violin plots, which can be thought of as sideways density plots, which can in turn be thought of as smooth histograms.[3] Alternatively, according to Nathan Yau, you can think of them as the “lovechild between a density plot and a box-and-whisker plot.” Similar to how in the previous graph I broke the school plotting up into four categories based on level of schooling, I now break the plotting up based on score type: overall, achievement, progress, and climate.  See below for the final product:

scores

The core command that yields this graph is as follows:

ggplot(data_new, aes(factor(data_new$Governance0), data_new$Score))+
geom_violin(trim=T, adjust=.2, aes(fill=Governance0))+
geom_boxplot(width=0.1, aes(fill=Governance0, color="orange"))+
my_theme()+
scale_fill_manual(values = pal2, guide_legend(title="School Type")) +
ylim(0,100)+
labs(x="", y="")+
facet_wrap(~Score_type, ncol=2, scales="free")+
ggtitle(expression(atop(bold("Comparing Philly School Score Distributions"), atop(italic("Data via OpenDataPhilly (2014-2015); Visual via Alex Albright (thelittledataset.com)"),""))))

Similar to before, I will briefly explain the functions and objects that we combine to into this one long command:

  • ggplot()
    • Begin the plot with aesthetics for score and school type (Governance0)
  • geom_violin()
    • Geometric object that specifies that we are going to use a violin plot for the distributions (also decides on the bandwidth parameter)
  • geom_boxplot()
    • Geometric object that generates a basic boxplot over the violin plot (so we can get an alternative view of the underlying data points)
  • my_theme()
    • My customized function that defines the style of visuals
  • scale_fill_manual()
    • Function that fills in the color of the violins by school type
  • ylim()
    • Short-hand function to set y-axis to always show 0-100 values
  • labs()
    • Function to get rid of default axes labels
  • facet_wrap()
    • Function that separates plots out into one for each of the four score types: overall, achievement, progress, climate
  • ggtitle()
    • Specifies the overarching plot title

Again, definitely head to the full R script to understand the full context of this command and the structure of the underlying data. (Relevant resources for looking into violin plots in R can also be found here and here.) 

It took me many iterations of code to get to the current builds that you can see on Github, especially since I am not an expert with mapping–unlike my better half, Sarah Michael Levine. See the below comic for an accurate depiction of current-day-me (the stick figure with ponytail) looking at the code that July-2015-me originally wrote to produce some variant of these visuals (stick figure without ponytail):

code_quality

Via XKCD

Hopefully current-day-me was able to improve the style to the extent that it is now readable to the general public. (Do let me know if you see inefficiencies though and I’m happy to iterate further! Ping me with questions too if you so desire.) Moreover, in intensively editing code created by my past self over the past string of days, I also quickly recalled that the previous graphical representation of my project workflow needed to be updated to more accurately reflect reality:

manic2

adapted from Liana Finck with the help of snapchat artistic resources

On a more serious note, city open data is an incredible resource for individuals to practice using R (or other software). In rummaging around city variables and values, you can maintain a sense of connection to your community while floating around the confines of a simple two-dimensional command line.

Plugs section [important]
  1. Thanks to Sivahn for communicating with me about her Charter Wars documentary webseries project–good luck with the screening and all, Si!
  2. If you like city open data projects, or you’re a New Yorker, or both… check out Ben Wellington’s blog that focuses on NYC open data.
  3. If you’d like to replicate elements of this project, see my Github documentation.
Footnotes

[1] Yes, that’s right; I’m linking you to the full pdfs that I downloaded with my university access. Think of me as Robin Hood with the caveat that I dole out journal articles instead of $$$.

[2] Note from Si on four school categories: Wait, why are there four categories? While most people, and researchers, divide public schools into charter-run and district-run, this binary is lacking vital information. For some district and charter schools, students have to apply and be selected to attend. It wouldn’t be fair to compare a charter school to a district magnet school just like it wouldn’t be fair to compare a performing arts charter school to a neighborhood district school (this is not a knock against special admit schools, just their effect on data analysis). The additional categories don’t allow for a perfect apples-apples comparison, but at least inform you’ll know that you’re comparing an apple to an orange. 

[3] The efficacy or legitimacy of this sort of visualization method is potentially contentious in the data visualization community, so I’m happy to hear critiques/suggestions–especially with respect to best practices for determining bandwidth parameters!


© Alexandra Albright and The Little Dataset That Could, 2016. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts, accompanying visuals, and links may be used, provided that full and clear credit is given to Alex Albright and The Little Dataset That Could with appropriate and specific direction to the original content.

Go East, young woman

We’ll always have Palo Alto[1]

It is 9:30pm PST on Friday evening and my seat beat is buckled. The lights are a dim purple as they always are on Virgin America flights. As if we are all headed off to a prom on the opposite side of the country together. My favorite safety video in the industry starts to play–an accumulation of visuals and beats that usually gives me a giddy feeling that only Beyoncé videos have the power to provoke–however, in this moment, I begin to tear up despite the image of a shimmying nun displayed in front of me. In my mind, overlaying the plane-inspired choreography is a projection of Rick Blaine reminding me in my moments of doubt that, I belong on this plane [2]: “If that plane leaves the ground and you’re not [in it], you’ll regret it. Maybe not today. Maybe not tomorrow, but soon and for the rest of your life.” I whisper “here’s looking at you, kid” to the screen now saturated with dancing flight attendants and fade into a confused dreamscape: Silicon Valley in black and white–founders still wear hoodies, but they have tossed on hats from the ’40s.

A few days later, I am now living in Cambridge, MA. While my senses are overcome by a powerful ensemble of changes, some more discreet or intangible than others, there is one element of the set that is clear, striking, and quantifiable. The thickness and heat in the air that was missing from Palo Alto and San Francisco. After spending a few nights out walking (along rivers, across campuses, over and under bridges, etc.) in skirts and sandals without even the briefest longing for a polar fleece, I am intent on documenting the difference between Boston and San Francisco temperatures. Sure, I can’t quantify every dimension of change that I experience, but, hey, I can chart temperature differences.

Coding up weather plots

In order to investigate the two cities and their relevant weather trends, I adapted some beautiful code that was originally written by Bradley Boehmke in order to generate Tufte-inspired weather charts using R (specifically making use of the beloved ggplot2 package). The code is incredible in how simple it is to apply to any of the cities that have data from the University of Dayton’s Average Daily Temperature archive.[3] Below are the results I generated for SF and Boston, respectively[4]:

SF_plot

Boston_plot

While one could easily just plot the recent year’s temperature data (2015, as marked by the black time series, in this case), it is quickly evident that making use of historical temperature data helps to both smooth over the picture and put 2015 temperatures in context. The light beige for each day in the year shows the range from historical lows and to historical highs in the time period of 1995-2014. Meanwhile, the grey range presents the 95% confidence interval around daily mean temperatures for that same time period. Lastly, the presence of blue and red dots illustrates the days in 2015 that were record lows or highs over the past two decades. While Boston had a similar number of red and blue dots for 2015, SF is overpowered by red. Almost 12% of SF days were record highs relative to the previous twenty years. Only one day was a record low.

While this style of visualization is primarily intuitive for comparing a city’s weather to its own historical context, there are also a few quick points that strike me from simple comparisons across the two graphs. I focus on just three quick concepts that are borne out by the visuals:

  1. Boston’s seasons are unmistakable.[5] While the normal range (see darker swatches on the graph) of temperatures for SF varies between 50 (for winter months) and 60 degrees (for late summer and early fall months), the normal range for Boston is notably larger and ranges from the 30’s (winter and early spring months) to the 70’s (summer months). The difference in the curve of the two graphs makes this difference throughout the months painfully obvious. San Francisco’s climate is incredibly stable in comparison with east coast cities–a fact that is well known, but still impressive to see in visual form!
  2. There’s a reason SF can have Ultimate Frisbee Beach League in the winter. Consider the relative wonderfulness of SF in comparison to Boston during the months of January to March. In 2015, SF ranged from 10 to 55 degrees (on a particularly toasty February day) warmer than Boston for those months. In general, most differences on a day-to-day basis are around +20 to +40 degrees for SF.
  3. SF Summer is definitely ‘SF Winter’ if one defines its temperature relative to that of other climates. In 2015, the summer months in SF were around 10 degrees colder than were the summer months in Boston. While SF summer is warmer than actual SF winter in terms of absolute temperature comparisons, comparing the temperatures to other areas of the country quickly yields SF summer as the relatively chilliest range of the year.

Of course, it is worth noting that the picture from looking at simple temperature alone is not complete. More interesting than this glance at basic temperature would be an investigation into the “feels like” temperature, which usually takes into account factors such as wind speeds and humidity. Looking into these more complex measurements would very likely heighten the clear distinction in Boston seasons as well as potentially strengthen the case for calling SF summer ‘SF winter’, given the potential stronger presence of wind chill during the summer months.[6]

The coldest winter I ever spent…[7]

It is 6:00am EST Saturday morning in Boston, MA. Hot summer morning is sliced into by divine industrial air conditioning. Hypnotized by luggage seemingly floating on the baggage claim conveyor belt and slowly emerging from my black and white dreams, I wonder if Ilsa compared the weather in Lisbon to that in Casablanca when she got off her plane… after contacts render the lines and angles that compose my surroundings crisp again, I doubt it. Not only because Ilsa was probably still reeling from maddeningly intense eye contact with Rick, but also because Lisbon and Morocco are not nearly as markedly different in temperature as are Boston and San Francisco.

Turns out that the coldest winter I will have ever spent will be winter in Boston. My apologies to summer in San Francisco.

Footnotes

[1] Sincere apologies to those of you in the Bay Area who have had to hear me make this joke a few too many times over the past few weeks.

[2] Though definitely not to serve as a muse to some man named Victor. Ah, yes, the difference 74 years can make in the purpose of a woman’s travels.

[3] Taking your own city’s data for a spin is a great way to practice getting comfortable with R visualization if you’re into that sort of thing.

[4] See my adapted R code for SF and Boston here. Again, the vast majority of credit goes to Bradley Boehmke for the original build.

[5] Speaking of seasons

[6] I’d be interested to see which US cities have the largest quantitative difference between “feels like” and actual temperature for each period (say, month) of the year…

[7] From a 2005 Chronicle article: “‘The coldest winter I ever spent was a summer in San Francisco,’ a saying that is almost a San Francisco cliche, turns out to be an invention of unknown origin, the coolest thing Mark Twain never said.”


© Alexandra Albright and The Little Dataset That Could, 2016. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts, accompanying visuals, and links may be used, provided that full and clear credit is given to Alex Albright and The Little Dataset That Could with appropriate and specific direction to the original content.

Where My Girls At? (In The Sciences)

Intro

In the current educational landscape, there is a constant stream of calls to improve female representation in the sciences. However, the call to action is often framed within the aforementioned nebulous realm of “the sciences”—an umbrella term that ignores the distinct environments across the scientific disciplines. To better understand the true state of women in “the sciences,” we must investigate representation at the discipline level in the context of both undergraduate and doctoral education. As it turns out, National Science Foundation (NSF) open data provides the ability to do just that!

The NSF’s Report on Women, Minorities, and Persons with Disabilities in Science and Engineering includes raw numbers on both undergraduate and doctoral degrees earned by women and men across all science disciplines. With these figures in hand, it’s simple to generate measures of female representation within each field of study—that is, percentages of female degree earners. This NSF report spans the decade 2002–­2012 and provides an immense amount of raw material to investigate.[1]

The static picture: 2012

First, we will zero in on the most recent year of data, 2012, and explicitly compare female representation within and across disciplines.[2]

fig1

The NSF groups science disciplines with similar focus (for example, atmospheric and ocean sciences both focus on environmental science) into classified parent categories. In order to observe not only the variation within each parent category but also across the more granular disciplines themselves, the above graph plots percentage female representation by discipline, with each discipline colored with respect to its NSF classified parent category.

The variation within each parent category can be quite pronounced. In the earth, atmospheric, and ocean sciences, female undergraduate representation ranges from 36% (atmospheric sciences) to 47% (ocean sciences) of total graduates. Among PhD graduates, female representation ranges from 39% (atmospheric sciences) to 48% (ocean sciences). Meanwhile, female representation in the physical sciences has an undergraduate range from 19% (physics) to 47% (chemistry) and a PhD range from 20% (physics) to 39% (chemistry). However, social sciences has the largest spread of all with undergraduate female representation ranging from 30% (economics) to 71% (anthropology) and PhD representation ranging from 33% (economics) to 64% (anthropology).

In line with conventional wisdom, computer sciences and physics are overwhelmingly male (undergraduate and PhD female representation lingers around 20% for both). Other disciplines in which female representation notably lags include: economics, mathematics and statistics, astronomy, and atmospheric sciences. Possible explanations behind the low representation in such disciplines have been debated at length.

Interactions between “innate abilities,” mathematical content, and female representation

Relatively recently, in January 2015, an article in Science “hypothesize[d] that, across the academic spectrum, women are underrepresented in fields whose practitioners believe that raw, innate talent is the main requirement for success, because women are stereotyped as not possessing such talent.” While this explanation was compelling to many, another group of researchers quickly responded by showing that once measures of mathematical content were added into the proposed models, the measures of innate beliefs (based on surveys of faculty members) shed all their statistical significance. Thus, the latter researchers provided evidence that female representation across disciplines is instead associated with the discipline’s mathematical content “and that faculty beliefs about innate ability were irrelevant.”

However, this conclusion does not imply that stereotypical beliefs are unimportant to female representation in scientific disciplines—in fact, the same researchers argue that beliefs of teachers and parents of younger children can play a large role in silently herding women out of math-heavy fields by “becom[ing] part of the self-fulfilling belief systems of the children themselves from a very early age.” Thus, the conclusion only objects to the alleged discovery of a robust causal relationship between one type of belief, university/college faculty beliefs about innate ability, and female representation.

Despite differences, both assessments demonstrate a correlation between measures of innate capabilities and female representation that is most likely driven by (1) women being less likely than men to study math-intensive disciplines and (2) those in math-intensive fields being more likely to describe their capacities as innate.[3]

The second point should hardly be surprising to anyone who has been exposed to mathematical genius tropes—think of all those handsome janitors who write up proofs on chalkboards whose talents are rarely learned. The second point is also incredibly consistent with the assumptions that underlie “the cult of genius” described by Professor Jordan Ellenberg in How Not to Be Wrong: The Power of Mathematical Thinking (p.412):

The genius cult tells students it’s not worth doing mathematics unless you’re the best at mathematics, because those special few are the only ones whose contributions matter. We don’t treat any other subject that way! I’ve never heard a student say, “I like Hamlet, but I don’t really belong in AP English—that kid who sits in the front row knows all the plays, and he started reading Shakespeare when he was nine!”

In short, subjects that are highly mathematical are seen as more driven by innate abilities than are others. In fact, describing someone as a hard worker in mathematical fields is often seen as an implicit insult—an implication I very much understand as someone who has been regularly (usually affectionately) teased as a “try-hard” by many male peers.

The dynamic picture: 2002–2012

Math-intensive subjects are predominately male in the static picture for the year 2012, but how has the gender balance changed over recent years (in these and all science disciplines)? To answer this question, we turn to a dynamic view of female representation over a recent decade by looking at NSF data for the entirety of 2002–2012.

fig2

The above graph plots the percentages of female degree earners in each science discipline for both the undergraduate and doctoral levels for each year from 2002 to 2012. The trends are remarkably varied with overall changes in undergraduate female representation ranging from a decrease of 33.9% (computer sciences) to an increase of 24.4% (atmospheric sciences). Overall changes in doctoral representation ranged from a decline of 8.8% (linguistics) to a rise of 67.6% (astronomy). The following visual more concisely summarizes the overall percentage changes for the decade.

fig3

As this graph illustrates, there were many gains in female representation at the doctoral level between 2002 and 2012. All but three disciplines experienced increased female representation—seems promising, yes? However, substantial losses at the undergraduate level should yield some concern. Only six of the eighteen science disciplines experienced undergraduate gains in female representation over the decade.

The illustrated increases in representation at the doctoral level are likely extensions of gains at the undergraduate level from the previous years—gains that are now being eroded given the presented undergraduate trends. The depicted losses at the undergraduate level could very well lead to similar losses at the doctoral level in the coming decade, which would hamper the widely shared goal to tenure more female professors.

The change for computer sciences is especially important since it provides a basis for the vast, well-documented media and academic focus on women in the field. (Planet Money brought the decline in percentage of female computer science majors to the attention of many in 2014.) The discipline experienced a loss in female representation at the undergraduate level that was more than twice the size of that in any other subject, including physics (-15.6%), earth sciences (-12.2%), and economics (-11.9%).

While the previous discussion of innate talent and stereotype threat focused on math-intensive fields, a category computer sciences fall into, I would argue that this recent decade has seen the effect of those forces on a growing realm of code-intensive fields. The use of computer programming and statistical software has become a standard qualification for many topics in physics, statistics, economics, biology, astronomy, and other fields. In fact, completing degrees in these disciplines now virtually requires coding in some way, shape, or form.

For instance, in my experience, one nontrivial hurdle that stands between students and more advanced classes in statistics or economics is the time necessary to understand how to use software such as R and Stata. Even seemingly simple tasks in these two programs requires some basic level of comfort with structuring commands—an understanding that is not taught in these classes, but rather mentioned as a quick and seemingly obvious sidebar. Despite my extensive coursework in economics and mathematics, I am quick to admit that I only became comfortable with Stata via independent learning in a summer research context, and R via pursuing projects for this blog many months after college graduation.

The implications of coding’s expanding role in many strains of scientific research should not be underestimated. If women are not coding, they are not just missing from computer science—they will increasingly be missing from other disciplines which coding has seeped into.

The big picture: present–future

In other words, I would argue academia is currently faced with the issue of improving female representation in code-intensive fields.[4] As is true with math-intensive fields, the stereotypical beliefs of teachers and parents of younger children “become part of the self-fulfilling belief systems of the children themselves from a very early age” that discourage women from even attempting to enter code-intensive fields. These beliefs when combined with Ellenberg’s described “cult of genius” (a mechanism that surrounded mathematics and now also applies to the atmosphere in computer science) are especially dangerous.

Given the small percentage of women in these fields at the undergraduate level, there is limited potential growth in female representation along the academic pipeline—that is, at the doctoral and professorial levels. While coding has opened up new, incredible directions for research in many of the sciences, its evolving importance also can yield gender imbalances due to the same dynamics that underlie underrepresentation in math-intensive fields.

Footnotes

[1] Unfortunately, we cannot extend this year range back before 2002 since earlier numbers were solely presented for broader discipline categories, or parent science categories—economics and anthropology would be grouped under the broader term “social sciences,” while astronomy and chemistry would be included under the term “physical sciences.”

[2] The NSF differentiates between science and engineering as the latter is often described as an application of the former in academia. While engineering displays an enormous gender imbalance in favor of men, I limit my discussion here to disciplines that fall under the NSF’s science category.

[3] The latter viewpoint does have some scientific backing. The paper “Nonlinear Psychometric Thresholds for Physics and Mathematics” supports the notion that while greater work ethic can compensate for lesser ability in many subjects, those below some threshold of mathematical capacities are very unlikely to succeed in mathematics and physics coursework.

[4] On a positive note, atmospheric sciences, which often involves complex climate modeling techniques, has experienced large gains in female representation at the undergraduate level.

Speaking of coding…

Check out my relevant Github repository for all data and R scripts necessary for reproducing these visuals.

Thank you to:

Ally Seidel for all the edits over the past few months! & members of NYC squad for listening to my ideas and debating terminology with me.


© Alexandra Albright and The Little Dataset That Could, 2016. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts, accompanying visuals, and links may be used, provided that full and clear credit is given to Alex Albright and The Little Dataset That Could with appropriate and specific direction to the original content.

Alex’s Adventures in Academia-land

Intro

While I usually use this blog to work on casual pet projects, I wanted to take an opportunity to use the medium to discuss academic research. Writing is an instinctive mechanism for me to process my thoughts on a topic, but it is one that I use sparingly to discuss the meta-narrative of my own decisions and behavior. The impetus for this self-reflection is the following exciting news: I’ll be pursuing my PhD in economics at Harvard starting this fall! The decision has naturally prompted me to think about my adventures thus far in the academic sphere and the scope of my ambitions and interests.

Think of this as a more organized and open outlet for many of the words (written, spoken, and silently thought) that have bounced around my head throughout the (now 100% finished!) applications process. This post contains a mixture of excerpts from academic personal statements from PhD applications as well as even undergraduate ones (turns out the overwhelming majority of my college application essays involved math in some way, shape, or form).[1] The purpose of this piece is multi-pronged: I’m hoping to (Part I) introduce my interest in economics research on a personal level, (Part II) clearly outline research questions and topics that I have worked on, and (Part III) describe potential eventual research ambitions.[2]

Part I: The number friends

A framed piece of legal paper hung in my parents’ room for nearly a dozen years. The numbers 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9 were etched onto the page. Each held a shade or two of color, leaked from a gifted box of gel pens, within its skinny outline. A speech bubble burst from each number so it could introduce itself or articulate feelings that might be beyond its self-quantification. ‘9’ philosophizes that “it is hard to be odd,” while ‘1’ grumbles over lonesomeness. Atop this paper is the simply written title “The Number Friends.”[3]

Many of my childhood memories are inseparably intertwined with numbers. Learning the exponents of 2 while poking my head into the ice cream freezer at our local deli. Multiplying numbers by wicking my finger against moisture on my grandmother’s 1980 Volvo. Calculating and writing up the winning percentages of baseball teams on the white board in our living room. (It’s an understatement to say that the 2000 subway series was an exciting time in my early life.) To cut to the chase, I was always fond of numbers. My numbers—those I played with as though they were a set of stuffed animals in my living room—hardly resemble those many people groan about—their dusty gray, corporate counterparts.

Despite my interest in the numbers that were stacked on top of each other to fill the standings in the sports section, I grew up ignoring a word that often found itself on adjacent pages of the inky paper— “economics.” The word always seemed to be coming from the lips of men in suits who carried leather briefcases and drank dark, serious coffees. It was a word that I did not associate with anything but Mr. Monopoly—that is, until my senior year of high school when I took an economics class for the first time. Carrying the weightless tools of microeconomics outside of the classroom, I quickly found myself internally modeling the grumpiness of my classmates based on the outside temperature, the day’s schedule type, the quality of their chosen lunch, and the morning delays (or, hopefully, lack thereof) on their regular subway line; explaining my teenage decisions to my parents by implicitly highlighting our very different utility functions; and even debating how one could “optimally” match up students for prom.[4] Imagine my joy in 2012 when Alvin E. Roth won the Economics Nobel for work that redesigned the mechanism for students to select into my exam high school (Stuyvesant High School[5])! The eventual knowledge that groundbreaking work in game theory and market design had implicitly played a role in my presence at that school and, accordingly, in my first foray into economics was incredibly exciting and inspiring. My innate adoration of mathematics and logic combined with my attention to the dynamics of the human world around me molded me into a young economist.[6]

Part IIa: Early research exposure

In my undergraduate studies, I eagerly continued formulating arguments and theories using the building blocks of microeconomic theory and began to seek out academic opportunities to explore these interests. In particular, my fondness for behavioral economics was solidified when I earned a job as a Research Assistant to Professor Sarah Jacobsen my junior year and discovered how assumptions of rational choice do not necessarily hold in human decision-making.  In helping evaluate the results of experimental economic studies, I was intrigued by the gap between seemingly concrete theory and the realities of human behavior.[7] I dived deeper into economics research by working on campus at Williams that following summer as a 1957 Research Fellow for Professor Yung Suk Lee, focusing on a project about the expansion of exam schools and opportunities to attain academic achievement. In this role, I used knowledge of exam cutoffs for admission into specialized New York exam schools and compared academic outcomes for students that were at the margin (both above and below cutoffs) to investigate the much-debated impact of these schools on later academic success. As well as exposing me to statistical methodologies such as regression discontinuity design, the summer taught me how to work independently and probe assumptions and logical frameworks at the core of well-respected studies.

Part IIb: Wine economics as senior thesis

At the end of my junior year, I was lucky enough to be awarded the Carl Van Duyne Prize in Economics and received funding to pursue a senior honors thesis; this opportunity was the catalyst for the start of my self-directed economics research. My project focused on the intersection of applied econometrics and behavioral economics and examined the dynamic response of prices in the wine market to wine critic reviews. Since consumers have often not experienced a given wine ex ante when considering what to buy, reviews and ratings of quality play a consequential role in shaping consumer and market dynamics. My fascination with this subject was derived from the knowledge that, though ratings measure quality, they also influence consumers independent of their accuracy; for this reason, my curiosity about how researchers could disentangle the concepts of hype and quality grew.

While other economists have studied similar topics, no previous work had defined hype and quality as unobserved concepts. Given the fact that I defined these two dimensions of a product as unobserved, a naive cross-sectional regression would not have sufficed in comparing the respective roles. Therefore, I instead used a panel structural vector autoregression methodology to approach this topic from a new angle. (For more on this method, see Pedroni 2013.) I exploited knowledge of the dynamics of an online wine community (CellarTracker) as well as the behavior of the consumer rating mechanism in order to construct short-run restrictions to identify structural shocks. Therefore, by combining both substantive knowledge of wine and the wine drinking community with statistical techniques, I was able to work on a novel approach to a continuously intriguing problem.

I continue to work with my advisor Professor Peter Pedroni on translating the concepts beyond the scope of wine to broader research pertaining to high-end goods. In fact, I’m going to the American Association of Wine Economists Meeting in Bordeaux to present on this in June![8] In preparing a paper for conference submission, we treat information from expert reviews of high-end goods as a part of a broader signal extraction problem tackled by consumers of such goods. (More to come on this soon…) During June 2015, I presented this ongoing work at the interdisciplinary Connected Life Conference at Oxford University, which fostered collaboration with computer scientists, sociologists, and other researchers.[9]

Part IIc: Working at the intersection of law and economics @ Stanford

Since graduating from Williams, I have worked with Professor John Donohue at Stanford Law School as a Research Fellow.[10] In this pre-doctoral role, I work on projects at the intersection of law and economics, with a particular focus on the economics of crime and related econometric and statistical methodologies. For instance, I got to play a large role in developing and reviewing the paper “The Empirical Evaluation of Law: The Dream and the Nightmare” (published in the Journal of American Law and Economics Review).[11] This paper charts the enormous advances in estimating causal effects of laws and policies in the past few decades and points out the frequency of conflicting studies on identical questions. Given the conflicting nature of many studies, it can be hard to know what should be believed and the media, think tanks, and others often exploit this difficulty to promote certain studies for private political or social agendas. Accordingly, in discussing the methodological soundness of various approaches, this article seeks to begin a discussion about how we want to manage the translation between research and media coverage especially when it comes to politically contentious topics.

On a related note, I am currently working on a project that uses a statistical technique called synthetic controls (see Abadie & Gardeazabal 2003 and Abadie, Diamond, & Hainmueller 2009) to look at the impact of right-to-carry laws on crime in the United States. The impact of right-to-carry gun laws on crime has been debated within both the academic community and the public sphere for decades. To address some of the inherent weaknesses of panel data models, we are using the aforementioned synthetic controls methodology, a methodology that generates counterfactual units by creating a weighted combination of similar (in terms of the pre-treatment period) control units. Panel data studies are often extremely sensitive to minor changes in choices of explanatory variables. Therefore, by working on new approaches to these sorts of questions, we seek out methods that generate robust results that have the potential to help guide policy decisions in pivotal areas, where slicing and dicing numbers can be done to fit virtually any policy agenda. The broader impacts of creating robust decision-making processes for analyzing the impact of controversial policies is one of the aspects of economics about which I am most passionate.

Part IIIa: Potential research ambitions in economics

During PhD visits, it is common to pitch your interests to professors. At the macro level (and using some slick economics jargon), I am most interested in behavioral economics, and applied microeconomics. Applied microeconomics is a lovably large umbrella term that easily contains both urban economics, and law and economics, and, therefore, the previous sentence adequately articulates both my interest in the effects of psychological/social/cognitive/emotional factors on decision making as well as the application of microeconomic theory to the study of crime, cities, law, and education. (That undoubtedly leaves space for a lot of potential research topics!)

While I have a number of continuing interests, such as the reputational influence of experts in networks as investigated in the wine project (in the behavioral realm), or economics of crime topics at Stanford, I believe one of the ripest and most important areas for economic research is actually a union of behavioral economics with the economics of crime. That is, further investigating how people find themselves participating in crime.

I am often struck by how often individuals, myself included, buy into illusions of choice. It is tempting to view one’s accomplishments as essentially a function of personal social/academic merit. This is especially true among the more privileged among us—those of us who grew up benefitting from the financial success of family members, the color of our skin, and overall, positive reenforcement in most facets of our lives. I became aware of the influence of environmental behavioral factors while observing my own behaviors in a school context. In high school, I was lucky enough to be a beneficiary of overwhelmingly positive forces (driven/ambitious peers and thoughtful/encouraging teachers). The profound influence of positive classrooms like my own can be easily seen in a recent study by Card and Giuliano. The study found that participation by “non-gifted” students in a “gifted” classroom lead to significant achievement gains for the minority students (gains of 0.5 standard deviations in reading/math). Incredibly, the authors did not attribute the gains to teacher quality or peer effects, but to “the effects to a combination of factors like teacher expectations and negative peer pressure that lead high-ability minority students to under-perform in regular classes but are reduced in a GHA classroom environment“!

While education topics are increasingly receiving a behavioral treatment in the literature (due in part to the ability to fashion experiments in classrooms and, potentially, due to the less politically contentious nature of education), the current state of the economics of crime is still deeply entrenched in Beckerian ideas of deterrence–criminals make cost-benefit calculations in their minds and then use these to inform decisions. This type of reasoning (which is not incorrect, as much as it is lacking in dimensions of the human experience) over the past decades has lead to piles and piles of papers trying to separate out the impact of sentence enhancements (seen around the time of the 1990’s crime decline) into an incapacitation effect (people are off the street in prison and thus incapable of committing crimes) and a deterrence effect (people are scared off of committing crimes because of the greater cost). What with our improved notions of behavioral mechanisms and the current well-deserved focus on incarceration levels, policies from the 1990’s (specifically, the 1994 crime bill), and interactions between police and disadvantaged communities, there is no doubt that further studies of the social interactions in crime networks (see the classic Glaeser 1996 paper) as well as environmental factors (think Reyes’ work on lead exposure) are warranted to better inform policy as well as our core human understanding of how peoples’ lives diverge so starkly. Illusions of choice are powerful (as well as enticing to those at the helm of the ship) and are accordingly worth a hefty dose of skepticism from the community at large. (There are many more ideas to develop and papers to cite in these paragraphs, but I’ll let this marinate as it is for the moment.)

On herd behavior in particular: I have no qualms in asserting that I have benefited immensely from herding behaviors that harm others who simply gained consciousness in a different social/economic environment. The same strains of herd behavior, which pulses through networks (those of academics, and those of drug traffickers alike), lead to disparate outcomes based on the starting point and environment in which they occur. 

Beyond behavior and crime, some other developing research interests on my eventual topic wishlist include:

Part IIIb: Things are about to get meta

On a somewhat meta note, I feel strongly about making economics research and, more generally, research that is data-driven replicable and accessible to the public. I believe that open sourcing datasets and code for projects not only facilitates how different projects can build off of one another but also encourages a more diverse group of individuals to explore quantitative methods.[12] By making work publicly accessible, researchers can challenge themselves to defend their ideas and assertions to any interested individuals, rather than limiting themselves to discussion in academic bubbles. I strongly believe that this renders research dynamics fundamentally more efficient, as public-facing projects allow for a faster and smoother exchange of ideas, which can lead to superior projects in the long-run. This sort of openness on the part of researchers often allows for great collaborations—my wonderful friend/public speaking inspiration Sarah Michael Levine and I originally bonded via Twitter (!) and then ended up writing a paper together on the shortcomings of mainstream data science when applied to social good projects (which we got to present at the Bloomberg Data for Good Exchange 2015). In my personal experience, making work and ideas available to a larger audience has led to a number of incredible opportunities to work with talented people on a range of captivating questions that engage the public and illustrate the fundamental creativity that is inherent to but often ignored in quantitative work.

Endnote

In reviewing this writing, I am acutely aware of the fact that I tend to over-narrativize my own experiences, injecting meaning into the twists and turns that may just be segments of a random walk. However, while there might not be some grand meaning in an individual’s path towards the degree that we call a PhD, I do strongly believe in the profound nature of social science research more generally—self-awareness is fundamentally human and our ability to study our own machinations is something that we find irresistible.[13] The letters we desire to have traipse behind our names are trivial in the long run, but the questions we ask in pursuit of them ultimately stem from the core of personhood—consciousness and the curiosity that comes with it.[14][15]

Footnotes

[1] Concretely describing motivations, processes, and goals for research is an element of communication in academia that I believe can be much improved by embracing non-traditional/technologically-driven mediums of discussion. So, why not take the time to try and practice communicating with the transparency and openness that I often crave from other researchers? (Warning: this is going to be long! I am working through caches of thoughts that have managed to build themselves into some pretty hefty structures over the years.)

[2] In thinking about that oft-cited 2 x 2 matrix that contains four quadrants dedicated to simple/complex ideas vs. simple/complex writing, the dream is to eventually make it into that slick complex ideas & simple writing quadrant.

[3] Oh, the trials and tribulations of being an only child… (“Some imaginary friends you never outgrow.”)

[4] Think utility maximization problems. If the application of mathematical concepts to questions of romance is interesting to you: check out the marriage problem.

[5] Go Vixens!/Go Phoenix!/Go Renegades! (The last one was a much needed improvement from the softball team’s previous mascot—the Chipmunks.)

[6] In this vein of personal narrative, see also Claudia Goldin’s “The Economist as Detective.”

[7] In technical terms, I ran paired t-test and signed-rank regressions in order to analyze a survey participant’s level of consistency in terms of his or her risk-taking decisions.

[8] Hopefully, I will soon have some slides that can help in communicating the relevant ideas.

[9] Check out the call for papers!

[10] I originally found out about Prof Donohue through reading Freakonomics (a commonly cited catalyst for people’s realization that economics can be clever and creative!) my sophomore year since the abortion and crime chapter is based on one of his articles “The Impact of Legalized Abortion on Crime” with Steven Levitt of UChicago.

[11] I saw the journal that contained this article (and my name typed within it) in the flesh a few weeks ago at Harvard before some meetings. That experience immediately quashed some hefty feelings of impostor syndrome.

[12] Papers, data, and methods should be available to the public rather than only available to those at institutions of higher education…or, even worse only available through asking nicely via email with shiny credentials. (Once, a professor I emailed once for data responded that he was retiring and moving across the country, so he had thrown out all his papers, and, thus, could not help me. I often feel more like an investigative reporter when tracking down data than an academic!)

[13] Research in this context should not be solely interpreted as academic research! In fact, I would argue that every individual conducts casual research in the day-to-day, while the PhD is an example of an institutionalized and formal medium for research.

[14] Listen to this recent episode of Radiolab for the following relevant quote and much more: “Consciousness—for some reason, for some reason one animal on the planet and only one that we can know seems to string into this very elaborate sense of self-awareness—we don’t know how it happened we don’t know why it happened it just did”

[15] Insightful discussions that stem from that very curiosity should not be limited to only those with a PhD. So, social network, let’s talk.


© Alexandra Albright and The Little Dataset That Could, 2016. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts, accompanying visuals, and links may be used, provided that full and clear credit is given to Alex Albright and The Little Dataset That Could with appropriate and specific direction to the original content.

The Curious Case of The Illinois Trump Delegates

Intro

This past Wednesday, after watching Hillary Clinton slow-motion strut into the Broad City universe and realizing that this election has successfully seeped into even the most intimate of personal rituals, I planned to go to sleep without thinking any more about the current presidential race. However, somewhere in between Ilana’s final “yas queen” and hitting the pillow, I saw David Wasserman’s FiveThirtyEight article “Trump Voters’ Aversion To Foreign-Sounding Names Cost Him Delegates.”

Like many readers I was immediately drawn to the piece’s fundamentally ironic implication that Trump could have lost delegates in Illinois due to the very racial resentment that he espouses and even encourages among his supporters. The possibility that this could be more deeply investigated was an energizing idea, which had already inspired Evan Soltas to do just that as well as make public his rich-in-applications-and-possibilities dataset. With this dataset in hand, I tried my hand at complementing the ideas from the Wasserman and Soltas articles by building some visual evidence. (Suffice it to say I did not end up going to sleep for a while.)

To contribute to the meaningful work that the two articles have completed, I will first quickly outline their scope and conclusions, and then present the visuals I’ve built using Soltas’ publicly available data. Consider this a politically timely exercise in speedy R scripting!

Wasserman’s FiveThirtyEight piece & Soltas’ blog post

In the original article of interest, Wasserman discusses the noteworthy Illinois Republican primary. He explains that,

Illinois Republicans hold a convoluted “loophole” primary: The statewide primary winner earns 15 delegates, but the state’s other 54 delegates are elected directly on the ballot, with three at stake in each of the state’s 18 congressional districts. Each campaign files slates of relatively unknown supporters to run for delegate slots, and each would-be delegate’s presidential preference is listed beside his or her name.

Given that the delegates are “relatively unknown,” one would assume that delegates in the same district who list the same presidential preference would earn similar numbers of votes. However, surprisingly, Wasserman found that this did not seem to be the case for Trump delegates. In fact, there is a striking pattern in the Illinois districts with the 12 highest vote differentials: “[i]n all 12 cases, the highest vote-getting candidate had a common, Anglo-sounding name” while “a majority of the trailing candidates had first or last names most commonly associated with Asian, Hispanic or African-American heritages.” These findings, while admittedly informal, strongly suggest that Trump supporters are racially biased in their delegate voting behaviors.

Soltas jumps into this discussion by first creating dataset on all 458 people who ran for Illinois Republican delegate spots. He merges data on the individuals’ names, districts, and candidate representation with a variable that could be described as a measure of perceived whiteness–the non-Hispanic white percentage of the individual’s last name, as determined from 2000 US Census data. The inclusion of this variable is what makes the dataset so exciting (!!!) since, as Soltas explains, this gives us an “objective measure to test the phenomenon Wasserman discovered.”

The article goes on to confirm the legitimacy of Wasserman’s hypothesis. In short, “Trump delegates won significantly more votes when they had “whiter” last names relative to other delegates in their district” and this type of effect does not exist for the other Republicans.

Visual evidence time

I now present a few visuals I generated using the aforementioned dataset to see Soltas’ conclusions for myself. First things first, it’s important to note that some grand underlying mechanism does not jump out at you when you simply look at the association between perceived whiteness and vote percentage for all of Trump’s Illinois delegates:

fig1

The above graph does not suggest any significant relationship between these two numbers attached to each individual delegate. This is because the plot shows delegates across all different districts, which will vote for Trump at different levels, but compares their absolute variable levels. What we actually care about is comparing voting percentages within the same district, but across different individuals who all represent the same presidential hopeful. In other words, we need to think about the delegates relative to their district-level context. To do this, I calculate vote percentages and whiteness measures relative to the district: the percentage point difference between a Trump delegate’s vote|whiteness percentage and the average Trump delegate vote|whiteness percentage in that district. (Suggestions welcome on different ways of doing this for visualization’s sake!)

fig2

Now that we are measuring these variables (vote percentage and whiteness measure) relative to the district, there is a statistically significant association beyond even the 0.1% level. (The simple linear regression Y~X in this case yields a t-statistic of 5.4!) In the end, the interpretation of the simplistic linear regression is that a 10 percentage point increase in a Trump delegate’s perceived whiteness relative to the district yields a 0.12 percentage point increase in the delegate’s vote percentage relative to the district. (I’m curious if people think there is a better way to take district levels into account for these visuals–let me know if you have any thoughts that yield a simpler coefficient interpretation!)

The last dimension of this discussion requires comparing Trump to the other Republican candidates. Given the media’s endless coverage of Trump, I would not have been surprised to learn that this effect impacts other campaigns but just was never reported. But, Wasserman and Soltas argue that this is not the case. Their claims are further bolstered by the following visual, which recreates the most recent Trump plot for all 9 candidates who had sufficient data (excludes Gilmore, Huckabee, and  Santorum):

fig3

It should be immediately clear that Trump is the only candidate for whom there is a positive statistically significant association between the two relative measures. While Kasich has an upward sloping regression line, the corresponding 95% confidence interval demonstrates that the coefficient on relative perceived whiteness is not statistically significantly different from 0. Employing the whiteness measure in this context allows us to provide quantitative evidence for Wasserman’s original intuition that this effect is unique to Trump–thus, “lend[ing] credibility to the theory that racial resentment is commonplace among his supporters.”

The role of perceptions of whiteness

Wasserman’s article has incited an outpouring of genuine interest over the past few days. The fascinating nature of the original inquiry combined with Soltas’ integration of a perceived whiteness measure into the Illinois delegate dataset provides a perfect setting in which to investigate the role racial resentment is playing in these particular voting patterns, and in the election on the whole.

Code

My illinois_delegates Github repo has the R script and csv file necessary to replicate all three visuals! (We know data, we have the best data.)


© Alexandra Albright and The Little Dataset That Could, 2016. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts, accompanying visuals, and links may be used, provided that full and clear credit is given to Alex Albright and The Little Dataset That Could with appropriate and specific direction to the original content.