On ego and sharing ideas

Line Charts, Models
A behavioral Nobel means a behavioral blog post

In honor of Richard Thaler’s recent Nobel prize win, I give you a post on a behavioral economics topic! Welcome to round 2 of A G2 Talks Models. Today’s topic: ego utility and the decision to speak up in class à la Koszegi 2006, an approach to belief-based utility adapted from Psychology and Economics lecture.

An admission of ego

I have a confession to make. I want you to think I’m smart. There, I said it. It is important to my self-image that you (yes, you on the other side of this screen!), my academic peers, and even the man (boy?) who tipsily mansplained the Monty Hall problem to me perceive me as intelligent. That is, intelligence as signaled by the occasional insightful comment, deep question, or quality idea that I get up the nerve to share.

Classrooms are environments in which lots of signaling of such smarts takes place. Professors ask questions, both rhetorical and not. They let us marinate in pregnant pauses and make a call for ideas. There’s a beat in which the tiny neuron-bureaucrat who is tasked with managing and organizing my brain activity files through some nascent concepts and responses. Is this any good?, she asks. Her supervisor isn’t sure either. Is this relevant? Yeah, but, is it too obvious? The supervisor prods her saying, time is of the essence. Internal hesitation over whether or not to share an idea in class still plagues me even after multiple decades of participation in the exercise. However, the difference is that now, in 18th grade, I can explicitly model that very idea-sharing decision.

A twist on classical utility: Enter ego…

To model this decision, I enter into a belief-based utility world. I define my utility function as follows: u = r – e + g√p, with r being the classroom response to the idea, e being the effort cost of sharing the idea, p being the probability that I think it’s a quality idea, and g being a parameter for “ego utility.” In classical utility world, this g√p term would not exist; I would simply weigh the benefit of sharing the idea r and the cost of sharing the idea e. Moreover, in a departure from classical economics assumptions, this form of belief-based utility displays information aversion, thus the square root on the p.

Now, let’s run through the outcomes based on class participation or class non-participation. If I take the jump and share my idea, I always expend some amount of effort e>0. Meanwhile, the benefit I derive depends on the ex post observable quality of the idea, as measured by the classroom response. If the idea was high quality, I gain r=1. If instead it was lacking or, shall we say, basic, I gain r=b where 1>b>0. If I keep my thoughts to myself, then e=0 and r=0.

In effect, if I share my idea, I receive u = p(1-e+g√1) + (1-p)(b-e+g√0). The first term on the right hand side is my perceived probability that the idea is of high quality multiplied by the associated payoff, while the second term is my perceived probability that the idea is basic multiplied by that associated payoff. Rearranging terms, u = p(1+g) + (1-p)b – e. Meanwhile, if I stay silent, my payoff is simply u = g√p.

Using this simple framework, I will share my idea with the class if and only if p(1+g) + (1-p)b – e > g√p. Simplified, I share my idea if and only if g(p-√p) + p + (1-p)b – e > 0.

Given this inequality, we can see that if g goes to infinity (i.e., my ego utility is huge), and p is not 0 or 1, the inequality will never hold, as p-√p will always be negative (since p is a fraction between 0 and 1); this means I will never raise my hand to share my ideas because I am so paralyzed by my massive ego utility. Meanwhile, as p approaches 0 or 1, the g(p-√p) term goes to 0, leaving the decision up to the inequality p + (1-p)b > e. Thus, I will speak if the expected value of the payoff to my comment exceeds the effort cost. (Recall that this is exactly what I do in the classical utility case in which I have no ego utility.)

While both of these two above conclusions seem predictable, there is a notable intriguing prediction from this model. You might expect that the greater my perceived probability that the idea is high quality, the more likely I am to share the idea. Well, this is not true. In other words, there is non-monotonicity in p. Say I have a moderate level of ego utility g and my p grows from a low to a higher level. This positive change in p could cause me to put my hand down even though now I am more confident in the quality of my idea. Weird! Ego utility allows there to be a negative correlation between my confidence in my idea’s quality and my willingness to share said idea.

Intuitively, as I become more confident in an idea, not only is there is a higher expected benefit to sharing the idea but there is also a higher possible loss of utility due to the ego utility term. The way these two opposing effects spar with one another can lead my hand to go up, down, and up again as my confidence in an idea increases.

Let’s illustrate this surprising concept graphically. We can parametrize the model and make visually explicit how the decision to my raise hand changes with p. Let’s set g=3, b=0.5, e=0.01. Given these values, I will speak my idea if and only if 3(p-√p) + p + (1-p)0.5 – 0.01 > 0. I.e., iff 3.5p – 3√p + 0.49 > 0. As such, I can graph the utility function with the full range of possible p values from 0 to 1 and accordingly color areas depending on whether or not they correspond to sharing an idea. (I share an idea if the utility function yields a value greater than or equal to 0; otherwise, I do not.)

share

The above illustrates that I am willing to share an idea when my probability that it is high quality is very low, but that I am no longer willing once the probability is a more moderately low value. This is evidence of the non-monotonicity in p in this model; I might lower my hand in class to protect my ego.

Anecdotal evidence, dynamics, and blog posting

I find ego utility fascinating and very believable when reflecting on my own experiences. For one, I have noticed that I often become more silent as conversation topics sway from topics in which I am novice to topics that I am moderately more knowledgable about. I feel acutely aware of the aforementioned tensions in the model; yes, I am more confident in my ideas in this realm, but I now have more to lose if I choose to share them. This is also a hesitation I feel internally when I talk with professors and friends about ideas. If the idea is undeveloped, there is really no harm in sharing it (p is low at that point); but, if I have been working on it and have a higher p, now, there is a chance I might realize that my idea was not up to snuff. In this sense, I can sometimes feel myself keeping ideas or projects to myself, as then they can’t be externally revealed to be low quality. I can sit on the sidelines and nurture my pet projects without a care in the world, stroking the ego-related term in my utility function.

But, in a more complex model, perhaps one that better represents my reality, idea quality is improved with idea sharing and collaboration. The model at hand is a one-shot game. I have an idea and I decide whether or not to share it. (The end.) But, in my flesh-and-blood/Stata-and-R universe, ideas do not disappear after that first instance of sharing; they develop dynamically. If I imagine refitting the model to mimic my reality, it is clear that silence for ego appeasement is a strategy that does not pay off long term…

I like to think that this is one reason why I write these posts — to share and accordingly develop ideas. In fact, when I started sharing R code online almost three years ago I was such a novice that I had a very low p regarding my data visualization capacities. In this way, ego utility was not able to hold me back from openly sharing my scripts. I was a strong advocate for transparency (still am) and at that time didn’t mind at all if my code looked like “a house built by a a child using nothing but a hatchet and a picture of a house.” However, if I were to imagine starting blogging now, I could see holding off, as I perceive my probability of being a decent coder as much larger than I did three years ago.

In the end, I am very happy that I chose to start sharing my work when I had a very small p. In fact, if you squint really hard, you can probably see me lounging on the utility function curve, fumbling to use ggplot2, somewhere in that first blue chunk of the graph.

Endnote

This post adapts model mechanics from Koszegi 2006. A Psychology and Economics lecture explicitly inspired and informed this piece. Lastly, here is the R notebook used to create the graphic in this post.


© Alexandra Albright and The Little Dataset That Could, 2017. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts, accompanying visuals, and links may be used, provided that full and clear credit is given to Alex Albright and The Little Dataset That Could with appropriate and specific direction to the original content.
Advertisements

A Bellman Equation About Nothing

Line Charts, Models

Cold Open [Introduction]

A few years ago I came across a short paper that I desperately wanted to understand. The magnificent title was “An Option Value Problem from Seinfeld” and the author, Professor Avinash Dixit (of Dixit-Stiglitz model fame), therein discussed methods of solving for “sponge-worthiness.” I don’t think I need to explain why I was immediately drawn to an academic article that focuses on Elaine Benes, but for those of you who didn’t learn about the realities of birth control from this episode of 1990’s television, allow me to briefly explain the relevant Seinfeld-ism. The character Elaine Benes[1] loyally uses the Today sponge as her preferred form of contraception. However, one day it is taken off the market and, after trekking all over Manhattan, our heroine manages to find only one case of 60 sponges to purchase. The finite supply of sponges poses a daunting question to Elaine… namely, when should she choose to use a sponge? Ie, when is a given potential partner sponge-worthy?

JERRY: I thought you said it was imminent.

ELAINE: Yeah, it was, but then I just couldn’t decide if he was really sponge-worthy.

JERRY: “Sponge-worthy”?

ELAINE: Yeah, Jerry, I have to conserve these sponges.

JERRY: But you like this guy, isn’t that what the sponges are for?

ELAINE: Yes, yes – before they went off the market. But I mean, now I’ve got to re-evaluate my whole screening process. I can’t afford to waste any of ’em.

–“The Sponge” [Seinfeld Season 7 Episode 9]

As an undergraduate reading Professor Dixit’s introduction, I felt supremely excited that an academic article was going to delve into the decision-making processes of one of my favorite fictional characters. However, the last sentence in the introduction gave me pause: “Stochastic dynamic programming methods must be used.” Dynamic programming? Suffice it to say that I did not grasp the methodological context or mathematical machinery embedded in the short and sweet paper. After a few read-throughs, I filed wispy memories of the paper away in some cluttered corner of my mind… Maybe one day this will make more sense to me… 

Flash forward to August 2016. Professor David Laibson, the economics department chair, explains to us fresh-faced G1’s (first-year PhD’s) that he will be teaching us the first part of the macroeconomics sequence… Dynamic Programming. After a few days of talking about Bellman equations, I started to feel as if I had seen related work in some past life. Without all the eeriness of a Westworld-esque robot, I finally remembered the specifics of Professor Dixit’s paper and decided to revisit it with Professor Laibson’s lectures in mind. Accordingly, my goal here is to explain the simplified model set-up of the aforementioned paper and illustrate how basics from dynamic programming can be used in “solving for spongeworthiness.”

Act One [The Model]

Dynamic programming refers to taking a complex optimization problem and splitting it up into simpler recursive sub-problems. Consider Elaine’s decision as to when to use a sponge. We can model this as an optimal stopping problem–ie, when should Elaine use the sponge and thus give up the option value of holding it into the future? The answer lies in the solution to a mathematical object called the Bellman equation, which will represent Elaine’s expected present value of her utility recursively.

Using a simplified version of the framework from Dixit (2011), we can explain the intuition behind setting up and solving a Bellman equation. First, let’s lay out the modeling framework. For the sake of computational simplicity, assume Elaine managed to acquire only one sponge rather than the case of 60 (Dixit assumes she has a general m sponges in his set-up, so his computations are more complex than mine). With that one glorious sponge in her back pocket, Elaine goes about her life meeting potential partners, and yada yada yadaTo make the yada yada’s explicit, we say Elaine lives infinitely and meets one new potential partner every day t who is of some quality Qt. Elaine is not living a regular continuous-time life, instead she gets one romantic option each time period. This sets up the problem in discrete-time since Elaine’s decisions are day-by-day rather than infinitesimally-small-moment-by-infinitesimally-small-moment. If we want to base this assumption somewhat in reality, we could think of Elaine as using Coffee Meets Bagel, a dating app that yields one match per day. Ie, one “bagel” each day.

Dixit interprets an individual’s quality as the utility Elaine receives from sleeping with said person. Now, in reality, Elaine would only be able to make some uncertain prediction of a person’s quality based on potentially noisy signals. The corresponding certainty equivalent [the true quality metric] would be realized after Elaine slept with the person. In other words, there would be a distinction between ex post and ex ante quality assessments—you could even think of a potential partner as an experience good in this sense. (Sorry to objectify you, Scott Patterson.) But, to simplify our discussion, we assume that true quality is observable to Elaine—she knows exactly how much utility she will gain if she chooses to sleep with the potential partner of the day. In defense of that assumption, she does vet potential partners pretty thoroughly.

Dixit also assumes quality is drawn from a uniform distribution over [0,1] and that Elaine discounts the future exponentially by a factor of δ in the interval (0,1). Discounting is a necessary tool for agent optimization problems since preferences are time dependent. Consider the following set-up for illustrative purposes: Say Elaine gains X utils from eating a box of jujyfruit fruit today, then using our previously defined discount factor, she would gain δX from eating the box tomorrow, δ2X from eating it the day after tomorrow, and so on. In general, she gains δnX utils from consuming it n days into the future—thus the terminology “exponential discounting.” Given the domain for δ, we know unambiguously that X > δX >δ2X >… and on. That is, if the box of candy doesn’t change between periods (it is always X), (assuming it yields positive utility—which clearly it must given questionable related life decisions.) Elaine will prefer to consume it in the current time period. Ie, why wait if there is no gain from waiting? On the other hand, if Elaine wants to drink a bottle of wine today that yields Y utils, but the wine improves by a factor of w>1 each day, then whether she prefers to drink it today or tomorrow depends on whether Y—the present utility gain of the current state of the wine—or δ(wY)—the discounted utility gain of the aged (improved) wine—is greater. (Ie, if δw>1, she’ll wait for tomorrow.) If Elaine also considers up until n days into the future, she will be comparing, Y,  δ(wY), δ2X(w2Y), …, and δn(wnY).

In our set-up Elaine receives some quality offer each day that is neither static (as in the jujyfruit fruit example) nor deterministically growing (as in the wine example), rather the quality is drawn from a defined distribution (the uniform distribution on the unit interval—mainly chosen to allow for straightforward computations). While quality is observable in the current period, the future draws are not observable, meaning that Elaine must compare her current draw with an expectation of future draws. In short, everyday Elaine has the choice whether to use the sponge and gain Qt through her utility function, or hold the sponge for a potentially superior partner in the future. In other words, Elaine’s current value function is expressed as a choice between the “flow payoff” Qt and the discounted “continuation value function.” Since she is utility maximizing, she will always choose the higher of these two options. Again, since the continuation value function is uncertain, as future quality draws are from some distribution, we must use the expectation operator in that piece of the maximization problem. Elaine’s value function is thus:

eq1

This is the Bellman equation of lore! It illustrates a recursive relationship between the value functions for different time periods, and formalizes Elaine’s decision as a simple optimal stopping problem.

Act Two [Solving for Sponge-worthiness]

To solve for sponge-worthiness, we need to find the value function that solves the Bellman equation, and derive the associated optimal policy rule. Our optimal policy rule is a function that maps each point in the state space (the space of possible quality draws) to the action space such that Elaine achieves payoff V(Qt) for all feasible quality draws in [0,1]. The distribution of Qt+1 are stationary and independent of Qt, as the draws are perpetually from U[0,1]. (Note to the confounded reader: don’t think of the space of quality draws as akin to some jar of marbles in conventional probability puzzles—those in which the draw of a red marble means there are less red to draw later—since our distribution does not shift between periods. For more on other possible distributions, see Act Four.) Due to the aforementioned stationarity and independence, the value of holding onto the sponge [δEV(Qt+1)] is constant for all days. By this logic, if a potential partner of quality Q’ is sponge-worthy, then Q’ ≥ δEV(Qt+1)! Note that for all Q”>Q’, Q”>δEV(Qt+1), so some partner of quality Q” must also be considered sponge-worthy. Similarly, if a person of quality Q’ is not sponge-worthy, then δEV(Qt+1) ≥ Q’ and for all Q”<Q’, Q”<δEV(Qt+1), so any partner of quality Q” must also not be sponge-worthy. Thus, the functional form of the value function is:

eq2

In other words, our solution will be a threshold rule where the optimal policy is to use the sponge if Q> Q* and hold onto the sponge otherwise. The free parameter we need to solve for is Q*, which we can conceptualize as the all-powerful quality level that separates the sponge-worthy from the not!

Act Three [What is Q*?]

When Q= Q*, Elaine should be indifferent between using the sponge and holding onto it. This means that the two arguments in the maximization should be equal–that is, the flow payoff [Q*] and the discounted continuation value function [δEV(Qt+1)]. We can thus set Q*=δEV(Qt+1and exploit the fact that we defined Q ~ U[0,1], to make the following calculations:

eqs3

The positive root yields a Q* >1, which would mean that Elaine never uses the sponge. This cannot be the optimal policy, so we eliminate this root. In effect, we end up with the following solution for Q*:

eq4

Given this Q*, it is optimal to use the sponge if Q> Q*, and it is optimal to hold the sponge Q* ≥ Qt. Thus, as is required by the definition of optimal policy, for all values of Qt:

eq5

We can interpret the way the Q* threshold changes with the discount factor δ using basic economic intuition. As δ approaches 1 (Elaine approaches peak patience), Q* then approaches 1, meaning Elaine will accept no partner but the one of best possible quality. At the other extreme, as δ approaches 0 (Elaine approaches peak impatience), Q* then approaches 0, meaning Elaine will immediately use the sponge with the first potential partner she meets.

To make this visually explicit, let’s use a graph to illustrate Elaine’s value function for some set δ. Take δ=0.8, then Q*=0.5, a clear-cut solution for the sponge-worthiness threshold. Given these numbers, the relationship between the value function and quality level can be drawn out as such:

valfn

What better application is there for the pgfplots package in LaTeX?!

The first diagram illustrates the two pieces that make up Elaine’s value function, while the second then uses the black line to denote the value function, as the value function takes on the maximum value across the space of quality draws. Whether the value function conforms to the red or green line hinges on whether we are in the sponge-worthy range or not. As explained earlier, before the sponge-worthiness threshold, the option value of holding the sponge is the constant such that Q*=δEV(Qt+1). After hitting the magical point of sponge-worthiness, the value function moves one-for-one with Qt. Note that alternative choices for the discount rate would yield different Q*’s, which would shift the red line up or down depending on the value, which in turn impact the leftmost piece of the value function in the second graph. These illustrations are very similar to diagrams we drew in Professor Laibson’s module, but with some more advanced technical graph labelings than what we were exposed to in class (ie, “no sponge for you” and “sponge-worthy”). 

Act Four [Extensions]

In our set-up, the dependence of the value function is simple since there is one sponge and Elaine is infinitely lived. However, it could be that we solve for a value function with more complex time and resource dependence. This could yield a more realistic solution that takes into account Elaine’s age and mortality and the 60 sponges in the valuable case of contraception. We could even perform the sponge-worthiness calculations for Elaine’s monotonically increasing string of sponge quantity requests: 3, 10, 20, 25, 60! (These numbers based in the Seinfeld canon clearly should have been in the tabular calculations performed by Dixit.)

For computational purposes, we also assumed that quality is drawn independently each period (day) from a uniform distribution on the unit interval. (Recall that a uniform distribution over some interval means that each value in the interval has equal probability.) We could alternatively consider a normal distribution, which would likely do a better job of approximating the population quality in reality. Moreover, the quality of partners could be drawn from a distribution whose bounds deterministically grow over time, as there could be an underlying trend upward in the quality of people Elaine is meeting. Perhaps Coffee Meets Bagel gets better at matching Elaine with bagels, as it learns about her preferences.

Alternatively, we could try and legitimize a more specific choice of a distribution using proper Seinfeld canon. In particular, Season 7 Episode 11 (“The Wink,” which is just 2 episodes after “The Sponge”) makes explicit that Elaine believes about 25% of the population is good looking. If we assume Elaine gains utility only from sleeping with good looking people, we could defend using a distribution such that 75% of quality draws are exactly 0 and the remaining 25% of draws are from a normal distribution ranging from 0 to 1.  (Note that Jerry, on the other hand, believes 95% of the population is undateable, so quality draws for Jerry would display an even more extreme distribution–95% of draws would be 0 and the remaining 5% could come from a normal distribution from 0 to 1.)

Regardless of the specific distribution or time/resource constraint choices, the key take-away here is the undeniably natural formulation of this episode’s plot line as an optimal stopping problem. Throughout the course of our six weeks with Professor Laibson, our class used dynamic programming to approach questions of growth, search, consumption, and asset pricing… while these applications are diverse and wide-ranging, don’t methods seem even more powerful when analyzing fictional romantic encounters!?

elaine

Speaking of power

References

As explained earlier, this write-up is primarily focused on the aforementioned Dixit (2011) paper, but also draws on materials from Harvard’s Economics 2010D sequence. In particular, “Economics 2010c: Lecture 1 Introduction to Dynamic Programming” by David Laibson (9/1/2016) & “ECON 2010c Section 1” by Argyris Tsiaras (9/2/2016).


© Alexandra Albright and The Little Dataset That Could, 2017. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts, accompanying visuals, and links may be used, provided that full and clear credit is given to Alex Albright and The Little Dataset That Could with appropriate and specific direction to the original content.