Category Archives: Uncategorized

Allocating online experimental subjects to cells—doing better than random

Most people running experiments understand that they need to randomize subjects to their experimental groups. When I teach causal inference to my undergraduates, I try to drive this point home and I tell them horror stories about would-be experimenters “randomizing” on things like last name or even worse, effectively letting people opt into their treatment cell by conditioning assignment on some behaviour the user has taken.

But this randomization requirement for valid inference is actually a noble lie of sorts—what you really need is “unconfoundedness” and in experiments I have run on Mechanical Turk—I don’t randomize but rather I stratify, allocating subjects sequentially to experimental groups based on their “arrival” to my experiment. In other words, if I had two cells, treatment and control, my assignments would go like this:

  • Subject 1 goes to Treatment
  • Subject 2 goes to Control
  • Subject 3 goes to Treatment
  • Subject 4 goes to Control
  • Subject 5 goes to Treatment

So long as subjects do not know relative arrival order and cannot condition on it (which is certainly the case), this method of assignment, while not random, is OK for valid causal inference. In fact, arrival time stratification approach does better than OK—it gives you more precise estimates of the treatment effect for a given sample size.

The reason is that this stratification ensures your experiment is better balanced on arrival times, which are very likely correlated with user demographics (because of time-zones) and behaviour (early arrivals are more likely to be heavy users of the site). For example, suppose on Mechanical Turk you run your experiment over several days and the country of each subject, in order is:

  1. US
  2. US
  3. US
  4. US
  5. India
  6. India
  7. India
  8. India.

With randomization, there is a non-zero chance you could get an assignment of Treatment, Treatment, Treatment, Treatment, Control, Control, Control, Control, which is as badly biased as possible (all Americans in the treatment, all Indians in the control). There are a host of other assignments that are better, but not by much, still giving us bad balance. Of course, our calculated standard errors take this possibility into “account” but we don’t have to do this to ourselves—with the stratified on arrival time method, we get a perfectly balanced experiment on this one important user attribute, and hence more precision. As the experiment gets larger and larger this matters less and less, but at any sample size, we do better with stratification if we can pull it off.

We could show the advantages of stratification mathematically, but we can also just see if through simulation. Suppose we have some unobserved attribute of experimental subjects—in my simulation (the R code of which is available at the end of this post), the variable x—that has a strong effect on the outcome, y  (in my simulation the effect is 3 * x) and we have a treatment that has a constant treatment effect (in my simulation, 1)  whenever it is applied. And let us suppose that subjects arrive in x order (smallest to largest). Below I plot the improvement in the average improvement in the absolute difference between the actual treatment effect (which is 1) and the experimental estimate from using stratification rather than randomization for assignment. As we can see, stratification always gives us an improvement, though the advantage is declining in the sample size.



Marketing in Networks: The Case of Blue Apron

For the past several months, I’ve been a very happy Blue Apron customer. I’ve always liked to cook, but my interest and devotion to cooking has waxed and waned over the years, with the main determinant being the (in)convenience of shopping. Blue Apron basically solves the shopping problem: they send you a box of all the ingredients you need to make a number of meals, together with detailed and visual preparation instructions.

But this blog post isn’t about cooking–it’s about Blue Apron’s interesting marketing strategy. Blue Apron makes extensive use of free meals that existing customers can send to friends and family who aren’t already signed up. My anecdotal impression is that this meal-sharing approach is remarkably successful: it’s how my family joined, and how we’ve since gotten siblings, parents, extended family and many friends to join. They in turn have gushed about Blue Apron on Facebook and earned their own free boxes to send, and so on.

What I think is interesting about Blue Apron’s marketing is that although free samples are costly, these kind of customer-chooses free samples are amazingly well targeted: presumably their customers know how likely their friends and family are to like the service and thus can tailor their invitations accordingly.  This is particularly important for a product that has (somewhat) niche appeal. It would be interesting to compare the conversion of these social referrals to other acquisition channels.

This marketing strategy also raises some design questions. First, which of your existing customers are the best ones to target with free boxes to give away? Presumably the most enthusiastic ones are better targets (the ones that never cancel deliveries, order many meals a week, and so on). It also makes sense to target customers with lots of social connections to the target Blue Apron demographic. However, it is perhaps not as simple as identifying super-users. These customers, while likely to give great word-of-mouth endorsements, might also not be as persuasive with would-be customers who might infer that the service is not for them based on the others’ enthusiasm: if the gourmet chef aunt loves the service, the pb&j-fan nephew may infer that it’s only for experts and abstain.

Even once you identify the super-users, how do you prompt them to optimally allocate the free samples? Presumably customers will start by sending the meals to people they think will value them the most and then work their way down, offering diminishing returns to Blue Apron. However, individuals probably have differently-shaped curves, with some who rapidly deplete their interested friends and others that are “deep” with potential customers. Blue Apron also has to consider not just the probability that a free sample leads to a new customer, but also the probability that the new customer will bring in other customers, and so on.

There is some academic work on marketing in networks. For example Sinan Aral and Dylan Walker have a great paper on the design of products to encourage viral growth. One of their conclusions is that more passive broadcast messaging is more effective than “active-personalized” features because the passive broadcast approach is used more often & thus the greater usage outweighs the lower effectiveness. It would be really interesting to compare in the Blue Apron context analogous interventions, say comparing a Facebook experiment where the customer could choose who to send their free meals to, or one where they just generically share that they have free meals to give, and then individuals in their network could self-select based on their interest.

Trusting Uber with Your Data

Screenshot 2014-12-09 12.14.54

There is a growing concern about companies that posses enormous amounts of our personal data (such as Google, Facebook and Uber) using it for bad purposes. A recent NYTimes op-ed proposed that we need information fiduciaries that would ensure data was being used properly:

Codes of conduct developed by companies are a start, but we need information fiduciaries: independent, external bodies that oversee how data is used, backed by laws that ensure that individuals can see, correct and opt out of data collection.

I’m not convinced that there has been much harm from all of this data collection—these companies usually want to collect this data for purposes no more nefarious than up-selling you or maybe price discriminating against you (or framed in more positive terms, offering discounts to customers with a low willingness to pay). More commonly, they collect data because they need it to make the service work—or it just gets captured as a by-product of running a computer-mediated platform and deleting the data is more hassle than storing it.

Even if we assume there is some great harm from this data collection, injecting a third party to oversee how the data is used seems very burdensome. For most of these sites, nearly every product decision touches upon data that will be collected or using data already collected. When I was employed at oDesk I worked daily with the database to figure out precisely what “we” were capturing and how it should or could be used in the product. I also designed features that would capture new data. From this experience, I can’t imagine any regulatory body being able to learn enough about a particular site to regulate it efficiently, unless they want a regulator sitting in every product meeting at every tech company—and who also knows SQL and has access to the firm’s production database.

One could argue that the regulator could stick to broad principles. But if their mandate is to decide who can opt out of certain kinds of data collection and what data can be used for what purpose, then they will need to make decisions at a very micro level. Where would you get regulators that could operate at this micro-level and simultaneously make decisions about what was good for society? I think you couldn’t and the end result would probably be either poor, innovation-stifling mis-regulation or full-on regulatory capture—with regulations probably used as a cudgel to impose large burdens on new entrants.

So should sites just be allowed to do whatever they want data-wise? Probably—at least until we have more evidence of some actual rather than hypothetical harm. If there are systematic abuses, these sites—with their millions of identifiable users—would make juicy targets for class action lawsuits. The backlash (way overblown, in my opinion) from the Facebook experiment was illustrative of how popular pressure can change policies and that these companies are sensitive to customer demands: the enormous sums these companies pay for upstart would-be rivals suggest they see themselves as being in an industry with low switching costs. We can also see market forces leading to new entrants whose specific differentiator is supposedly better privacy protections e.g, DuckDuckGo and Ello.

Not fully trusting Uber is not a good enough reason to introduce a regulatory body that would find it nearly impossible to do its “job”—and more likely, this “job” would get subverted into serving the interests of the incumbents they are tasked with regulating.

Human Capital in the “Sharing Economy”

Most of my academic research has focused on online labor markets. Lately, I’ve been getting interested in another kind of online service—namely for the transfer of human capital, or in non-econ jargon, teaching. There have been a number of new companies in this space—Coursera, edX, Udactiy and so on—but one that strikes me as fundamentally different—and different in an important way—is Udemy.

Unlike other ed tech companies, Udemy is an actual marketplace:  instructors from around the world can create online courses in whatever their area of expertise and then let students access those courses, often for a fee but not always. Instructors decide the topic, the duration and price: students are free to explore the collection of courses and decide what courses are right for their needs and their budget. The main reason I think this marketplace model is so important is that it creates strong incentives for instructors to create new courses, thus partially fixing the “supply problem” in online education (which I’ll discuss below).

Formal courses are a great way to learn some topic, but not everything worth learning has been turned into a course. Some topics are just too new for a course to exist yet. It takes time to create courses and for fields that change very rapidly—technology being a notable example—no one has had the time to create a course. The rapid change in these fields also reduces the incentives for would-be instructors—many of which likely to not even think of themselves as teachers—to make courses, as the material can rapidly become obsolete. Universities can and do create new courses, but it’s hard to get faculty to take on more work. Further, the actual knowledge that needs to be “course-ified” is often held by practitioners and not professors.

I recently worked with Udemy to develop a survey of their instructors. We asked a host of questions (and I hope to blog about some of the other interesting ones) but one that I think is particularly interesting was “Where did you acquire the knowledge that you teach in your course?” We wanted to see whether a lot of what was being taught on Udemy was knowledge that was acquired through some means other than formal schooling. In the figure below, I plot the fraction of respondents selecting different answers.


We can see that the most common reason is a mixture of formal education and on-the-job experience (about 45%). The next most common answer was strictly on the job experience at a little less than 30%. Less than 10% of instructors were teaching things they had learned purely in school.

These results strongly support the view that Udemy is in fact taking knowledge acquired from non-academic sources and turning it into formal courses. Given that Udemy is filling in a gap left by traditional course offerings, it is perhaps not surprising that the answers skew towards “on the job training” but it is even more pronounced than I would have expected. I also think it’s interesting that the most common answer was  a “mixture” suggesting that for instructors their on-the-job training was a complement to their formal education.

Online education and ed tech is exciting in general—it promises to potentially overcome the Baomol’s cost disease characterization of the education sector and let us educate a lot more people at a lower cost.  However, I suspect that business models that simply take offline courses and move them online will not create the incentives needed to bring the large amounts of practical knowledge into the course format; by creating a marketplace, Udemy creates those incentives. By having an open platform for instructors, it can potentially tap the expertise of a much larger cross section of the population. Expertise does not just reside within academia and Udemy—unlike platforms that simply take traditional courses and put them online—can unearth this expertise and catalyze its transformation into courses. Any by forcing these courses to compete in a market for students, they create strong incentives for both quality and timeliness.

Although having a true marketplace has many advantages, running marketplace businesses is quite difficult—they create challenges like setting pricing policies, building and maintaining a reputation system, ensuring product quality without controlling production, mediating disputes and so on. But taking on these challenges seem worth it, particularly as businesses are getting better at running marketplaces (see Uber, Lyft, Airbnb, Elance-oDesk, etc.). In future blog posts, I hope to talk about some other interesting aspects of the survey and how they related to market design. There are some really interesting questions raised by Udemy, such as how should instructors position their courses vis-a-vis what’s already offered, how they set prices, and on the Udemy side, how you share revenue, how you market courses, how you have your reputation/feedback system work, how you decide to screen courses and so on—it’s a really rich set of interesting problems.

Labor in the sharing economy

There is much to like in this article “In the Sharing Economy, Workers Find Both Freedom and Uncertainty” by Natasha Singer.

The best part has to be this quote and parenthetical comment from the author:

“These are not jobs, jobs that have any future, jobs that have the possibility of upgrading; this is contingent, arbitrary work,” says Stanley Aronowitz, director of the Center for the Study of Culture, Technology and Work at the Graduate Center of the City University of New York. “It might as well be called wage slavery in which all the cards are held, mediated by technology, by the employer, whether it is the intermediary company or the customer.”

(Disclosure: For two weeks in the summer of 1988, I had a gig as the au pair for Professor Aronowitz’s daughter, then a toddler.)

Despite getting quotes from a labor economist (though certainly not a mainstream one), the article fails to mention several ideas from labor economics that seem quite relevant to understanding these marketplaces.

1. Most of the work being done through these platforms is paid work that probably would not have occurred without the changes in technology that lowered market transaction costs.

Obviously people have been hiring drivers, getting houses cleaned, having packages delivered, etc., for a long time before these platforms sprung up. However, in earlier times, the transaction costs associated with hiring someone to do this stuff was high. As a result, people wanting these kinds of services either hired a firm if they *really* needed the service—but more commonly, either did it themselves or went without. I suspect that most of the sharing economy labor is “new” work (i.e., the buyers would have gone without or done it themselves) rather than labor that is just shifted around.

2. Wages in these markets are almost certainly determined by supply and demand. 

These are marketplaces where decentralized, individual buyers and sellers make decisions about one-off, spot transactions. We might not like the allocation that results—and the lingering effects of the great recession probably have lead to markets with more supply than demand—but it is useful to appreciate where prices are coming from and regard them as prices. When we think of them as prices, it becomes easier to think about what different policies are likely to do.

3. Wages in these markets are probably artificially high.

Most of these platforms making money by taking some percentage of each labor transaction, they have an incentive to tilt the platform—through policies, recommendations, search algorithms and so on—in favor of the worker.

4. When we observe people making a choice between several options without coercion, it probably means they value whatever option they selected more than their next best option.

When we see someone working as a TaskRabbit, Lyft Driver, Postmates delivery person etc., they value everything about that job, positive and negative, more than their next best option. It would be great if they had a better option—and presumably many people doing this kind of work are looking for or trying to create other options—but you aren’t going to make them better off by foreclosing the option they already have.

5. Workers “pay” for amenities (through lower wages) and are paid for disamenities (through higher wages).

People often fail to see this because sometimes high-paying jobs also come with a nice set of amenities—think free food and pleasant offices at a tech company and low-paying jobs have disamenties—loud, noisy workplaces, irregular hours, shift work and so on. However, these comparisons are deeply misleading, in that they compare very different labor markets.  We can see the point about “paying” for amenities & disamenities by  imagining the same job in the same industry, but varying its attributes. If, for some reason, Google offices absolutely had to have a loud industrial turbine right near where the programmers sat and the office was dusty and hot, Google jobs would pay even more. Similarly, if a fast food place can make the job more valuable (and hence pay lower wages) by letting workers take food at the end of their shift, they will (and do).

6. Workers sort into jobs that offer the collection of amenities and disamenities that are particularly attractive to them. 

Being a Lyft driver sounds like hell to me—I hate driving, have a bad sense of direction and making small talk is not something I enjoy. There are plenty of people that like driving, have a great sense of direction and love chatting/meeting new people. These are the people that will “sort” into being a Lyft driver. Similarly, workers that want lots of flexibility, have a taste for variety in tasks and so are going to be much more open to these kinds of informal work arrangements.

Does Airbnb hosting beggar your neighbor?

tl; dr version: I wrote  a short paper on one of the policy issues raised by Airbnb—namely does hosting on Airbnb impost a cost on neighbors that make Airbnb as a development socially inefficient. My answer is no.

Yesterday I read a column in the Guardian by Dean Baker titled “Don’t buy the ‘sharing economy’ hype: Airbnb and Uber are facilitating rip-offs.” The title pretty much sums up his views. I did not think very much of the policy arguments (one of the key points is that they hurt local tax revenue, which seems eminently fixable and not the most important concern anyway) save for one: the notion that Airbnb hosts impose a negative externality on their fellow apartment renters. This seems plausible and it is exactly the kind of market failure governments are often needed to remedy.

I have heard many New Yorkers say something to this effect “I don’t want to live next to random people coming  and going.” And while this main costs of hosting bad guests falls on the host—see the New York Post article (of course)  “Airbnb renter returns to ‘overweight orgy’“—presumably neighbors also bear some costs. The natural policy question is whether these negative externalities get internalized.

Presumably individual tenants considering being Airbnb hosts don’t fully consider the costs on their neighbors, but the decision to list is not wholly up to them: landlords certainly have some say and presumably have incentives to both (a) let renters earn extra revenue, as they can capture some it through rents and (b) minimize the costs that these quasi-sub-letters have on other tenants (for the same rent-related reason).

I tried thinking through these issues and wrote a little paper that sits halfway between an academic paper and blog post. The main conclusion I draw is that if tenants can sort across apartments based on the landlords Airbnb-hosting policy, the negative externality of hosting will be internalized. In other words, there will not be “too much” Airbnb hosting.

See the paper for the details, but the basic idea is that when the rental market is in equilibrium, tenants have to be indifferent between apartment buildings that allow Airbnb hosting and those that do not, which means that the benefits they would get from being an Airbnb host equal the costs from everyone else in the apartment being Airbnb hosts. And when this condition is met, private benefits equal social costs, which is what one needs for the externality to be internalized. Obviously this is a pretty stylized argument, but hopefully it can be a starting point for thinking more seriously about the policy implications of Airbnb.






Do job-seekers know how much competition they face?

In some matching models of the labor market, a source of friction is that job-seekers do not know how many other workers are also applying (or will apply) to a job. The job-seeker cannot condition their application decisions accordingly e.g., they can not easily skip jobs that are over-subscribed and/or seek out jobs where the competition is thinner.

I was curious whether job-seekers do know anything about the count of other candidates even after they apply to a job, and if they do know, how did they know. If they don’t know the count of applicants even for jobs they apply to—or they only learn during the process—then it seems unlikely they know the application counts for jobs they did not apply to. I ran a little survey on MTurk last night asking 50 people the following question:

“The last time you applied for a job, did you know how many other workers the firm was considering, and if so, how did you know?”


A little more than 40% of the respondents reported knowing anything about the count of other applicants. Of those that knew, their reasons fell into basically three buckets: (1) the interviewer told them, at 42% (2) a friend or associate at the firm told them, at 32% and (3) they inferred it from something they observed, such as seeing an interview list, overhearing a phone call etc., at 26%.

The usual caveats about convenience samples aside, in a majority of cases, job-seekers did now know the count and never learned it. Among those that did learn but were not told by the firm, social connections to existing employees mattered. These connections seem to grease the hiring process with information at multiple points. The additional fact that a substantial fraction reporting being able to infer the count is (perhaps unsurprising) evidence that job-seekers do care about the count and try to infer it from whatever sources available. The raw data as a CSV & my R code for analyzing it is at:

Free text Responses by MTurk Respondents

  • Yes, because it was through school and I knew the particular amount of interview spots that were open.
  • Yes the last time I applied for a job they were considering two people. My friend that work there told me.
  • The last time I tried to apply for a job, I did not know how many people the firm was considering since it was a blind interview.
  • No, I had no idea. The only indication was I was not told anything other than they had “several” other people to interview. “Several” could mean anything, although I doubt they would interview more than 10-15.
  • I had no idea how many candidates were being considered for the position. I was only aware that the position was available.
  • no, I had no idea.
  • Yes, I knew how many because associate conducting the interview mentioned that there were two other applicants coming in for interviews in the coming days so the final decision would not come for a few day afterwards.
  • I knew of at least 2 because I overheard the interviewer talking on the phone while I was waiting.
  • I went to to a group interview, so I saw all of the other applicants.
  • No, I didn’t know how many other workers were being considered. It was not an open job process at all. It was a job where there were only a small number of positions in my area, and lots of people who were qualified and who would want it–so I assumed there would be a lot of competition. (I didn’t get it.)
  • Yes, the last time I applied for a job, I knew how many other workers the firm was considering, because it was an internship opportunity in which there was an information session held beforehand that told us how many applicants would be considered.
  • i had no idea how many they were considering/wasn’t disclosed to me
  • I only new a rough estimate of around 10-20 and that info came from the interviewer.
  • No I did not know how many other workers were being considered.
  • No, I did not know. Applications and resumes were submitted in person or through fax, any number of people could have been considered.
  • I knew the place where I was hired was considering one other person because I was working there in a temporary capacity at the time and someone gave me inside information.
  • The last time I turned in an application for a job I had no idea how many others were considered.
  • The last time I applied for a job I did not know how many other workers they were considering hiring.
  • Yes, I knew that there was one other person that the firm was considering because the office manager who was interviewing me told me so at the end of the interview. The office manager ended up offering me the job several days later after all because that first choice person did not accept the job when it was offered to her.
  • 3
  • I did know, but only after I was at the interview. I was told by the hiring manager how many candidates there were.
  • The firm was considering two other workers. The reason that I knew was that I over heard the interviewers talking after I completed my interview and left them room.
  • I was applying for a bank teller job and in their wanted ad they stated they were looking for 4 positions to be filled.
  • i am currently interviewing and the interviewers have indicated that there are 2-3 other candidates.
  • no unfortunately i do know know
  • Don’t know. They never like to give exact numbers.
  • Yes, I knew how many others were being considered.
    I had an acquaintance at the firm who checked the list and told me.
  • yes, word of mouth from people who worked there.
  • I knew how many were applying because I knew one of the people who worked inside the company. He was a very close friend.
  • Yes, four others, they told me.
  • I was unsure of how many other workers the firm was considering.
  • There were about 25 that were invited back for a second interview. It was down to me and the daughter of a lady who already worked there. The mother was screening the calls to make sure no one got through to speak to the department head looking for a secretary hoping her daughter would get the job. It just so happened my call came while someone else was on the switchboard for a few minutes and was not in on the call blocking plan. Long story short, I got the job and worked there 25 years. It was a great job with great benefits. I found out when I was working with the payroll department doing W-2’s and came across the folder of applications for my job.
  • Yes. The guy who interviewed me told me how many other applicants there were.
  • I was not aware of how many others were being considered for the position.
  • Yes, I knew how many other people were being considered because I had a friend who worked for the company. He was able to find out how many other candidates there were and he told me.
  • I did not know how many other workers the firm was considering.
  • Yes I was working for the job and they were expending and hiring from within.
  • No.
  • I did not know how many others the company was considering.
  • Yes, I knew how many they were considering, as my 2nd cousin was in charge of the hiring process.
  • No, I didn’t know.
  • I applied for an accounting job, five people applied. I knew because I was friends with the HR director. I didn’t get the position.
  • I did not know how many other workers the firm was considering. I did know that there were other applicants, as I was told they had to “interview more people”, but that was the extent of my knowledge.
  • Yes, I knew. It was for a civil service position and when interviews were given out I was sent a packet showing mine and the other interviewee’s time slots. It was intimidating seeing how many people I was competing against. I would have rather not known.
  • No.
  • The last time I applied for a position, I was unsure how many candidates the firm considered. I had received notification of other candidates, but not a specific number.
  • No I did not know, but i would assume at least twenty others
  • I didn’t know how many other people were being considered.
  • I did not know an exact number of how many other workers were being considered, but the hiring manager did advise me that they had several interviews to complete and would contact me when they were finished.
  • NO, when I applied for my last job I had no clue how many others where applying.

Chinese freelancers on oDesk seem to avoid the number 4 in their proposed hourly rates

There are numerous examples of Chinese speakers avoiding the number “4” because it sounds like the Mandarin word for death; the Wikipedia page on “Tetraphobia” documents several examples. My NYU colleague Adam Alter recently published an article in the New Yorker discussing some of his work on how cultural beliefs can have economic and financial significance. After I read his article, I was curious whether Chinese freelancers on oDesk are less likely than expected to use the number 4 in their profiles. It seems like they are:


Note that we would expect different countries to “naturally” have different fractions of 4’s given that different countries tend to occupy different parts of the oDesk wage distribution. To make this plot, I took a sample of countries with at least as many freelancers as China.

Market Clearing Without Consternation?:
The Case of Uber’s “Surge” Pricing

The on-demand car service Uber has a “surge” pricing policy: during periods of peak demand, such as during snowstorms and on New Year’s Eve, prices have increased as much as 8x. Uber’s stated goal of the policy is to increase the supply of drivers and ensure that cars are still available. Although would-be riders know about the policy (and even have to confirm the fare multiplier on their iPhones), it has generated a great deal of mostly negative media attention. To many, dynamic pricing is just price gouging and immoral.

Uber’s CEO has argued that other companies practice dynamic pricing (e.g., hotels, airlines, clubs, etc.) and that Uber’s core, unwavering goal is to ensure that a car is always available. These justifications sound reasonable, but are they persuasive?

An MTurk experiment

I decided to test whether different “elaborations” such as the CEO’s can alter opinions about the morality of surge pricing. To do this, I created a series of HITs (human intelligence tasks) on Amazon Mechanical Turk. In each HIT:

  1. I first described how surge pricing worked and asked respondents to rate the policy on a five-point scale from “Very Wrong” to “Very Right.”
  2. I then had respondents read one of 18 “elaborations” that they were instructed to regard as true. Some of these elaborations were similar to claims made by the Uber CEO or his critics. Each elaboration was a separate HIT and respondents only considered one at at time.
  3. In response to the elaborations, respondents were asked to rate how this elaboration changed their baseline view, on five-point scale from “Much More Wrong” to “Much Less Wrong.”

Baseline views on the morality of surge pricing

As expected, a large chunk of respondents don’t like surge pricing:
Notes: For each bin, a 95% CI is shown for that proportion. The first response of each worker was used.

Do “elaborations” change views?

Price setting. Uber has been a bit opaque about how surge prices are actually arrived at. Given people’s taste for procedural justice and their tendency to view acts of omission and comission differently, I suspected that details about how the surge price was obtained would matter. I created three HITs related to price-setting:

  • “The actual surge price is set by an algorithm designed to make sure there is always a car available within a city within 20 minutes.”
  • “Uber lowers prices during periods of slack demand.”
  • “The actual surge price is set by Uber based on their guess about how much more demand there will be.”

In the figure below, we can see that the (tautological) statement that Uber lowers prices during periods of slack demand improves opinions substantially. Further, stating that the price was determined by an algorithm working against a liquidity contraint also strongly improved views. In contrast, stating that the price was made on the basis of a “guess” by Uber about likely demand made views much more negative.

Tighter parallelism between questions would be ideal, but I suspect that three factors made the “algorithm” perspective more persuasive: (1) with the algorithm, high prices were a byproduct rather than a goal, (2) moral agency is transferred from Uber employees to “the algorithm” and (3) the phrasing “guess” implies a lack of care and diligence.

Notes:For each bin, a 95% CI is shown for that proportion. The red line shows the predicted proportions from a fitted ordered logistic regression model.

Competitive landscape. The CEO of Uber has repeatedly discussed how dynamic pricing is commonplace in other industries. I created three HITs about the strategies used by firms—two about Uber’s competitors and one about an industry that practices dynamic pricing:

  • “Hotels in the area also use “surge” pricing to meet increases in demand.”
  • “Uber has a number of competitors, and they also use surge pricing.”
  • “Uber has a number of competitors, and they do not use surge pricing.”

In the figure below, we can see that neither hotels nor competitors practicing dynamic pricing does much to improve opinions. However, the elaboration that Uber practices surge pricing when its competitors do not has a strongly negative effect: the distrbution of responses is very left-shifted toward even dimmer views. It would seem Uber’s best bet would be to hope Lyft starts their own surge pricing (oh look, they did).


Uber’s revenue. Other work looking at how individuals judge the morality of market actions has found that people view actions to increase already-positive profits differently from actions to “save” a company (Kahneman, Knetsch & Thaler, or KKT). The two revenue related elaborations were:

  • “Uber as a whole is losing money.”
  • “Uber as a whole has high profits.”

In the figure below, we can see that KKT result is strongly replicated: views are much more negative when Uber is profitable than when it is making money.

Miscellaneous. I tried a few other elaborations that do not fit neatly into a bucket:

  • “People needing cars had other alternatives during surge pricing, such as public transportation.”
  • “Uber has told users about the policy up-front and it is very clear to them what the price will be.”
  • “A majority of economists view this surge pricing as an efficient way to allocate scarce goods.”

We can see that all of these elaborations “work” in improving views. It seems Uber is wise to point out that transactions are entered into freely. This appears to be better than showing that alternatives existed (which also had a positive effect). The respondents were also surprisingly (to me, at least) open to the idea that expert opinion on pricing might change their views. Maybe this is a good idea…



There are several complexities that I elided over in this post. For example, there is the question of how repondent’s prior beliefs affect their willingness to change views (in short, people with negative initial views are less likely to be persuaded, but they have the same directional effects as people with positive views). If there’s interest, I’ll follow-up with some more details in a longer post. I’ll also share a repository with the code, data and experimental materials.

Economics for skeptical social scientists

I recently gave a talk at the “Training school on Virtual Work,” which was held at the University of Malta. The participants were mostly graduate students and junior faculty at European universities studying some aspect of virtual work e.g., Wikipedia editors, gold farmers, Current TV contributors, MTurk workers etc. Most were coming from very different methodological background than my own and the people I usually work with—sociology, anthropology, media studies, gender studies etc. I think it is fair to say that most participants have a fairly dim view of economics.

One of the organizers felt that few participants would have encountered the economic perspective on online work. I was asked to present a kind of non straw man version of economics and present the basic tools for how economists think about labor markets. Below is the result—a kind of apologia for economics, combined with a smattering of basic labor economics. I’m not the best judge obviously, but I think it was reasonably well received.

Economics and Online Work (a slightly misleading title though – see description) from John Horton

PS – I should write more about the school later, but one of the main take-aways for me was how (a) pervasive the acceptance of the labor theory of value was among participants and (b) how this leads to very different conclusions about almost everything that matters with respect to online work. It would be interesting to try to analyze a couple of different online work phenomena using the LTV and the marginalist approach to value.