Author Archives: johnjhorton

New post by email test

This is a test post.

Documenting your work as you go with GNU Make

I’ve long been a convert to using ‘Make’ to turn LaTeX into a PDF. However, you can also easily use Make to backup your work as you go along and take “snapshots” of what a draft looked like at a moment in time (see example Makefile below). This complements using both github, Dropbox and some external backup (I might add another block to the Makefile that pushes a snapshot to Amazon S3).

My folder structure for a project:

writeup (where I store the LaTeX & BibTeX)
code
data
backups (where I store entire snapshots of the directory, w/o backups included, for obvious reasons)
snapshots (where just the PDF draft is stored)

How much should MTurk cost (if you’re Amazon)?

Amazon recently announced that they are going to double their percentage fee, from 10% to 20%. At least among the people I follow on Twitter (lots of academics that use MTurk for research), this has caused much consternation. A price increase is clearly bad for workers and bad for requestors (who gets hurt worse will depend on relative elasticities), but what about Amazon?

When I first got interested in online labor markets, I wrote a short paper (“Online Labor Markets” ) in which I tried to figure out what was the optimal ad valorem charge from the platform’s perspective. The relevant section is below, but the main conclusion was that most online platforms were pricing as if demand and/or supply was highly elastic. In other words, that even a small increase in price would send nearly all customers elsewhere.

Screenshot 2015-06-25 09.04.14

The basic reason is that when the platform doubles its fees from 10% to 20%, they double their revenue (if everything stays the same) but only increase the cost of Using MTurk for users by about 10%. Of course, there is some decline in usage (demand curves slope down) which reduces profits, but it has to be a huge reduction to make up for the direct increase in revenue to the platform. This is more or less the same argument for why cutting taxes doesn’t increase revenue unless tax rates are incredibly high—the consensus estimate is that the revenue maximizing tax rate is in the mid 70%.

My guess is that MTurk has fairly few substitutes and someone at Amazon decided it should be making more money as a service. Fortunately for us as observers, because of the great work of Panos Ipeirotis, we’ll get to see what happens.

The impact & potential of online work – some new references

In the last month, two new reports have come up looking at online work and online labor markets, with a focus on their potential for economic development.

There is:

A report from the McKinsey Global Institute, “Connecting talent with opportunity in the digital age” The McKinsey report covers a lot of the economic rationale for online work and digitization a bit more broadly.
A report from the World Bank “Jobs without Borders” (pdf) which focuses more on what these markets could do for workers in less developed countries. It brings together quite a bit of disparate data on the size on online marketplaces, worker composition and so on.

The McKinsey report relies on some of the work that went into my NBER working paper w/ Ajay Agrawal, Liz Lyons and Nico Lacetera on “Digitization and the Contract Labor Market” which in turn leans heavily on data from oDesk (now Upwork). This paper—along with all the others from the conference—are now available as a book from the University of Chicago Press.

Allocating online experimental subjects to cells—doing better than random

Most people running experiments understand that they need to randomize subjects to their experimental groups. When I teach causal inference to my undergraduates, I try to drive this point home and I tell them horror stories about would-be experimenters “randomizing” on things like last name or even worse, effectively letting people opt into their treatment cell by conditioning assignment on some behaviour the user has taken.

But this randomization requirement for valid inference is actually a noble lie of sorts—what you really need is “unconfoundedness” and in experiments I have run on Mechanical Turk—I don’t randomize but rather I stratify, allocating subjects sequentially to experimental groups based on their “arrival” to my experiment. In other words, if I had two cells, treatment and control, my assignments would go like this:

Subject 1 goes to Treatment
Subject 2 goes to Control
Subject 3 goes to Treatment
Subject 4 goes to Control
Subject 5 goes to Treatment
…

So long as subjects do not know relative arrival order and cannot condition on it (which is certainly the case), this method of assignment, while not random, is OK for valid causal inference. In fact, arrival time stratification approach does better than OK—it gives you more precise estimates of the treatment effect for a given sample size.

The reason is that this stratification ensures your experiment is better balanced on arrival times, which are very likely correlated with user demographics (because of time-zones) and behaviour (early arrivals are more likely to be heavy users of the site). For example, suppose on Mechanical Turk you run your experiment over several days and the country of each subject, in order is:

US
US
US
US
India
India
India
India.

With randomization, there is a non-zero chance you could get an assignment of Treatment, Treatment, Treatment, Treatment, Control, Control, Control, Control, which is as badly biased as possible (all Americans in the treatment, all Indians in the control). There are a host of other assignments that are better, but not by much, still giving us bad balance. Of course, our calculated standard errors take this possibility into “account” but we don’t have to do this to ourselves—with the stratified on arrival time method, we get a perfectly balanced experiment on this one important user attribute, and hence more precision. As the experiment gets larger and larger this matters less and less, but at any sample size, we do better with stratification if we can pull it off.

We could show the advantages of stratification mathematically, but we can also just see if through simulation. Suppose we have some unobserved attribute of experimental subjects—in my simulation (the R code of which is available at the end of this post), the variable x—that has a strong effect on the outcome, y (in my simulation the effect is 3 * x) and we have a treatment that has a constant treatment effect (in my simulation, 1) whenever it is applied. And let us suppose that subjects arrive in x order (smallest to largest). Below I plot the improvement in the average improvement in the absolute difference between the actual treatment effect (which is 1) and the experimental estimate from using stratification rather than randomization for assignment. As we can see, stratification always gives us an improvement, though the advantage is declining in the sample size.

Marketing in Networks: The Case of Blue Apron

For the past several months, I’ve been a very happy Blue Apron customer. I’ve always liked to cook, but my interest and devotion to cooking has waxed and waned over the years, with the main determinant being the (in)convenience of shopping. Blue Apron basically solves the shopping problem: they send you a box of all the ingredients you need to make a number of meals, together with detailed and visual preparation instructions.

But this blog post isn’t about cooking–it’s about Blue Apron’s interesting marketing strategy. Blue Apron makes extensive use of free meals that existing customers can send to friends and family who aren’t already signed up. My anecdotal impression is that this meal-sharing approach is remarkably successful: it’s how my family joined, and how we’ve since gotten siblings, parents, extended family and many friends to join. They in turn have gushed about Blue Apron on Facebook and earned their own free boxes to send, and so on.

What I think is interesting about Blue Apron’s marketing is that although free samples are costly, these kind of customer-chooses free samples are amazingly well targeted: presumably their customers know how likely their friends and family are to like the service and thus can tailor their invitations accordingly. This is particularly important for a product that has (somewhat) niche appeal. It would be interesting to compare the conversion of these social referrals to other acquisition channels.

This marketing strategy also raises some design questions. First, which of your existing customers are the best ones to target with free boxes to give away? Presumably the most enthusiastic ones are better targets (the ones that never cancel deliveries, order many meals a week, and so on). It also makes sense to target customers with lots of social connections to the target Blue Apron demographic. However, it is perhaps not as simple as identifying super-users. These customers, while likely to give great word-of-mouth endorsements, might also not be as persuasive with would-be customers who might infer that the service is not for them based on the others’ enthusiasm: if the gourmet chef aunt loves the service, the pb&j-fan nephew may infer that it’s only for experts and abstain.

Even once you identify the super-users, how do you prompt them to optimally allocate the free samples? Presumably customers will start by sending the meals to people they think will value them the most and then work their way down, offering diminishing returns to Blue Apron. However, individuals probably have differently-shaped curves, with some who rapidly deplete their interested friends and others that are “deep” with potential customers. Blue Apron also has to consider not just the probability that a free sample leads to a new customer, but also the probability that the new customer will bring in other customers, and so on.

There is some academic work on marketing in networks. For example Sinan Aral and Dylan Walker have a great paper on the design of products to encourage viral growth. One of their conclusions is that more passive broadcast messaging is more effective than “active-personalized” features because the passive broadcast approach is used more often & thus the greater usage outweighs the lower effectiveness. It would be really interesting to compare in the Blue Apron context analogous interventions, say comparing a Facebook experiment where the customer could choose who to send their free meals to, or one where they just generically share that they have free meals to give, and then individuals in their network could self-select based on their interest.

Trusting Uber with Your Data

There is a growing concern about companies that posses enormous amounts of our personal data (such as Google, Facebook and Uber) using it for bad purposes. A recent NYTimes op-ed proposed that we need information fiduciaries that would ensure data was being used properly:

Codes of conduct developed by companies are a start, but we need information fiduciaries: independent, external bodies that oversee how data is used, backed by laws that ensure that individuals can see, correct and opt out of data collection.

I’m not convinced that there has been much harm from all of this data collection—these companies usually want to collect this data for purposes no more nefarious than up-selling you or maybe price discriminating against you (or framed in more positive terms, offering discounts to customers with a low willingness to pay). More commonly, they collect data because they need it to make the service work—or it just gets captured as a by-product of running a computer-mediated platform and deleting the data is more hassle than storing it.

Even if we assume there is some great harm from this data collection, injecting a third party to oversee how the data is used seems very burdensome. For most of these sites, nearly every product decision touches upon data that will be collected or using data already collected. When I was employed at oDesk I worked daily with the database to figure out precisely what “we” were capturing and how it should or could be used in the product. I also designed features that would capture new data. From this experience, I can’t imagine any regulatory body being able to learn enough about a particular site to regulate it efficiently, unless they want a regulator sitting in every product meeting at every tech company—and who also knows SQL and has access to the firm’s production database.

One could argue that the regulator could stick to broad principles. But if their mandate is to decide who can opt out of certain kinds of data collection and what data can be used for what purpose, then they will need to make decisions at a very micro level. Where would you get regulators that could operate at this micro-level and simultaneously make decisions about what was good for society? I think you couldn’t and the end result would probably be either poor, innovation-stifling mis-regulation or full-on regulatory capture—with regulations probably used as a cudgel to impose large burdens on new entrants.

So should sites just be allowed to do whatever they want data-wise? Probably—at least until we have more evidence of some actual rather than hypothetical harm. If there are systematic abuses, these sites—with their millions of identifiable users—would make juicy targets for class action lawsuits. The backlash (way overblown, in my opinion) from the Facebook experiment was illustrative of how popular pressure can change policies and that these companies are sensitive to customer demands: the enormous sums these companies pay for upstart would-be rivals suggest they see themselves as being in an industry with low switching costs. We can also see market forces leading to new entrants whose specific differentiator is supposedly better privacy protections e.g, DuckDuckGo and Ello.

Not fully trusting Uber is not a good enough reason to introduce a regulatory body that would find it nearly impossible to do its “job”—and more likely, this “job” would get subverted into serving the interests of the incumbents they are tasked with regulating.

Human Capital in the “Sharing Economy”

Most of my academic research has focused on online labor markets. Lately, I’ve been getting interested in another kind of online service—namely for the transfer of human capital, or in non-econ jargon, teaching. There have been a number of new companies in this space—Coursera, edX, Udactiy and so on—but one that strikes me as fundamentally different—and different in an important way—is Udemy.

Unlike other ed tech companies, Udemy is an actual marketplace: instructors from around the world can create online courses in whatever their area of expertise and then let students access those courses, often for a fee but not always. Instructors decide the topic, the duration and price: students are free to explore the collection of courses and decide what courses are right for their needs and their budget. The main reason I think this marketplace model is so important is that it creates strong incentives for instructors to create new courses, thus partially fixing the “supply problem” in online education (which I’ll discuss below).

Formal courses are a great way to learn some topic, but not everything worth learning has been turned into a course. Some topics are just too new for a course to exist yet. It takes time to create courses and for fields that change very rapidly—technology being a notable example—no one has had the time to create a course. The rapid change in these fields also reduces the incentives for would-be instructors—many of which likely to not even think of themselves as teachers—to make courses, as the material can rapidly become obsolete. Universities can and do create new courses, but it’s hard to get faculty to take on more work. Further, the actual knowledge that needs to be “course-ified” is often held by practitioners and not professors.

I recently worked with Udemy to develop a survey of their instructors. We asked a host of questions (and I hope to blog about some of the other interesting ones) but one that I think is particularly interesting was “Where did you acquire the knowledge that you teach in your course?” We wanted to see whether a lot of what was being taught on Udemy was knowledge that was acquired through some means other than formal schooling. In the figure below, I plot the fraction of respondents selecting different answers.

We can see that the most common reason is a mixture of formal education and on-the-job experience (about 45%). The next most common answer was strictly on the job experience at a little less than 30%. Less than 10% of instructors were teaching things they had learned purely in school.

These results strongly support the view that Udemy is in fact taking knowledge acquired from non-academic sources and turning it into formal courses. Given that Udemy is filling in a gap left by traditional course offerings, it is perhaps not surprising that the answers skew towards “on the job training” but it is even more pronounced than I would have expected. I also think it’s interesting that the most common answer was a “mixture” suggesting that for instructors their on-the-job training was a complement to their formal education.

Online education and ed tech is exciting in general—it promises to potentially overcome the Baomol’s cost disease characterization of the education sector and let us educate a lot more people at a lower cost. However, I suspect that business models that simply take offline courses and move them online will not create the incentives needed to bring the large amounts of practical knowledge into the course format; by creating a marketplace, Udemy creates those incentives. By having an open platform for instructors, it can potentially tap the expertise of a much larger cross section of the population. Expertise does not just reside within academia and Udemy—unlike platforms that simply take traditional courses and put them online—can unearth this expertise and catalyze its transformation into courses. Any by forcing these courses to compete in a market for students, they create strong incentives for both quality and timeliness.

Although having a true marketplace has many advantages, running marketplace businesses is quite difficult—they create challenges like setting pricing policies, building and maintaining a reputation system, ensuring product quality without controlling production, mediating disputes and so on. But taking on these challenges seem worth it, particularly as businesses are getting better at running marketplaces (see Uber, Lyft, Airbnb, Elance-oDesk, etc.). In future blog posts, I hope to talk about some other interesting aspects of the survey and how they related to market design. There are some really interesting questions raised by Udemy, such as how should instructors position their courses vis-a-vis what’s already offered, how they set prices, and on the Udemy side, how you share revenue, how you market courses, how you have your reputation/feedback system work, how you decide to screen courses and so on—it’s a really rich set of interesting problems.

Labor in the sharing economy

There is much to like in this article “In the Sharing Economy, Workers Find Both Freedom and Uncertainty” by Natasha Singer.

The best part has to be this quote and parenthetical comment from the author:

“These are not jobs, jobs that have any future, jobs that have the possibility of upgrading; this is contingent, arbitrary work,” says Stanley Aronowitz, director of the Center for the Study of Culture, Technology and Work at the Graduate Center of the City University of New York. “It might as well be called wage slavery in which all the cards are held, mediated by technology, by the employer, whether it is the intermediary company or the customer.”

(Disclosure: For two weeks in the summer of 1988, I had a gig as the au pair for Professor Aronowitz’s daughter, then a toddler.)

Despite getting quotes from a labor economist (though certainly not a mainstream one), the article fails to mention several ideas from labor economics that seem quite relevant to understanding these marketplaces.

1. Most of the work being done through these platforms is paid work that probably would not have occurred without the changes in technology that lowered market transaction costs.

Obviously people have been hiring drivers, getting houses cleaned, having packages delivered, etc., for a long time before these platforms sprung up. However, in earlier times, the transaction costs associated with hiring someone to do this stuff was high. As a result, people wanting these kinds of services either hired a firm if they *really* needed the service—but more commonly, either did it themselves or went without. I suspect that most of the sharing economy labor is “new” work (i.e., the buyers would have gone without or done it themselves) rather than labor that is just shifted around.

2. Wages in these markets are almost certainly determined by supply and demand.

These are marketplaces where decentralized, individual buyers and sellers make decisions about one-off, spot transactions. We might not like the allocation that results—and the lingering effects of the great recession probably have lead to markets with more supply than demand—but it is useful to appreciate where prices are coming from and regard them as prices. When we think of them as prices, it becomes easier to think about what different policies are likely to do.

3. Wages in these markets are probably artificially high.

Most of these platforms making money by taking some percentage of each labor transaction, they have an incentive to tilt the platform—through policies, recommendations, search algorithms and so on—in favor of the worker.

4. When we observe people making a choice between several options without coercion, it probably means they value whatever option they selected more than their next best option.

When we see someone working as a TaskRabbit, Lyft Driver, Postmates delivery person etc., they value everything about that job, positive and negative, more than their next best option. It would be great if they had a better option—and presumably many people doing this kind of work are looking for or trying to create other options—but you aren’t going to make them better off by foreclosing the option they already have.

5. Workers “pay” for amenities (through lower wages) and are paid for disamenities (through higher wages).

People often fail to see this because sometimes high-paying jobs also come with a nice set of amenities—think free food and pleasant offices at a tech company and low-paying jobs have disamenties—loud, noisy workplaces, irregular hours, shift work and so on. However, these comparisons are deeply misleading, in that they compare very different labor markets. We can see the point about “paying” for amenities & disamenities by imagining the same job in the same industry, but varying its attributes. If, for some reason, Google offices absolutely had to have a loud industrial turbine right near where the programmers sat and the office was dusty and hot, Google jobs would pay even more. Similarly, if a fast food place can make the job more valuable (and hence pay lower wages) by letting workers take food at the end of their shift, they will (and do).

6. Workers sort into jobs that offer the collection of amenities and disamenities that are particularly attractive to them.

Being a Lyft driver sounds like hell to me—I hate driving, have a bad sense of direction and making small talk is not something I enjoy. There are plenty of people that like driving, have a great sense of direction and love chatting/meeting new people. These are the people that will “sort” into being a Lyft driver. Similarly, workers that want lots of flexibility, have a taste for variety in tasks and so are going to be much more open to these kinds of informal work arrangements.

Does Airbnb hosting beggar your neighbor?

tl; dr version: I wrote a short paper on one of the policy issues raised by Airbnb—namely does hosting on Airbnb impost a cost on neighbors that make Airbnb as a development socially inefficient. My answer is no.

Yesterday I read a column in the Guardian by Dean Baker titled “Don’t buy the ‘sharing economy’ hype: Airbnb and Uber are facilitating rip-offs.” The title pretty much sums up his views. I did not think very much of the policy arguments (one of the key points is that they hurt local tax revenue, which seems eminently fixable and not the most important concern anyway) save for one: the notion that Airbnb hosts impose a negative externality on their fellow apartment renters. This seems plausible and it is exactly the kind of market failure governments are often needed to remedy.

I have heard many New Yorkers say something to this effect “I don’t want to live next to random people coming and going.” And while this main costs of hosting bad guests falls on the host—see the New York Post article (of course) “Airbnb renter returns to ‘overweight orgy’“—presumably neighbors also bear some costs. The natural policy question is whether these negative externalities get internalized.

Presumably individual tenants considering being Airbnb hosts don’t fully consider the costs on their neighbors, but the decision to list is not wholly up to them: landlords certainly have some say and presumably have incentives to both (a) let renters earn extra revenue, as they can capture some it through rents and (b) minimize the costs that these quasi-sub-letters have on other tenants (for the same rent-related reason).

I tried thinking through these issues and wrote a little paper that sits halfway between an academic paper and blog post. The main conclusion I draw is that if tenants can sort across apartments based on the landlords Airbnb-hosting policy, the negative externality of hosting will be internalized. In other words, there will not be “too much” Airbnb hosting.

See the paper for the details, but the basic idea is that when the rental market is in equilibrium, tenants have to be indifferent between apartment buildings that allow Airbnb hosting and those that do not, which means that the benefits they would get from being an Airbnb host equal the costs from everyone else in the apartment being Airbnb hosts. And when this condition is met, private benefits equal social costs, which is what one needs for the externality to be internalized. Obviously this is a pretty stylized argument, but hopefully it can be a starting point for thinking more seriously about the policy implications of Airbnb.