Today I spoke on a panel on something called “impact sourcing” at the BPO World Forum. The idea of impact sourcing, in a nutshell, is that online work is a tool for development and that for-profit firms outsourcing some part of their business should look beyond traditional BPO firms and consider non-profits like Samasource and Digital Divide Data. It was a good audience for this pitch, as many of attendees were CIOs from big companies that are accustomed to signing multi-million dollar IT outsourcing deals with the likes of traditional BPO firms like Wipro, Infosys, Tata Consultancy etc.
Monthly Archives: June 2012
Resources for online social science
The Economist recently had an article about the growing use of online labor markets as subject pools in psychology research; ReadWriteWeb wrote a follow-up. If you’ve been following this topic, there wasn’t very much new, but if you’re a researcher that would like to use these methods, the articles were pretty light on useful links. This blog post is an attempt to point out some of the resources/papers available. This is my own very biased, probably idiosyncratic view of the resources, so hopefully people will send me corrections/additions and I can update this post.
Blogs
- There is the “Follow the Crowd” blog which I believe is associated with HCOMP conference. It’s definitely more CS than Social Science, but I think it’s filled with good examples of high-quality research done w/ MTurk and with other markets.
- There’s Gabriel Paolacci‘s (now at Erasmus University) “Experimental Turk” blog which was mentioned in the article and is probably the best resource for examples of social and psychological science research being done with MTurk.
- Panos Ipeirotis (at NYU and who is now academic-in-residence at oDesk) has a great blog “Behind-enemy-lines” that’s basically all things relating to online work
- The defunct “Deneme blog” by Greg Little (who also works at oDesk) and Lydia Chilton (at University of Washington).
Guides / How-To (Academic Papers)
A number of researchers have written guides to using MTurk for research. I think the first stop for social scientists should be the paper by Jesse Chandler, Gabriel Paolacci and Panos Ipeirotis:
Chandler, J. Paolacci, G. and Iperiotis, I. Running Experiments on Mechanical Turk,
Judgement and Decision Making (paper) (bibtex)
My own contribution is a paper with Dave Rand (who will still be starting as new assistant professor at Yale) and Richard Zeckhauser (at Harvard). The paper paper contains a few replication studies, but the real meat and the part I think is most important is the part discussing precisely why/how you can do valid causal inference online (I’m stealing this write-up/links of the paper from Dave’s website):
Horton JJ, Rand DG, Zeckhauser RJ. (2011) The Online Laboratory: Conducting Experiments in a Real Labor Market. Experimental Economics. 14 399-425. (PDF) (bibtex)
Press: NPR’s Morning Edition Marketplace , The Atlantic, Berkman Luncheon Series , National Affairs, Crowdflower, Marginal Revolution, Experimental Turk, My Heart’s in Accra, Joho blog, Veracities blog
Software
Basically, it lets you provide subjects one link that will automatically redirect them (at random) to a collection of URLs you’ve specified.I made the first really crummy version of this and then got a real developer to re-do it so it runs on Google App Engine.
“QuickLime“
This is a tool for quickly setting up an Limesurvey (an open source alternative to Qualtrics & Surveymonkey) on a new EC2 machine. This was made courtesy of oDesk research. I haven’t fully tested it yet, so as with all this software, caveat oeconomus.
“oDesk APIs“
There haven’t been lot of experiments done on oDesk by social scientists, but there’s no reason it cannot be done. While it currently is not as convenient or as low-cost as doing experiments on MTurk, I think long-term oDesk workers would make a better subject pool since you can more carefully control experiments, it’s easier to get everyone online at the same time to participate in an experiment, there are no spammers etc. If you’re looking for some ideas or pointers, feel free to email me.
“Boto“
This is a python toolkit for working with Amazon Web Services (AWS). It’s fantastic and saved me a lot of time when I was doing lots of MTurk experiments.
“Seaweed“
This was Lydia Chilton’s masters thesis. The idea was to create tools for conducing economics experiments online. I don’t think it ever moved beyond the beta stage, but if you (a) have some grant money and (b) are thinking about porting z-tree to the web, you should email Lydia and see where the codebase is & if anyone is working on it.
Here’s a little javascript snippet I wrote for doing randomization within the page of an MTurk task.
People
I’m not doing to try to do a who-is-who of Crowdsourcing, but if you’re looking for some contacts of other people (particularly those in CS) who are doing work in this field, you can check out the list of recent participants at “CrowdCamp” which was a workshop prior to HCI.
History
Probably the first paper I’m aware of that pointed out that experiments (i.e., user studies) were possible on MTurk was by Ed Chi, Niki Kittur and Bongwon Suh. As far as I know, the first social science done on MTurk was Duncan Watts and Winter Mason‘s paper on financial incentives and the performance of crowds.
The Innovation of StackOverflow
So as I write this, there is an egg timer ticking away next to me, set with 10 minutes of time. What am I waiting for? 10 minutes is how much time I predicted it would take to get my programming question answered on StackOverflow (SO):
The back story was that I was writing some R code and I got to a point where I was stuck: there was something I wanted to do and I remembered that there was a built-in function that could accomplish my goal. Unfortunately, I couldn’t remember that function’s name. After some fruitless googling, I posted the question on SO.