Given my recent interest in crowdsourcing the names of colors with Pigment (while trying out Heroku), I thought I would review the inspiration for the application: The Delores Labs Color Survey.
Delores Labs based in San Fransisco and founded by Lukas Biewald and Chris Van Pelt is a six month old business built around a service for crowdsourcing market research problems, predominantly using the Amazon Mechanical Turk platform. They provide a number of examples of their work including tagging the sentiment of comments, classification of documents, events, online price extraction, search relevance, and image retrieval. It seems their approach generally involves using the human click workers on MTurk to do the grunt of difficult (for computers) work, and cleaning it up for application using statistics and relevant data mining or machine learning processes.
The site provides a blog that highlights interesting and relevant work to their core service, including what seems to be ad hoc internal projects as marketing and promotion for their services. Some recent examples promoted by the blog include: facestat for collecting statistics on human faces (advanced hotornot), media bias for classifying of articles for US presidential candidates, classification of the covers of sports illustrated, and a color naming survey.
The initial findings from the survey were released in mid March this year (2008) with the title Where does blue end and red begin? (a which made me think of the Chemical Brothers song: "Where Do I Begin"). The post provided a brief about the nature of the project, some screen shots, a link to their in-browser color explorer (very slow), and references the World Color Survey at Berkeley. Follow-up posts that month included the release of a color dataset with 10,000 color-name pairs, and reference to a cloud view of the dataset by an IBM researcher. A follow-up post was made more than a month later (late April) highlighting additional interesting visualisations of the original dataset including a color flower, a series of network and cluster visualisations, a 3D fly-through of the cloud view, and the application of names to images.
The release of the color data with 10K instances revealed further detail about the experiment, including sampling colors based on a biased HSV color space, rather than RGB (all rendered as sRGB) and the collection of multiple colors on each submission (biased by human comparison information).
The color survey is great idea for an crowdsourced experiment, and generally an interesting and fun idea for a company.
I remember when I read about the survey and first learned about the company, the story was very popular, hitting the front page on most social news websites. As a concept, I think a day of brainstoring would highlight many applications for such services as well as gimmick-based experiments, although as a business I would focus on web and technology centric market research.
This would be the kind of business I would like (may still) start. As such I can see a few areas where the site (the company's interface to the humans) needs improvement, as follows:
- Organisation of Promotion Studies: One of the best marketing tools this group has is the development and promotion of in-house studies. This is because they clearly show the power and potential of the service, which may be difficult to grasp for those not familiar with the model. I would rename "examples" to "recent happy customers" or something, and create a whole new "examples" section of the site. Each project would have its own sub-section (well defined container) to house all relevant information. At the moment, these in-hose studies are presented ad hoc in blog posts, and feel very messy and inaccessible.
- Rigor: Elaborating on the first point, these guys need some (visible) scientific rigor. Defining and presenting each in hose project as an experiment means that a clear aim, methods, results and analysis procedure should be defined and made available. Analysis may be highlighted in blog posts, but should be well written in a scientific manner and made available somewhere on the site. This would promote professionalism regarding offered services (trust), and provide a high-class (manages and engineers would love it) of promotional material in addition to the lower-class stuff like blog posts. Rigor may also allow the use of the data in scientific publications, providing additional high-class promotion and the potential for collaborating with research groups directly (additional revenue streams).
- Community: The popularity of the color survey and resultant comments and additional visualisations highlights that a community must be built around each project. Importantly it will drive traffic and interest in the services of the company. Also as important, this would provide a different-level of crowdsourcing, where analysis is performed by the community, freeing staff (founders) to complete paid work. For example, a forum or comments system and base visualisations that can be manipulated and commented on (like IBM Many Eyes).
I am unsure of the direction of the company, but a quick assessment of their recent FaceStat service highlights that perhaps the company is trying to create and cash-in on a single community (like a hotornot), rather than focusing on their core service. Nevertheless, it will be interesting to see what these guys come up with next.


