A post on O'Reilly Radar today prompted me to think hard databases of discrete human contributions. The post by Tim O'Reilly made connections between ubiquitous computing and web 2.0. It was not the general topic that got me thinking, but rather the concise reiteration of the attributes of collective intelligence.
Tim clarified the breakthroughs of web 2.0 as the added meaning to existing data via algorithms, statistics, and meta-information rather than the addition of new data. He touched on the different ways of creating databases (classical Cornucopia of the Commons), focused on the shared-collective approach, and the methods of an architecture of participation where contributions are implicit and driven by the design of the system.
I have visited these concerns recently, and their discussion has motivated recent web application projects (humanTSPsolver and Pigment). Although these applications are small and exercises in learning specific frameworks and technologies, two things struck me:
- Both applications required explicit user contribution.
- Both applications focus on data collection only.
Considering the first limitation, implicit user contributions were considered although the explicit contribution mechanism was constructed first, and thus remains. This highlights that if such automatic user contribution mechanisms are desired, they must be designed and implemented first, sidelining the easier explicit contribution mechanisms.
An excellent automatic contribution mechanism for the humanTSPsolver application are games. Therefore, the web site should have been designed around the notion of small addictive games, user scores, and generally user experience. The aggregate contributions and any derived scientific value should have been relegated to a a small corner of the site. A good automatic contribution mechanism for the Pigment colour naming site would be to allow users to define and prepare colour profiles, perhaps for their own websites. Tagging of prepared colours would provide the colour name contributions. Again, this is a complete shift in the focus of the website.
I initially believed that the second limitation was an artifact of the early stage in development for both projects. I now think that claims of "it is unclear the use of such data will have until after well collect it" are bogus. I think that the full extent of insights and implications are unknown a priori (naturally), but to not think about and build first-pass tools for harnessing the data in aggregate is simply lazy.
Early experiments for the humanTSPsolver showed that feasible and complete tours can be constructed from aggregate contributions, and that such information can be used to seed probabilistic methods. The current site does not provide such primitive capability, rather focuses on a simple (although pretty) visualisation of contribution data. Now, after many thousands of explicit user contributions have been made, and relevant background research has been considered, there is scope for for more interesting applications, such as testing hypotheses about integrating multiple convex hull based sub-tours. An awesome research starting point for someone starting an Honors or Masters project.
The Pigment application does provide primitive first-pass tools for exploiting the contributions in aggregate in the form of searches for mapping human colour names to computer values, and computer colour values to human names. Naturally, these services should be promoted as webservices, and demonstration applications provided.
I would argue that seeking value in the data is the paramount concern of human-powered database applications. The application of statistics, development of algorithms, and linking of meta-data are the required tasks for firstly thinking about, and secondly figuring out what value and use the collected data has. Collection mechanism can start out rudimentary and explicit, and can be shrouded in user experience and marketed once it is clear what that value is and how it will be used.
Maybe, although there is also the school that suggests to open everything up and allow your users to define the use and value of your data.
Finally, it occurs to me that there are many models for such applications, not limited to O'Reilly's constraints. For example: (1) the classical selfish user experience where aggregate implicit contributions are provided as an additional related on unrelated user service (GWAP, social photos, and social bookmarks), and (2) the less selfish user experience where users still have a context (get attribution) although all contributions are made explicitly to a core consumable service (social news).
Small degrees of difference, although they importantly highlight the trade-offs in the continuum's of im/explicit contributions, in/visibility of aggregation ir|relevance aggregate-powered service to the primary interaction, and so on.


0 comments:
Post a Comment