Friday, July 30, 2010

Service Idea: Tender problems for a given solution

Most of my side projects have been a solution looking for a problem. My research was really a technique looking for an application. Most of my startup work has really been technology looking for an domain. Not surprisingly, I feel a lot more engaged, motivated, and even look fondly on those projects that were focused on tackling a problem, particularly someone else's problem.

I propose a website where users can ask the question:

 "Given a technique or a technology that I want to use and a set of constraints (a solution), what should I work on (what problem)?"
I want a service where I can list a technique, some constraints, and a general approach and have someone interested suggest an appropriate problem or project. The "someone interested" part is key. Ideally this someone would be a person with the exact opposite conundrum: A person with a problem looking for a solution. Alternatively (additionally!), the community of problem-suggesters could be fellow 'looking for a problem' posters. This second case is likely a far more richer source of underutilized ideas - technology/entrepreneur people who cannot propose problems for themselves although I expect are quite happy telling other people what they could be working on.

Some ways to solve this right now:

  • Myself: Think of (contrive or locate) a problem for the project. Locating a problem may involve searching for standard problems or perhaps searching on freelance job listing site for a piece of work that could fit. More often than not, I contrive something and feel like I've wasted effort at the end.
  • Tender Comment: Post a request for advice on a forum and hope that a project can be materialized from the resultant discussion. Rarely can I find interested parties to comment. 

Use Cases:

  • A student looking for a project or research topic: I want to a paper/thesis on genetic programming, what project should I do?
  • A programmer/hacker looking to explore and learn a new or popular technique: I want to learn clojure, what's a good idea for a small project?
  • An entrepreneur looking for a start-up idea: I want to do something in health on the iPad, got any suggestions?

Additionally:

  • It would be nice if the kernel of the request was small and tweetable with a small link back to the full request (additionally suggesting some well structured proposal forms to fill out).
  • Listing problems in search of a solution could be provided as well with attempts at automatic (machine-based) match-ups/suggestions
  • A request is designed to generate discussion and suggestions. The requestee can take away whatever they which from the discussion, although a well defined answer is not required (it is not a Q/A stack exchange)
  • Voting on suggestions could be interesting, at very least to expose what direction group think is the most promising. 

Problem solving via a solution-first methodology might be an all round bad idea. Although posts might generate some interesting comment, the whole movement might be addressing laziness or a response to technology marketing rather than be a viable path to a satisfactory project. It's hard to see - is there something here or should I just hit the forums harder next time?

Wednesday, July 28, 2010

Book Idea: Movies by the numbers

While I've been slogging through my current book project I've have had a number of follow-on book project ideas. Rather than letting them rot in my head, I thought I'd leave a note to future me (or anyone else looking for a good side project to tackle).

One of these ideas is to write a workbook on machine learning that uses the Netfix Prize dataset as the focus. The objective of the book is to learn how to apply a number of techniques at a number of different levels of complexity on a real problem domain.

The beauty of the netfix prize dataset is that there is a lot of information out there that can be drawn together, from detailed forum posts to peer reviewed publications. This information can be located, sifted and reduced into the book structure, highlighting findings and strategies for addressing the difficult classification/prediction tasks.

The second great aspect of the competition is that it had a clear objective, a 10% improvement in RMSE over the baseline technique. The narrative of the book can lead the reader from basic 'getting to know the dataset', to technique application (ensembles), to the development and application of advanced tuning regimes, with pit stops in necessary areas such as cross-validation test harness development. Hopefully, the work would climax by putting all the knowledge learned throughout the journey together and building the final classifier system that achieved the winning result (or close to it!).

The book would be fun to write (and read!!!) because there is so much to do and to learn. A practical workbook means that the chapters would be littered with functional explanation of how techniques work, how to analyze the results that are produced and relate them to the broader problem, and most importantly (complete!?) sample code for achieving each step along the way.

I'd expect samples would be in SQL (initial dataset explorations), Ruby or Python (for exploring different techniques), perhaps a heavier language like C++ or Java for core/slow techniques that are built up towards achieving the 10% improvement, and perhaps R or similar for result analysis and interpretation. Unfortunately, the source code for the winning approaches is not available, although a best possible approximation of the systems used would sufficient for demonstration purposes.

Regarding the book structure: I'd like to see a staged hill climb of complexity, where the limit of a direction or an approach is reached by the end of a section of chapter and a new approach picks up in the following chapter or section where the previous left off. It's really hard to come up with a preliminary table of contents without first immersing myself in the facts that were explored in this competition, so instead I'll highlight some topics I'd like to see covered in the book:

  • The Prize : background on the competition, who participated, and how it played out
  • Recommendation: background of collaborative filtering / recommendation systems, approaches typically used and how they work.
  • Preliminary Analysis : preliminary stats on the dataset using SQL and/or a scripting language.
  • Classical : Application of classical approaches: SVD, rule systems, correlations, etc - perhaps a range of parametric approaches
  • Mainstream : Application of mainstream approaches: SVM, kNN, etc - perhaps a range of non-parametric approaches
  • Ensembles : Combining classifiers, tuning the mixture of experts, etc
  • Meta : Boosting, bagging and related approaches and their benefit.
  • Lessons : Extracting key points that could contribute towards building a competitive recommendation or machine learning system (or at the very least for addressing machine learning competitions)

Amongst the topics you can tease out a general structure of: Data Analysis->Classical Approaches->Mainstream Approaches->Meta Techniques->Tuning. I would rather the technique sections focus on actual approaches adopted by teams during the competition. To find this out, I believe it will require a lot of questioning of participants. From memory there was heavy use of SVD based systems, kNN, ensembles, and optimization of ensemble systems.

"Movies by the numbers: Practical Machine Learning with the Netfix Prize dataset" (a title plucked from thin air) - coming to a book store near you in 2012, possibly.

If you would like to see this book exist or have an opinion about its content, leave a comment and let me know!

Wednesday, July 21, 2010

Chris Anderson's most profound TED talks

People love TED talks. I thought I would point out the four most profound TED talks according to TED curator Chris Anderson. These were listed in a Q/A with Chris via Reddit in early 2010 entitled "TED's Chris Anderson answers Reddit's questions".

In addition to providing direct links, the following provide the (cleaned-up) notes I took the time I first watched the videos.

  • Barry Schwartz on the paradox of choice
    • Official dogma - maximize welfare of citizens - maximize freedom (we do things on our own)
    • the way to maximize freedom is to maximize choice
    • choice->freedom->welfare
    • examples of choice 
      • buying things: supermarket choice, electronics, phone services -  crazy
      • health care... doctors give you choice - patent autonomy
      • identity - we get to invent/reinvent ourselves
      • family, used to be who, now everything - consuming questions
      • work - every minute of every day, from anywhere on the planet, decide whether we should be working
    • 2 negative effects
      • produces paralysis rather than liberation - people cannot choose at all - don't want to make the wrong decision
      • we end up less satisfied with the choice when there are more choice - could have made a different choice - regret subtracts from decision made
    • we just don't want to miss the opportunity
    • escalation of expectation 
      • do better but feel worse
      • expectations go up when there is so many options
      • never pleasantly suppressed
    • The secret to happiness is low expectations
    • who is responsible
      • you when there is a lot of choice
      • if there are few options - the world is responsible
    • official dogma is all wrong
      • more choice is better is totally wrong, there is a sweet spot
    • it's a problem of modern affluent societies 
    • abundance of choice: doesn't help, actually hurts
    • if everything is possible, you increase paralysis and decrease satisfaction
  • David Deutsch on our place in the cosmos
    • solar system is highly tailored to our survival
      • spaceship earth - we're safe in here
      • we're chemical scum on the outside of a typical rock, around typical star, etc.
    • both at odds and both false
    • typical place in the universe is emptiness - space
      • intergalactic space - a typical space, nothingness, light, vacuum, etc
    • we (humans) can explain things
      • we can observe, track, model, explanatory model, causal structure
      • our brains can create knowledge and grow it (explanations)
      • not physics, we create an open ended stream of explanations
    • we are very different 
      • we are a hub - we can work out the structure of everything else
      • amazing, given just in the laws of physics
    • we do it with 3 things: matter (computation), energy, evidence
      • evidence is everywhere for the taking
    • out in the normal part of the universe - none of these things are there (Wrong!)
    • intergalactic space does provide the prerequisites (hydrogen atoms)!
      • just missing knowledge 
    • cosmic knowledge based view
    • we can survive, and we can fail to survive
    • we just need the suitable knowledge is to survive, everything goes extinct
      • we want to be the exception 
      • our only hope, to create new knowledge
    • global warning - too late
      • been too late for a long time
      • we cannot always know - disaster exists and how to solve it
    • we need to focus on fixes
      • problems are solvable, problems are inevitable 
  • Dan Gilbert asks, Why are we happy?
    • human brain has nearly tripled in mass over last 2 million years
    • we gained new structures - frontal lobe, prefrontal cortex
    • prefrontal cortex is a experience simulator
    • winning the lottery and being a paraplegic are equally happy with there lives
    • impact bias - simulator can work badly
      • different outcomes are more different than they really are
      • bad things have far less impact than they actually have - major life trauma
    • happiness can be synthesized
    • psychological immune system
      • help change views of the world to feel better about the worlds they are in
    • we think happiness is a thing to be found
    • 2 types of happiness
      • naturally happiness - what we get when we get what we wanted
      • synthetic happiness - when we are happy with what we have
    • synthetic is every bit as real as natural
    • synthetic happiness - actually a change in the brain
      • demonstrated with people who cannot make new memories
    • dogma: freedom - choose - path to natural happiness
      • it is the enemy of synthesized happiness
    • accept the things you cannot change
    • reversible and irreversible decisions in test subjects
      • people stuck/limited are happier
      • people with choice are far less happy 
    • people prefer to have the choice but choice will lead to be less happiness
    • preferences can be good
      • we can overrate the differences between options
      • when unbounded we do crazy things, bounded is controlled
  • Steven Pinker on the myth of violence
    • recent history gives the impression that we have been horrifically violent
    • we think historically things were peaceful and harmonious existence
    • thesis
      • history was far more violent than we believe
      • there is a decline in violence over recent time
      • this is/may be the most peaceful time in history 
    • log time period to provide cases
    • millennium scale
      • hunter gatherer societies: more than likely to be killed by another
      • early civilizations: looking at the bible, very violent  
    • century scale
      • death penalty for most things, very violent 
      • log graph from middle ages to now, mass drop
      • elbow of graph (drop in violence) was the early 16th century
    • decade scale
      • decrease in wars, etc
      • decrease in deaths per year per war
    • year scale
      • decrease in homicide, small increase in the 60s, back down in the 90s
    • so many people are so wrong about something so important
      • we have much better press
      • there is a cognitive illusion - we simply remember the infrequent occurrences because they're shocking
      • guilt about indigenous peoples
    • why has violence declined?
      • anarchy - strike first out of fear 
        • can have a pact with neighbours, ends in bloodshed
        • the Leviathan - authority, an agency: the state
          • we mainly see anarchy now mainly in failed states, marfia's, etc
      • life is cheap
        • life appreciated more if violence is seen (media)
      • non-zero sum game
        • cooperation can benefit both parties
        • other people are more valuable alive than dead
      • expanding circle
        • empathy, naturally only applied extremely locally
        • more recently the circle has expanded family..clan...etc  
    • implications
      • why is there war/peace
      • what are we doing wrong/right
Note, that post provides many links to a host of excellent TED talks.

Monday, July 19, 2010

TED talks for startups

I came across a RWW post entitled "10 Inspiring TED Talks for Startups", and being an avid TED watcher I thought I'd take them all for a spin. I pulled a muscle in my back and it only hurts when I move/breathe, so I didn't feel guilty watching TED videos for a bunch of hours on end.

There were some I'd seen before, and some that I don't want to watch again. Long story short, the TED website slice and dice their content a zillion different ways, providing many ways to discover and consume the 'best' talks, including a top 10. My advice, is that there are more interesting talks for the intellectually curious, including those who have a bent on starting a startup.

Nevertheless, if all of the talks on the list are new to you, my picks would include: 1, 3,  4, 7, 9.

Like most TED talks, there is a key thesis or idea in each talk and a whole lot of analogy and case studies designed to make you understand/think you had the idea yourself. Naturally, I didn't consume them passively, see below for my 'workings out' (notes).

  1. Simon Sinek: How great leaders inspire action.
    • Golden Circle: Why, How, What (start with what and work in)
    • Great leaders invert the circle (start with why and work out)
    • People don't buy what you do, they buy why you do it
    • Examples: Wright Brothers, Apple, Martin Luther King
    • Why speaks to decision making centers in the brain (limbic), What speaks to rational/language centers (prefrontal cortex).
    • Communicate your "why" (belief, purpose, vision) and let people buy in for themselves, then try to sell them the "what" and "how".
    • Sell to people, hire people who believe what you believe.
    • There are leaders and those who lead (power vs inspiration).
  2. Adora Svitak: What adults can learn from kids.
    • Childish - dreams for perfection (unconstrained optimism)
    • Adult - reasons not to do things (focus on constraints)!
    • Kids think of good ideas, not thinking within traditional limitations.
    • Learning between adults-kids should be reciprocal.
  3. Larry Lessig on laws that choke creativity.
    • User generated content - how to open it up.
    • Avoid top-down, professionalized, read-only culture, seek read-write and participative.
    • cases: talking machines killing culture,  flights a trespassers over the land below, broadcasting, BMI a more democratic management of music content.
    • Revive the read-write culture using digital technology.
    • User generated content - amateur culture - produce for the love not the money.
    • Remixing content - not piracy, recreating with existing content.
    • Democratizing techniques (digital techniques - tools of technology become tools of literacy) - say things differently.
    • Architecture of copyright law - makes remix illegal, everything is a copy online.
    • Adopt permissive non-commercial licenses, need private solution like BMI, let competition in the marketplace solve this problem.
  4. Dan Pink on the surprising science of motivation.
    • Candle problem - example of overcoming cognitive fixedness.
    • Control vs rewards/competition, reward motivation can make them perform worse.
    • On lots of tasks incentives don't help and can make things worse.
    • Incentives really only work well for simple problems, narrow focus with well defined constraints.
      • rewards narrow focus, hamper creativity 
    • The creative problems are more of the types of problems we do now - the easy problems are outsourced and automated.
    • Use intrinsic motivation: autonomy, mastery, purpose  over extrinsic (carrot and stick).
    • Engagement: self-direction works better than management.
    • Examples: Atlassian and Google (20% time), Wikipedia vs MS Encarta
  5. Rory Sutherland: Sweat the small stuff.
    • Worry about the little things (marketing and if you want to make a difference).
    • The little things are the things that people remember.
    • Behavior change inverse to the amount of force applied.
    • The big stuff has money and is done well.
      • too powerful, too high level
      • people with the power want to do the big expensive things
      • makes the high salaries seem worth while
    • The small stuff is done badly - the user interface.
    • We want input and change to be proportional (Newtonian)
      • Reality is much more complex
    • This is not the world, small change has big effect. (complexity theory)
    • Need to look for those small risky things that can have a huge effect.
      • Need a name for this, very important.
      • chief detail officer 
  6. Seth Godin on standing out.
    • Case study on sliced bread patent focused on how to make it, was a failure for 15 years, took good marketing to take off.
    • This is the century of ideas diffusion (those who are best at it, win).
    • We are currently focus on trying to get the front page on google, how to grab attention.
    • Advertising is currently all about interrupting the consumer.
    • Consumers don't care, too much choices, too little time, they ignore stuff.
    • Purple cow - you need to stand out - is it remarkable (worth commenting on).
    • We're all in the fashion business now - not about interrupting people.
      • old: mass marketing: average products for normal people
      • new: never market to normal people, market to innovators and early adopters - then word of mouth will push it into mainstream
    • You need a group that cares about what you have to say.
    • Find out what people really want, then give it to them.
    • Riskiest thing you can do is being safe - the safe thing now is being remarkable
      • Being very good is boring. (get scrappy?)
  7. Malcolm Gladwell on spaghetti sauce.
    • Breakthrough: There is no best Pepsi, there is best Pepsi's.
    • Spaghetti sauce.
      • Created a search space, tested the space.
      • Measured the modality of the space - the peaks
      • three groups: plain, spicy, extra chunky
    • pickles: regular and zesty
    • You need to provide choice - multimodal not unimodal distributions of preference.
    • This changed the way the food industry makes you happy
      • old way: what do you want in product? always wrong, people don't know what they want
      • new way: horizontal segmentation. trial lots of things, collect data and analyze 
    • Mustard's: french and golden, then dijon
      •  lesson: make them aspire to something - a better product, sophistication (wrong!)
    • It is not a hierarchy, it is a plain, a spectrum with clusters of preference.
    • You no longer you have a platonic dish.
      • authentic dish was the way to go, seeking cooking universals
    • Now - the understanding of variability
      • interested in the details of the differences 
  8. Jan Chipchase on our mobile phones.
    • Types of possessions: owned, consider, carry, use
    • 3 most important things: keys, money, mobile phone
    • core reason: survival (lowest level in Maslow's hierarchy of needs)
    • mobile phone can transcend space and time (call, and sync messages)
      • personal and convenient
    • methods for avoiding forgetting: tap your pockets, turn around, rituals
      • center of gravity - where you look for things
    • never forget: have nothing to remember
      • art of delegate
    • Turn phone into ATM in Africa
      •  decentralized, street innovation
    • peoples identity is mobile
    • effects of everyone having a mobile phone: immediacy of ideas, immediacy of objects (adoption), the street will innovate despite you in ways you cannot anticipate (design), direction of conversation (learning how to listen)
  9. Clay Shirky: How cognitive surplus will change the world.
    • 20th Century taught us how to be good consumers, now new media gives us all an ability to also be producers 
    • Cognitive Surplus - enabling ability of digital media and contribution of peoples free time (ability to create, ability to share)
    • Examples: Ushahidi and LOL cats. Difference is LOLCats is communal value (the group), Ushahidi is civic value (good of society).
    • Once you allow creation/experiments, you have to allow the spectrum, the Ushahidi and the LOL Cats (gap is between doing anything and doing nothing)
    • Design for generosity - manage economic contracts (money for goods and services) and your social contract (being good) - they are different and incompatible .
    • Only going to get more participation, both types of value, but the latter value will change our society.
  10. Chip Conley: Measuring what makes life worthwhile.
    • Finding meaning makes you happy.
    • Hierarchy of needs - Maslow
      • apply hierarchy of the individual to the business
      • survival, success, transformation: for business/life
    • Adoption results in lowered turnover, increased customer loyalty
    • manage what you can measure - old way
    • need to manage the intangible (top of the pyramid)
      • we think they are important - no idea how to measure them
    • an alternative measure of success: gross domestic happiness (GMH)
      • start measuring and monitoring happiness 
    • about creating the conditions for happiness to happen
      • have lots of indicators, questions, etc
    • emotional equation
      •  happiness: wanting what have (gratitude) / having what you want (gratification)
      • USA/West are a bottom heavy culture
      • happiness is not an object
    • GDP is important although doesn't count a lot of things that matter to us in life
    • maximizing GDP optimizes tangible success, but not tangible happiness
    • need to create conditions for happiness
      • can have inspired employees and tangible profits
      • what counts? 

Saturday, July 17, 2010

LOST, productivity

It's winter here. The days were getting shorter, the mornings and evenings darker and colder, and in late May I started watching the TV show Lost with my partner. The show had recently aired its last episode, so we promptly acquired all six seasons and started watching. It had a small science fiction / fantasy bent and lots of mystery so it sucked us in.

Having all the episodes to a show that designed to suck your attention is dangerous. Lost totally killed my productivity over the last ~2 months. I've created almost nothing, and my reading has dropped off slightly as well. Our TV consumption jumped to a solid 2-3 hours an evening and perhaps more on weekends, eating up almost all valuable non-work time.

I'm taking stock now that we've finsihed. I'll be fair and say the show was interesting in the first two seasons. Reflecting, the story took a downward trend and although the final episode/season technically resolve the plot, I felt the whole fifth and sixth seasons were kind of bland. The mystery and mystique of the history of the island and physics questions (the parts I spent the most thinking about) were slowly revealed, as were the backstories of the Dharma Initiative and the hostiles/others which were hacked at rather than pealed away.

The Lostpedia has been pointed out to me, so I might spend a little time verifying some assumptions and learning about the augmented reality games that accompanied the broadcast.

Whatever, we had fun. I think I would feel less dirty having consumed perhaps an average of 1 episode per day and spent more time tangibly producing rater than consuming. Easy access made it all too easy to press play on the next episode. That is defiantly it for high-rate TV consumption for a while now - time to get back into the book project in a big and furious way!