I was listening to a talk by Nicholas Negroponte regarding his One Laptop Per Child Project, and was struck by his description of stripping back the idea of a laptop and reformulating it into something that would be a cheap and an effective tool for learning in developing countries. A powerful vision and a brilliant man, no doubt about it. I was really interested in the notion of redesigning a common consumer product for a similar but different and more specific purpose, and the insights that provides on what that product is and how it is conventionally used. I was also thinking about the methodology those guys used to even be able to get into the right 'head space': principally stripping commercial perspectives of technology design and construction such as feature bloat and lock-in.
The rapid development and release of primitive versions of ideas followed by incremental improvement is the mantra of agile software engineering, and of some of the Godfathers of startups. This process is a methodology for distilling an idea for a product or service. The realities of implementation, technical minutiae, and related time pressures force the distillation. I would advocate stopping there. Keep the minimal version of the product, and exploit incremental improvement to improve the adaptive fit of the system for its application (however the users help you to define what that means). Specifically, I'm referring to the sometimes irresistible urge to add features that are not core to a problem's solution (what the hell is an AJAX?).
I like visiting big websites that have heaps of features, although I don't want to live there. I use them like shopping centers to meet a specific need. I may browse or window shop a little although I usually bail once my need is fulfilled. Simply, I have better ways to spend my time. I 'live' on minimal websites. These are the online properties that get the most so-called 'face-time' in my browser because they help me to get things done, even if that 'thing' is optimised procrastination. Google products for example let me do things. Beyond a minimal design, the service sites of choice are minimal in features. For example, when I'm collating some notes, docs services me better than notepad, and trims the gigabytes of features from MS Word that I have, and never will touch. I fill my digital world with an ecosystem of these minimal service solutions. Their simplicity comforts me, as a user I feel like I have some control and that the service is transparent, whether these perceptions are true or not doesn't matter.
A minimal product or service represents a perfect case for a technology startup. A well defined problem with a no frills solution. A solution that nails the problem, irrespective of whether the problem is for a low percentage of the web users with the problem. Realising ideas as tangible products and services, like anything, is a skill that improves with practice. You educate yourself, and train on manageable cases of increasing difficulty (ambition). Plans of glory are checked at the door, although success comes to those that work hard.
Around 2005 I was managing my bookmarks in an XML document which I would run through a simple XSLT to produce a webpage I could access at work and home. Serendipity revealed del.icio.us to me, which solved exactly this problem for me. I no longer value bookmarks (search is king), so bookmarking applications no longer appeal to me. A few months of pushing links to the database highlighted the point. I got a lot more value mining the corpus for relevant pages (yet another way for me to search), and using the 'just posted' page as a pseudo news aggregation site. I liked and used it because it was simple: it did little other than capture my bookmarks for later location-independent use. It was not part of some big machine which tried to sell me other services or blind me with features I didn't understand or want (we're all just dumb users after all, just in different contexts).
del.icio.us is a startup wannabes wet dream. It is a simple site, with an enormous user base, built and maintained my one guy, that was bought out for lots of cash. As a case study it is important to point out that 'the guy' did a whole lot of web-based problem solving before del.icio.us, most likely spending years refining his capability to realise and launch an ideas into viable produces and services. Popularity, and therefore the user base and buyout may have been lucky, but you can bet that his execution of the project was close to clinical. Other seminal one-man cases include Richard Cameron's citeulike for academic references (a killer application as far as I'm concerned), and Gabriel Jeffrey's grouphug (procrastination or research into the human condition, you choose).
The patten, like writing a research paper is: one problem, one solution, no frills. The solution is arrived at not by coincidence or luck, but systematically and as a result of training. I would advocate not just minimalism in the presentation-level design (an implementation detail), but rather the methodical and informed removal of choice. Specifically, I'm referring to Richard Gabriel's notions of "worse is better", and Barry Schwartz's take on "less is more". A persistent minimal solution is not one-size-fits all (case in point), but it is a viable perspective and a good start for startups.
Monday, March 31, 2008
Beyond Minimal Design: Minimal Solutions
Sunday, March 30, 2008
Software Doesn't Rot, But What If It Could?
Code is not magical, rather when you have no clue what is going on 'under the covers' it can seem that way. For example, use any google product, and stuff just works. Functions correlate with intuition such that you are totally task focused (find x, email y, blog z) rather than sidelining your task to deal with the particulars of an esoteric perspective of a process. In particular I feel that the decoupled nature of 'software' and 'activity' when I used to play a computer game, or more recently why I'm observing the real-time sampling behaviour of a stochastic optimisation algorithm. The game environment becomes my temporary reality, or I treat a complex process as an entity independent of the code I have been slaving over for weeks. Even with an acknowledgment of computability theory, it is easy to forget the fine grained design determinism of the software we are using, and focus on its capability to facilitate communication and automated computation.
Problems regarding the maintainability of software are referred to as software rot, which basically describes the mismatch of a software system to its use. It is an emotive analogy from software engineering, and like Software Evolution, it provides a useful context for aggregating related concerns, although can cause problems when pushed too far. For example, I recently had a discussion with a colleague who was interested in investigating software development and release cycles as an evolutionary process, an idea I quickly dissuaded him of pursuing. Likewise, software rot can be pushed to far when people start using terms like software getting 'worn out' to describe software maintenance problems. Although software is automated logic execution, it is not mechanical (although the mechanical perspective is another attractive perspective), it does not wear in the physical sense. I was reading about the inherent lack of value of discrete logic in software when I was reminded of a notion I had a number of months ago: what if you create a system in which everyday software can evolve and can be worn out.
Naturally you may jump to notions of evolutionary algorithms, specifically Genetic Programming which is a nice automated inductive logic generating process, although scales poorly with regard to complexity. I'm sure that all the smart kids that are hooked on GP hace a similar dream for software, at least initially. This was not the first place I went, instead I was thinking more generally. Irrespective of the technology used to achieve the effect, the problem is that of 'automatically' (using computation) rather than 'explicitly' (via human designed and coded updates) refining the adaptive fit of a software system to its application. The problem, as is usually the case with interesting problems of adaptation, comes down to that of credit assignment. Specifically: how to identify and measure the adaptive fit of the software system? For example, is it the features used or logic executed that is important? I suspect that the problem is domain dependent, and also that it is possible to devise a general 'good enough' method. I believe that there are a lot of indicators that could be exploited, although selecting 'useful measurements' is by and far the hardest part of the problem.
Wear can be a good thing in physical systems, especially mechanical systems made of metal. It can help find the sweet spot, although tends to need lubrication and eventual replacement. Evolution also has its benefits such as ruthlessly although slowly adapting species to a (typically changing) ecological niche.
Two important influences on my thinking are both web based (the Internet is an everyday application right?): distributed generative art, and real-time interactive art. Specifically, I am referring to the use of automated inductive processes (like genetic algorithms) that exploit feedback signals from multiple users (computers), such as Picbreeder, Electric Sheep, and others. Regarding real-time interactive art, I am referring to web pages that allow many users to work together on some product like The Broth, Beauty in Chaos, drawball, and others (I posted in '07 about these technologies [pdf]). I like these examples because the art focus makes the underlying idea of collaborative product adaptation far more transferable. The examples also highlight that the collaboration can be explicit or implicit, the latter of which is likely the most powerful (less intrusive).
Forget the method for now, assume we can measure well and do all our induction and reasoning reliably. I have a fast computer that typically does nothing, so lets further assume that the computation for such adaptation is local for both local and distributed applications, although likely reasoned from an aggregated corpus of collected information. This provides a platform to begin reasoning about the behaviour and effects of such software systems.
Notions of developing a browser or similar complex logic from scratch is preposterous, although consider a browser augmented with such capability. Beyond smart menus, consider a population of browsers (entire user base) refining themselves to their users, likely streamlining away all those menu options that are never clicked and logic never executed. Think bigger, such as biases on the network connections that can be made (only visiting US sites, or reddit on Friday afternoon), or even the mechanisms by which URL's are perceived and manipulated. The adaptation of fit is not constrained to adding and removing features, but rather the more subtle influence over the interaction of the user with their application process based on individual or aggregate behavior. A raw HCI task focus of interactions provides a powerful optimising lens with which to automate the refinement of software.
Away from science fiction for a second and back to the web. As highlighted by the AI Effect, and acknowledge by any software engineer who has had to write user interface validation logic, there is a disparity between perceived system behavior and perceived underlying complexity. Good software is easy to use and it is hard to make, although does it have to be a human that makes it? When I visit a web page for the n'th time, I want it to recognise me, remind me of all the good times we have had together, and suggest a relevant current offering of information to digest. I want the process of reading news to be like an old friend telling me gossip that he will think I'm interested in hearing. Instead pages are generally static, dynamically generated for the masses or 'people like me', not for me. I never want my emails listed only by order of arrival, I want they prioritised by maximum pay-off of me responding to them. I think putting up with software rot (more likely maladaptations) is worth the expected gains of such primitive improvements.
Saturday, March 29, 2008
Cycling Hobbies to Day Jobs: A Personal Assessment of Starting a Startup
When I was exposed to programming in an official capacity (university) I hated it. At, and before that time my hobbies included learning about computer networking, hacking around in Linux, and modding computer games, much of which I later realised involved 'programming' (compiling drivers, changing and compiling game code, writing scripts, etc.). Later, after I made this association, and more importantly when I saw the creative power of the medium, my spare time was filled primarily with programming projects and the rapid assimilation anything and everything that would help me create better or more interesting programming artifacts. Work products from my spare time crept into my student life (code examples, awesome assignments, etc.) and I was quickly employed to do it for a living.
This recurrent process of transitioning 'hobby' to 'day job' has been a common theme with my interactions with computers. For example, I was exposed to networking technologies (which I disliked intently) when I spent my spare time playing games, which progressively overtook my interest in games after participating and organising in LAN Parties and Modem variations of the games I was playing. Presently my 'day job' is research in Artificial Intelligence, which was my hobby when I worked as a software engineer (about the most interesting thing in from the study of programming I could find). Observing this trend, one may be inclined to consider the exploitation of this organic process: what is your hobby (something you love, enjoy at the least), and how can it pay the rent (become a day job)?
Postgraduate life is awesome, although having to deliver work product in an official capacity can suck the fun from the experience without discipline. For example, I dislike publishing my research (an academic career killer), although I love writing it up, particularly for the refinement of my ideas that it affords. The important aspect of having a stake in the research project (essentially being the sole stakeholder) is that you actually care about the end product. As a result, I have found that my hobby and my day job have merged. This is not completely accurate, as I have invested time in researching the hell out of interesting pocks of work that do not directly contribute to my project that may be considered a 'hobby' in the 80/20 consideration of work (for example complex adaptive systems, satisficing, human computation, and all kinds of methodologies). From a more abstract perspective, my hobby over the last 3 years may have been a form of project management, while may day job has been to 'do research'. Besides reading about esoteric facets of computer science, and stressing about research project schedulling, I have pondered the latest web technologies and principles of being involved in a technology startup.
A rational assessment suggests that: Software Engineer + Intelligent Systems Research = Job Building Intelligent System Software. I'm cool with that, although I question whether the only options for such work are academia and big (read: someone else's) business. The responsibility of management and control has impaired by desire (not ability mind you) to do someone else's bidding. (Un?)Fortunately there are many resources out there for people thinking along these lines, the least of which is the page-rank prominent Paul Graham's Essays. Frankly, I enjoy his writing style, the optimism, and I agree on much of what he has to say regarding education and technology, although I'm pragmatic enough to not take his retrospective advice too literally. Specifically, I subscribe to the messages of 'work on what you love' (I live and breathe 'always produce'), 'having the guts to try', and 'work on hard problems', which he revisits many times in his writings.
Regarding the 'work on what you love', I have had the fortune, and now the bias of thinking that if you work had enough on your personal passions, that good things will come, like getting a day job. An observed flip side is that I keep needing new hobbies, this is because when fun work is structured with responsibility and expectation, it's no longer as fun. Regarding 'having the guts to try', it's a no-brainer for anyone ambitious, although it is a hard lesson to take on board, especially when pushed by someone who's current day job depends on it. By the way, I think it is brilliant to start a startup that helps people start startups. If only you could mitigate the risk of running such a thing remotely (franchise?), I'm sure the world wide market would be huge! Anyway, having attractive opportunities fall in your lap removes the need to make this important and hard decision. I am at this crossroad now, and I equate it to the crossroad I faced nearly four years ago of whether to give up a perfectly good job (and relatively linear career) as a consultant to research AI. I think rationally decision making is not the problem, rather it is the objectivity of the process, specifically the emotional impact of the unknowns involved. Finally, regarding 'working on hard problems' Hamming's advice has influenced me since the day I first read it, and hat tip to anyone spreading the good word.
Two cases pushed me over the line to help me to decide to pursue my own technology startup before seeking classical employment: (1) As a tool, intelligent systems are an underutilised powerhouse which translates to competitive advantage (Graham's popularised example makes this point), and (2) An ex-colleague deliberated on the decision, gave it a go, and successfully launched a product (Elimatta) in 3 months (localised confirmation that it can be done). Additional no-brainier points include my own financial security, the low and continually decreasing cost of bringing web products and services to market, and my lack of broader commitments.
I am left with questions regarding the relationship between hobbies and day jobs, particularly given my biased perspective. My suspicion is their present merger will hold for sometime yet as I'm sure starting a startup, much like graduate research, does not promote consideration of additional pleasurable workload.
Friday, March 28, 2008
LaTeX and a Strategy for Overcoming the Learning Curve
In October 2007 I made the decision to start writing up my dissertation, and the platform I chose was LaTeX. Franky, I chose LaTeX because the quality of the documents produced puts other products to shame! Specifically, my only other choice was MS Word (or equivalent), which has treated me well, but from initial testing was clearly not up to the task (master documents suck!). Reading up on popular news sites at the time also provided me with further evidence. Presentation in the thesis is everything. It represents the pinnacle of 3+ years of research and is the primary method by which the work produced during the PhD is assessed. The decision to use LaTeX was made after checking out a number of thesis and papers produced by the platform, after which my primary aim was to address the single pain point claimed by others that had made the decision: the learning curve. I documented the progress of the strategy I adopted (links and opinions), which involved two main thrusts: (1) reading seminal LaTeX literature, and (2) learn by doing.
I started out by reading up on LaTeX and TeX in Wikipedia, the result of which suggested to me that there was plenty of free documentation on the web to exploit (voiding the need to hit the library), and that getting into TeX was not necessary. The first two big problems I want to address were the choice of an IDE, and the specifics of reference migration. Googleing revealed the popularity of the MiKTeX distribution, reinforced by the longevity and amount of downloads on the projects statistics. Reading up on popular LaTeX IDE's lead me to TeXnicCenter, which also had convincing project statistics. Googling on these two packages in combination supported their compatibility and combined popularity. All of my references were stored in ProCite, which I learned I wanted to migrate to BibTeX format. I chose JabRef to maintain my references (in particular the WebStart version for automatic deployment of updates). Reference migration was a pain, I exported from ProCite, imported into EndNote, then exported from Endnote and imported into JabRef using this guide (RI format). Some data cleaning in JabRef was required (I had ~1000 entries at that time), although overall the migration was pretty smooth.
After I was up and running, I educated myself on the basics of getting things done in LaTeX. I printed a copy of the LaTeX Cheat Sheet and memorised as much as I could. I downloaded the introductory guides from the Latex Project, and read through them end-to-end. Specifically, LaTeX: an introduction and The (Not So) Short Introduction to LaTeX2e. I realised I was going to have two main problems: graphics and tables. I flicked through Using Imported Graphics in LaTeX2e to get up on graphics, and Tables in LaTeX: packages and methods to get up on tables. Regarding tables, I chose to prepare my data in MS Excel, then use LaTable to convert the CSV into a LaTeX table. Regarding graphics, I had already created many images in MS Word. I copied these from MS Word into MS Visio and exported as WMF. I them imported them into TPX and exported as EPS and included them using the graphics package into my documents. Regarding images that were in JPEG, I used InkScape to convert them to EPS for usage in my documents. Graphics were the weakest part in my tool chain, and looking back on the monotony, I should have invested more time into this area (heed my advice!). A final area I invested time early was algorithm representation (pseudo code). I looked at a lot of examples, and came across Algorithm2e which I adopted because I thought it looked the best, especially after further tweaking.
I wanted to use an off-the-shelf thesis template. Searching revealed a number, although I settled on an integration of three specific templates from around the web: here, here, and here. The most important principle was separation, specifically the main thesis file displays nothing, each chapter (includes appendices, front, back matter) is maintained in a separate file. This allows a lot of control over the working set (using \includeonly) for compiling and drafting. I also modified the template based on my schools requirements (Swinburne), and documented all of these specifications in the template to continually remind me. I managed to find lots of tips and tricks all over the web for both LaTeX (wikibook, tutorials), and TeXnicCenter. I used an Australian dictionary file which I continually added to with domain specific words. I had a lot of problems with too many words, which I measured using the excellent web-based LaTeX word count script. Going forward, I plan on pushing my LaTeX out to the web as a template and thesis source to help future "other me's". Regarding IDE's I have played with TeXlipse which I have found to be quite good, although not as mature as TeXnicCenter (yet). Specifically, I love the Eclipse platform, and all the cool tools that come with it, so I would love to see this variant of the tool rise up and take over, just like Eclipse did with Java IDE's.
The general strategy I adopted was the same strategy I used back in the day when consulting in getting productive on a new technology as fast as possible. Generally the procedure used in this case was as follows:
- Preliminary Reading: This step is all about coming to terms with the the capability and placement of the domain in the context of the problem to be solved. What is it, how does it help me?
- Environment: All about getting a standard operating environment up and running as quick as possible. Importantly, there needs to be some level of trust in the tools used.
- Detail Introductory Reading: This step is about coming to terms the principles of the technology. Specifically, the details of how to use it, best practices for completing common tasks, so on. This foundational education is critical for future trouble shooting.
- Project Structure: The skeleton design of the scope of the project using the selected tools and initial application of acquired best practices. This includes a well structured project directory, naming conventions, and initial content. This structure houses all future work on the project until completion.
- Learn While Doing: Do the work, and acquire specific details as required.
Importantly, the outlined 'learning by doing' strategy works. It has given me a functional understanding of LaTeX and a coherent written-up dissertation (most of the research was complete prior to the switch) in about six months. For anyone out there on the fence, my advice is: if the document matters to you and you want it to look professional, make the switch and get on with it!
Thursday, March 27, 2008
Rants as the Basis of Knowledge
I was editing my thesis this morning and relatively pessimistically thought of the whole lot as a giant rant. Sure, it's systematic and contains published research (may even be called scientific), although referring to the the mass of typed pages as a rant feels strangely accurate. I do not mean the term in the emotional sense, the work is objective, rather I mean it in the sense that the mass of pages represents my biased investigation and resulting biased perspective on the particular research question. Books and all (let's say written for now) communication are rant's, just really well structured and written by people who know more than the average (usually) about a given domain.
Extrapolation of this perspective suggests that the basis of human knowledge is the summation of biased perspectives. Two obvious clarifications are, firstly the biased context for communicating knowledge, and second the biased context for interpreting communicated knowledge. Pushing the perspective too far suggests at the ability of supporting arbitrary 'facts' with a seemingly reasonable argument of hand-picked (bias with intent) evidence. This does occur (mass media), although is generally averaged out, where knowledge is defined as commonly accepted (easily associated into the framework) facts. This is likely the basis of the perspective that large jumps (difficult to associate or integrate facts into the framework, like 'a perpetual motion engine') are treated skeptically, even in the presence of 'evidence'. Also suggests at work that is refered to as 'before its time', a the suitable context or framework had not been devised.
Ultimately, higher education (for example some history or philosophy of science) teaches you about the fluid nature of facts and knowledge, although the subtleties and implications (in my case) seep-in over a longer period of time. Particularly the structures for integrating, and frameworks for assessing facts (self correcting frameworks are cool). The observation this morning reminded me of a discussion with a friend a number of months ago. She was reading Bryson's: A Short History of Nearly Everything, (highly recommended!) and said she 'felt cheated' by her education (she's a practicing vet). The core of the problem was that Bryson's treatment of the history of science highlighted pointedly the fluidity of the 'facts' that define our conceptual frameworks defined and used in K-12 and beyond. The disgruntled perspective comes from the point that the (potential) instability of 'foundational knowledge' was only recently revealed to her, and serendipitously through reading the seemingly innocuous book.
Thinking about it deeper, it is clear that the perspectives that define the presented 'foundation knowledge' are those arrived at by consensus, basically perspectives that are the most amenable to a broader sample of humans (easy to associate or integrate) using popular methods. I was talking this over with D in the context of what we agree on as seminal dissertations and reference texts in our field, and it is clear for a rant to rise to the stature of 'seminal' there are a set of objectives to maximise, not limited to readability and precedence, as well as the simplicity (elegance!) and distinctness of the presented perspective of the material. Now, if only I could crack that nut...
Wednesday, March 26, 2008
Top-Down and Bottom-Up Acqusition of Principles
I colleague of mine recently went to a job interview, before which it was suggested to him to bone up on basic problem programming-based solving and related things (discrete math, Big O, algorithms, and data structures). The boning up was suggested because the interview was for a Software Engineering role and involved a skills assessment. He's a trained Electronic Engineer with some Computer Science and Research mixed in, and was reasonably concerned with the trade-off of not misrepresenting himself whilst aiming to nail the interview. The situation resulted in a discussion on the quandary of rapidly learning the first principles of a field of study.
A first pass at the problem lead to the natural observation of the skills assessment approach, and the general philosophy of: if you breeze through the easy problems, that trust is instilled in your capabilities with harder problems. All quite reasonable. For example, if I were to hand over commit rights on my baby to an unknown programmer, I'd need to know that they were aware of Evolutionary Computation 1 and 2 (or equivalent seminal references), and knew the principles of building good software. I related two anecdotes of my modest time out in the world developing software. The first advocated the need for first principles involving a story of how a grounding in concurrent programming helped my solve a sticky synchronization problem. The second example more interesting highlighted need to perpetually acquire relevant principles in unfamiliar domains, where I related a generic .NET web services problem and the general problem solving-based strategy for rapid self education (the mantra of the consultant?).
A deeper consideration highlighted the differences and inter-relatedness of the two approaches to acquiring information. The first case of classical education may be considered bottom-up, providing a slow, systematic, although generalised treatment of a domain (the classroom). The second case of as-needed self education may be considered top-down is highly motivated, fast, and importantly extracts specific principles from a domain (life long learning). The directedness of the two extremes highlight the manner in which the principles of a domain are acquired, submerged within or scavenging from above. In the case of the job interview, the expectation was a bottom-up education in programming and software engineering, although most of the excellent consultants I had the privilege to work with were trained engineers (seems topical) and most likely acquired the principles of software engineering in a top-down manner (at least initially).
The slow speed of the bottom-up approach promotes strong contemplation and likely improved integration via association (building the structures with which to associate). The generality and lack of personalized motivation, (at least in my case) suggest at broader acquisition of the domain of which only residual structures are accessible, although fuller understandings are derivable (for example you can't remember specifics, although they can be obtained through promoting ). Top-down information acquisition on the other hand is sticky in that they occasions that are the basis of antidotes, interestingly in my case both in the application of latent and acquisition of new principles. The specificity of the top down approach heavily biases both the coverage of the domain, and therefore the principles observed, acquired, and broader perspective of the domain acquired. A discussed example was the PhD itself, where although you become a world-class expert in a extremely narrow field of study, the amount of self education required to get there leaves a feeling of incompleteness ('what have I missed?') and general sense of resentment of the generalised scope of bottom-up education ('why wasn't topic X covered in subject Y?' or 'Why wasn't there are subject on X?').
Anyway, I am unclear on how these pondering fit into the scope of the so-called philosophy of education (if that is even the right context in which to consider such thoughts - a concern of top-down). Nevertheless it is interesting to consider some personal examples of this perspective on education. First-language acquisition is an example top-down education, although I feel somewhat cheated that the rules of English (grammar for example) were not treated in far more detail in secondary school. This highlights an important interplay between the extreme perspectives, that is the need to make the time to take it slow and acquire the principles of the broader domain after a top-down approach has been taken. One can imagine an agile life-long learning methodology of frequent top-down acquisition and infrequent periods of consolidation with bottom-up reinforcement. In fact, I have little doubt that effective people adopt a related strategy, either consciously or by design. The methods of acquisition is important because, although application of information is discrete and specific, competence is likely defined by having acquired information by both a top-down and bottom-up means.
Tuesday, March 25, 2008
Writing to Clear the 'Idea Buffer'
This site has been live for a week now and I thought I would reflect both on my motivations for writing and intentions for the future of this blog (yes I know it's a cliche!). Firstly, my motivations for writing are selfish. The two specific interrelated problems that this blog addresses are: (1) capturing of ideas that may otherwise never see the light of day, and (2) to purge my 'idea buffer' to foster progress.
I talk a lot of crap with a lot of people (typically neighbours, colleagues, and friends), which usually results in all kinds of distilled opinions and project ideas. Sometimes those ideas are written up, and sometimes those written up ideas are acted upon. It has taken me a number of years to realise that sadly much of the specifics from interesting discussions are lost, although the useful general principles are usually recycled. Posting on ideas and distillations of conversations provides an excellent free-form documentation process, with all the great things that the medium encourages, such as cross-referencing and comment amendments. The most important feature of the medium is 'public by default', promoting sharing with interested parties, and if I'm lucky comment from a broader community. Further, from my previous blogging exercises, I have found that the converse to 'public by default', is that generally people don't care, which mitigates so called 'sensitive issues' such as priority on ideas for projects, an effect that increases with the specialty of the subject matter.
Another lesson that has taken me a long time to learn and effectively exploit, is that my 'idea buffer' is finite, and once clogged up with a few interesting observations or principles for projects, all further thinking is biased. Additional idea generation is painted by whatever is clogging the buffer (for example if I'm interested in MapReduce, then everything will look like a divide and conquer), the elaboration of ideas in the buffer is stunted, and new ideas are assessed way too aggressively. Writing obsessively purges my buffer, and as a process has treated me well with regard to base work product for my PhD dissertation.
Frankly, selfish writing sucks (blogs included)! I try to address this by setting up topics, and ruthlessly cross-referencing both previous posts and related material. I have much to learn, although it helps to have less selfish writers as role models (for example Paul, Kevin, and Jeff). I think of it in terms of a multi-objective problem (for example attributes of 'audience'), where non-dominated solutions satisfy specific sub-groups (can't please everyone).
The plan for the future is to produce original content related loosely to software engineering and artificial intelligence, and average a post per day. I have few illusions regarding the construction of a passive income stream, although I have included Amazon Associate content. I like books and I discover and buy books based the discussions and recommendations of others. Therefore, referenced books link to Amazon, and the blog includes a single banner advertisement of suggested products based on the posted content (so-called Omakase widget), that latter of which can be ignored using the excellent (and free) AdBlock Plus extension in Firefox.
Monday, March 24, 2008
AI as Pushing the Bounds on Automated Computation
I was reading the post by KK on the acceptance of disruptive technology, where he comments on the ease in computer science and the difficulty in medicine and biology. What I found interesting was the setup comment on things computer's do well (better than humans). Specifically, he lists automated computation activities such as arithmetic, spelling and OR.
This got me thinking about two things, how automated computation is taken for granted, and the other things we exploit computers for besides automation. Regarding the first point, a raw interpretation of HCI promotes the idea that peripherals such as displays, keyboards, and software are tools for optimally automating our computation (cheap automated computation is no doubt responsible for the information age). Fair enough, although my paraphrasing clearly presents an antiquated perspective. Computers are also our note books, source of entertainment, epicentre for personal research, and perhaps most importantly our portal to broader social interaction.
Anyway, back to computation. Computers are great at repeating well formed procedures, quickly. Although, even with lots and lots of computers, there are problems that still really hard. One may simplify things such that progress may be achieved through work on rephrasing problems to make them amenable to automation (problem focus), work on computational systems that can be applied to automation tasks (algorithm focus), or do both at the same time.
An intersection that has always fascinated me is automated induction with humans in the loop, such as interactive evolutionary computation (unfortunately great examples don't extend beyond art), and more recently popular interactive machine learning. These are excellent technologies exploiting the principle of using humans for parts of the problem for which they excel, and machines for fast automated computation. Two other examples that I am extremely fond of include: the use of computers to harness large diverse sets of human problem solvers (crowdsourcing such as the Mechanical Turk), and the genius move of exploiting such a pool to both solve individual human problems and broader problems at the same time (so-called Human Computation, with reCAPTCHA as the seminal example).
Kelley's comment also provoked thoughts on other computation humans don't do well, no doubt reflected in what our automatons cannot do well. In fact it is fair to say the majority of work we get computers to do is just scaled up (time and or space) automation's of what we would get humans to do. Anyway, two examples of what humans are generally bad at include long term planning, and understanding influences on short term decision making, although once identified, we can improve with frameworks and training. For example, The Long Now Foundation, (in which KK is involved) that promotes long term thinking, and Gladwell's Blink on our limited understanding on instinctive decision making.
An automated computation perspective helps simplify things, especially when considering Artificial Intelligence. It naively suggests that all we have to do is apply ruthless reductionism to complex systems, distill them to salient information processing qualities, and realise them as procedures in computers. As uncomfortable as this may appear at first glance, it works, as long as you use a good methodology. It's hard work, and we have much to learn, although with so much cheap computation around, we can start to study emergence in very large simulations (such as IBM Blue Brain simulating complete portions of the neocortex).
Sunday, March 23, 2008
Toward a Standarised Description of Optimization Techniques
In considering the insane notion to implement all computational intelligence techniques, one can't help but think about standardisation, simply as a method for reducing effort. A particular standardisation I have been using in my thesis, and talking over with D is that of the description of biologically inspired optimisation algorithms, that naturally extends to all of CI.
The idea was born out of the question as to how to communicate an approach in one page. Specifically, the salient features and capabilities of an approach (also valid for a class of approaches) would be distilled into a single A4 page, along the lines of a cheat sheet or a one page resume. The reason for such compression is not to communicate the in's and out's of a given approach (there are plenty of works out there that do that well enough), instead, the idea was proposed out of the necessity to compress the information on approaches to effectively communicate choice between approaches.
The general framework involved describing each approach in terms of four key perspectives:
- Metaphor: The inspiration that motivated the human in the development of the approach, providing a common context for the strategy.
- Strategy: The information processing properties and emergent behaviours of the approach, describing the principles and expectations for the operations.
- Operations: A technical summation of the bottom-up procedures for achieving the strategy, sufficient for specialisation and implementation.
- Further Reading: Lists primary sources and seminal works supporting and elaborating the compressed descriptions.
The collection and communication of the approaches in this method would provide insights into both important similarities and differences between approaches, potentially promoting improved hybridisation, informed extension, and at the very least a broader perspective than that of the Computational Intelligence or Metaheuristics perspectives (or subsidiaries).
The vision for presenting this work is either a book (Algorithm Atlas) and a web page, potentially hooked into the mapping of NFL, and potentially generally editable toward community-based improvement.
Saturday, March 22, 2008
Mapping 'no free lunch'
An idea I have been rolling around recently with my colleague (Dan) is to race a stack of machine learning algorithms (starting with computational intelligence algorithms on optimization problems). This basic idea has expanded to include notions of full automation of case integration and execution (upload your own systems and problems), and beyond.
Firstly, racing is a bad idea. We have No Free Lunch, and we have 60+ years of advice from related disciplines telling us it is fundamentally anti-intellectual. Frankly, we are not interested in a winner (silver bullet black box algorithm), rather in the distribution of winners. Specifically, in using automated statistical tools to collect, maintain, provide, and promote the relationships between quantifiable measures, problems, and algorithms. It's an ambitious goal, as the scope of things to assess in the literature is massive (although finite)! The notion was born out of the general observations of the difficulty of reproducing results and lack of consistency in methods results. For example, the state is so bad in CI, that one would be considered crazy these days to refer to the results from another paper (trust is very low).
Many have tried to do similar things. There was statlog (in fact I was inspired by Duch's result listing), there are competitions, and there are awesome libraries of standard measures and algorithms. Two reasons why I think this has not been addressed on a large scale are (1) there's no money in it, and (2) it's hard, real hard.
Regarding funding, I think the data is valuable. First tier, you make and the software and results freely available and seek simple advertisement revenue. Popularly is promoted bottom-up through publications and symposiums. Maybe there's grant funding that could be funneled in if the 'right perspective' can be devised for a given project. The competitive advantage of the first tier is weak, generation and availability based on a mishmash of in house and open source software (academic software is generally public domain). Second tier, is about consulting based on the skills and knowledge acquired developing the resource. Specifically consulting on the application of the technology using in house tools and methods, and most likely custom build solutions. A hard business to enter, and all the money is in the custom builds. The alternative route is academic, which given the amount of information that could be mined would no doubt provide a steady stream of publications.
Regarding difficulty, I think the best way to address the complexities and scope is decomposition with aggressively iteration. With a good design, algorithms, problems, and measures could be added in dynamically, where all execution is farmed in small jobs, and all human analysis is performed with RDBMS queries. The importance is that results are permanent, in that once a measure is calculated for a run-algorithm combination, it is available for all future analysis. This addresses the enormous problem (as I see it) of duplicated effort, in particular software engineering by scientists (poor software), and research by software engineers (poor method). Further, a good design also instills longevity and self-maintenance into the system. For example, full automation of the addition of the measure/algorithm/problem submission could be achieved with code-reviews, publication evidence, and reputation systems to promote decentralised trust and control in the system.
Beyond the funding and the difficulty, the contribution of such an effort would not only be extremely fun, it would raise the level of quality and accessibility of knowledge in the field immediately and permanently, potentially (given the uptake) influencing the way contributions are made or perceived.
Friday, March 21, 2008
Perspectives on Intelligent Systems
When I first started my PhD back in 2005 (even my Masters before that), it was not clear to me how the AI I was working on fit into the broader and popular notions of the field. I new it all fit together somehow, although I lacked the perspective or the inclination to piece it together (situation of not seeing the forest for the trees). Two things that occurred to me recently as I outlined my latest understanding of the patchwork of AI related to my work, is the importance of the broader perspective for motivating work, and the apparent lack of papers on there outlining the various perspectives and how they relate. I guess I'm all about abstractions and metaphors so I may have over estimated the topics importance.
I have tackled this problem a few times in the past, the results of which colour my current understandings. Academically, I was introduced to intelligent systems in three ways: (1) first principles such as logic and reasoning, (2) soft computing such as neural and evolutionary computing, and (3) intelligent agents and multiagent systems. A typical formal undergraduate education I suspect. I like the messy stuff over the logic or game theoretic stuff, so that was the perspective I pursued.
In my own time, I ruthlessly pursued my interests which exposed me two different perspectives including (1) game programming such as Bots and non-playable characters for first person shooters, (2) data mining specifically classification and visualisation, and (3) optimisation for programming competitions and related activities. The first and last examples I suspect any game playing programmer type would have investigated.
Early during my project I investigated a perspective that was popular in my research group called Biologically Inspired Computation. Later I investigated a related and more recent perspective called Metaheuristics. As I mentioned, these perspectives, and adaptive systems theory shaped my understanding and therefore the motivations of my own research.
More recently, as I have pulled everything together to paint the picture (write the narrative) of my work for the submission I have realised that all these 'fields' have the same general aim, although importantly emphasised different qualities to promote different (usually distinct) perspectives. For example, Soft Computing was all about less ridged systems compared to GOFAI, Machine Learning focused on pattern recognition and learning paradigms, Biologically Inspired approaches motivated abstractions from the natural world, Computational Intelligence unified disparate strategy-based fields, Agents incorporated rational models, Multiagent Systems unified bottom-up Distributed AI, Metaheuristics integrated Operations Research into the strategy focused approaches, and so on.
The acknowledgement that multiple perspectives exist, and are a good thing, should negate petty field-rivalries. The key for me was to assess the structure of the range of popular AI books, and to integrate the definitions of 'sub fields' drawn from seminal papers. Now the beautiful thing is, I believe the same internalisation could be achieved simply from a comprehensive reading of the overviews provided in Wikipedia. The AI Portal for example is cool.
Anyway, I think the perspective ah-huh moment is critically important to being able to exploit findings and even whole paradigms across sub-fields. This is a huge problem, where, as soon as the nomenclature or terminology is shifted, novelty is (re)discovered. That's cool for the empire builders, although it sucks if you want to want to piece together a bigger picture, or more interestingly have an impact.
Thursday, March 20, 2008
Chunking Structures and Brains
I have been thinking about is something I read a long time ago about the key to efficiently absorbing information was to get good at pick out and internalising salient features, and specifically that this and the capability difference between the novice and the master (slow/fast, dumb/smart) in a given field (It was some public digestible take on the research. Ring any bells?). These thoughts were provoked by Jeff Moser's post on becoming a grand master software developer via an active chunking methodology. It is a great post, although I am less interested in the notions as an approach toward being an awesome developer, and more interested in the broader notions as they relate to efficient associative systems and data structures. Specifically, Jeff's coverage of hierarchical chunking suggests (to me) at procedures toward automated abstraction along the lines of a transition from sub-symbolic toward symbolic meaningful for a specific context (the grand visions of vast hierarchical Artificial Neural Networks or Genetic Programs).
Toward this end, I was thinking about the TED talk by Neuroanatomist Jill Bolte Taylor on her experiences studying her own massive stroke 12 years ago. It was a great talk, although I was fascinated by her initial description of the brain. Specifically, she described the two hemispheres in terms of the analytical left focused on association and anticipation, and the right concerned with focusing and integrating sensory input (she phrased it all much better than I can, watch the video). Anyway, I was interested in the computational properties of such a system, specifically if such assumptions are made in the design of a toy system, how do you get the two parts to cooperate to do useful things. I was also thinking along the lines of how such a system would get started, specifically associations cannot be made without a knowledge base. The usefulness of the associations (in terms of forward and backward propagation of the associations for memory reuse and anticipation) would be biased by the extent and accuracy of the association-drive knowledge base. Further, could such knowledge could be built up using a hierarchical chunking method starting with sub-symbolic cases until associations start being useful - symbolic in the sense that they or their effects are meaningful in the world.
Anyway, I was mulling on this and came across the notion that children's memory may be more reliable than that of adults because of the lack of prior association bias (they are more specific, and call it Fuzzy Trace Theory). These ideas were familiar to me because of the similar features you see in adaptive systems like the immune system (something I've learned a few things about), where something like the finite capability of novel antibody generation is used up as in the specific pathogenic exposures of a hosts environment (along the lines of time of exposure bias in the case of Antigenic Sin, and the ongoing biasing of the capability of the repertoire).
Wednesday, March 19, 2008
Publish or Push?
In an effort to prompt one of my fellow colleagues into working more on his research, my supervised made a comment along the lines of "you may have many great ideas in your thesis, but unless you publish them, only about three people on the planet will know about them". Although the comment was somewhat facetious, this immediately got my cogs turning, as I too have published little from my thesis at this stage, although I have pushed many of my ideas out in the form of small reusable technical reports.
I am a major advocate of getting ideas out at any cost, particularly with regard to personal attribution or first-pass correctness taking a back seat. The thing I think is great about peer-review is that many eyes make the bug pool shallow, and when institutionalised it means humans a forced to sit down and at the very least read your ponderings, and at best pick through your arguments and findings. Unfortunately, from my limited perspective the so-called 'standardisation of quality' promoted by the process is somewhat lacking, specifically with regard to my readings in computer science (without naming names, I read broadly across themes in artificial intelligence). On the one had I see publication spamming as a mechanism for academic empire building (an axiom of the scholarly method and aggressive extension of 'publish or perish') and a revenue stream for large publishing houses. On the other hand, I suspect the mediocrity of the majority published work is caused by the shear volume of annual contributors, the effect this has on expanding the pool of reviewers, the monetary incentives for publishing houses to accommodate the size of the market, and the diverse state of computer science education and promoted research methods.
Anyway, all of that aside, there are plenty of avenues for pushing ideas out beyond 'feeding the machine'. Specifically, there are open access options and pre-print archives that promote dissemination of information, obviously, there are technical reports, and more interestingly, there is the increasing trend of academic blogging. I equate of this last point in terms of the researchers of yesteryear keeping notebooks and writing letters (often posthumously published). In particular the openness of website or a blog (public by default) and searchability of the web make this previously private commentary available on demand.
Two things I have observed with this is everyone (the new and the old school of academics) hits Google (or equivalent) for a first pass instead of the library or search on the publishing houses websites, and the creditability of results is still assessed by the conventional peer-reviewed publishing houses standards. This means you can push all your crazy ideas out there, people will find them, although they are (generally) only trusted if you can back them up with some kind of publication. It is generally the case where seminal white papers or technical reports are supplanted with journal papers a number of years later.
Two excellent examples that came out of a discussion with a colleague (Dan) regarding this point on credibility. The first was Wikipedia that is used as a general 'first port of call' for general information although is criticised against the old standards for generally being written by anonymous non-experts and therefore has a low creditability. The second example was that of the recent trend in publishing albums on line, the first case with Radiohead with a proven (classical) track record (creditable and with existing fan base) whose album In Rainbows was an online success, and Saul Williams (as helped by Reznor who with NIN later 'pulled a Radiohead'), who had no such credibility and was not a success.
The point is I think that the on demand availability of information will clobber the conventional standards of evaluation, where it will be the amount of information disseminated and amount of reuse (PageRank? or Whuffie?) that will define future credibility of pushed research (scale or scope of reuse defining the discrete usefulness). This is simply an on-demand (dynamic) re-phrasing of the classical source evaluation and citation system in print media. If that is the case, then surely Google is on the right track.
Tuesday, March 18, 2008
Hard Working Genius?
Over beer, a colleague of mine (Dave) made the comment along the lines of "only great contributions to science can come from autistic-smart people". Here, contributions generally refers to innovations (in the context we were talking about science and paradigm shifts) and so-called 'autistic smart' refers to a higher level of intelligence credited to a level of autism, or "autistic genius" in that you must be autistic to be that smart (in the colloquial sense such as specialist intelligence).
Firstly, I have got to point out that autism is no picnic, it characterises a spectrum of structural malformations of the brain, typically during development and genetic in origin. It is easy to assert the premise based on the correlation of general properties, such as obsession, underdeveloped social skills, and intelligence quotient between genesis and autistic (autistic savant for the latter property). This correlation is so effortless that people publish books analysing dead genius through the autism lenses, for example see Genius Genes: How Asperger Talents Changed the World.
There is also a spectrum of genius, and the kind I have always admired are those famed for scientific contributions. Specifically, when it comes to ‘hard work’ and ‘genius’ I think of Darwin toiling for about 25 years on the principles and evidence that became 'the Origin. I also think of the Newton and the papers that became his Principia Mathematica. These guys were way smart, but they also did a lot of work! I also get images of the hundreds of Nobel laureates who generally are credited with their contributions when they were young and ambitious (read as: bright, well positioned, and energetic enough to follow through on their ambitions).
I generally disagree with the premise, as my philology has always been 'hard work pays off', and being a trained software engineer means that my hard work in the short term is directed toward maximising my laziness (potential for less work in the longer term). For example, abstracting and solving generalisations of problems so I can maximise my effort through solution reuse. I’m not advocating that hard work makes you a genius, but rather it maximises the contribution positional of a person genius or otherwise. Beyond the technical definition of the term (IQ > 140), you may be credited as a genius by the products created by simply for sticking with (obsessing about) a problem for long enough to see it through.
How to Think
I came across How To Think by Ed Boyden via O'Reilly Radar and it resonated with my attitudes, specifically in the context of working toward my PhD.
Two points I particularly liked were Synthesize new ideas constantly and Document everything obsessively as these are principles I have adopted in recent years (see my recent technical reports as evidence). Even his notion of write up best-practices protocols I have found useful in my report on Debrief of PhD Project Practices, and an upcoming report on Dissertation Writing Best Practices.
Anyway, Ed's post prompted me to create this blog with the basic premise of to never read passively. Hopefully I can capture some writings more coherent that my previous blogging efforts in Pensive Pondering, and the CIS Lab Blog.

