Tuesday, May 17, 2011

House Price Regression: Vermont South, Melbourne

While looking for a house I maintained statistics on the main suburbs we were visiting, and more specifically, on the houses we looked at.

Each house we visited was about 4 bed rooms and generally had the same kinds of attributes - attributes we thought we wanted in a house. For each house we visited, I recorded the address, size of land in square meters, date of sale, sale type (auction, private), asking price, sale price, and other assorted details. An additional contrived measure was the driving distance to main shops reported by Google Maps, in kilometers.

I also supplemented the dataset with additional matching houses in the area when data was available. I found prices were sometimes available from the auction results, although in other cases I had to call to find out, and.or scourer the web.

Rather than let this information go to waste, I thought I would share some of the collected data. This post provides data I collected for the Melbourne suburb of Vermont South.

The following graph simply shows sale price by date, quite boring.

The following graph shows the sale price by land size in square meters.
The following graph shows the sale price by distance to a specific set of shops in kilometers.
I found the data generally useful for plugging in new places and using simple linear regression to help answer questions about expected price at auction or private sale.

Some of this data may be available for purchase from various retail data providers, but I found collecting and entering the data myself made it a lot more personal and gave me some additional focus when inspecting properties and talking to agents about trends.

Monday, May 16, 2011

So, we bought a house

So we finally bought a house. We've been looking on and off for about 12 months although things got serious about 3 months ago.

We first looked at the place we bought last Saturday, and walking in the door I knew it was a strong contender. We looked at three other places that day, and they all paled in comparison.

The place had been passed in at auction nearly three months before, and we were told that initially the vendors expectations were too high. We saw this as a good opportunity to negotiate and try to broker a good deal. The market had been slumped for a few months and the early figures for the quarter had indicated a ~2.5% drop in median house price for the city.

We did another inspection on the following Tuesday and enlisted all of the troops (extended family) to give the place a good once over. We then sat down and signed a formal offer. It was rejected. We upped the offer $5K and it was accepted, although at the insistence of my wife we made the offer contingent on the outcome of a builders inspection.

A found a company in the yellow pages and had the inspection done on the last day of the 3 day cooling off period. The building and pest report was incredibly detailed, providing photos and a room-by-room summary, inside and out. We learned a lot about the types of preventative maintenance the place will need over the next 5-10 years, and more importantly, we learned that the upstairs balcony had some major structural problems.

The report said that the wood used was popular in the decade that the balcony was built and was known to rot unless property treated. Rather than expecting the vendor to return the balcony to new condition, we made an offer to split the difference, deducting half of the cost of the repair from our offer.

All of the negotiating occurred on the last day of cooling off period, a Friday. I had my wife on one had, adamant that she didn't want to pay a thing to have the balcony fixed, and the agent on the other hand threatening to open the property for inspection on the next day. I really liked the place and I was feeling totally strung out (to say the least).

We managed to broker a deal in the end and initial the final amendment on the Saturday, one week from our first inspection. With previous auctions and negotiations, I tried to remain emotionless, time was on our side and we could wait for a deal. I really liked this place and it was beginning to dawn on me that our remaining time to find a place (before the baby came) had shrunk to a matter of a few months. We're both happy we finally got there and have high hopes for turning the property into our home.

We learned a lot throughout the process. My analysis of median house prices, suburb selection, crime rates, and even travel time studies months ago were interesting, although in the end did not directly affect the outcome. Even the detailed suburb house price regressions I was building up were not used, as we ended up buying in a completing different suburb, inspecting the house on a whim.

I was told early on that buying a home is different from buying an investment, and it bit me in the end, because its emotional. If/when there is a next time, at least our expectations - that it is a long hard emotional roller coaster - will mean we'll be better prepared. Hopefully.

Rather than letting them go to waste, I'll post some regression analysis for a selected suburb soon.

Tuesday, May 3, 2011

Quake AI Programming Book

I intend to write a follow-up book to the Nature-Inspired Clever Algorithms book on Machine Learning. I have a lot going on this year, so I was thinking of postponing it until 2012. If I do decided to go down this road, I was thinking of taking on a different project in 2011 that would be smaller in scope, less taxing, although still interesting and rewarding.

I have been thinking of writing a book about the AI in the Quake series of computer games. I was thinking of either writing a book that analysed the Artificial Intelligence architecture in each game in the series, or analyze the AI in the bot modifications. Perhaps both. The book would walk through monster or bot case studies and describe how they fit together, think, and behave. Perhaps with small experiments and demonstrations along the way. The kind of book that would have captured me as a game programming hacker 15 years ago.

In pondering this idea, I thought it prudent to explore other books written on or related to this idea. The following is a list of books that I found:

Quake Series Programming Books

Related Programming Books
These are by no means the cream of the crop of game AI programming, and there are in fact many level design books in there as well.

All of these books are focused on teaching some form of programming or game development using an existing game as a medium. The advantage of the Quake series is that the source code is released under the GPL. The Unreal series and the Half-Life (Source Engine) series are not released as open source, although do provide access to some aspects of the source under restricted licence for the modding community.

It is clear that there is interest/demand for books on game development based on the Unreal series, which makes a lot of sense given their general success in licencing the technology.

Some concerns about tackling such a project include:
  • Interest: The games in the Quake series are old (10-15 years). The methods may be outdated, they may not be relevant to modern computer games, and it is more than likely that no one will care
  • Low Barrier: It is more than likely that no one has undertaken such a project because the barrier is so low. One can simply read the code and understand what is happening, no analysis is necessary. 
  • Copyright: Although the source code is released under the GPL, the game assets are not. One may have to acquire a licensed copy of the game to do any meaningful development. Additionally, my use of game screenshots may be restricted (fair use!?).
There is some effort required to produce such a work. Getting each project setup may be involved, especially across the three main platforms (Windows, Mac, Linux). The work would be primarily analysis: reading source code, experimenting and communicating what is happening with diagrams and descriptions. This tinker-write cycle is slightly more relaxed than the deep research needed for each algorithm in a machine learning text.

Is there interest in the market? Would you read or skim such a book?
Let me know what you think in a comment or email.

Monday, May 2, 2011

The Little Taxonomist

We are expecting our first child in a bunch of months and I have been thinking about all kinds of science experiments to perform on/with the bub. I had an idea last week for what I think is a cool little web app that allows a a parent and their child to catalogue the native species around their home and learn more about their local environment. I am referring to this idea as "Little Taxonomist".

I'll present the idea in the context of some stories:

Story #1
A father and son are curious about the plants and animals in and around their house and neighbourhood. In an effort to learn more, they decide to begin to catalogue the things they see in their backyard. They select a subject (say a flower, tree, or insect), photograph it, and note down a few descriptive phrases. They enter this information into a web application. The web application accepts the image and structured description and makes informed guesses (based on location and time of year and subjects collected by others) as to the exact species of the subject. The web application also suggests interesting subjects that are known or expected to exist in their area (probabilistically based on entries of other entries in the area), and some information about where they might be found, creating a context sensitive scavenger hunt. Slowly, a subject every few days, more on weekends, over weeks, they build up a catalogue of plants and animals in and around their home. Together they have learned more about the specifics of the suburban flora and fauna.

Story #2
A primary school class have a week or month long assignment that is a scavenger hunt of genera and/or species in the school ground. The class is split into pairs or small groups and allocated a flip camera (or equivalent single button point-and-shoot). They have a list of hints or requirements as well as blank pages, structured to capture basic taxonomic descriptions. Students return to the classroom and use the computer to copy the photos from the camera, drag them into the web application and add their descriptions. Students are awarded badges and points for the breadth and depth of species described, teams are ranked, and some indication of what other groups are finding is provided.

Vision
A web application for children to be completed in groups or with a guardian. The objective is to describe subjects in the local area (home or neighbourhood) and in so doing learn more about the local flora and fauna. The system encourages the collection of species by intelligently guessing based on brief descriptions and photos as to the actual known species. A gamaification layer is provided that includes badges, points, leaderboards, and similar extrinsic motivators. Additionally, the system uses the localized information in aggregate to suggest subjects to look for (to "collect"), and a probabilistic expectation that they can be spotted (you have a 90% chance of seeing a fruit bat between 5pm and 8pm by looking up). This probabilistic understating of what can be seen in the local area would be coupled with the gamification system, highlighting rare finds. The system would provide all data in aggregate (anonymize) allowing kids to explore what others are finding in their neighbourhood and the types of descriptions being used.

The following are some mock screens I hacked together in Google Docs:

Mock: Add Subject

Mock: List of Subjects


Forget kids, I want to use this. There might be a general case for adults with smart phones.

I am not sure whether I will build it yet, I figure it needs a 5-10 year old child to make it fun. I figure it could make money by selling some cheap cameras with a website subscription or maybe targeted advertising (kids+science).

A friend pointed me to Project Noah, which is a similar idea, but not the same.

I'm eager to hear what people think, seen anything like this? Would you use it yourself or with a child?

Sunday, May 1, 2011

May Challenge: Touch-type Faster!

We learned touch-typing in high-school. I think we did it for two or three years. Nevertheless, I suck at touch-typing, and as a programmer it may be considered embarrassing.

I can type without looking at the keyboard, so technically, I touch-type all day long. What I don't do is type as I was taught was "correct". I think that as a consequence I am starting to get some wrist pain.

Anyway, in the constant battlefield of self-improvement, I thought I would take the month of May as an opportunity to improve my touch typing and hopeful start typing "correctly" from the end of May onward in day-to-day computer interaction.

The May Challenge is to perform one lesson in touch-typing each day and to measure a standard typing word count each day. Hopefully this word count and/or accuracy will improve by the end of the month and my confidence in correctly touch typing will also improve.

At this stage, I believe I will use the free online service www.typingweb.com because it provides lessons and ad word speed tests, and of course it's free. I just took a test using the "correct" home-row based method and scored: 18 WPM average, 23 WPM gross, and 96% accuracy. It would be interesting to see what my score would be with my unorthodox method - a bad idea, I suspect it may reinforce the bad method.

Again, as with April, I will adopt a penalty-based approach to the challenge. For each day that I miss, I will have to donate $20 to an open source project of choice (not a .NET project as with last month). This is less relaxed (I support opensource, generally), because I am concerned that commitment required may mean that I miss a day or two here and here. We'll see. $20 a day is still a decent disincentive.

I'll be sure to summarize progress at the end of the month.

Image copyright Wouter Verhelst.