Introduction
I had the priviledge of presenting at CodeStock. It was absolutely great. I was surprised and humbled at the reception of my session regarding Machine Learning. As such, I wanted to do a series of posts regarding what it is I wish to accomplish.
Machine Learning is Hard
Because the stuff is so intriguing, I have spent the last number of years trying to figure the stuff out! I would certainly not classify myself as an expert (by any means), but I think I have a general idea of the field.
Machine learning can be seperated into roughly 3 classifications:
- Supervised Learning - learning from labeled examples
- Unsupervised Learning - learning from unlabeled examples
- Other - Hybrid of the above two, structured prediction, reinforcement learning, etc.
The (perceived) source of difficulty in these three areas is the enormous amount of math, probability, and statistics that is needed in order to begin solving problems via this approach. I remember sitting with my adviser where we both wondered alound: "Why aren't these things used more?" I thought I had an idea that might work. Before we get to that, I though I would share the general idea of machine learning.
A better way
Our standard approach to solving programming problems is to sit down and come up with a series of finite steps that lead to a solution (an algorithm so-to-speak). The idea behind machine learning is to simply provide the machine with data and let it decide how to solve the problem. Although this sounds mildly creepy, the principles behind this type of approach are not at all foreign to us.
In summary, (as stated by my adviser): we are "replacing humans writing code with humans supplying data" and letting the machine decide the best approach.
Skynet (or generalization)
If you are currently fearing for your life because you think we are sowing the seeds of our future destruction, fear not. The main impetus behind machine learning is generalization. In other words, can we find patterns that can be exploited in the data to produce the desired results? Absolutely! Will those patterns always have the ability to produce the desired results? Unfortunately not. It turns out the machine will only be as smart as you!. The machine cannot generalize useful information from unimportant data.
Show us something already!
The second source of difficulty is knowing how to use the data we have to generalize properly. This process is often referred to as feature selection. What pieces of data do we use to create the model? How do we convert these features into the corresponding mathematical representations?
This is were our ideas came in. As a developer, we always represent our data in classes. Always have and always will (well... see F#). Is there an easy way to represent things in classes whereby a machine learning algorithm can automatically make the appropriate conversions to the mathematical representations and also select the appropriate learning algorithm? I think so; take a look:
public class Student
{
[Feature]
public string Name { get; set; }
[Feature]
public Grade Grade { get; set; }
[Feature]
public double GPA { get; set; }
[Feature]
public int Age { get; set; }
[Feature]
public bool Tall { get; set; }
[Feature]
public int Friends { get; set; }
[Learn]
public bool Nice { get; set; }
}
The idea here is that we have a student with 6 features (Name, Grade, GPA, Age, Tall, Friends) and want to learn whether the features can predict whether he or she is Nice (target). In an effort to help the machine learn this, we will provide it with a list of students with all properties filled out, and ask it to predict the niceness of future students. This is the representation I have chosen for automatic feature selection. I would love feedback on this representation. The purpose of this representation is to attempt to reduce the friction in using ml algorithms because of the difficulty of feature selection and feature conversions.
Wrapping it up
The current incarnation of this code can be found ~~at http://machine.codeplex.com/~~ GitHub. There is also a drop with this very example. In future posts I will get into the general ideas behing supervised and unsupervised learning as well as the particular algorithms I have implemented to test this theory. I would love your feedback.
Does it work?
I got an interesting set of emails (only 3 days after my demo) where there were already some truly novel things being done using this approach to machine learning. Needless to say I was very impressed! Perhaps they will let me use their particular implementations as a case study. In short - YES! This stuff totally works! All I have done is create a new way of interacting with tried and proven machine algorithms.
- Does it make sense?
- Did it help you solve a problem?
- Were you looking for something else?