Terminology and key issues with Machine Learning

These are some of the terms which are used in machine learning algorithms.

  • Training Example: An example of the form [x, f(x)]. Statisticians call it ‘Sample’. It is also called ‘Training Instance’.
  • Target Function: This is the true function ‘f’, that we are trying to learn.
  • Target Concept: It is a boolean function where
    • f(x) = 1 are called positive instances
    • f(x) = 0 are called negative instances
  • Hypothesis: In every algorithm we will try to come up with some hypothesis space which is close to target function ‘f’.
  • Hypothesis Space: The space of all hypothesis that can be output by a program. Version Space is a subset of this space.
  • Classifier: It’s a discrete valued function.
    • Classifier is what a learner outputs. A learning program is a program where output is also a program.
    • Once we have the classifier we replace the learning algorithm with the classifier.
    • Program vs Output and Learner vs Classifier are same

Some of the notations commonly used in Machine Learning related white papers

ml_reps

Some of the key issues with machine learning algos

  • What is a good hypothesis space? Is past data good enough?
  • What algorithms fit to what spaces? Different spaces need different algorithms
  • How can we optimize the accuracy of future data points? (this is also called as ‘Problem of Overfitting‘)
  • How to select the features from the training examples? (this is also called ‘Curse of Dimentionality‘)
  • How can we have confidence in results? How much training data is required to find accurate hypothesis (it’s a statistics question)
  • Are learning problems computationally intractable? (Is the solution scalable)
  • Engineering problem? (how to formulate application problems into ML problems)

Note: Problem of Overfitting and Curse of Dimentionality will be there with most of the real time problems, we will look into each of these problems while studying individual algorithms.

REFERENCES

http://www.cs.waikato.ac.nz/Teaching/COMP317B/Week_1/AlgorithmDesignTechniques.pdf

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Mawazo

Mostly technology with occasional sprinkling of other random thoughts

amintabar

Amir Amintabar's personal page

101 Books

Reading my way through Time Magazine's 100 Greatest Novels since 1923 (plus Ulysses)

Seek, Plunnge and more...

My words, my world...

ARRM Foundation

Do not wait for leaders; do it alone, person to person - Mother Teresa

Executive Management

An unexamined life is not worth living – Socrates

Diabolical or Smart

Nitwit, Blubber, Oddment, Tweak !!

javaproffesionals

A topnotch WordPress.com site

thehandwritinganalyst

Just another WordPress.com site

coding algorithms

"An approximate answer to the right problem is worth a good deal more than an exact answer to an approximate problem." -- John Tukey