07 | November | 2015 | coding algorithms

Terminology and key issues with Machine Learning

November 7, 2015 by Niranjan Tallapalli Leave a comment

These are some of the terms which are used in machine learning algorithms.

Training Example: An example of the form [x, f(x)]. Statisticians call it ‘Sample’. It is also called ‘Training Instance’.
Target Function: This is the true function ‘f’, that we are trying to learn.
Target Concept: It is a boolean function where
- f(x) = 1 are called positive instances
- f(x) = 0 are called negative instances
Hypothesis: In every algorithm we will try to come up with some hypothesis space which is close to target function ‘f’.
Hypothesis Space: The space of all hypothesis that can be output by a program. Version Space is a subset of this space.
Classifier: It’s a discrete valued function.
- Classifier is what a learner outputs. A learning program is a program where output is also a program.
- Once we have the classifier we replace the learning algorithm with the classifier.
- Program vs Output and Learner vs Classifier are same

Some of the notations commonly used in Machine Learning related white papers

Some of the key issues with machine learning algos

What is a good hypothesis space? Is past data good enough?
What algorithms fit to what spaces? Different spaces need different algorithms
How can we optimize the accuracy of future data points? (this is also called as ‘Problem of Overfitting‘)
How to select the features from the training examples? (this is also called ‘Curse of Dimentionality‘)
How can we have confidence in results? How much training data is required to find accurate hypothesis (it’s a statistics question)
Are learning problems computationally intractable? (Is the solution scalable)
Engineering problem? (how to formulate application problems into ML problems)

Note: Problem of Overfitting and Curse of Dimentionality will be there with most of the real time problems, we will look into each of these problems while studying individual algorithms.

REFERENCES

http://www.cs.waikato.ac.nz/Teaching/COMP317B/Week_1/AlgorithmDesignTechniques.pdf

Filed under Machine Learning

Machine Learning and its overall territory

November 7, 2015 by Niranjan Tallapalli Leave a comment

Machine Learning is automating automation OR getting computers to program themselves.
*sometimes writing software will be a bottleneck (like face detection, handwriting to ASCII mapping, stock prediction etc), so let the data do the work instead

Every machine learning algorithm has 3 components.

Representation
Evaluation
Optimization

Representation
Like we programmers need a programming language like java/scala.. etc to develop a program, machine learning needs languages to accomplish learning, they are

Decision Trees: these are much widely used in ML
Set of Rules: Its simple set of rules (like a bunch of if-else conditions)
Instances: This is one of the easiest, oldest and simplest lazy learnings of ML.
Graphical Models: Bayes and Markov nets.
Neural Networks
Support Vector Machines: These are very important in business based learning and use much sophisticated mathematics.
Model Ensembles: These are the newest ones (ex: Gradient Boosting Machine)

New representations come much less often than compared to next phases of ‘Evaluation’ and ‘Optimization’ and hence this is like a first time effort. Once a representation is chosen, it means that we have chosen a language and now the actual work starts, that is ‘Evaluation’

Evaluation
It explains how to measure accuracy of a ML program. There are few concepts that we use to measure accuracy (consider spam detection example)

Accuracy: a program which counts number of spams which are actually marked spam and same with non-spams.
Precision: What fraction of our predicted spams are actually spams (0-1 probability)
Recall
Squared Error: Square of the difference between the predicted value and the actual value.
Likelihood: How likely is what we are seeing according to our model. Likelihood is good when the learning is not very powerful.
Posterior Probability: It is a combination of Likelihood and ‘Prime Probability’. Along with likelihood it gives weightage to our beliefs.
Cost/Utility: We should consider the cost of ‘false positives’ and ‘false negatives’ as they can be very different and expensive.
Margin: If we draw a line between spams and non-spams then the distance between the line and spams/non-spams is the margin.
Entropy: It is used to measure the degree of uncertainty.
KL Divergence

Optimization
The type of optimization to use depends on what type of Representation and Evaluation we are using.

Combinatorial Optimization: We use this if representation is discrete. (ex: greedy search)
Convex Optimization: We use this if representation is discrete. (ex: gradient descent)
Constrained Optimization: Its continuous optimization subjected to many constraints (in a way this is combination of above both, like Linear Programming)

This gives the complete picture of Machine Learning and we will dive into each ML algorithm going forward.

Filed under Machine Learning Tagged with Machine Learning

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30

coding algorithms

Terminology and key issues with Machine Learning

Machine Learning and its overall territory

Coding Algorithms is referred

Categories

Subscribe via email

Mostly Viewed

Recent Comments

Recent Posts

Archives

Blogs I Follow

coding algorithms

Terminology and key issues with Machine Learning

Share this:

Machine Learning and its overall territory

Share this:

Coding Algorithms is referred

Categories

Subscribe via email

Trending Categories

Mostly Viewed

Recent Comments

Recent Posts

Archives

Blogs I Follow