ID3 Implementation of Decision Trees

There are different implementations given for Decision Trees. Major ones are

  • ID3: Iternative Dichotomizer was the very first implementation of Decision Tree given by Ross Quinlan.
  • C4.5: Advanced version of ID3 algorithm addressing the issues in ID3. Some of issues it addressed were
    • Accepts continuous features (along with discrete in ID3)
    • Normalized Information Gain
    • Missing Value Imputation: Handling missing values
    • Pruning: Overfitting problem
    • Applying different weights to the features
  • CART: Classification and Regression Trees are similar to C4.5 in many ways except these trees are built on numeric splitting criteria.

We will see the basic implementation of ID3 now and we will extend it from here going further.

Features, training and testing data sets are taken from

FEATURES (sample)

Total: 22
// A listing of the features and their possible values. All features are binary.
age_lte_30 t f
age_betw_31_50 t f
age_gt_50 t f
sex m f


Total: 102
trainPatient1 survived   t f f f n y y y y n y y y y y t t t f f f n
trainPatient2 survived   f t f m n y n y y n y y y y y t f t f f t n
trainPatient3 survived   f f t m y y n y y y y y y y y t t t f f f n
trainPatient4 survived   f t f m y n y y y y y y y y y t t t f f f n


Total: 52
testPatient1 survived    f t f m y y y y y y y y y y y t t f t f f n
testPatient2 survived    f f t m n n y y y n n y y y y t f f f t t y
testPatient3 survived    t f f f y y n n y y n y y y y t f f f t t n
testPatient4 died        f t f m y y n y y y n n y n n t f f t f t y
testPatient5 survived    f f t m n y n n n y n n n y n f f f f t f y

Implementation of the ID3 ALGORITHM

  • Load the data: Load the features, training and testing data.
  • Build the Decision Tree
  • Test the Decision Tree
  • Print the constructed Decision Tree

Build the Decision Tree

 - Check for the base cases
 - For each attribute, find the Information Gain and choose the attribute 
     with Maximum Info Gain or Minimum Entropy (aka Best Attribute)
 - Create a Decision Node on the best attribute chosen in above step.
 - Recursively run above 3 steps for remaining attributes.

public DTNode buildDecisionTree(Features features, Instances trainExamples, 
			String[] outputLabels) {

		if(trainExamples.getExamples().size() ==0 || features.getFeatures().size() ==0) {
			return null;

		Map<String, Integer> outputLabelCt = getOutputLabelDistribution(features.getFeatures(), trainExamples.getExamples(), outputLabels);
		int negCt = outputLabelCt.get(outputLabels[0]);
		int posCt = outputLabelCt.get(outputLabels[1]);
		String debug = posCt + "+ve/" + negCt + "-ve";
		String outputLabel = outputLabels[1];
		if(posCt > negCt) {
			outputLabel = outputLabels[0];

		if(posCt == 0) {
			DTNode node = new DTNode();
			return node;

		if(negCt == 0) {
			DTNode node = new DTNode();
			return node;
		Feature bestFeature = getBestFeature(features.getFeatures(), trainExamples.getExamples(), outputLabels, outputLabelCt);
		Features remainingFtrs = getRemainingFtrs(features.getFeatures(), bestFeature);
		//System.out.println("remainingFtrs size "+remainingFtrs.getFeatures().size());

		// build a node based on the best feature
		DTNode root = new DTNode();

		// traverse left side and construct
		//System.out.println("traversing left");
		Instances filteredExOnBestFtr = getFilteredExamplesOnBestFtr(trainExamples.getExamples(), bestFeature, bestFeature.getfOptions()[0]);
		root.setLeftNode(buildDecisionTree(remainingFtrs, filteredExOnBestFtr, outputLabels));

		// traverse right side and construct
		//System.out.println("traversing right");
		filteredExOnBestFtr = getFilteredExamplesOnBestFtr(trainExamples.getExamples(), bestFeature, bestFeature.getfOptions()[1]);
		root.setRightNode(buildDecisionTree(remainingFtrs, filteredExOnBestFtr, outputLabels));

		// return root of the tree
		return root;

Test the Decision Tree: Tests the decision tree which is built in previous step and prints the accuracy (In our implementation accuracy is around 92%)

Successfully predicted testPatients: 48
Failed example is testPatient5;survived;[f, f, t, m, n, y, n, n, n, y, n, n, n, y, n, f, f, f, f, t, f, y]
Failed example is testPatient19;died;[f, t, f, m, n, n, n, n, y, y, y, y, n, y, y, t, t, f, t, f, t, y]
Failed example is testPatient24;survived;[f, f, t, m, y, y, n, y, y, y, n, n, n, y, n, t, f, t, f, f, t, y]
Failed example is testPatient34;survived;[f, t, f, m, n, y, n, n, n, y, y, y, y, y, y, t, f, f, f, t, f, y]
Number of failed examples: 4, out of total test examples: 52
Accuracy achieved is 92.3076923076923%

OUTPUT Decision Tree

│       ┌── survived#ONLYNEG#1+ve/0-ve
│   ┌── histology#Normal#1+ve/6-ve
│   │   └── died#ONLYPOS#0+ve/6-ve
└── varices#Normal#88+ve/14-ve
    │               ┌── survived#ONLYNEG#4+ve/0-ve
    │           ┌── sex#Normal#9+ve/5-ve
    │           │   │   ┌── died#ONLYPOS#0+ve/4-ve
    │           │   └── sgot_lte_59#Normal#5+ve/5-ve
    │           │       │       ┌── died#ONLYPOS#0+ve/1-ve
    │           │       │   ┌── age_betw_31_50#Normal#1+ve/1-ve
    │           │       │   │   └── survived#ONLYNEG#1+ve/0-ve
    │           │       └── anorexia#Normal#5+ve/1-ve
    │           │           └── survived#ONLYNEG#4+ve/0-ve
    │       ┌── alk_phosphate_lte_127#Normal#35+ve/8-ve
    │       │   │       ┌── survived#ONLYNEG#3+ve/0-ve
    │       │   │   ┌── antivirals#Normal#5+ve/3-ve
    │       │   │   │   │       ┌── survived#ONLYNEG#2+ve/0-ve
    │       │   │   │   │   ┌── sgot_gt_156#Normal#2+ve/1-ve
    │       │   │   │   │   │   └── died#ONLYPOS#0+ve/1-ve
    │       │   │   │   └── liver_big#Normal#2+ve/3-ve
    │       │   │   │       └── died#ONLYPOS#0+ve/2-ve
    │       │   └── spiders#Normal#26+ve/3-ve
    │       │       └── survived#ONLYNEG#21+ve/0-ve
    │   ┌── age_lte_30#Normal#48+ve/8-ve
    │   │   └── survived#ONLYNEG#13+ve/0-ve
    └── fatigue#Normal#87+ve/8-ve
        └── survived#ONLYNEG#39+ve/0-ve

Complete source code can be downloaded at


3 Responses to ID3 Implementation of Decision Trees

  1. audio hartama says:

    how to combine the input dataset file?, and what are ve/ and ve- mean?

  2. audiohart says:

    how to combine the input dataset file without splitting them into traing and testing?, and what are ve/ and ve- mean?

  3. dioooooo says:

    update for C4.5 please

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Mostly technology with occasional sprinkling of other random thoughts


Amir Amintabar's personal page

101 Books

Reading my way through Time Magazine's 100 Greatest Novels since 1923 (plus Ulysses)

Seek, Plunnge and more...

My words, my world...

ARRM Foundation

Do not wait for leaders; do it alone, person to person - Mother Teresa

Executive Management

An unexamined life is not worth living – Socrates

Diabolical or Smart

Nitwit, Blubber, Oddment, Tweak !!


A topnotch site


Just another site

coding algorithms

"An approximate answer to the right problem is worth a good deal more than an exact answer to an approximate problem." -- John Tukey