Homework for IT642, Spring 2002

Homework 1: (due Jan 31)

Run the following five classifiers on five datasets of your choice. Report a table of best accuracy for each classification method and dataset. For each dataset draw a graph showing the effect of accuracy on the following parameter for each classification method. The five dataset may be chosen from the UCI repository (TAs will show you how to acquire these). One of the five datasets should be from the KDD repository instead of the machine learning repository.

Homework 2: (due Feb 27)

Implement an efficient and scalable algorithm for constructing decision tree classifiers.