Homework 3
Due: February 22, 2005
You may work with others on this assignment, but you should turn in separate writeups, and you should understand the solutions. Consult the book and your professor for help if you need it.
This assignment must be done in LaTeX and turned in printed from Postscript or PDF file format.
Announcements
- Friday, February 11, 2005
- The assignment has been posted. Please start on it now, and ask questions if something is unclear. This should be a fun assignment!
Assignment
- Reading. You should read chapter 4 thoroughly. Parts 4.1-4.3 are especially important for this assignment.
- Implement LDA. Implement linear discriminant analysis in Matlab. Your book (chapter 4.3) should be a good resource for this.
- Simple LDA experiments.
Then, using the software, run LDA on this
data. The data is of 5 classes, and the class label is in the first
column. The remaining two columns are the two input features.
Show a plot similar to this that demonstrates where the boundaries are,
along with the training data:
I think the easiest way to produce this plot is to classify every point in the input space into one of the five classes, and then plot those points of the same class with the same color.
- LDA on vowel classification. Use your LDA software to classify the vowel classification dataset, which can be found on the textbook's website. You should read the info file to become familiar with what the data means (see the section labeled "Application to Vowel Recognition," which is halfway through the document). Note that this is a difficult problem! The best error rates on the test data for these types of classifiers are above 50% (see page 85 in your textbook). How does your LDA classifier compare to the error rates given in the book? Describe your experiments in detail, and analyze the results. Use visual elements like graphs, plots, and confusion matrices to explain your results.
- Variance of LDA Use 10-fold cross-validation on the vowel training set to estimate the variance of the error of LDA. Describe your experiments in detail, and analyze the results.
- Extra credit: linear regression versus LDA Implement linear regression on an indicator matrix (see section 4.2 in your book) for the vowel classification problem. Compare the classification error rates on the test data using linear regression and LDA. Can you find evidence of the masking problem of linear regression (hint: look at the confusion matrix).
Copyright © 2005 Greg Hamerly.
Computer Science Department
Baylor University