You must develop a classifier that implements the Perceptron learning algorithm. The classifier must be written in Java and must extend Weka’s Classifier class so it can be invoked from within the Weka framework to exercise and evaluate its performance. It will need to implement only a few skeletal methods, as described below.
The classifier will be tested against several test cases, each with a specific data file in Weka’s ARFF format, a number of training epochs, and a learning constant. These values will be furnished as command line parameters when the program is run. All data files will involve decimal real feature values and nominal (i.e., enumerated) binary classifications. They will be drawn from the sample data files that you can see in the data folder in your Weka installation.
You will be provided with the [url removed, login to view] driver program for exercising the classifier and obtaining results. The driver class will instantiate your classifier and pass the command line arguments as Weka options. The driver will also invoke the Weka evaluation methods for assessing performance. Each run of the program will exercise one test case.
You will find helpful information in the “Writing a New Classifier” section in the “Extending Weka” chapter in the Weka manual that is included in your Weka distribution. There is also much material online about Weka and how to use it.
With the above in mind, here are the specific requirements:
1. Your zip file must contain only your Java source files, including the version of [url removed, login to view] that you wish us to use with your classifier, even if you have not modified it. Do NOT include any class files, [url removed, login to view], or any IDE project files.
2. You must use the [url removed, login to view] driver file to exercise your program. The program entry point MUST BE the Main method in this class and it must take the following three command line parameters, in the following order: (a) the data file name (a String), (b) the number of training epochs (an integer), and (c) the learning constant (a decimal real value).
3. Your classifier must be defined in a class called “Perceptron”, which must extend the [url removed, login to view] class and implement the [url removed, login to view] interface, which are in the [url removed, login to view] file that came with your Weka distribution. You will need to add [url removed, login to view] to your IDE project classpath, but do not include it in the zip file. This class must explicitly implement the Perceptron training algorithm.
4. A minimal set of methods that your classifier must implement is: (a) buildClassifier, (b) distributionForInstance, (c) setOptions, and (d) toString. The buildClassifier method will train the classifier using the Perceptron algorithm for the number of epochs, using the learning rate constant, and against the data set, all as specified by by the values retrieved by the setOptions method. The distributionForInstance method will simply specify the zero or one values for each classification class, depending on the predicted classification for an instance.
5. The buildClassifier method must report intermediate results as shown in the [url removed, login to view] file furnished with this assignment. Specifically, for each training epoch, the classifier must report the epoch number (e.g., “Iteration 0:”) followed by a binary string containing a value of 1 for each data instance that is successfully classified, or a value of 0 if classification is unsuccessful, requiring that the weights be updated.
6. The toString method must report the following data in the format shown in the [url removed, login to view] sample output file: (a) the source file; (b) the number of iterations (epochs); (c) the learning rate used; (d) the total number of time that weight updates were performed during training; and (e) the final weight values.