Blogger Tips and TricksLatest Tips And TricksBlogger Tricks

Bayesian Algorithm implementation in weka tool

Procedure:
Step1. We begin the experiment by loading the data (employee.arff) into weka.
Step2: next we select the “classify” tab and click “choose” button to select the “id3”classifier.
Step3: now we specify the various parameters. These can be specified by clicking in the text box to the right of the chose button. In this example, we accept the default values his default version does perform some pruning but does not perform error pruning.
Step4: under the “text “options in the main panel. We select the 10-fold cross validation as our evaluation approach. Since we don’t have separate evaluation data set, this is necessary to get a reasonable idea of accuracy of generated model.
Step-5: we now click”start”to generate the model .the ASCII version of the tree as well as evaluation statistic will appear in the right panel when the model construction is complete.
Step-6: note that the classification accuracy of model is about 69%.this indicates that we may find more work. (Either in preprocessing or in selecting current parameters for the classification)
Step-7: now weka also lets us a view a graphical version of the classification tree. This can be done by right clicking the last result set and selecting “visualize tree” from the pop-up menu.
Step-8: we will use our model to classify the new instances.

Step-9: In the main panel under “text “options click the “supplied test set” radio button and then click the “set” button. This will show pop-up window which will allow you to open the file containing test instances.


=== Run information ===


Scheme:       weka.classifiers.bayes.NaiveBayes
Relation:     employee
Instances:    11
Attributes:   3
              age
              salary
              performance
Test mode:    10-fold cross-validation


=== Classifier model (full training set) ===
Naive Bayes Classifier

                Class
Attribute        good    avg   poor
               (0.29) (0.36) (0.36)
====================================
age
  25               1.0    1.0    2.0
  27               1.0    1.0    3.0
  28               1.0    1.0    2.0
  29               1.0    3.0    1.0
  30               1.0    3.0    1.0
  35               2.0    1.0    1.0
  48               3.0    1.0    1.0
  [total]         10.0   11.0   11.0

salary
  10k              1.0    1.0    2.0
  15k              1.0    1.0    2.0
  17k              1.0    1.0    3.0
  20k              1.0    3.0    1.0
  25k              1.0    3.0    1.0
  30k              1.0    1.0    1.0
  35k              2.0    1.0    1.0
  32k              3.0    1.0    1.0
  [total]         11.0   12.0   12.0

Time taken to build model: 0 seconds

=== Stratified cross-validation ===
=== Summary ===

Correctly Classified Instances          10               90.9091 %
Incorrectly Classified Instances         1                9.0909 %
Kappa statistic                          0.8625
Mean absolute error                      0.2899
Root mean squared error                  0.3171
Relative absolute error                 61.3111 %
Root relative squared error             63.0158 %
Coverage of cases (0.95 level)         100      %
Mean rel. region size (0.95 level)     100      %
Total Number of Instances               11    




=== Detailed Accuracy By Class ===

                 TP Rate  FP Rate  Precision  Recall   F-Measure  MCC      ROC Area  PRC Area  Class
                 1.000    0.000    1.000      1.000    1.000      1.000    1.000     1.000     good
                 1.000    0.143    0.800      1.000    0.889      0.828    1.000     1.000     avg
                 0.750    0.000    1.000      0.750    0.857      0.810    1.000     1.000     poor
Weighted Avg.    0.909    0.052    0.927      0.909    0.908      0.868    1.000     1.000    

=== Confusion Matrix ===

 a b c   <-- classified as
 3 0 0 | a = good
 0 4 0 | b = avg
 0 1 3 | c = poor


1 comment:

Flag Counter