Procedure:
Step1: Loading the data. We can load the dataset into weka by
clicking on open button in preprocessing interface and selecting the
appropriate file.
Step2: Once the data is loaded, weka will recognize the attributes
and during the scan of the data weka will compute some basic strategies on each
attribute. The left panel in the above figure shows the list of recognized
attributes while the top panel indicates the names of the base relation or
table and the current working relation (which are same initially).
Step3: Clicking on an attribute in the left panel will show the
basic statistics on the attributes for the categorical attributes the frequency
of each attribute value is shown, while for continuous attributes we can obtain
min, max, mean, standard deviation and deviation etc.,
Step4: The visualization in the right button panel in the form of
cross-tabulation across two attributes.
Note: we can select another attribute
using the dropdown list.
Step5:
Selecting or filtering attributes
Removing an attribute-When we need to
remove an attribute,we can do this by using the attribute filters in weka.In
the filter model panel,click on choose button,This will show a popup window
with a list of available filters.
Scroll down the list and select the
“weka.filters.unsupervised.attribute.remove” filters.
Step 6:a)Next click the textbox immediately to the right of the
choose button.In the resulting dialog box enter the index of the attribute to
be filtered out.
b)Make sure that invert selection option is set to false.The click
OK now in the filter box.you will see “Remove-R-7”.
c)Click the apply button to apply filter to this data.This will
remove the attribute and create new working relation.
d)Save
the new working relation as an arff file by clicking save button on the
top(button)panel.(student.arff)
Discretization
Sometimes association rule mining can only be
performed on categorical data. This requires performing discretization on
numeric or continuous attributes.
In the following example let us discretize age attribute :
Let
us divide the values of age attribute into three bins(intervals).
First
load the dataset into weka(student.arff)
Select
the age attribute.
Activate
filter-dialog box and select “WEKA.filters.unsupervised.attribute.discretize”from
the list.
To
change the defaults for the filters,click on the box immediately to the right
of the choose button.
We
enter the index for the attribute to be discretized.In this case the attribute
is age.So we must enter ‘1’ corresponding to the age attribute.
Enter
‘3’ as the number of bins.Leave the remaining field values as they are.
Click
OK button.
Click
apply in the filter panel.This will result in a new working relation with the
selected attribute partition into 3 bins.
Save the new working relation in a
file called student-data-discretized.arff
Data set:
@relation student
@attribute age
{<30,30-40,>40}
@attribute income
{low, medium, high}
@attribute student
{yes, no}
@attribute
credit-rating {fair, excellent}
@attribute buyspc
{yes, no}
@data
%
<30, high, no,
fair, no
<30, high, no,
excellent, no
30-40, high, no,
fair, yes
>40, medium,
no, fair, yes
>40, low, yes,
fair, yes
>40, low, yes,
excellent, no
30-40, low, yes,
excellent, yes
<30, medium,
no, fair, no
<30, low, yes,
fair, no
>40, medium,
yes, fair, yes
<30, medium,
yes, excellent, yes
30-40, medium, no,
excellent, yes
30-40, high, yes,
fair, yes
>40, medium,
no, excellent, no %
you discussed all the process here
ReplyDeletegatwick meet & greet
meet and greet at gatwick