result

5.4. Real data with feature selection

Real data of object images (COIL-100)

Real data of face images (ORL faces)

With the feature selection parameter set, the matlab code is limited to 8 components due to computational time costs (n_vectors=7 or less).

EXPERIMENT 17

20 first objects from COIL-100 and FASTICA

objects=20;
x=data_coil100(objects);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
nc=objects; %number of classes
np=10; %number of samples per class
n_vectors=6; %number of vectors used to reduce the data
nr=6; %number of repetitions of the classification process (DIFFERENT EXPERIMENTS)
per_test=0.5; %percentage of test samples from x (0-1)

>> example(3,1,1)

EXPERIMENT 18

20 first objects from COIL-100 and INFOMAX

>> example(3,2,1)

EXPERIMENT 19

40 people from ORL and FASTICA

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Images from ORL FACE DATABASE %%%%http://people.cs.uchicago.edu/~dinoj/vis/ORL.zip
objects=40;
np=10; %number of samples per class
x=data_orl_faces(objects,np,see);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
nc=objects; %number of classes
n_vectors=5; %number of vectors used to reduce the data
nr=20; %number of repetitions of the classification process
per_test=0.5; %percentage of test samples from x (0-1)

>> example(4,1,1)

EXPERIMENT 20

40 people from ORL and INFOMAX

>> example(4,2,1)

CONCLUSIONS (Experiments 17-20)

In the above graphs, for each number of components returned by the algorithm, the best possible set is selected by means of an exhaustive feature selection process. In this way, the results show the recognition rates that could be obtained with each algorithm if a perfect selection of components is carried out.

The results show clearly that a feature selection process may have a strong influence in the recognition rates obtained by each algorithm. FastICA and whitened PCA no longer perform equally; and the same applies to Infomax. Eventhough the results of the three algorithms differ to a great extent, there is no clear winner. As all the algorithms are unsupervised, the results are highly dependent on the class distributions. This fact was previously shown in section 5.3. with the artificial datasets.

go back to main