FA: Choosing Rotation matrix, based on “Simple Structure Criteria”

One of the most important issues in using factor analysis is its interpretation. Factor analysis often uses factor rotation to enhance its interpretation. After a satisfactory rotation, the rotated factor loading matrix L’ will have the same ability to represent the correlation matrix and it can be used as the factor loading matrix, instead of the unrotated matrix L.

The purpose of rotation is to make the rotated factor loading matrix have some desirable properties. One of the methods used is to rotate the factor loading matrix such that the rotated matrix will have a simple structure.

L. L. Thurstone introduced the Principle of Simple Structure, as a general guide for factor rotation:

Simple Structure Criteria:

  1. Each row of the factor matrix should contain at least one zero
  2. If there are m common factors, each column of the factor matrix should have at least m zeros
  3. For every pair of columns in the factor matrix, there should be several variables for which entries approach zero in the one column but not in the other
  4. For every pair of columns in the factor matrix, a large proportion of the variables should have entries approaching zero in both columns when there are four or more factors
  5. For every pair of columns in the factor matrix, there should be only a small number of variables with nonzero entries in both columns

The ideal simple structure is such that:

  1. each item has a high, or meaningful, loading on one factor only and
  2. each factor have high, or meaningful, loadings for only some of the items.

The problem is that, trying several combinations of rotation methods along with the parameters that each one accepts (especially for oblique ones), the number of candidate matrices increases and it is very difficult to see which one better meets the above criteria.

When I first faced that problem I realized that I was unable to select the best match by merely ‘looking’ at them, and that I needed an algorithm to help me decide. Under the stress of project’s deadlines, the most I could do was to write the following code in MATLAB, which accepts one rotation matrix at a time and returns (under some assumptions) whether each criterion is met or not.
A new version (If I would ever tried to upgrade it) would accept a 3d matrix (a set of 2d matrices) as an argument, and the algorithm should return the one that better fits the above criteria.

How would you extract an algorithm out of those criteria? I am just asking for your opinions (I also think that there’s been criticism over the usefulness of the method by itself) and perhaps better approaches to the rotation matrix selection problem.

Also, I would like to know what software do you prefer to perform FA. If it’s R, what package do you use? (I must admit that if I had to do FA, I would turn to SPSS again). If someone wants to provide some code, I would prefer R or MATLAB.

P.S. The above Simple Structure Criteria formulation can be found in the book “Making Sense of Factor Analysis” by PETT, M., LACKEY, N., SULLIVAN, J.

P.S.2 (from the same book): “A test of successful factor analysis is the extent to which it can reproduce the original corr matrix. If you also used oblique solutions, among all select the one that generated the greatest number of highest and lowest factor loadings.”
This sounds like another constraint that the algorithm could use.

P.S.3 This question has also been asked here. However, I think it fits better on this site.

function [] = simple_structure_criteria (my_pattern_table)
%Simple Structure Criteria
%Making Sense of Factor Analysis, page 132

disp(' ');
disp('Simple Structure Criteria (Thurstone):');
disp('1. Each row of the factor matrix should contain at least one zero');
disp( '2. If there are m common factors, each column of the factor matrix should have at least m zeros');
disp( '3. For every pair of columns in the factor matrix, there should be several variables for which entries approach zero in the one column but not in the other');
disp( '4. For every pair of columns in the factor matrix, a large proportion of the variables should have entries approaching zero in both columns when there are four or more factors');
disp( '5. For every pair of columns in the factor matrix, there should be only a small number of variables with nonzero entries in both columns');
disp(' ');
disp( '(additional by Pedhazur and Schmelkin) The ideal simple structure is such that:');
disp( '6. Each item has a high, or meaningful, loading on one factor only and');
disp( '7. Each factor have high, or meaningful, loadings for only some of the items.');

disp('')
disp('Start checking...')

%test matrix
%ct=[76,78,16,7;19,29,10,13;2,6,7,8];
%test it by giving: simple_structure_criteria (ct)

ct=abs(my_pattern_table);

items=size(ct,1);
factors=size(ct,2);
my_zero = 0.1;
approach_zero = 0.2;
several = floor(items / 3);
small_number = ceil(items / 4);
large_proportion = 0.30;
meaningful = 0.4;
some_bottom = 2;
some_top = floor(items / 2);

% CRITERION 1
disp(' ');
disp('CRITERION 1');
for i = 1 : 1 : items
    count = 0;
    for j = 1 : 1 : factors
        if (ct(i,j) < my_zero)
            count = count + 1;
            break
        end
    end
    if (count == 0)
        disp(['Criterion 1 is NOT MET for item ' num2str(i)])
    end
end


% CRITERION 2
disp(' ');
disp('CRITERION 2');
for j = 1 : 1 : factors 
    m=0;
    for i = 1 : 1 : items
        if (ct(i,j) < my_zero)
            m = m + 1;
        end
    end
    if (m < factors)
        disp(['Criterion 2 is NOT MET for factor ' num2str(j) '. m = ' num2str(m)]);
    end
end

% CRITERION 3
disp(' ');
disp('CRITERION 3');
for c1 = 1 : 1 : factors - 1
    for c2 = c1 + 1 : 1 : factors
        test_several = 0;
        for i = 1 : 1 : items
            if ( (ct(i,c1)>my_zero && ct(i,c2)<my_zero) || (ct(i,c1)<my_zero && ct(i,c2)>my_zero) ) % approach zero in one but not in the other
                test_several = test_several + 1;
            end
        end
        disp(['several = ' num2str(test_several) ' for factors ' num2str(c1) ' and ' num2str(c2)]);
        if (test_several < several)
            disp(['Criterion 3 is NOT MET for factors ' num2str(c1) ' and ' num2str(c2)]);
        end
    end
end

% CRITERION 4
disp(' ');
disp('CRITERION 4');
if (factors > 3)
    for c1 = 1 : 1 : factors - 1
        for c2 = c1 + 1 : 1 : factors
            test_several = 0;
            for i = 1 : 1 : items
                if (ct(i,c1)<approach_zero && ct(i,c2)<approach_zero) % approach zero in both
                    test_several = test_several + 1;
                end
            end
            disp(['large proportion = ' num2str((test_several / items)*100) '% for factors ' num2str(c1) ' and ' num2str(c2)]);
            if ((test_several / items) < large_proportion)
                pr = sprintf('%4.2g',  (test_several / items) * 100 );
                disp(['Criterion 4 is NOT MET for factors ' num2str(c1) ' and ' num2str(c2) '. Proportion is ' pr '%']);
            end
        end
    end
end

% CRITERION 5
disp(' ');
disp('CRITERION 5');
for c1 = 1 : 1 : factors - 1
    for c2 = c1 + 1 : 1 : factors
        test_number = 0;
        for i = 1 : 1 : items
            if (ct(i,c1)>approach_zero && ct(i,c2)>approach_zero) % approach zero in both
                test_number = test_number + 1;
            end
        end
        disp(['small number = ' num2str(test_number) ' for factors ' num2str(c1) ' and ' num2str(c2)]);
        if (test_number > small_number)
            disp(['Criterion 5 is NOT MET for factors ' num2str(c1) ' and ' num2str(c2)]);
        end
    end
end

% CRITERION 6
disp(' ');
disp('CRITERION 6');
for i = 1 : 1 : items
    count = 0;
    for j = 1 : 1 : factors
        if (ct(i,j) > meaningful)
            count = count + 1;
        end
    end
    if (count == 0 || count > 1)
        disp(['Criterion 6 is NOT MET for item ' num2str(i)])
    end
end

% CRITERION 7
disp(' ');
disp('CRITERION 7');
for j = 1 : 1 : factors 
    m=0;
    for i = 1 : 1 : items
        if (ct(i,j) > meaningful)
            m = m + 1;
        end
    end
    disp(['some items = ' num2str(m) ' for factor ' num2str(j)]);
    if (m < some_bottom || m > some_top)
        disp(['Criterion 7 is NOT MET for factor ' num2str(j)]);
    end
end
disp('')
disp('Checking completed.')
return

Answer

The R psych package includes various routines to apply Factor Analysis (whether it be PCA-, ML- or FA-based), but see my short review on crantastic. Most of the usual rotation techniques are available, as well as algorithm relying on simple structure criteria; you might want to have a look at W. Revelle’s paper on this topic, Very Simple Structure: An Alternative Procedure For Estimating The Optimal Number Of Interpretable Factors (MBR 1979 (14)) and the VSS() function.

Many authors are using orthogonal rotation (VARIMAX), considering loadings higher than, say 0.3 or 0.4 (which amounts to 9 or 16% of variance explained by the factor), as it provides simpler structures for interpretation and scoring purpose (e.g., in quality of life research); others (e.g. Cattell, 1978; Kline, 1979) would recommend oblique rotations since “in the real world, it is not unreasonable to think that factors, as important determiners of behavior, would be correlated” (I’m quoting Kline, Intelligence. The Psychometric View, 1991, p. 19).

To my knowledge, researchers generally start with FA (or PCA), using a scree-plot together with simulated data (parallel analysis) to help choosing the right number of factors. I often found that item cluster analysis and VSS nicely complement such an approach. When one is interested in second-order factors, or to carry on with SEM-based methods, then obviously you need to use oblique rotation and factor out the resulting correlation matrix.

Other packages/software:

  • lavaan, for latent variable analysis in R;
  • OpenMx based on Mx, a general purpose software including a matrix algebra interpreter and numerical optimizer for structural equation modeling.

References
1. Cattell, R.B. (1978). The scientific use of factor analysis in behavioural and life sciences. New York, Plenum.
2. Kline, P. (1979). Psychometrics and Psychology. London, Academic Press.

Attribution
Source : Link , Question Author : George Dontas , Answer Author : chl

Leave a Comment