How to compute correlation between/within groups of variables?

I have a matrix of 1000 observations and 50 variables each measured on a 5-point scale. These variables are organized into groups, but there aren’t an equal number of variables in each group.

I’d like to calculate two types of correlations:

  1. Correlation within groups of variables (among characteristics): some measure of whether the variables within the group of variables are measuring the same thing.
  2. Correlation between groups of variables: some measure, assuming that each group reflects one overall trait, of how each trait (group) is related to every other trait.

These characteristics have been previously classified into groups. I’m interested in finding the correlation between the groups – i.e. assuming that the characteristics within in group are measuring the same underlying trait (having completed #1 above – Cronbach’s alpha), are the traits themselves related?

Does anybody have suggestions for where to start?


What @rolando suggested looks like a good start, if not the whole response (IMO). Let me continue with the correlational approach, following the Classical Test Theory (CTT) framework. Here, as noted by @Jeromy, a summary measure for your group of characteristics might be considered as the totalled (or sum) score of all items (a characteristic, in your words) belonging to what I will now refer to as a scale. Under CTT, this allows us to formalize individual “trait” propensity or liability as one’s location on a continuous scale reflecting an underlying construct (a latent trait), although here it is merely an ordinal scale (but this another debate in the psychometrics literature).

What you described has to do with what is know as convergent (to what extent items belonging to the same scale do correlate one with each other) and discriminant (items belonging to different scales should not correlate to a great extent) validity in psychometrics. Classical techniques include multi-trait multi-method (MTMM) analysis (Campbell & Fiske, 1959). An illustration of how it works is shown below (three methods or instruments, three constructs or traits):

enter image description here

In this MTMM matrix, the diagonal elements might be Cronbach’s alpha or test-retest intraclass correlation; these are indicators of the reliability of each measurement scale. The validity of the hypothesized (shared) constructs is assessed by the correlation of scales scores when different instruments are used to assess the same trait; if these instrument were developed independently, high correlation (>0.7) would support the idea that the traits are defined in a consistent and objective manner. The remaining cells in this MTMM matrix summarize relations between traits within method, and between traits across methods, and are indicative of the way unique constructs are measured with different scales and what are the relations between each trait in a given scale. Assuming independent traits, we generally don’t expect them to be high (a recommended threshold is <.3), but more formal test of hypothesis (on correlation point estimates) can be carried out. A subtlety is that we use so-called "rest correlation", that is we compute correlation between an item (or trait) and its scale (or method) after removing the contribution of this item to the sum score of this scale (correction for overlap).

Even if this method was initially developed to assess convergent and discriminant validity of a certain number of traits as studied by different measurement instruments, it can be applied for a single multi-scale instrument. The traits then becomes the items, and the methods are just the different scales. A generalization of this method to a single instrument is also known as multitrait scaling. Items correlating as expected (i.e., with their own scale rather than a different scale) are counted as scaling success. We generally assume, however, that the different scales are not correlated, that is they are targeting different hypothetical constructs. But averaging the within and between-scale correlations provide a quick way of summarizing the internal structure of your instrument. Another convenient way of doing so is to apply a cluster analysis on the matrix of pairwise correlations and see how your variables do hang together.

Of note, in both cases, the usual caveats of working with correlation measures apply, that is you cannot account for measurement error, you need a large sample, instruments or tests are assumed to be "parallel" (tau-equivalence, uncorrelated errors, equal error variances).

The second part addressed by @rolando is also interesting: If there's no theoretical or substantive indication that the already established grouping of items makes sense, then you'll have to find a way to highlight the structure of your data with e.g., exploratory factor analysis. But even if you trust those "characteristics within a group", you can check that this is a valid assumption. Now, you might be using confirmatory factor analysis model to check that the pattern of items loadings (correlation of an item with its own scale) behaves as expected.

Instead of traditional factor analytic methods, you can also take a look at items clustering (Revelle, 1979) which relies on a Cronbach's alpha-based split-rule to group together items into homogeneous scales.

A final word: If you are using R, there are two very nice packages that will ease the aforementioned steps:

  • psych, provides you with everything you need for getting started with psychometrics methods, including factor analysis (fa, fa.parallel, principal), items clustering (ICLUST and related methods), Cronbach's alpha (alpha); there's a nice overview available on William Revelle's website, especially An introduction to psychometric theory with applications in R.
  • psy, also includes scree plot (via PCA + simulated datasets) visualization (scree.plot) and MTMM (mtmm).


  1. Campbell, D.T. and Fiske, D.W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56: 81–105.
  2. Hays, R.D. and Fayers, P. (2005). Evaluating multi-item scales. In Assessing quality of life in clinical trials, (Fayers, P. and Hays, R., Eds.), pp. 41-53. Oxford.
  3. Revelle, W. (1979). Hierarchical Cluster Analysis and the Internal Structure of Tests. Multivariate Behavioral Research, 14: 57-74.

Source : Link , Question Author : blep , Answer Author : chl

Leave a Comment