Factor
Analysis
Graham Tall research@grahamtall.com
September 2003
| Preview:
Sample Size: |
The purpose of Factor
Analysis is to identify patterns in the tests/item responses (i.e. data in
the COLUMNS/FIELDS of a spreadsheet/database). It was initially used to ensure
that the items in psychological tests were all assessing the same psychological construct
(eg a particular way of thinking {spatial awareness, verbal reasoning, numerical
reasoning} or a type of personality {extraversion, neuroticism}), but has been very
valuable in analysing banks of attitude questions. Allowing the researcher to check
whether the attitude questions are measuring the same underlying characteristic (eg
enjoyment of mathematics, belief in Islam etc.) or, to discover whether amongst the range
of responses there are any particular patterns. The
statistical concern that factor analysis requires interval
numbers and attitude questions only require an ordinal
response is commonly rebutted by the reality that the 'factors' obtained are commonly
helpful. As Hutcheson & Sofroniou (1999) argue "The relaxation of the
requirement for continuous data can be justified for exploratory factor analysis as the
usefulness of the procedure is based purely on the interpretability of the factors."
(p.222) Whilst the bigger the sample the better, Hutcheson & Sofroniou (1999) accept that viable analyses can be carried out on "much smaller" numbers than the 150 cases mentioned by Tabachnick & Fidell (1996). |
The use of Factor Analysis in psychology and education is based on the view that responses to particular questions are caused or affected by underlying belief systems/ 'psychological constructs' /factors. The assumption is that, if this is the case, such questions will be answered similarly and hence will correlate with each other. In educational research methodologies, such underlying belief systems are the paradigms behind the use of 'quantitative' and 'qualitative' methods. The two quotations below illustrate this:
Factor Analysis is based on the assumption that relationships between variables are due to the effects of underlying factors. It is assumed that factors may represent the causes of relationships in the data and that observed correlations are the result of variables sharing common factors. For example, the existence of certain attitudes can be inferred from answers which are given to a number of questions. Answers of "agree" or "strongly agree" to questions such as "It is important to preserve one's culture", "I am prepared to die for my country" and "It is good to take part in our traditional festivals" may lead one to conclude that the 'patriotism' factor is present. Here patriotism is not a single measurable entity but is a construct which is derived from the measurement of other variables....Hypothesising the existence of something called 'patriotism' explains some of the relationships between the variables, can simply the description of the data and help in our understanding of the complex relationship between answers to numerous and varied questions. (Hutcheson,G & Sofroniou, N. (1999) p218-9)
What are creativity, love altruism? Unlike variables such as weight, blood pressure and temperature, they cannot be measured on a scale, sphygmomanometer, or thermometer, in units of pounds, millimetres of mercury or degrees Fahrenheit. Instead they can be thought of as unifying constructs or labels that characterise responses to related groups of variables. For example, answers of "strongly agree" to items such as "sends me flowers", "listens to my problems", "reads my manuscripts", "laughs at my jokes," and "gazes deeply into my soul" may lead you to conclude that love is present. Thus love is not a single measurable entity but a construct which is derived from other, directly observable variables. Identification of such underlying dimensions - factors - greatly simplifies the description and understanding of complex phenomena like social interaction (Norusis, 1992, p47)
For researchers the particular value of factor analysis is that it simplifies the data analysis report and helps to provide deeper insights. Instead of commenting separately on twenty plus attitude statements, factor analysis allows the researcher to group them into a much smaller number of factors/constructs.
Factor analysis is NOT expected for an MEd degree but is with MPhil, PhD & EdD degrees when the data warrants it..
Factor analysis software can be run on pentium PC's which have lots of memory. A time-restricted copy of SPSS which can be loaded on one's own computer is available at low cost from the main library. SPSS is also available in the School of Education IT room and John Shearwood will provide advice in using it, but it is your responsibility to enter the data. It is recommended that you enter the data into an EXCEL file (see Figure 1 below). Once entered, use the SAVE AS command and save the sheet of data as a single worksheet (EXCEL version 4 or earlier), with the first row of the spreadsheet containing short names for each field. Note, most spreadsheet programs and all later versions of EXCEL can create earlier versions of EXCEL files.
Figure 1: An example of a
spreadsheet containing attainment test and questionnaire information for 3 individuals:
for more information select spreadsheets.
| A | B |
C |
D |
E |
F |
G |
H |
I | J | K | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | ID* | T1 |
T2 |
T3 |
Sex |
Eth |
A1 |
A2 |
A3 |
A4 |
|
| 2 | M1 |
27 |
59 |
62 |
2 | 4 | 43 |
5 |
4 |
6 |
|
| 3 | M2 |
35 |
48 |
47 |
2 | 4 | 41 |
4 |
6 |
3 |
|
| 4 | F3 |
22 |
50 |
52 |
1 | 8 | 35 |
6 |
5 |
4 |
|
| 5 | etc. |
|
* ID column the identifies each
individual. T1 to T3 are attainment test scores. Sex, Eth, A1 to
A2 are
questionnaire responses. Field
names must: begin with a letter, be a maximum of 8 letters/numbers
long, and avoid symbols like %.
Factor analysis searches for patterns in test/item responses in the selected columns of data in a spreadsheet (The columns of data must contain at least interval or ordinal numbers, whilst the researcher can use a 'battering ram' style of research and include all columns of data it is more logical to choose those relevant to the item being studied). The underlying assumption is that the higher the correlation between columns of data the greater the liklehood that they are measuring a common underlying characteristic or factor. Second, third, fourth factors will be found if groups of questions in the analysis correlate highly with each other, but not with the questions in the other groups. The first stage in factor analysis is, therefore, the production of a correlation matrix . Interpretation of factor analysis is based on:
| 1.
|
Eigenvalue (the analysis begins by standardising the value of EACH attitude question/test as being = 1. Hence, factors with an eigenvalue less than 1 provide less information than that obtained on a single attitude question/ test. |
| 2. | The larger the Eigenvalue for a particular factor the greater the amount of information that is explained. |
| 3.
|
Interpret each factor by studying the attitude questions/statement with large +ve or -ve scores. Statements that you thought were linked may not be included because the original wording of the statement allowed different interpretations. Finally, do remember that computers cannot 'think': if the great majority of respondents all strongly agreed with an item, then that item may well have a low correlation with other attitude statements where there is a range of agreement. |
Consider the series of statements below:
Statement |
Statement |
Statement |
Statement |
Statement |
Statement |
6 |
4 |
5 |
6 |
2 |
1 |
6 |
5 |
6 |
5 |
4 |
2 |
6 |
6 |
6 |
4 |
6 |
3 |
6 |
4 |
5 |
6 |
2 |
1 |
6 |
5 |
6 |
5 |
4 |
2 |
6 |
6 |
6 |
4 |
6 |
3 |
6 |
4 |
5 |
6 |
2 |
1 |
6 |
5 |
6 |
5 |
4 |
2 |
6 |
6 |
6 |
4 |
6 |
3 |
Even though statements 1, 2, 3 and 4 ONLY contained positive responses (4, 5 or 6 on a six point scale). Statement 1 will not correlate with any of the other statements because everyone answering, strongly agreed (6). Statements 2 & 3 will positively correlate with each other and negatively with statement 4. Statement 5 will correlate positively with 2 & 3 and negatively with statement 4. Even though statement 6 ONLY contained negative responses (1, 2 or 3 on a six point scale), it will still correlate positively with 2, 3 & 5 and negatively with statement 4.
Introductions to factor analysis can be found in Burroughs (1975) and Hutcheson,G & Sofroniou, N. (1999).
QUALITATIVE ASPECT of Factor Analysis, to interpret 'factors' you MUST study the items.
Identifying what a factor means requires a combination of deductive and inductive reasoning. The attitude statements need to be read carefully to discover why items have been answered similarly. The research assumption is that when a group of statements inter-correlate it is likely that their is an underlying relationship between them.
It is essential to recognise that the calculation aspect of factor analysis is both totally dependent on the information given to it and completely ignorant of what the numbers mean, this means that the patterns discovered have been identified without knowledge of the meaning of the various statements. They are, to that extent, more objective than patterns identified solely on the researchers intuitive interpretation of the data collected. But, the researchers intuitive interpretation remains crucial for their interpretation: the researcher has to study the original statements to interpret the factors identified. Factor analysis is ultimately a combination of objective evidence and intuitive interpretation.
Note: Particular individuals may present one, some, all or none of the factors identified. For example: several teachers, all of whom are deeply committed to caring for the children in their tutor groups, could have very different attitudes to record-keeping and teaching in tutor time.
References:
Burroughs, G (1975) Design and Analysis in
Educational Research. Educational Monograph No.8 U.Birmingham. A useful
introduction to factor analysis.
Hutcheson,G & Sofroniou, N. (1999) The Multivariate Social Scientist.
Sage:London
Norusis, M.J. (1992) SPSS for Windows Professional Statistics Release 5. SPSS:Chicago.
A very useful book providing
detailed instructions on how to use the SPSS package.
Home Page Research Introduction Quantitative Advice Index Statistical Tests Cluster Analysis Research and Statistics Courses