Correspondence Analysis is a technique more commonly found in Market Research that is used to display relationships between groups of respondents and levels of categories. For example, if you are trying to determine which groups prefer a particular type of snack food, you could use a Correspondence Map to show these preferences in a two dimensional plane.
To perform a Correspondence Analysis, your data will need to take on a particular form (for one example, see table below). In this table,Brand represents the groups you wish to compare (e.g., males vs. females). In the Attribute, each of the numbers represents a different product being rated by the groups in the Brands column. Finally, the Top 2 Box (%) represents the proportion of respondents within that group who provided a top rating for that Attribute. In the table below, the top row (1,1,.67) means that 67% of Males gave the first attribute a rating of 4 or 5 (out of five). The second row (1,2,.54) indicates that 54% of Males gave the second attribute a rating of 4 or 5.
One thing that makes this a little tricky is that a single row no longer represents one respondent and data from one respondent can now be in multiple groups (rows). For example, both the group for females and the group for young people would represent a 20-year old female who gave a top box rating of Attribute 3.

There is more than one way to get the data into this format; the process outlined below is just one way that can be replicated in SPSS by having to make only a few adjustments to the code.
Note: I used SPSS 12.0.2 for Windows – you may have to make additional modifications if using a different version of SPSS.
1. Select the groups you want to compare
In this example we will examine how seven groups rated 10 different options of snack foods on a scale of 1 to 5. The seven groups are males (Gender=0), females (Gender=1), young (where Age=1), middle aged (where Age=2), old (where Age=3), dog owners (Cell7=0) and cat owners (Cell7=1). The options for snack food are represented by the variables “option_1″ to “option_10″.
Here is what the initial data file might look like:

2. (Optional) Recode the data
Recode your groups to correspond to the format of the final output. In other words, recode the variables in numeric order so that once they are all combined together, there will be no duplicates or overlap with variable codes. While this step is optional, it will save time and prevent confusion in subsequent steps.
In this example, we will recode variables for gender, age, and pet owner into one new variable called Brand.
RECODE
Gender
(0=1) (1=2) INTO Genderr .
EXECUTE .
RECODE
Age
(1=3) (2=4) (Else=5) INTO Ager .
EXECUTE .
RECODE
cell7
(0=6) (1=7) INTO Pet .
EXECUTE .
Here is what your data file might look like at this stage:

3. Restructure the Data Using the Variables to Cases Function
VARSTOCASES is a function that allows you to combine information from multiple variables into one. You can also use it to simplify the working data file, keeping only the variables that you will need and dropping extraneous ones.
The newly created ‘trans1’ will contain all the information from Gender, Age and Pet (here is where you will be thankful for recoding earlier). ‘Index1’ will contain the variable label while ‘trans1’ will contain the numerical code. Because we want the proportion of Top 2 Box ratings from the 10 options, we will use the ‘/KEEP’ option to list out the variables we want to remain in the data file. All other variables will be dropped from the data file at this stage.
VARSTOCASES /MAKE trans1 FROM Gender Age Pet
/INDEX = Index1(trans1)
/KEEP = option_1 option_2 option_3 option_4 option_5 option_6
option_7 option_8 option_9 option_10
/NULL = KEEP.
Here’s what your data would look like at this stage:

4.Count the Number of Top 2 Box Scores
Now that we have each respondent identified to a particular group, we need to count each time they gave a top 2 box rating for each attribute (in this example, the attributes are the variables called “option_x”).
There are multiple ways to do this, one way is to create new variables and use a “do repeat” function. This code creates one new variable for each attribute and places a one for each top two box score for each of the attributes.
compute count_1=0.
compute count_2=0.
compute count_3=0.
compute count_4=0.
compute count_5=0.
compute count_6=0.
compute count_7=0.
compute count_8=0.
compute count_9=0.
compute count_10=0.
do repeat x=option_1 to option_10 / y=count_1 to count_10.
if x=5 or x=4 y=1.
end repeat.
execute.
Now, your data would look like this:

5. Aggregate the data
Once you have counted the number of times a respondent provided a top two box rating for each attribute, now you need to find the proportion of respondents within each group that gave a top two box rating. This can be easily accomplished with the Aggregate function in SPSS.
Aggregate will create a new data file containing only the information you specify. Start by giving your new file a name. Use the ‘/BREAK’ option to specify how you want to group your data, in this example, ‘trans1’ contains our grouping information. The next few lines of code specify how you want SPSS to handle the data. Because our Count variables are binary, simply taking the mean of a binary variable will return the proportion of respondents who gave a top two box rating. The last line in the code, ‘/N_BREAK=N’ is a non-mandatory option. This will return the number of respondents within each of the groups. This is an easy way to check to see if your code is working properly.
AGGREGATE
/OUTFILE="C:\Documents and Settings\yourname\My Documents\data.sav"
/BREAK=Brand
/count_1 = MEAN(count_1) /count_2 = MEAN(Count_2) /count_3 = MEAN(Count_3)
/count_4 = MEAN(Count_4) /count_5 = MEAN(Count_5) /count_6 = MEAN(Count_6)
/count_7 = MEAN(Count_7) /count_8 = MEAN(Count_8) /count_9 = MEAN(Count_9)
/count_10 = MEAN(Count_10)
/N_BREAK=N.
Your data would now look something like this:

6. Stack the Data
We are almost done! The next step is to stack the count variables on top of each other. We can use the VARSTOCASES function again to make a new variable from multiple variables and to simplify the dataset down to only the variables we need using the ‘/KEEP’ option. Notice that the values for the variable Attribute are not numeric – you will also need to convert this string variable into numeric format in order to perform a Correspondence Analysis.
VARSTOCASES /MAKE T2B FROM count_1 count_2 count_3 count_4 count_5
count_6 count_7 count_8 count_9 count_10
/INDEX = Attribute(T2B)
/KEEP = Brand N_BREAK
/NULL = KEEP.
RECODE
Attribute (CONVERT)
('Count_1'=1) ('Count_2'=2) ('Count_3'=3) ('Count_4'=4) ('Count_5'=5)
('Count_6'=6) ('Count_7'=7) ('Count_8'=8) ('Count_9'=9) ('Count_10'=10)
INTO Attribute1 .
EXECUTE .
Your data should now look something like this:

The only variables you will need for Correspondence Analysis are Brand, Attribute1 and T2B. Add in the value labels for you Brand and you will be ready to go.
7. Correspondence Analysis
You are now ready to perform a Correspondence Analysis.
Weight by T2B .
CORRESPONDENCE TABLE=Attribute1(1 10) BY Brand (1 7)
/DIMENSIONS=2
/MEASURE=CHISQ
/STANDARDIZE=RCMEAN
/NORMALIZATION=SYMMETRICAL
/PRINT=TABLE RPOINTS CPOINTS
/PLOT=NDIM(1,MAX) NONE .
Here is all of the code for you:
RECODE
s4
(1,2,3=3) (4=4) (Else=5) INTO Age .
EXECUTE .
RECODE
cell7
(0=6) (1=7) INTO Pet .
EXECUTE .
VARSTOCASES /MAKE trans1 FROM Gender Age Pet
/INDEX = Index1(trans1)
/KEEP = option_1 option_2 option_3 option_4 option_5 option_6 option_7
option_8 option_9 option_10
/NULL = KEEP.
SAVE OUTFILE="C:\Documents and Settings\yourname\data.sav"
/COMPRESSED.
compute count_1=0.
compute count_2=0.
compute count_3=0.
compute count_4=0.
compute count_5=0.
compute count_6=0.
compute count_7=0.
compute count_8=0.
compute count_9=0.
compute count_10=0.
do repeat x=option_1 to option_10 / y=count_1 to count_10.
if x=5 or x=4 y=1.
end repeat.
execute.
AGGREGATE
/OUTFILE="C:\Documents and Settings\yourname\My Documents\data.sav"
/BREAK=Brand
/count_1 = MEAN(count_1) /count_2 = MEAN(Count_2) /count_3 = MEAN(Count_3)
/count_4 = MEAN(Count_4) /count_5 = MEAN(Count_5) /count_6 = MEAN(Count_6)
/count_7 = MEAN(Count_7) /count_8 = MEAN(Count_8) /count_9 = MEAN(Count_9)
/count_10 = MEAN(Count_10)
/N_BREAK=N.
Weight by T2B .
CORRESPONDENCE TABLE=Attribute1(1 10) BY Brand (1 7)
/DIMENSIONS=2
/MEASURE=CHISQ
/STANDARDIZE=RCMEAN
/NORMALIZATION=SYMMETRICAL
/PRINT=TABLE RPOINTS CPOINTS
/PLOT=NDIM(1,MAX) NONE .
It‘s quite in here! Why not leave a response?