How to run a Multiple Correspondence Analysis (MCA) with XLSTAT?





The interactive transcript could not be loaded.



Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Published on Nov 23, 2010

This tutorial and the data are available on our website: http://www.xlstat.com/en/support/tuto....

Multiple Correspondence Analysis (MCA) is a method that allows studying the association between two or more qualitative variables. MCA is to qualitative variables what Principal Component Analysis is to quantitative variables. One can obtain maps where it is possible to visually observe the distances between the categories of the qualitative variables and between the observations.

An Excel sheet containing both the data and the results used in this tutorial can be downloaded by clicking here. The data correspond to a survey conducted by a car dealer where 28 customers were asked five questions, one week after they had picked up their car after a mechanical repair. The questions were:
- Are you globally satisfied by the service? (Yes/No)
- Do you consider the problem is solved? (Yes/No/Don't know)
- How good was the welcome? (1 to 5)
- Is the quality/price ratio satisfactory? (Yes/No)
- Will you use our services again? (Yes/No/Don't know)

By running a Multiple Correspondence Analysis (MCA), we want to identify the relationships between the various possible answer to the questions.

After opening XLSTAT, select the XLSTAT/Analyzing data/Multiple Correspondence Analysis command, or click on the corresponding button of the "Analyzing data" toolbar (see below).

Once you've clicked on the button, the Multiple Correspondence Analysis dialog box appears. The format of the data is here "Observations/Variables". We select the data on the Excel sheet, using the column selection method: just click on the name of the columns you want to select (see the tutorial on how to select data for more information on this topic). The "Observations labels" are selected in the corresponding field, and the "Variable labels" option is left activated as the first row of the table contains the name of the variables.

In the "Options" tab we activate the "Supplementary data" option and then go to the corresponding tab: the "Come back" variable is used as a supplementary variable because we don't want it to influence the computations; however, we want to know how the categories of this variable are positioned on the correspondence map.

The 1/p option is our filtering choice: the detailed results corresponding to factors which eigenvalue is less than 1/p (where p is the number of active qualitative variables), will not be displayed.

The following "Outputs" and "Charts" options have been activated.

The computations begin once you have clicked on "OK". The results will then be displayed. The first results displayed are the tables used for the computations (full disjunctive table, Burt's table).

The total inertia is equal to 2. It depends only on the number of variables and categories and not on the linkage between the variables. Therefore, there is no possible statistical interpretation.

The next table shows the eight non null eigenvalues and the corresponding % of inertia. However, unlike with CA (correspondence analysis performed on only 2 variables), the % of inertia are here pessimistic estimates of the quality of the representation, the latter being for the user "how close is the representation to the reality". Greenacre et al (2005) suggested an adjusted inertia which gives a better idea of the quality of the maps. We see here that while the usual computation gives us only 46.6% with the first two axes, the method based on the adjusted inertia gives us 87.3%.

The % displayed on the scree plot is based on the adjusted inertia.

Then, a table displays the coordinates of the categories in the factors space. The results that correspond to the supplementary variable are displayed in blue color. The coordinates of the observations are displayed further down. The contributions, the test values and the squared cosines help in the interpretation of the results. Before interpreting that two categories are close on the map, one should check that their contribution to the axes of the map, or that their squared cosines are high.


When autoplay is enabled, a suggested video will automatically play next.

Up next

to add this to Watch Later

Add to

Loading playlists...