Analysing PIAAC Data with the IDB Analyzer (SPSS and SAS)

Sandoval-Hernandez, Andres; Carrasco, Diego

doi:10.1007/978-3-030-47515-4_6

Analysing PIAAC Data with the IDB Analyzer (SPSS and SAS)

Andres Sandoval-Hernandez¹¹ &
Diego Carrasco¹²

Chapter
Open Access
First Online: 28 July 2020

5532 Accesses
2 Citations

Part of the book series: Methodology of Educational Measurement and Assessment ((MEMA))

Abstract

This chapter provides readers with a step-by-step guide to performing both simple and complex analyses with data from the Programme for the International Assessment of Adult Competencies (PIAAC) using the IEA International Database (IDB) Analyzer. The IDB Analyzer is a Windows-based tool that generates SPSS and SAS syntax. Using this syntax, corresponding analyses can be conducted in SPSS and SAS. The chapter presents the data-merging module and the analysis module. Potential analyses with the IDB Analyzer are demonstrated—for example, the calculation of percentages, averages, proficiency levels, linear regression, correlations, and percentiles.

You have full access to this open access chapter, Download chapter PDF

This chapter describes the general use of the International Association for the Evaluation of Educational Achievement’s (IEA) International Database Analyzer (IDB Analyzer) for analysing PIAAC data (IEA 2019). The IDB Analyzer provides a user-friendly interface to easily merge the data files of the different countries participating in PIAAC. Furthermore, it seamlessly takes into account the sampling information and the multiple imputed achievement scores to produce accurate statistical results (see Chap. 2 in this volume for details about PIAAC’s complex sample and assessment design).

This chapter is subdivided into three main sections. In the first section, we will provide a brief overview of the software.^{Footnote 1} Sections 6.2 and 6.3 will be dedicated to the Merge and Analysis modules of the IDB Analyzer, respectively. For each of these two sections, we will provide a description of the functionalities of the respective modules and examples to illustrate some of the capabilities of the IDB Analyzer (Version 4.0) to merge files and to compute a variety of statistics, including the calculation of percentages, averages, benchmarks (proficiency levels), linear regression, logistic regression, correlations, and percentiles.

6.1 The IDB Analyzer

Developed by IEA Hamburg, the IDB Analyzer is an interface that creates syntax for SPSS (IBM 2013) and SAS (SAS 2012). The IDB Analyzer was originally designed to allow users to combine and analyse data from IEA’s large-scale assessments, but it has been adapted to work with data from most major large-scale assessment surveys, including those conducted by the Organisation for Economic Co-operation and Development (OECD), such as the Programme for the International Assessment of Adult Competencies (PIAAC), the Programme for International Student Assessment (PISA), and the Teaching and Learning International Survey (TALIS).

The IDB Analyzer generates SPSS or SAS syntax files that take into account information from the complex sampling design of the study to produce population estimates. In addition, the generated syntax makes appropriate use of plausible values for calculating estimates of achievement scores, combining both sampling variance and imputation variance. Considering PIAAC’s complex sample and complex assessment design, using either SPSS or SAS to analyse PIAAC data without the IDB Analyzer would require the user to have programming knowledge in order to create their own macros. The IDB Analyzer automatically generates these macros (syntax files) in a user-friendly environment that allows their customisation according to the purposes of the intended analysis.

The IDB Analyzer consists of two modules: the merge module and the analysis module. These two modules are integrated and executed in one common application. When working with PIAAC data, the merge module is used to create analysis datasets by combining data files from different countries and selecting subsets of variables for analysis. The analysis module provides procedures for computing various statistics and their standard errors.

Once the IDB Analyzer application is launched,^{Footnote 2} the main window will appear, as shown in Fig. 6.1. Users have then the option of choosing either SPSS or SAS as their statistical software of choice. For the examples in this chapter, we will use the SPSS software. The main window also has options to select the ‘Merge Module’, the ‘Analysis Module’, the ‘Help Manual’ or to exit the application.

There are at least two ways to access guidance on how to use the IDB Analyzer: video tutorials made by IEA and the main ‘Help’ manual that accompanies this software installation. An easy way to get you started with the IDB Analyzer is to watch IEA video tutorials. These have been made available at the following link: https://www.iea.nl/training#IDB_Analyzer_Video_Tutorials.

These videos have been shared via YouTube; they cover step-by-step examples of how to estimate correlations, percentiles, percentages and means, logistic regression, linear regression, and benchmarks.

A second way to get help and guidance is to consult the ‘Help’ manual via the main menu in the IDB Analyzer. This official manual can be accessed by clicking on the third button present in the main menu. Figure 6.1 shows what this main menu looks like.

The IDB Analyzer will work on most IBM-compatible computers using current Microsoft Windows^{Footnote 3} operating systems. The IDB Analyzer is licensed free of charge and may be used only in accordance with the terms of the licencing agreement. While the IDB Analyzer is free, the user must own a valid licence for at least one of the software packages used as statistical engine (i.e., SPSS Version 18 or later or SAS Version 9 or later). Additionally, the user should have a valid licence for Microsoft Excel 2003 or a later version (as outputs are also produced in this format). The IDB Analyzer licence expires at the end of each calendar year. So, every year, users have to download and reinstall the most current version of the software and agree to the terms and conditions of the new licence.

6.2 Merging Files with the IDB Analyzer

PIAAC Public Use Files containing both responses to the background questionnaire and the cognitive assessment are available for downloading for each of the participant countries separately. The Merge Module of the IDB Analyzer allows users to combine datasets from more than one country into a single data file for cross-country analyses. For the purposes of this chapter, we will assume all data files have been copied within a folder named ‘C:\Data\PIAAC\’. PIAAC data files are available in both SPSS and SAS from the PIAAC website.^{Footnote 4} Users should download the data files in the format of their preference.

The Merge Module recognises the data files for PIAAC by reading the file names in the selected directory and matching them to the file-naming convention pre-specified in the IDB Analyzer configuration files. For this reason, in order to ensure that the IDB Analyzer will correctly identify the different files contained in the PIAAC datasets, as well as the user-generated files:

Users should not change the name of the files once downloaded from the PIAAC website.
Users should not save the merged file in the same directory where the source files are located.
Users should keep files from different studies and years in separate directories.

The following steps will create an SPSS or SAS data file with data from multiple countries and/or multiple file types:

1.
Open the IDB Analyzer.
2.
Select the statistical software you want to work with (choose between SAS or SPSS).
3.
Select the Merge Module of the IDB Analyzer.
4.
Click the Merge Module button. The Merge Module interface is divided into two different tabs. In the first one, you can select the countries and edit country labels. In the second tab, you can select the variables you want to include in your analysis and specify the name of the merged file.
5.
Under the ‘Select Data Files and Participants’ tab and in the ‘Select Directory’ field, browse to the folder where all data files are located. For example, in Fig. 6.2, all SPSS data files are located in the folder ‘C:\Data\PIAAC\’. The program will automatically recognise and complete the ‘Select Study’ and ‘Select Cycle’ fields and list all countries available in this folder as possible candidates for merging.
Fig. 6.2
IDB Analyzer merge module: select data files and participants
Full size image
6.
Click the countries of interest from the ‘Available Participants’ list and click the right arrow button (▹) to move them to the ‘Selected Participants’ panel on the right. Individual countries can also be moved directly to the ‘Selected Participants’ panel by double-clicking on them. To select multiple countries, hold the CTRL-key of the keyboard when clicking on countries. Click the tab-right arrow button (⊵) to move all countries to the ‘Selected Participants’ panel. For this example, we selected all the countries available.
7.
Click the ‘Next >’ button to proceed to the next step. The software will open the ‘Select File Types and Variables’ tab of the merge module (see Fig. 6.3), to select the file types and the variables to be included in the merged data file.
8.
Select the files for merging by checking the appropriate boxes to the left of the window. For example, in Fig. 6.3, the ‘General Response File’ has been selected.^{Footnote 5} Checking this box will automatically populate the ‘Selected Variables’ panel with the three scores available in PIAAC (i.e. Literacy Scale Score, Numeracy Scale Score, and Problem-Solving Scale Score), as well as with all the ID (e.g. Country ID) and sampling variables (e.g. sampling and replicate weights) needed for the corresponding analyses (Fig. 6.4).

9.
Select the variables of interest from the ‘Available Variables’ list in the left panel. In SPSS, you can right-click on the variable names to open a menu with details about each of the available variables (i.e. variable name, label, measurement level, and value labels). Variables are selected by clicking on them and then clicking the right arrow (▹) button. Clicking the tab-right arrow (⊵) button selects all variables (Fig. 6.5).
Fig. 6.5
IDB Analyzer merge module: selecting all variables
Full size image
10.
When selecting the variables, you can search variables by variable name or by variable label using the filter boxes (blue space between column header and list of variables) in the ‘Available Variables’ list and ‘Selected Variables’ list.
11.
Note that the IDB Analyzer assumes that files have the same structure and the variables have the same properties (e.g. variables, formats, labels) in each of these files. Any deviation from this can cause unexpected results. Should you want to modify the contents of a file for a country, or a set of countries, it is recommended to do this on the resulting merged file, after the merge is completed.
12.
In the ‘Output Files’ field, click on the ‘Define’ button to specify the name for the merged data file and the folder where it will be saved. The IDB Analyzer will also create an SPSS syntax file (∗SPS) (or a SAS syntax file, ∗.SAS, if you are using this software) of the same name and in the same folder with the code necessary to perform the merge. In the example shown in Fig. 6.3, the merged data file ‘merge_piaac.sav’ and the syntax file ‘merge_piaac.sps’ both will be created and stored in the folder titled ‘C:\Data\’. The merged data file will contain all the variables listed in the ‘Selected Variables’ panel, and if all available variables were selected, the resulting merge file should be about 622 megabytes in size.
13.
Click the ‘Start SPSS’ button to create the SPSS syntax file. An SPSS Syntax Editor window with the created syntax code will be automatically opened. The syntax file can be executed by opening the ‘Run’ menu of SPSS and selecting the ‘All’ menu option. Alternatively, you can also submit the code for processing with the keystrokes Ctrl+A (to select all), followed by Ctrl+R (to run the selection). In SAS, the syntax file can be executed by selecting the ‘Submit’ option from the ‘Run’ menu.

Once SPSS or SAS has completed its execution, it is important to check the SPSS output window or SAS log for possible warnings. If warnings appear, they should be examined carefully because they might indicate that the merge process was not performed properly and that the resulting merged data file might not include all the relevant variables or countries.

6.3 Example Analyses with the IDB Analyzer

In the following section, we will describe step-by-step instructions to produce means, percentiles, percentages, linear regressions, correlations, and benchmarks, using the latest PIAAC public-use data files. In each subsection, a sequence of steps will be included as a numbered list. These steps are reiterated for each analysis routine. In this way, each subsection is self-contained, and the reader does not need to consult any other part of the chapter to complete the steps she or he needs to follow to produce means, percentiles, percentages, linear regressions, correlations, or benchmarks.

6.3.1 Means with Plausible Values

In this section, we illustrate how to estimate the means of literacy scores by country. The first example contains a variable with plausible values. In PIAAC there are three variables with plausible values: the literacy scale scores, the numeracy scale score, and the problem-solving scale score. Each of these variables consists of ten different columns of values within the PIAAC dataset. For each test, plausible values are generated as random draws of the posterior distribution of the participant’s proficiency (Wu 2005). To produce population estimates with these scores, the IDB Analyzer computes the results for each plausible value and combines these estimates using Rubin-Shaffer rules (Rutkowski et al. 2010). The following steps produce mean estimates of literacy proficiency by country, for females and males:

1.
Open the IDB Analyzer.
2.
Select the statistical software you want to work with (choose between SAS or SPSS).
3.
Open the Analysis Module of the IDB Analyzer.
4.
For this example, specify the data file ‘merge_piaac.sav’ as the Analysis File (see Sect. 6.2 in this chapter for details on how this file was created).
5.
Select ‘PIAAC (using final full sample weight)’ as the Analysis Type.
6.
Select ‘Percentages and Means’ as the Statistic Type.
7.
Under the ‘Plausible Values Options’, select ‘Use PVs’.
8.
Click on the ‘Separate Tables by’ section at the right-hand side of the software window. This section will become active and highlighted in light yellow.
9.
Go to the ‘Select variables’ section, and click on the ‘GENDER_R’ variable in the fourth row of the name list.
10.
Drag the ‘GENDER_R’ variable to the ‘Separate Tables by’ section.
11.
Click on the ‘Plausible Values’ section at the right-hand side of the software window. This section will become active and highlighted in light yellow.
12.
Go to the ‘Select variables’ section and click on the ‘PVLIT1–10’ variable in the first row of the name list.
13.
Drag the ‘PVLIT1–10’ variable to the ‘Plausible Values’ section.
14.
The Weight Variable is automatically selected by the software. SPFTWT0 is selected by default; this variable contains the final sampling weight.
15.
Specify the name and the folder of the output files in the ‘Output Files’ field by clicking the Define/Modify button. For this example, we use the term ‘mean_with_pv’.

After all these steps, the reached setup should look similar to Fig. 6.6:

16.
Then, click the ‘Start SPSS’ button. This will create an SPSS syntax file and open it in an SPSS editor window.
17.
To start the computations, one needs to press the following keys combinations: CTRL+A first, to select the entire generated code present in the syntax window, and then CTRL+R to run these commands. The output of these analyses is depicted in Fig. 6.7.

In the generated output, the first column contains the list of countries. The second column presents the categorical values of the ‘GENDER_R’ variable: ‘Male’ and ‘Female’. In the third column, the nominal sample size is presented for each group, within each country. In the fourth column, the sum of survey weights is included. These later numbers represent the survey population to which the estimates are projected (Heeringa et al. 2009). Additionally, the IDB Analyzer generates standard errors for the survey population size (sixth column). In the ‘Percent’ column, the estimate of the proportion of each group in the population is presented. These point estimates are accompanied by their standard errors in the ‘Percent (s.e.)’ column. In the column ‘PVLIT (Mean)’, we find the point estimates of the literacy scores. Each country has two values, one for males and one for females. These point estimates present uncertainty, due to measurement error and due to sampling error. This uncertainty is summarised in the ‘PVLIT (s.e.)’ column. Standard deviations of these means are included in the ‘Std.Dev’ column. Similarly to previous estimates, on its right, standard errors of the standard deviations are provided in the column ‘Std.Dev. (s.e.)’. Finally, the last column, ‘pctmiss’, contains the percentage of missing cases in the variables involved in the analysis (‘PVLIT1-10’ and ‘GENDER_R’).

The IDB Analyzer creates six files after an analysis of means with plausible values is complete. Table 6.1 details these files and their content.

Table 6.1 Generated files by an analysis of means

Abstract

6.1 The IDB Analyzer

6.2 Merging Files with the IDB Analyzer

6.3 Example Analyses with the IDB Analyzer

6.3.1 Means with Plausible Values

6.3.2 Means with Other Variables

6.3.3 Percentiles

6.3.4 Percentages

Code 6.1: Recoding Highest Educational Level to a Dummy Variable

6.3.5 Linear Regression

6.3.6 Correlations

6.3.7 Proficiency Levels

6.4 Concluding Remarks

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation