dETECT: A Model for the Evaluation of Instructional Units for Teaching Computing in Middle School

The objective of this article is to present the development and evaluation of dETECT (Evaluating TEaching CompuTing), a model for the evaluation of the quality of instructional units for teaching computing in middle school based on the students’ perception collected through a measurement instrument. The dETECT model was systematically developed and evaluated based on data collected from 16 case studies in 13 different middle school institutions with responses from 477 students. Our results indicate that the dETECT model is acceptable in terms of reliability (Cronbach’s alpha α=.787) and construct validity, demonstrating an acceptable degree of correlation found between almost all items of the dETECT measurement instrument. These results allow researchers and instructors to rely on the dETECT model in order to evaluate instructional units and, thus, contribute to their improvement and to direct an effective and efficient adoption of teaching computing in middle school.


Introduction
Teaching computing through summer camps, clubs or in family workshops is a worldwide trend (Gresse von Wangenheim and Wangenheim, 2014).There are several initiatives to teach computing such as Code.org(http://www.code.org),Code.club(https://www.codeclubworld.org),Computing at Schools (http://www.computacaonaescola.ufsc.br/),among others.These initiatives are expected to contribute to the popularization of computing competencies as well as the awareness and interest of the students towards computing (Guzdial et al., 2014;Garneli et al., 2015).
Taking into consideration the growing number of alternative instructional units (IUs) for teaching computing, it is important to obtain evidence on the expected benefits as a basis for their systematic selection, adoption and improvement (Decker et al., 2016).Following Guzdial (2004), a main contribution to this knowledge area is not necessarily the development of new programming environments or instructional units, but to find out how to study the existing ones.A more precise understanding of the results of using these instructional units would make it possible to know whether they contribute, in fact, positively to the achievement of the learning goals and compensate the cost involved in their adoption.However, although there is evidence that existing IUs can improve the teaching and learning process in middle school being used more widely in schools worldwide, there is little research on the analysis of the contribution that these IUs can bring to education (Decker et al., 2016).
Currently, the evaluation of the quality of IUs is limited or even, sometimes, non-existent (Decker et al., 2016;Garneli et al., 2015).In many cases, a decision about the use of IUs is based on assumptions of their effectiveness (Gross and Powers, 2005;Wilson et al., 2010).On the other hand, some studies focus on specific quality factors only, such as learning improvement (Gross and Powers, 2005;Kalelioğlu and Gülbahar, 2014).Other studies focus on the effectiveness of visual block-based programming languages (Weintrop and Wilensky, 2015;Grover et al., 2014;Perdikuri, 2014).However, students' perceptions and intentions are also determining factors for successful learning (Giannakos et al., 2013).Yet, few evaluations take into consideration aspects such as motivation and the students' experience during the instructional unit (Craig and Horton, 2009;Giannakos et al., 2014), or students' attitudes toward technology acceptance (Giannakos et al., 2013).In addition, studies that measure students' attitude toward computing are rather designed for higher education and seem to be outdated in the current context of teaching computing in schools (Garland and Noyes, 2008).
The measurements used to evaluate the quality of IUs to teach computing vary widely, ranging from generic scales of students' attitudes toward computing to measurement instruments developed in an ad-hoc way.Many measurements are developed without the definition of a model to derive the items of the measurement instrument based on theoretical constructs, which may make the validity of the results questionable.Thus, currently, there is a lack of systematically developed and evaluated evaluation models and/or measurement instruments that are widely accepted to evaluate the quality of IUs for teaching computing in schools.However, such evaluation models have to take into consideration the characteristics of such IUs typically performed more informally, for example, as programming workshops for parents and children outside the school environment.In such a context, it may be impracticable to carry out experiments that require pre-tests and inclusion of control groups, causing a major interruption and influencing the fun factor of the workshop.A more viable alternative may be the conduction of case studies, in which the evaluation of the IU is performed only at the end of the workshop/ course (post-test), typically through a questionnaire to obtain the students' perceptions (Wohlin et al., 2012).An advantage of this study type is that evaluation can be performed with little effort and in a non-intrusive way at the end of the instructional unit.Studies based on the measurement of perceptions, using questionnaires, are conducted in a variety of different research areas providing reliable, valid and useful information (Devellis, 2016;Takatalo et al., 2010, Sweetser andWyeth, 2005;Poels et al., 2007).Thus, the objective of this article is to present the development and evaluation of dETECT (Evaluating TEaching CompuTing), a model for the evaluation of the quality of instructional units for teaching of computing in schools based on the students' perception.

Research Method
In order to develop a model for the evaluation of instructional units for teaching computing, an applied research was carried out (Miller and Salkind, 2002), divided into four stages (Fig. 1  Stage 2. Developing of the dETECT Evaluation Model.Based on the results of the literature review, we systematically developed the dETECT evaluation model for measuring the quality of instructional units for teaching computing based on the perceptions of the students and their parents.Therefore, we used GQM -Goal Question Metric (Basili et al., 1994), a popular approach to measure diverse quality attributes.Using GQM we systematically defined the evaluation objective(s) and decomposed the objective into analysis questions and measures.
Stage 3. Design of the measurement instrument.In order to operationalize the measurement, a questionnaire was developed by a multidisciplinary team, based on methods for scale and questionnaire development (Devellis, 2016;Krosnick and Presser;2010;Malhotra, 2008;Kasunic, 2005).For each of the defined measure, questionnaire items have been defined also based on similar studies that were found in literature, considered adherent to the context of this study and to the defined measurement plan.The questionnaire has been revised and piloted with a small sample of the target audience.
Stage 4. Application and evaluation of the measurement instrument.A case study (Yin, 2009;Wohlin et al., 2012) was conducted in order to evaluate the measurement instrument in terms of reliability and construct validity.For the definition of the evaluation, we used the GQM approach (Basili et al., 1994).The objective of the study was decomposed into quality factors and analysis questions also in accordance with methods for scale development (Carmines and Zeller, 1979;Devellis, 2016;Trochim and Donnelly, 2008).During the case study, the dETECT model measuring instrument was applied as part of the evaluation of 16 courses/computing workshops carried out in different educational institutions collecting the required data.The pooled data was analyzed in order to answer our analysis questions, following the definition of Trochim and Donnelly (2008) and the scale development guide proposed by DeVellis (2016).In terms of reliability, internal consistency is typically measured based on the correlations between different items on the same measurement instrument (Carmines and Zeller, 1979;Trochim and Donnelly, 2008).Internal consistency is usually measured through Cronbach's alpha, a popular method to assess the reliability of the measurement instrument (Carmines and Zeller, 1979).In terms of construct validity, convergent and discriminant validity are the two subtypes of validity that make up construct validity (Trochim and Donnelly, 2008).Convergent validity refers to the degree to which two items of quality factors that theoretically should be related, are in fact related.In contrast, discriminant validity tests whether concepts or measurements that are supposed to be unrelated are in fact unrelated (Trochim and Donnelly, 2008).In order to analyze the convergent and discriminant validity of the dETECT measurement instrument, the intercorrelations of the items and item-total correlation are calculated (DeVellis, 2016).Intercorrelation refers to the degree of correlation between the items of a measurement instrument (Carmines and Zeller, 1979;DeVellis, 2016).The higher the correlations among items that measure the same quality factor, the higher the validity of individual items and, hence, the validity of the instrument as a whole.Item-total correlation is analyzed in order to check if any item in the measurement instrument is inconsistent with the averaged correlation of the others, and thus, can be discarded (Carmines and Zeller, 1979;DeVellis, 2016).
In addition, we used factor analysis to determinate how many factors underlie the set of items of the dETECT measurement instrument, following the analysis process proposed by Brown (2006).Each factor is defined by those items that are more highly correlated with each other than with other items.A statistical indication of the extent to which each item is correlated with each factor is given by the factor loading.Thus, the higher the factor loading, the more the particular item contributes to the given factor.Thus, factor analysis also explicitly takes into consideration the fact that the items measure a factor unequally (Carmines and Zeller, 1979).
This research was approved by the Ethics Committee of the Federal University of Santa Catarina (No. 1021541).

The Evaluation Model dETECT (Evaluating TEaching CompuTing)
The objective of the dETECT model is to analyze instructional units in order to evaluate the quality in terms of quality of the IUs, computing experience and the perception of learning, from the learners' perspective in the context of teaching computing in middle school.From this objective, the analysis questions and measures are derived based on literature (Fig. 2) (Keller, 1987;Sweetser and Wyeth, 2005;Poels et al., 2007;Takatalo et al., 2010;Ericson and McKlin, 2012;Tangney et al., 2010;Wiebe et al., 2003;Papastergiou, 2008;Sanchez-Franco, 2010;Giannakos et al., 2013;Makris et al., 2013;Shih, 2008;Sivilotti and Laugel, 2008;Lai and Lai, 2012;Lee et al., 2009;Savi et al., 2012;Kwon et al., 2012).
In a general way, following the definition proposed by Wiggins and McTighe (2005), an IU is a set of lessons carefully designed to collectively achieve a selected group of learning objectives for a target audience.The unit consists of a coherent set of materials designed to support student learning in a specific educational context and offers goals, assessment tasks, instruction, implementation procedures, and resources.However, due to the lack of a definition of an IU for teaching computing in schools, based on the literature review (Keller, 1987;Sweetser and Wyeth, 2005;Poels et al., 2007;Takatalo et al., 2010;Ericson and McKlin, 2012;Tangney et al., 2010;Wiebe et al., 2003;Papastergiou, 2008;Sanchez-Franco, 2010;Giannakos et al., 2013;Makris et al., 2013;Shih, 2008;Fig. 2. Decomposition of the quality factors.Source: authors.Sivilotti and Laugel, 2008;Lai and Lai, 2012;Lee et al., 2009;Savi et al., 2012;Kwon et al., 2012) we consider, that an instructional unit (workshop, course, etc.) with quality achieves its learning objectives, promotes pleasant activities, facilitates learning, and that creates a positive perception and interest for computing.(1) Yes (2) No 5 I want to learn more about how to make computer programs: (1) Yes (2) No 6 Making a computer program is: (1) Lot of fun (2) Fun (3) Annoying (4) Very Annoying 7 I like to make computer programs: (1) Yes (2) No 8 Computing is useful in everyday life: (1) Yes (2) No 9 I want to learn more about how to make computer programs: (1) Yes (2) No

Perception of Learning
10 The workshop/course was: (1) Very easy (2) Easy (3) Difficult (4) Very Difficult 11 I can write computer programs: (1) Yes (2) No 12 I can explain to a friend how to make a computer program: (1) Yes (2) No 13 Making a computer program is: (1) Very easy (2) Easy (3) Difficult (4) Very Difficult The measurement is operationalized by the development of a questionnaire to be answered by the students at the end of the instructional unit, in order to obtain their perception about the quality of the instructional unit.The items that compose the questionnaire (Table 1) are defined for each of the measures derived from similar studies found in literature considered adherent to the context of this research (Keller, 1987;Sweetser and Wyeth, 2005;Poels et al., 2007;Takatalo et al., 2010;Ericson and McKlin, 2012;Tangney et al., 2010;Wiebe et al., 2003;Papastergiou, 2008;Sanchez-Franco, 2010;Giannakos et al., 2013;Makris et al., 2013;Shih, 2008;Sivilotti and Laugel, 2008;Lai and Lai, 2012;Lee et al., 2009;Savi et al., 2012;Kwon et al., 2012).

Definition and Execution of the Evaluation of the dETECT Model
When developing evaluation models and questionnaires, it is fundamental to analyze whether they are measuring what is intended (construct validity) and whether the same measurement process produces the same results (reliability) (Carmines and Zeller, 1979).Therefore, we evaluated the measurement instrument of the dETECT model in terms of reliability and construct validity from the viewpoint of researchers in the context of instructional units for teaching computing in school.The following analysis questions are taken into consideration: Reliability AQ1: Is there evidence for internal consistency of the dETECT measurement instrument?

Construct Validity
AQ2: Is there evidence of the convergent and discriminant validity of the dETECT measurement instrument?AQ3: How do underlying factors influence the responses on the items of the dETECT measurement instrument?
For the evaluation of the dETECT model, 16 case studies were performed applying three different instructional units in 13 different educational institutions between 2015 and 2016, involving a total of 477 students (Table 2).The measurement took place at the end of instructional units teaching computing, either in form of short 4-hours workshops or as part as interdisciplinary school units during 10-12 weeks (with 2 hours weekly).The units have been applied on the educational stage of middle school with children of age 10 to 14.
The target audience is middle school students including different types of activities during the regular school schedule as well as extracurricular workshops in Brazil.The instructional units aim at teaching computing focusing on programming and computational thinking (Table 3).More information on the instructional units is available at: http://www.computacaonaescola.ufsc.br.Integrating Scratch/Snap!with Arduino and pieces of hardware in a low-cost solution, students learn to program an interactive robot.
In an interdisciplinary way students learn basic computer concepts by programming games involving different contents (e.g.history, Portuguese language, geography, etc.) using Scratch.
Student learn how to program a mobile app game using App Inventor.

Analysis
In order to obtain greater precision and statistical power through a larger sample size, the data collected in the 16 case studies were pooled to answer the defined analysis questions.
Reliability AQ1: Is there evidence for internal consistency of the dETECT measurement instrument?
In order to answer this question, we evaluated the internal consistency of the dE-TECT measurement instrument through Cronbach's alpha coefficient (DeVellis, 2016;Trochim and Donnelly, 2008).Cronbach's alpha coefficient (Cronbach, 1951) indicates indirectly the degree to which a set of items measures a single quality factor.Thus, we want to know whether the dETECT measurement instrument measures the same quality factor, the perception of the quality of the instructional unit.Typically, values of Cronbach's alpha, ranging from 0.70 to 0.95 are considered acceptable (DeVellis, 2016), indicating an internal consistency of the instrument.
Analyzing the 13 items of the measuring instrument (Table 1), the value of Cronbach's alpha is acceptable (α = .787).We, thus, can conclude that the answers to the items are consistent and precise, indicating the reliability of the measuring instrument items of the dETECT model.

Construct Validity
AQ2: Is there evidence of the convergent and discriminant validity of the dETECT measurement instrument?Construct validity of a measurement instrument refers to the ability to actually measure what it purports to measure (Carmines and Zeller, 1979;Trochim and Donnelly, 2008).Convergent and discriminant validity are the two subtypes of validity that make up construct validity (Trochim and Donnelly, 2008).Convergent validity shows that the items that should be related are in reality related.On the other hand, discriminant validity shows that the items that should not be related are in reality not related (Carmines and Zeller, 1979;Trochim and Donnelly, 2008).In order to obtain evidence of the convergent and discriminant validity of the items of the dETECT measurement instrument, the intercorrelations of the items and correlation item-total are calculated (DeVellis, 2016).
Intercorrelations of the items.In order to analyze the intercorrelations between the items, we used the nonparametric Spearman correlation matrices (Table 4).The matrices show the Spearman correlation coefficient, indicating the degree of correlation between two items (item pairs).We used this correlation coefficient, as it is the most appropriate correlation analysis for Likert scales (Trochim and Donnelly, 2008).The correlation coefficients between the items within of the same dimension are colored.In accordance to Cohen (1988), a correlation between items is considered satisfactory, if the correlation coefficient is greater than 0.29, indicating that there is a medium or high correlation be-tween the items.Satisfactory correlations are marked in bold.The numbers of the items are related to the specification presented in Table 1.
Analyzing the interrelations between the items of the three quality factors (Table 4), we can observe that most of the item pairs have medium or high correlation regarding each quality factor.However, some item pairs have a low correlation (e.g., 1-2, 6-9, 10-11).Even so, the results indicate evidence of convergent validity.
On the other hand, some item pairs (e.g., 1-6, 3-6, 5-11) presented medium or high correlation with items of another quality factor.Thus, there is no evidence of discriminant validity.However, the lack of discriminant validity is acceptable, as, although the model is divided into three quality factors, all factors are also related to a single factor, which is the perception of the quality of the IU.
Item-total correlation.This method is complementary to the previous one in order to evaluate the correlation with all the other items.Each item of the instrument should have medium or high correlation with all the other items (DeVellis, 2016), as this indicates that the items present consistency in comparison to the other items.On the other hand, a low item-total correlation of an item undermines the validity of the scale, and, therefore, should be eliminated.Table 5 shows the correlation coefficients between a single item and the other items of the measurement instrument.
We used the method of corrected item-total correlation, which compares one item with every other one of the instrument, excluding itself.Reference values for the analysis are the same as presented in the previous section based on Cohen (1988), considering a correlation satisfactorily, if the correlation coefficient is greater than 0.29.Items with low correlation are marked in bold.In addition, Table 5 also shows the Cronbach's alpha if an item was deleted, expecting that no item elimination should cause a substantial decrease in the Cronbach's alpha (DeVellis, 2016)..162 .094 .135 .312 .246 .194 .266 .192 .247 .143 .450 1.000 13 .156 .082 .155 .168 .266 .325 .225 .226 .199 .391 .432.2861,000 In general, item-total correlations are medium and high.Most items demonstrate acceptable item-total correlation and satisfactory values of Cronbach's alpha coefficient, if item was deleted, thus, indicating, the validity of the quality factors.Only the items 2 ("The time of the workshop passed:") and 10 ("The workshop was:") presented a low itemtotal correlation.In addition, item 2 presents a small increase in Cronbach's alpha if the item was deleted.Consequently, the results indicate that these items need to be reviewed.

AQ3: How do underlying factors influence the responses on the items of the dETECT measurement instrument?
In order to identify the number of factors (quality factors) that represents the responses of the set of the 13 items of the dETECT measurement instrument, we performed a factor analysis.
To analyze whether the items of the dETECT measurement instrument can be submitted to a factor analysis (Brown, 2006), we used the Kaiser-Meyer-Olkin (KMO) index.This method indicates how much the realization of the factor analysis is appropriate for a specific set of items (Brown, 2006).The KMO index measures the sampling adequacy with values between 0.0 and 1.0.An index value near 1.0 supports a factor analysis and anything less than 0.5 is probably not amenable to useful factor analysis (Dziuban and Shirkey, 1974).Analyzing the set of items of the dETECT measurement instrument, we obtained a KMO index of .827.Consequently, it indicates that factor analysis is appropriate in order to analyze the number of factors that represents the responses of the dETECT measurement instrument.
Running a factorial analysis, the number of factors retained in the analysis is decided (Glorfeld, 1995;Brown, 2006).Here we used the Kaiser-Guttman criterion for this decision, as it is the most commonly used method of determining the number of factors.This method states that the number of factors is equal to the number of eigenvalues greater than 1 (Glorfeld, 1995).The eigenvalue refers to the value of the variance of the all the items which is explained by a factor (Glorfeld, 1995).Following the Kaiser-Guttman criterion, our results show that three factors should be retained in the analysis.Regarding the dETECT model, this means that the responses of the measuring instrument are representing three underlying factors, indicating a decomposition similar to the original definition of the model.Once identified the number of underlying factors, another issue is to determine which items are loaded into which factor.In order to identify the factor loadings of the items, a rotation method is used (Brown, 2006;Tabachnick and Fidel, 2007).Here we used the Varimax with Kaiser Normalization rotation method being the most widely accepted and used rotation method (Tabachnick and Fidel, 2007).Table 6 shows the factor loadings of the items associated with the three retained factors.The highest factor loading of each item, indicating to which factor the item is most related, is marked in bold.
Analyzing the factor loadings of the items (Table 6), we can observe that, the first factor (factor 1), includes a set of 7 items (4, 5, 6, 7, 8, 9 and 12).Thus, this factor is directly related to the quality factor of the computing experience provided by the instructional unit (Table 1).With the exception of item 12, all items correspond to the referred quality factor in the original structure of the dETECT model.Although, item 12 has the highest factor loading on factor 1, it also presents a similar factor loading (.410) with respect to factor 3, thus, showing that this item contributes to both quality factors (computing experience and perception of learning).Regarding factor 2, a set of three items (1, 2 and 3) is considered.This result seems to suggest that these items are related to the factor related to the quality of the instructional unit of the dETECT model.In fact, these items correspond to the same quality factor (quality of the IU) in the original definition of the dETECT model (Table 1).Analyzing the results of factor 3, it includes a set of three items (10, 11 and 13), indicating that these items are related to a single quality factor (perception of learning).

Discussion
The obtained results show sufficient evidence to consider the reliability and construct validity of dETECT as an acceptable model for the evaluation of instructional units for teaching computing in middle school.
In terms of reliability (AQ1), the results of the analysis indicate an acceptable Cronbach's alpha for all quality factors (Cronbach's alpha α=.787), indicating the internal consistency of the dETECT measurement instrument.Thus, it indicates that the items of dETECT measurement instrument are consistent and precise with respect to the evaluation of instructional units for teaching computing.
In terms of construct validity, with regard to convergent validity (AQ2), we identified that most items have medium and high correlation, mainly between items of the same quality factor (e.g., quality of IU, computing experience, and perception of learning).In this way, we can conclude that there is evidence of convergent validity considering the quality factors.This indicates that the items of the measuring instrument seem to be actually measuring what they intend to measure (e.g., quality of IU, computing experience, and perception of learning).However, some items have a low correlation, both within a single quality factor and in relation to the other factors (e.g., items 4-9).This may be due to the description of the items derived from the ones found in literature, and, thus, may indicate that these items need to be revised.
With respect to discriminant validity, in general, most of the items present a low correlation with items of other quality factors.However, some item pairs (e.g., 1-6, 5-11) have a medium or high correlation with items of another quality factor.Thus, the results do not indicate evidence of discriminant validity.However, in this case, the lack of discriminant validity is acceptable, because, although the model is divided into three quality factors, all factors are also related to a single factor, which is the perception of the quality of the instructional unit, as proposed in the original composition of the dETECT model (Fig. 2).
Analyzing the item-total correlation, again, the majority of the items presents a satisfactory correlation with the other items of the measuring instrument.Thus, indicating that the set of items of the measuring instrument of the dETECT model is related to measure what they propose to measure (perception of quality of an IU).
Based on the results of the factor analysis (AQ3), we identified that the data collected in the case studies are explained by three factors.This confirms the initial structure defined for the dETECT model, clearly grouping the items according to their defined quality factor (quality of IU, computing experience and perception of learning).

Threats to validity
Due to the characteristics of this type of research, this work is subject to various threats to validity.We, therefore, identified potential threats and applied mitigation strategies in order to minimize their impact on our research.Some threats are related to the design of the study.In order to mitigate this threat, we defined and documented a systematic methodology for our study.The dETECT model was defined based on the GQM approach, systematically decomposing the evaluation objective into analysis questions and measures.The measuring instrument was developed following a scale and questionnaire development methods defined in literature and involving a multidisciplinary team of researchers.In addition, for the evaluation of the dETECT model measuring instrument, a case study was systematically defined and documented.Another risk refers to the quality of the data pooled into a single sample, in terms of standardization of data (response format) and adequacy to dETECT model.As our study is limited exclusively to evaluations that used the dETECT model this risk is minimized as in all studies the same data collection instrument has been used.Another issue refers to the pooled data from different contexts.To mitigate this threat we selected studies which considered only case studies of IUs for teaching computing in similar contexts.
In terms of external validity, a threat to the possibility to generalize the results is related to the sample size and diversity of the data used for the evaluation.In respect to sample size, our evaluation used data collected from 16 case studies evaluating three different instructional units, involving a population of 477 students.In terms of statistical significance, this is a satisfactory sample size allowing the generation of significant results (Clark and Watson, 1995;MacCallum et al., 1999;Kasunic, 2005, Devellis, 2016).
In terms of reliability, a threat refers to what extent the data and the analysis are dependent on the specific researchers.In order to mitigate this threat, we systematically documented the development and evaluation of the dETECT model, defining clearly the study objective, the process of data collection, and the statistics methods used for data analysis.Another issue refers to the correct choice of statistical tests for data analysis.To minimize this threat, we performed a statistical evaluation based on the approach for the construction of measurement scales as proposed by DeVellis (2016), which is aligned with procedures for the evaluation of internal consistency and construct validity of a measurement instrument (Trochim and Donnelly, 2008).

Conclusion
Although the evaluation of instructional units for teaching computing is essential for their continuous improvement and effective and efficient application, few efforts are made for the development of evaluation models.In this context, this article presents a first step into this direction taking also into consideration practical limitations when running such evaluations in more informal outreach programs.Based on literature and practical experiences, the evaluation model dETECT and its 13-item measurement instrument have been developed systematically and applied at the end of 16 instructional units in middle school in Brazil.
Results from the analysis of the responses of 477 students indicate that the measurement instrument is acceptable in terms of reliability and construct validity.With respect to reliability, a Cronbach's alpha α=.787 indicates an acceptable internal consistency, which means that the responses between the items are consistent and precise.Our analysis also indicates convergent validity through an acceptable degree of correlation found between almost all items regarding the quality factors.Thus, it suggests that the measurement instrument of the dETECT model can be a reliable and valid instrument for measuring the students' perception of instructional units for teaching computing.The results of the factorial analysis indicate that three underlying factors influence the responses of the items of the dETECT model measuring instrument confirming the original structure of the model, which defines three quality factors (quality of IU, computing experience and perception of learning) for the evaluation of instructional units.
): Stage 1. Literature review • Stage 2. Developing of the dETECT Evaluation Model • Stage 3. Design of the measurement instrument • Stage 4. Application and evaluation of the measurement instrument • Stage 1. Literature review.In a first exploratory stage, we conducted a literature review on bibliography related to evaluation models of instructional units for teaching computing in schools.

Table 3
Overview of the instructional units applied

Table 4
Spearman correlation coefficient