Superimposition of serial 3-dimensional facial photographs to assess changes over time: A systematic review

Introduction: Superimpositions of 3-dimensional photographs enable a thorough and risk-free assessment of facial changes over time. However, the available methods and the evidence supporting them have not been assessed systematically. The paper summarizes and assesses the current evidence on superimposition methods of serial 3-dimensional facial photographs available in the literature. Methods: The following databases were searched without time restriction (last updated December 2020): MEDLINE via PubMed, EMBASE, Cochrane Library, and Google Scholar. Unpublished literature was searched on Open Grey and Grey Literature Report. Authors were contacted if necessary, and reference lists of relevant papers were screened. All studies with sample size $ 6 that tested the accuracy or precision of a superimposition technique, or agreement between different techniques regarding facial surface changes, were considered. The 2 authors performed data extraction independently using prede ﬁ ned forms. The risk of bias was assessed through the Quality Assessment and Diagnostic Accuracy Tool 2 tool. Results: Eight studies ful ﬁ lled the inclusion criteria. The total risk of bias of 7 studies was high and of 1 low. Seven studies had high total applicability concerns, and 1 was unclear. There was high heterogeneity among studies, which tested constructed planes through manually selected landmarks, a con ﬁ guration of 9 landmarks, various surface areas, and the entire facial surface as superimposition references. A small rectangular area on the forehead combined with one on the middle part of the nose and the lower wall of the orbital foramen showed promising results. Conclusions: The limited available evidence suggests that surface-based registration is superior to landmark-based registration. Further research in the ﬁ eld is mandatory. (Am J Orthod Dentofacial Orthop 2021; - : - - - )

I n several disciplines of medicine and dentistry, facial photographs of individual patients are required for documentation, diagnosis, planning of the treatment, and treatment progress or outcome evaluation. Two-dimensional (2D) photographic analysis is still the most common imaging technique to evaluate facial soft tissues in clinical practice. The drawbacks of this 2D technique, when used to assess 3-dimensional (3D) structures, namely the face, include restricted validity and reliability errors because of head positioning, image distortion, and reduction of a 3D complex surface to a 2D image. Furthermore, the 2D images do not represent the real size because there is always a magnification factor. 1,2 Because of these limitations and the new possibilities available in recent years, 3D imaging techniques are gaining space in many disciplines, including plastic surgery, orthodontics, implantology, and maxillofacial surgery. 3 Three-dimensional images include an enormous amount of information compared with conventional 2D images, which is needed for the proper morphologic assessment of complex structures.
Three-dimensional data of the facial surface can be obtained by different methods, including laser scanning, 3D stereophotogrammetry, patterned light techniques, conventional computed tomography, 3D cone-beam computed tomography (CBCT), and 3D magnetic resonance images. Compared with other methods, stereophotogrammetry is noninvasive, risk-free, accurate, user-friendly, and fast. 4 To objectively evaluate the effects of treatment, growth, aging, or pathology in the morphology of anatomic structures, superimposition of 2 or more serial images obtained at different time points is necessary. 5 Different techniques have been suggested to superimpose serial 3D facial surface images, usually on various morphologically stable structures represented by areas, landmarks, or planes. In brief, these can be divided into 2 categories: landmark-based 4 and surface-based methods. 6 Landmark-based methods register serial 3D images on $3 manually selected corresponding anatomic landmarks. 3 Surface-based registrations use anatomic areas as superimposition references, comparing the triangular representations of corresponding 3D surface geometries. Then, using appropriate software, automated matching of the serial images on the selected reference landmarks or areas is performed through specific mathematical algorithms that aim to minimize the distance between the corresponding landmarks or areas. 3,7,8 The information on 3D superimposition techniques to assess facial surface changes is scarce. So far, there is no widely accepted method for the superimposition of serial facial images. Many factors might affect the superimposition outcome, such as the image quality and the superimposition methods. 5,[9][10][11][12] However, an assessment of the available methods and the evidence supporting them in the context of a systematic review has not been undertaken.

OBJECTIVES
The purpose of this study was to summarize and critically assess all the relevant studies on the validity, reliability, and reproducibility of serial 3D facial photograph superimposition techniques through a systematic review of the literature.

Protocol and registration
The study protocol was registered in PROSPERO (registration no. CRD42019134784) before study implementation.

Eligility criteria
The following criteria were applied in this review: (1) prospective and retrospective study designs were considered eligible; (2) study sample: studies with sample size $6; (3) index test: surface-based or landmark/ reference plane-based superimposition of serial 3D facial photographs; (4) types of participants: patients who underwent any change in their facial morphology because of growth, aging, pathology, or medical intervention. Natural changes of healthy patients assessed at very short term (eg, through a change in facial expression) were also considered if used to test superimposition outcomes; (5) type of intervention: 3D superimposition to assess any change in facial morphology; (6) primary outcome: accuracy or precision of a superimposition technique, or agreement between different techniques measured in corresponding surface landmarks or areas as soft-tissue change; (7) studies that tested any of the above as secondary outcomes were also considered eligible; (8) comparator/ control group: different superimposition techniques, direct measurements, or repeated measurements; (9) unit of analysis: in all cases, the unit of analysis was the measured change in a facial surface area or landmark; (10) follow-up: all observation periods between subsequent photographs were accepted if a morphologic change was expected; and (10) exclusion criteria: nonhuman-derived data. Study selection. The 2 authors of this review searched for eligible studies independently. They were not blinded to the identity of the authors of the studies, their institutions, or the results of their research. Included studies were selected by the title and the abstract, and when needed, the full text was read. Noneligible papers were excluded at all stages. If there was a disagreement, the eligibility was discussed between the authors until a consensus was reached. A record of all decisions on study identification was kept.

Data items and collection
Data extraction was performed independently and duplicated by the 2 authors using a custom Excel sheet (Microsoft Excel; Microsoft, Richmond Wash) formed after pilot testing on 3 relevant studies. Disagreements were resolved by reexamining the source of information until a consensus was reached. The following information was extracted from all eligible studies, if available: (1) methods: author, title, year, objectives, and design of the study; (2) participants: number, age, and gender of patients recruited; (3) materials: type of 3D facial softtissue acquisition method and time between serial models; (4) superimposition method: type of superimposition reference areas or points and software used; (5) comparison/control group: type and characteristics; and (6) outcome: type of outcome(s) and method of outcome assessment.
When needed, in the presence of missing information from a study, the authors were contacted by e-mail to request clarifications. In case that the authors did not respond or the data were not receivable, only the available information was regarded.

Risk of bias in individual studies
The quality of the selected studies was evaluated through the Quality Assessment and Diagnostic Accuracy Tool 2 (QUADAS-2) tool. 13 This tool is recommended by the Cochrane Collaboration to assess the risk of bias and the applicability concerns of diagnostic accuracy studies in systematic reviews. It subdivides each study into 4 key domains: patient selection, index test, reference standard, and flow and timing. Those 4 domains are evaluated for the risk of bias and only the first 3 for applicability concerns. The results of the QUADAS-2 tool are commonly presented in a table using happy (low risk) or sad smiles (high risk). If it is not possible to evaluate a domain, an interrogation mark is set, which indicates an unclear risk. In case a domain is evaluated with a high risk, relevant justification is provided.
The risk of bias was performed independently and in duplicate by the 2 authors. In cases of disagreement, the judgment was discussed until a consensus was reached. In the case of a meta-analysis, studies with a high risk of bias will not be included.

Summary measures and approach to synthesis
Assessment of heterogeneity. The studies were assessed on the basis of the similarity between characteristics, participants, methods, and outcomes as reported in the inclusion criteria.
Data synthesis. A meta-analysis was conducted if there were .2 unclear or low risk of bias studies of similar comparisons, reporting the same outcomes at similar follow-up periods.

Assessment of reporting bias
We attempted to minimize potential reporting biases, including publication bias and multiple (duplicate reports) publication bias, by conducting an accurate but sufficiently broad search of multiple sources. We also searched for ongoing studies.
Additional analysis. Potential subgroups that will be further assessed whenever possible include patients with vs without growth, short-term (within a week) vs medium-/long-term (.1 week) interval between serial photographs and surface-based vs landmark/reference plane-based techniques.

Study selection and characteristics of the included studies
Out of 1715 studies initially identified by the search, 8 studies fulfilled the eligibility criteria and were included in this review (Fig). From these, 6 studies were published articles, 6,14-18 1 was a master thesis, 19 and 1 was completed and published as a preprint. 20 In all studies, 3D landmark-based or surface-based superimposition techniques were used to assess actual or simulated facial changes. The included studies evaluated the accuracy or precision of a superimposition technique, or agreement between different techniques, as a primary or a secondary outcome.
All included studies were prospective in terms of superimposition data generation. Three studies tested only growing patients, 17,18,20 4 studies nongrowing patients 6,14,15,19 and another study did not report this information. 16 Regarding superimposition references, one study used constructed planes on the basis of manually selected landmarks, 6 one study used the anterior forehead surface, 14 another study used 9 landmarks, 15 2 other studies used the whole cropped image surface, 16,17 2 studies used the nasal bridge and forehead area 18,19 and, finally, 1 study tested 5 different surfaces as references. 20 All relevant details of the included studies are shown in Tables I and II.

Quality assessment
Quality assessment was performed through the QUADAS-2 tool and results are presented in Table III, including the reasons supporting judgments.
Seven included studies had a high total risk of bias, 6,14-19 whereas 1 study 20 had a low total risk of bias. Regarding the individual domains, studies were mainly characterized as high risk on the basis of index tests and reference standards. In these domains, low risk was evident only for 1 study 20 and 2 studies, 6,20 respectively.
Regarding the total applicability concerns, 7 studies had high, 6,14-19 and 1 study had an unclear risk. 20 Concerning individual items, studies were judged as having high concerns primarily because of index tests and reference standards. Only 1 study had low risk considering index test 20 and 2 for reference standard. 6,20 Results of individual studies and qualitative synthesis The results, conclusions, and limitations of all included studies are provided in Tables IV and V. Landmark-based registration. There is only 1 study in this category, which assessed the superimposition of 3D facial photographs on 9 corresponding soft-tissue landmarks. 15 The study focused on intraobserver and interobserver variability of superimposition outcomes after facial alteration of 10 adult males because of mimicry. For this, 3D photographs of each participant with a happy, sad, angry, and surprised expression were superimposed with a neutral one. The study does not provide information on the reproducibility of the technique in individual patients. The presented approach seems to differentiate between different facial expressions in most patients. The results of this study should be interpreted with caution because of the absence of individual outcome assessment, no adequate landmark identification error, and no testing of repeated images with the same expression.     ☹ So far, no solid conclusions can be drawn regarding landmark-based registration of 3D facial photographs because there is only 1 study on the topic, which has severe drawbacks.

Wampfler and Gkantidis
Surface-based registration. There are 6 studies that solely tested surface-based registration of serial 3D facial photographs. 14,16-20 Incrapera et al 14 compared linear facial soft-tissue changes (preorthognathic and postorthognathic surgery) of 34 nongrowing patients, measured through superimpositions of lateral cephalometric radiographs and 3D photographs. Distances were measured between 5 corresponding soft-tissue landmarks. The superimposition of lateral cephalometric radiographs and 3D images was done by manual best fit of hand tracings and automatic best fit of surfaces. The cephalometric superimposition references included the cranial base (SN line), the anterior clinoid process, and the sphenoethmoidale plane. The 3D image superimposition references included the anterior forehead, the softtissue glabella, the nasion, and the bridge of the nose area. The study showed that the mean linear changes measured through superimpositions of 2D radiographs or 3D photographs were not statistically different. However, the study had several limitations that questioned the validity of the findings. These include the absence of magnification measurement for the lateral cephalograms and of an assessment of individual differences through analysis, such as the Bland Altman test. Furthermore, no method errors were reported.
Adriaens et al 16 tested the intraobserver and interobserver reproducibility of masseter muscle volume differences after botulinum toxin type A therapy in 10 patients with unilateral or bilateral masseter hypertrophy. Three-dimensional photographs were obtained with a stereophotogrammetric camera (3dMD, Atlanta, Ga), before and 3 months after injection. The superimposition reference was the entire face after cropping the surrounding structures. This superimposition might show adequate reproducibility when mean changes in a group of patients are considered. Limitations such as small sample size, absence of Bland Altman or similar method to assess individual differences, and poor reporting weaken the evidence provided by the study.
H€ aner et al 20 analyzed the superimposition outcomes of 5 different facial surface superimposition techniques compared with the voxel-based anterior cranial base superimposition. The study assessed outcomes at 7 softtissue areas distributed over the entire facial surface. Eighteen serial CBCT derived facial surface models of growing patients, with a span of 1-3 years, were selected for this study. The superimposition references for the facial surfaces were (1) the whole facial surface excluding the eyes, the mouth, and the tip of the nose; (2) the forehead and base of the nose; (3) the upper half of the face excluding the eyes and the tip of the nose; (4) a small rectangular area on the forehead and an area including the middle part of the nose and the lower wall of the orbital foramen; and (5) the same area as the latter, but without the area on the forehead. The best agreement of facial surface superimposition outcomes compared with the anterior cranial base technique was evident for technique 4. The study had a low total risk of bias, but unclear applicability concerns because of the moderate sample size, with relatively broad baseline characteristics, such as malocclusion type, and the limited age range.
Dindaro glu et al 17 evaluated the intraobserver variability of a surface-based superimposition to measure facial changes after rapid maxillary expansion. Threedimensional photographs of 25 growing patients with posterior crossbite malocclusion were taken before and after expansion. Reproducibility was tested on the repeated superimposition of 10 randomly selected pairs of photographs using intraclass correlation. The whole face was used as a superimposition reference. The method indicated adequate reproducibility, but the study had significant limitations. These include the absence of a control/comparison method, the small sample size, no assessment of accuracy or reproducibility of individual measurements, and poor reporting.
Altındiş et al 18 tested the intraobserver reproducibility of a surface-based superimposition on volumetric assessment of upper lip changes after rapid palatal expansion. Forty-two 3D photographs of growing patients with moderate maxillary constriction were taken before and after rapid palatal expansion and were superimposed on the nasal bridge and forehead area. The presented superimposition might show adequate intraobserver reproducibility when mean changes on a group of patients are considered. Limitations of this study are that only mean values were assessed, no color maps were shown, no comparative statistics were applied, and the type of intraclass correlation coefficient was not reported.
Fay 19 examined intraobserver and interobserver reproducibility of facial volume measurements between superimposed facial 3D photographs. Photographs of 30 orthognathic surgery patients receiving steroids to control postoperative swelling were taken before hospital discharge and 4 weeks postoperative. The 2 records were superimposed on the nasal bridge and forehead area to illustrate volume differences. The presented method might show adequate intraobserver and interobserver reproducibility when mean changes on a group of patients are analyzed. The study considers only mean values, which is a significant weakness. Furthermore, no  color-coded distance maps were shown to assess the spatial distribution of changes. Overall, 6 studies tested a surface-based superimposition approach for 3D facial photographs. However, 4 of them [16][17][18][19] tested only reproducibility, and all had important limitations. A fifth study tested the agreement of facial soft-tissue superimposition with cephalometric superimposition, but it also had significant limitations. 14 Finally, 1 low risk of bias study showed promising findings for a surface-based superimposition on a small rectangular area of the forehead plus an area at the middle part of the nose and the lower wall of the orbital foramen. 20 Nevertheless, this study also needs testing in different samples to generalize the applicability of the findings.
Landmark-based vs surface-based registration.
Finally, 1 study evaluated the agreement between surface-based and landmark-based registrations 6 using serial 3D facial photographs acquired within 1 minute and 3-week intervals. Furthermore, 2 software packages and the differences between original and cropped photographs were compared regarding surface-based registration outcomes. Zero distance between corresponding serial 3D photographs was defined as the gold standard value. The superimposition reference for surface-based methods was the whole 3D facial photograph (original and cropped). For landmark (constructed planes)-based registration, certain points were used to construct a horizontal, a vertical, and a median plane. The superimposition reference frame was formed by these planes. The study showed that distances between registered images were less in cropped than uncropped photographs. Surface-based registration also provided smaller distances as compared with landmark-based registration.  14 The mean linear changes measured at 5 soft-tissue landmarks through superimpositions of serial cephalograms or 3D photographs were not statistically different No evaluation of magnification in the lateral cephalograms (same machine); only mean values were assessed; no Bland Altman or similar method to assess individual differences; and no method error Maal et al 6 Distances between registered images were less in cropped photographs and with surface-based registration (identical patients) than with landmark-based registration Differences between the software packages used for surfacebased registration were negligible Only mean values were assessed; no Bland Altman or similar method to assess individual differences. No color maps were shown; no comparative statistics were applied; and no intervention or expected morphologic change inbetween the serial images Gibelli et al 15 No valid conclusions can be drawn regarding the reproducibility of the used landmark-based superimposition in individual patients; the presented approach seems to differentiate between different facial expressions in most patients Inadequate method error assessment (t test); small sample size (questionable power for error assessments); no evaluation of individual outcomes; no assessment of landmark identification error; and no assessment of superimposition outcomes on repeated images with the same expression Adriaens et al 16 The presented superimposition might show adequate reproducibility when mean changes on a group of patients are considered Small sample size (questionable power); only mean values were assessed; no Bland Altman or similar method to assess individual differences; and poor reporting Altındiş et al 18 The presented superimposition might show adequate intraobserver reproducibility when mean changes on a group of patients are considered Only mean values were assessed; no Bland Altman or similar method to assess individual differences; no color maps were shown; no comparative statistics were applied; and the type of ICC was not reported Fay 19 The presented superimposition might show adequate intra and interobserver reproducibility when mean changes on a group of patients are considered Only mean values were assessed; no Bland Altman or similar method to assess individual differences; no color maps were shown; and the type of ICC was not reported H€ aner et al 20 The reference area selection considerably affects the superimposition outcomes of serial facial surface models; area 4, a small rectangular area on the forehead and an area including the middle part of the nose and the lower wall of the orbital foramen, shows comparable outcomes to the standard anterior cranial base superimposition technique, and it also shows adequate reproducibility Moderate sample size; a specific age group was tested Dindaro glu et al 17 The presented reference for superimposition might show adequate reproducibility Small sample size; no assessment of method accuracy; individual measurements of reproducibility were not assessed; no detailed data presented; and the ICC type is not specified ICC, intraclass correlation coefficient.
The differences between the software packages used for surface-based registration were negligible. Important limitations of the study included that only mean values and mean distances were assessed. Further drawbacks were the absence of color-coded distance maps to localize differences, of comparative statistics, and of considerable morphologic change between the serial images. Thus, there is only 1 study comparing landmarkbased with surface-based registration, and it favored surface-based techniques and cropped facial photographs. Furthermore, the study did not detect any difference between 2 software applying for surface-based registration. However, it showed important limitations.

DISCUSSION
Two-dimensional photographs still comprise the main imaging technique used to illustrate facial morphology. However, they have major limitations deriving from the information reduction of a 3D surface in 2D. 1,2 The same is true for lateral cephalometric radiographs, and therefore, they are, when indicated, replaced by 3D radiographs. 3 Nevertheless, any radiographic technique requires patient exposure to harmful ionizing radiation, which is usually higher for 3D acquisitions. Three-dimensional photographs comprise risk-free tools that offer convenient, fast, accurate, and detailed 3D facial surface imaging. The accuracy of 3D photographs is well documented. [21][22][23] In addition, the high quality and considerable amount of information offered by the 3D facial photographs is also quite clinically relevant because it represents the anatomic area that is assessed by patients and is perceived by people during human interactions. This is the primary area people consider in the perception of facial appearance and related treatment outcomes when looking at others or themselves in the mirror. It has been shown that the primary factor affecting patient satisfaction from a given intervention is associated with the perceived esthetic outcomes. 24,25 Thus, the proper assessment of facial surface changes because of treatment, growth, or pathology is a fundamental part of evaluating treatment effects.
Furthermore, it can determine how objective changes relate to perceived changes, which is a key factor affecting patient satisfaction. However, to objectively assess changes over time, superimposition techniques should be applied to serial images, and improper application or interpretation of such techniques might confound outcomes. 7,8,10-12 Therefore, we have gathered evidence from the available studies on facial soft-tissue superposition, aiming to provide useful guidelines for applying these techniques.
The systematic literature search identified a limited number of relevant studies, and most represented the low quality of evidence and had high applicability concerns. Furthermore, the heterogeneity among studies was high, and thus, solid conclusions cannot be drawn at the moment. There was only 1 low risk of bias study that compared different surface-based techniques with a standard technique. 20 This became feasible through the use of facial surface models extracted from CBCT images. The study concluded that a small rectangular area on the forehead and an area including the middle part of the nose and the lower wall of the orbital foramen provided comparable outcomes to an anterior cranial base superimposition when applied in growing patients. Thus, there is an urgent need for further work in the field so that both researchers and clinicians gain full advantage of this very powerful, risk-free imaging technique.
Though we did not limit our search regarding the time of publication, all eligible studies were published after 2010. This might be related to the fact that the 3D facial photograph acquisition methods comprise recently developed technologies. Furthermore, several studies reached the final stages of the search strategy but were excluded primarily because only 1 point was evaluated-usually to assess facial asymmetry-or there was no method error assessment. All these excluded studies, with the corresponding reasons for this, are listed in Supplementary Table II. Our review assessed only facial surface models derived from 3D facial photographs or models of equivalent accuracy. Indeed, 7 of the 8 included studies tested 3D facial photographs, whereas 1 study 20 used 3D facial models extracted from CBCT images. The latter have equivalent accuracy to that of the 3D photographs. 21,23,[26][27][28] Thus, studies that tested laser scans, Moir e topography, or other such methods were excluded. These methods are not easily applicable in a clinical environment, especially in children, because of the increased acquisition time, generating increased motion artifacts. [29][30][31][32] Another drawback concerns the difficulty of recording soft-tissue surface texture, which impedes the actual representation of a patient's image and might also affect landmark identification. 30,31 Therefore, we focused our review on the most clinically relevant method. Our findings should apply to any facial surface model with comparable accuracy to that tested here.
Different studies in the literature introduced various techniques as appropriate to superimpose serial 3D facial images. In 8 included studies, 6 superimposition references were deemed appropriate for superimposition of 3D facial photographs. However, 1 study tested only identical photographs or differences of a smiling from rest photographs, 6 whereas 7 studies did not assess individual differences between techniques, but performed only group mean comparisons. Thus, the applicability of the tested techniques in every single patient remains unexplored. Moreover, there was significant heterogeneity among studies in samples, methods, and outcomes making direct comparisons impossible. Therefore, no studies could be combined in a meta-analysis. Furthermore, a recent study that assessed various superimposition reference areas identified significant differences in outcomes, though all areas were primarily located at the upper half of the face. 20 A similar finding was previously evident for the superimposition of serial maxillary models on palatal structures 7 and for the superimposition of tooth crowns to assess tooth wear. 11,12 The most widely used techniques to superimpose serial facial images apply for surface-based registrations. According to Incrapera et al, 14 Altındiş et al, 18 and Fay, 19 the anterior forehead is a promising superimposition reference area because of the morphologic stability that this area shows over time. However, the low amount and quality of evidence supporting this approach emphasize the need for further studies. A recent low risk of bias study also supported the use of the forehead area but combined with an area consisting of the middle part of the nose and the lower wall of the orbital foramen. 20 The primary advantage of this study is that it used a standard reference method to compare the newly introduced facial surface-based methods. The study did not favor the nasal bridge area in growing patients, stating that it often presents a different growth pattern than the forehead, 33,34 and it might also show increased artifacts in certain patients because of facial hair. The above could confound the superimposition outcomes. 8 Adriaens et al 16 and Dindaro glu et al 17 suggested that the whole 3D facial (cropped) photographs can be used as a reference for superimposition. However, both studies had a high risk of bias and applicability concerns.
Furthermore, in most patients, the clinical applicability of the latter approach is questionable. When the entire facial image is used as a superimposition reference, the changes in a specific area cannot be assessed accurately because the superimposition results are skewed by changes in other facial areas. This is inevitable because the application of the best-fit algorithm aimed to minimize the overall distance of the superimposed surfaces. For example, when a face has been primarily changed at the lower third, which is usually the case in preadolescent growing patients, it would make sense to use relatively stable areas at the middle or the upper facial third as references. Then, by superimposing 2 serial images on these relatively stable areas, the actual changes in the lower third can be revealed. If a whole face approach is adopted, the assessed changes in the lower part of the face will be mediated (appear reduced) by the other relatively stable parts of the face, which will also appear falsely as being changed.
In general, larger reference areas are preferable to smaller areas because they are less prone to operator-or artifact-related errors. 8,11,12 However, when using larger areas, the chances to include regions that have been changed over time are increasing. These changes might skew outcomes on other regions, 10 as explained above. Thus, relatively large areas are advantageous when they do not compromise the overall stability of the reference structures between serial images. Stable superimposition reference structures are usually a prerequisite in clinically relevant outcomes to assess changes in other areas because of treatment, growth, or pathology. 5,9,10 So far, there was only 1 study 20 that performed a comparison of different superimposition reference areas with a well-documented superimposition technique on the anterior cranial base. 3,35,36 The study showed a low risk of bias, but it had unclear applicability concerns because only young adolescents were assessed. 20 The findings of this study look promising, but further studies on different samples would be beneficial to confirm and generalize the results.
As an alternative to the most often applied surfacebased registration, a landmark-based registration can be performed. The latter is simple to use and may show adequate reproducibility, especially if a considerable number of landmarks is used. 15,37 However, when few landmarks are used as superimposition reference, small errors in landmark identification considerably affect the outcomes. 3 In contrast, with the increase in the number of landmarks, the technique becomes more complex and time-consuming. So far, there is only 1 study that tested 9 landmarks around the eyes, nose, and mouth as reference for facial 3D photographs superimposition 15 and this has important limitations in outcome assessment. Furthermore, the study did not test faces subjected to changes over time, but only faces changed because of facial expressions.
Currently, there is only 1 study in the literature that tested a landmark-based technique compared with a surface-based technique. 6 The surface-based registration seemed to offer better accuracy than a reference frame created from identified landmarks. These findings are in agreement with previous evidence 3 supporting surface-based techniques, which are automated, and thus, less operator-dependent. In addition, the postprocessing capabilities of surface models are also higher because the congruence of a whole area, consisting of thousands of points, is used and assessed instead of a few arbitrarily selected landmarks.

Limitations
According to the inclusion criteria, only studies on 3D photographs or surface models of equivalent accuracy were considered. However, some studies tested other types of facial imaging, such as laser scans or Moir e fringe techniques. We decided to exclude these studies because these techniques are not realistic in daily clinical practice and for a large load of patients because of the time-consuming acquisition and the reduced image quality. Furthermore, the relatively small amount of included studies, the high heterogeneity, and the high risk of bias led to limited synthesis possibilities and no meta-analysis of the results. Thus, no solid conclusions with broad applicability could be drawn at the moment.

CONCLUSIONS
The present study on the superimposition of serial 3D facial photographs revealed the limited amount and the low quality of evidence supporting these newly introduced but quite powerful techniques. Few relevant studies were identified, and most had a high risk of bias and applicability concerns. The heterogeneity in methods and outcomes was also high.
The limited available evidence suggests that surfacebased registration performs superiorly to landmarkbased registration. A small rectangular area on the forehead and an area including the middle part of the nose and the lower wall of the orbital foramen showed promising results, comparable to the anterior cranial base superimposition. However, there is an urgent need for further studies in the field to confirm and generalize these findings.

AUTHOR CREDIT STATEMENT
Jonathan Johannes Wampfler contributed to data curation, formal analysis, original draft preparation, visualization, investigation, and manuscript review and editing; Nikolaos Gkantidis contributed to conceptualization, methodology, software, data curation, formal analysis, original draft preparation, visualization, investigation, supervision, validation, and manuscript review and editing.