lavaanExtra: Convenience Functions for Package lavaan

lavaan (Rosseel, 2012) is a very popular R package for structural equation modeling (SEM). The package relies on specific operators to define latent variables, regressions, covariances, indirect effects, and so on. However, some individuals (e.g., beginners to R and lavaan)—or in some cases power users—may prefer not having to specify the operators themselves, or would like to see some steps automatized, such as generating the lavaan model layout or defining indirect effects. Furthermore, for researchers, it can be relatively difficult to extract relevant statistical outputs in the form of tables and figures that are suitable for scientific publication.


Statement of need
lavaan (Rosseel, 2012) is a very popular R package for structural equation modeling (SEM).The package relies on specific operators to define latent variables, regressions, covariances, indirect effects, and so on.However, some individuals (e.g., beginners to R and lavaan)-or in some cases power users-may prefer not having to specify the operators themselves, or would like to see some steps automatized, such as generating the lavaan model layout or defining indirect effects.Furthermore, for researchers, it can be relatively difficult to extract relevant statistical outputs in the form of tables and figures that are suitable for scientific publication.
lavaanExtra does mainly two things to address these issues.First, it offers an alternative, code-efficient flexible modular syntax that allows automatizing certain steps, such as defining indirect effects in certain scenarios or the desired structure of a SEM model to be plotted (however, note that lavaan is also compatible with a modular approach).Second, it facilitates the analysis-to-publication workflow by providing publication-ready tables and figures following the style requirements of the American Psychological Association (APA).

Usage
There is a single function at the center of the proposed alternative syntax, write_lavaan().The idea behind write_lavaan() is to define individual components (regressions, covariances, latent variables, etc.), provide them to the function, and have it write the lavaan model, so the user does not have to worry about making typos in the specific symbols required for each aspect of the model.
There are several benefits to this approach.Some lavaan models can become very large.By defining the entire model every time, such as is typical with lavaan users, not only do we break the DRY (Don't Repeat Yourself) principle, but our scripts can also become long and unwieldy.This problem gets worse in the scenario where we want to compare several variations of the same general model.write_lavaan() allows the user to reuse code components, say, only the latent variables, for future models.This aspect also allows better control over the user's code.If the user makes a mistake in one of, say, five SEM models definition, the user will have to change it at all five places within the script.With write_lavaan(), users only need to define the reusable component the first time, or until they need to change that component again.
The vector-based approach also allows the use of functions to define components.For example, if all scale items are named consistently, say x1 to x50, one can use paste0("x", 1:50) instead of typing all the items by hand and risk making mistakes.However, note that reusable components through functions is also compatible with lavaan.
Another issue with lavaan models is the readability of the code defining the model.One can go to lengths to make it pretty, but not everyone does, and many people do not use the same strategies to organize the information of the model definition.With write_lavaan(), not only is the model information standardized, but it is also neatly divided into clear and useful categories.
Finally, for beginners, it can be difficult to remember the correct lavaan symbols for each specific operation.write_lavaan() uses familiar names to convert the information to the correct symbols.Even for people familiar with lavaan syntax, this approach can save time.The function also offers the possibility to define the named paths automatically with clear and intuitive names.
I provide a simple Confirmatory Factor Analysis (CFA) example below using the HolzingerSwineford1939 dataset (Holzinger & Swineford, 1939).The dataset contains the mental ability test scores of children.In this example, we want to define the latent variables visual (visual perception ability), textual (reading and writing ability), and speed (processing speed ability), which are defined by items 1 to 9, respectively.We can then use the cat() function on the resulting object (of type character) to read it in the traditional way and make sure we have not made any mistake.
Should we want to use these latent variables in a full SEM model, we do not need to define the latent variables again, only the new components.In the example below, I add regressions, covariances, and indirect effects to the model.Two of our latent variables (textual and speed) are now predicted by our mediating variable, visual.In turn, visual is now predicted by our independent variables, grade (the students' grade) and ageyr (the students' age, in years).
With the lavaanExtra syntax, when defining our lists of components, we can think of the = sign as "predicted by", a bit like ~for regression.There is an exception to this for the indirect object, which also allows specifying our variables directly instead.When such is the case, write_lavaan() will define all indirect paths automatically.

Tables
The nice_fit() function extracts only some of the most popular fit indices and organize them such that it is easy to compare models.There is an option to format the table as an APA flextable (Gohel & Skintzos, 2023), through the rempsyc package (Thériault, 2023), using option nice_table = TRUE.This flextable object can then be easily exported to Microsoft Word.Below we fit our two earlier models and feed them to nice_fit() as a named list: The table can then be saved to word simply using flextable::save_as_docx() on the resulting flextable object.

Figures
There are several packages designed to plot SEM models, but few that people consider satisfying or sufficiently good for publication by default.There are two packages that stand out however, lavaanPlot (Lishinski, 2021) and tidySEM (van Lissa, 2023b).Yet, even for those excellent packages, most people do not view them as publication-ready or at least optimized in the best possible way.
This is what nice_lavaanPlot and nice_tidySEM aim to correct.Let's compare the default lavaanPlot() and nice_lavaanPlot() outputs side-by-side for demonstration purposes.Even so, nice_lavaanPlot is not perfectly optimal for publication, for example for the use of curved lines, which many researchers dislike.Nonetheless, it will still yield excellent and satisfying results for a quick and easy check.
Finally, the base function, tidySEM::graph_sem(), is difficult to customize in depth.For the aesthetics of nice_tidySEM(), for example, we need to rely instead on the tidySEM's prepare_graph(), edit_graph(), and numerous conditional formatting functions.In contrast to nice_tidySEM(), these tidySEM functions act more like a grammar of SEM plotting, akin to the popular grammar of graphics, ggplot2 (Wickham, 2016).This provides great flexibility, but for the occasional user, also comes with an additional burden, as users may for example need to skim through almost 400 undocumented functions, should they want to conditionally edit the resulting tidy_sem object.
by researchers (especially in psychology): (a) a horizontal, rather than vertical, layout; (b) the coefficients appear by default (but only significant ones); (c) significance stars; and (d) the use of a sans serif font (as required by APA style for figures).
below I provide the code necessary to reproduce this figure using the tidySEM package only.