ImageNomer: Description of a functional connectivity and omics analysis tool and case study identifying a race confound

Most packages for the analysis of fMRI-based functional connectivity (FC) and genomic data are used with a programming language interface, lacking an easy-to-navigate GUI frontend. This exacerbates two problems found in these types of data: demographic confounds and quality control in the face of high dimensionality of features. The reason is that it is too slow and cumbersome to use a programming interface to create all the necessary visualizations required to identify all correlations, confounding effects, or quality control problems in a dataset. FC in particular usually contains tens of thousands of features per subject, and can only be summarized and efficiently explored using visualizations. To remedy this situation, we have developed ImageNomer, a data visualization and analysis tool that allows inspection of both subject-level and cohort-level demographic, genomic, and imaging features. The software is Python-based, runs in a self-contained Docker image, and contains a browser-based GUI frontend. We demonstrate the usefulness of ImageNomer by identifying an unexpected race confound when predicting achievement scores in the Philadelphia Neurodevelopmental Cohort (PNC) dataset, which contains multitask fMRI and single nucleotide polymorphism (SNP) data of healthy adolescents. In the past, many studies have attempted to use FC to identify achievement-related features in fMRI. Using ImageNomer to visualize trends in achievement scores between races, we find a clear potential for confounding effects if race can be predicted using FC. Using correlation analysis in the ImageNomer software, we show that FCs correlated with Wide Range Achievement Test (WRAT) score are in fact more highly correlated with race. Investigating further, we find that whereas both FC and SNP (genomic) features can account for 10–15% of WRAT score variation, this predictive ability disappears when controlling for race. We also use ImageNomer to investigate race-FC correlation in the Bipolar and Schizophrenia Network for Intermediate Phenotypes (BSNIP) dataset. In this work, we demonstrate the advantage of our ImageNomer GUI tool in data exploration and confound detection. Additionally, this work identifies race as a strong confound in FC data and casts doubt on the possibility of finding unbiased achievement-related features in fMRI and SNP data of healthy adolescents.


INTRODUCTION
Functional magnetic resonance imaging (fMRI) uses the blood oxygen level-dependent (BOLD) signal to identify regions of increased neural activity. 1 Functional connectivity (FC) is an fMRI-derived measure that quantifies the synchronization between BOLD signal in different regions of the brain. 2It and similar measures have been used to predict age, 3 sex, 4 intelligence, 5 and disease status. 67FC-like measures derived from magnetoencephalography (MEG) have also been used for predicting age 8 and sex. 9Genomics such as single nucleotide polymorphisms (SNPs) can be used to make predictions that are much more accurate than those based on fMRI. 1011Use of genomics and fMRI together may give superior results. 12isting software packages for analysis of fMRI, FC, and FC-like measures such as partial correlation connectivity are either mostly text-based (programmatic interface) or have incomplete feature sets for identifying correlations in phenotypes (see Figure 1).For example, numpy, 13 PyTorch, 14 scikit-learn, 15 nilearn, 16 and nipype 17 are all powerful and popular Python-based toolkits that can be used to conduct neuroimaging research.
In fact, we use some of these packages as components in our ImageNomer software, but they all lack a graphical user interface that can speed up exploration of new datasets.Classic packages such as the Matlab-based BrainNet viewer 18 or GIFT toolbox, 19 although they have a GUI frontend, do not allow for analysis of correlations between phenotypes as well as between phenotypes and imaging features/SNPs.Additionally, a Matlab-based toolchain ties one's product to a proprietary and non-free dependency.Even more modern tools like COINSTAC 20 fall short because of a complicated user interface, lack of support for extremely high dimensional features, and a focus on federated learning which most neuroscientists do not need in their research.In contrast, ImageNomer focuses on data exploration by allowing correlation analysis of imaging, demographic, and genomic features and the creation of demographic-based subgroups.An overview of the ImageNomer architecture is shown in Figure 2.
Two problems with creating good, easy-to-use tools for analysis of fMRI-derived FC data are the high dimensionality of imaging features and the small effect sizes being measured.For example, Bennet et al. found that many effects found as marginally significant by standard analysis techniques are simply due to noise. 2122For many recent fMRI studies, high dimensionality of the data and small effect size is exacerbated by small cohort sizes, 23 with the average reproducible cohort size for an fMRI result being 36 subjects. 24Our ImageNomer software addresses these points by treating visualized FC matrices (see Figure 3) and FC/phenotype correlation maps (see, e.g., Figure 8) as the primary outcome of the analysis, allowing quick visual inspection of what would take a long time through a programming interface.Cognizant of the high dimensionality of imaging features, we also perform Bonferroni-type multiple comparison correction in all FC-phenotype and SNP-phenotype correlation analysis.This does a lot to avoid the dead-salmon effect found by Bennet et al. 22 To demonstrate the utility of our developed ImageNomer tool, we use its data visualization and correlation abilities to quickly and easily identify a race confound in FC data.Specifically, we find that the high correlation between FC and race and the unequal distribution of achievement scores among races makes it appear that FC can predict achievement score, when our work shows it is more likely due to a confound.Many studies have used FC features to predict scholastic achievement, as measured by, e.g., WRAT score, 25 explaining 10% of the variance in a population 3 or achieving a small correlation with ground truth of ρ ≈ 0.3. 26We show, however, that the FC feature to WRAT score correlation is probably due to a confounding effect of race on FC.Indeed, previous studies have shown that AI models can sometimes trivially detect and be confounded by race, 27 and recent work has suggested that race can confound FC-based prediction of behavior. 28In this work, we use ImageNomer to identify a confound in FC, and find that this race confound is primarily responsible for any ability to predict WRAT score from FC.The utility of a tool like ImageNomer is validated by speed with which we find the race confound using a GUI toolkit, while numerous groups continue to search for achievement-based features, presumably using programmatic interfaces. 326 summary, correlation analysis can give a quick overview of the data, and subject-level or cohort-level views can be instrumental for quality control.This is the reason we have developed ImageNomer, a visualization and analysis tool for connectivity-based fMRI and omics studies.The tool enables rapid correlation analysis as well as the comparison of features from outside models in a convenient browser-based user interface.Additionally, we include the ability to analyze distribution of phenotypes.Indeed, we find that correlation analysis is sufficient to quickly and clearly identify the confounding effect of race on WRAT score found in our study.Our code, as well as a Docker image and a live on-line demo, has been released and are available via links on our GitHub page.*

Architecture
ImageNomer is made up of three components (see Figure 2): • a Python backend which integrates with available libraries such matplotlib, scikit-learn, and nilearn • a Flask server that handles requests from the browser-based UI to the backend • a Vue frontend which provides an interactive user experience from within the browser A web-based user interface allows quick navigation around a cohort as well as the creation of summary graphs and correlation analyses.The main FC view is shown in Figure 3.The data being explored is stored locally in the server component, while the Python backend allows integration with standard libraries such as nilearn, scipy, numpy, and matplotlib.The matplotlib backend is used to generate all graphs on the backend, which are sent to the frontend as images.The Vue frontend allows for modularity of UI components, provides a library of pre-built widgets via Vuetify, and enables easy-to-code interactivity.

Software features
ImageNomer has the following capabilities: • Examine individual subject FC and partial correlation-based (PC) connectivity • Display distributions of phenotypes • Correlate phenotypes with phenotypes, FC/PC features with phenotypes, and SNPs with phenotypes • Display p-value maps for correlations • Perform math on images • Display components for FC decompositions (such as PCA) • Correlate decomposition components with phenotypes or SNPs • Display weights from machine learning models • Summarize and average weights from multiple models Future work with fMRI will likely require summarizing connectivity patterns into discrete network contributions. 29In the future, we plan to expand ImageNomer's capabilities for summary measures and dictionary learning.

Live web demo, Docker images, and tutorial
We have created a live on-line demo † and a Docker image containing an example open-access dataset of Fibromyalgia patients.This dataset is available as accession number ds004144 from OpenNeuro. 3031Instructions for using the Docker images, as well as a tutorial based around the Fibromyalgia dataset, can be found online.‡ The tutorial goes through step-by-step each of the major functions of ImageNomer, with instructions and screenshots of the expected output.Unfortunately, NIH data access policy precludes us from making the PNC or BSNIP data available publicly.If you are an approved researcher, we would be happy to work with you regarding functions, e.g., SNPs, which are not found in the Fibromyalgia dataset.
The easiest way to use ImageNomer is by mapping a directory on your local machine containing a "demographics.pkl"file and an "fc" subdirectory into the Docker image when starting the container.§ We provide a second preprocessed OpenNeuro dataset ds004775 32 dealing with Vicarious Punishment in our GitHub along with a tutorial on how to map it to the Docker container, similar to what one would do for one's own dataset.
Our GitHub repository also contains a notebooks folder with Jupyter notebooks that shows step-by-step how the Fibromyalgia and Punishment data were preprocessed in manner suitable for ImageNomer, starting from a CSV/TSV file and BOLD timeseries.
Alternatively, ImageNomer can be used by cloning our GitHub project and installing the Python requirements via pip.However, the use of Docker images, via instructions found on our GitHub (https://github.com/TulaneMBB/ImageNomer) is the easiest method.Docker images have been built for the amd64 and arm64 architectures; check the documentation for how to use the right version for you.If one is interested in editing the code, it is split into the "backend" and "frontend" directories.The "backend" directory contains Python modules and does the heavy lifting with respect to data loading, image generation, and correlation analysis.The "frontend" directory contains a Vue javascript project that handles the browser-based interface and keeps track of most session state.Individual parts of the web-interface are built as Vue components.
2.4 Case study on ethnicity confound in FC and its impact on achievement score prediction As a demonstration of the power of ImageNomer's GUI in quickly identifying trends, confounds, and correlations in data, we give a case study of using ImageNomer to identify an under-reported race confound present in FC.
As discussed in the Introduction, many groups have tried to predict achievement score or similar metrics from FC.Using ImageNomer, we find that there is a large difference between achievement scores among races.This can potentially lead to a confounding effect if race can be predicted from FC.We also find a high correlation between race and certain FC regions, making us suspect that race-from-FC prediction is possible.We then verify the presence of this confound by performing regression on the whole cohort vs. within-ethnicity subsets.We learn that quick and dirty data exploration may save a lot of time trying to look for FC correlates of cognition that may or may not be there.

PNC dataset
We tested ImageNomer by using it to examine the large Philadelphia Neurodevelopmental Cohort (PNC) dataset. 3334The PNC dataset contains fMRI scans, SNP information, cognitive batteries, questionnaires, and phenotype data from healthy adolescents between 8-23 years old.The dataset is enriched for European Ancestry (EA) and African Ancestry (AA) races.It contains fMRI scans for 1,445 healthy adolescents and SPN data for more than 9,267.We chose an 830-subject subset of the data which included subjects with SNP information as well as resting state (rest), working memory (nback), and emotion identification (emoid) scanner task fMRI scans.
Scholastic achievement and problem-solving ability was measured by Wide Range Achievement Test (WRAT) score 25 with the effects of age regressed out.A total of three fMRI tasks were acquired: resting state, working memory, and emotion identification.An example of parcellation along with mean FC in the PNC dataset is shown in Figure 4.The acquisition parameters for fMRI and FC preprocessing have been described elsewhere. 3Ps were collected using one of eight different platforms, with the largest set containing 1,185,051 SNPs.33 We selected a subset of 10,433 SNPs that were found in at least 100 subjects in the cohort for our analysis.SNPs were categorized by haplotype as homozygous minor variant, heterozygous, and homozygous major variant.
Missing values for subjects were set to zero for all haplotypes.

BSNIP dataset
Robustness of race prediction from FC was tested by using an independent dataset to validate models trained on PNC.The dataset used was the Bipolar and Schizophrenia Network for Intermediate Phenotypes (BSNIP) dataset of 933 patients, 1059 relatives, and 459 healthy controls. 36fMRI scans were acquired over 6 different sites, and acquisition and preprocessing are described elsewhere. 37For validation of race prediction we chose a subset of 387 African Americans (AA) and 778 Caucasians (CA), both patients and healthy controls, for whom we had fMRI scans.
using 264 region Power template mean FC of subjects in dataset

RESULTS AND DISCUSSION
We first present our exploration of the potential race confound in achievement score prediction from FC using ImageNomer.Based on analysis with ImageNomer, we hypothesize that due to the high correlation of FC with race and the obvious difference in achievement scores between races, prediction of achievement score from FC is solely due to a race confound.We then corroborate our hypothesis by using whole cohort and within-ethnic group regression models.Note that all Figures presented in this section are screenshots from the ImageNomer program.

Data exploration of possible race confound with ImageNomer
We first confirm that age and sex are not possible confounding factors with respect to achievement score prediction.This is illustrated in Figures 5 and 6, where we see equal distributions of WRAT score among males and females and no correlation with age (raw WRAT score has been corrected for age).
Next, we use the group creation capabilities of ImageNomer to create two groups: European Ancestry (EA) and African Ancestry (AA) groups.We then compare the WRAT score distribution between the two groups, illustrated in Figure 7.We find that here there is a clear difference in achievement score distribution, leading to the possibility of a confounding effect if there is a race signal present in FC.
Using the FC-to-phenotype correlation feature of ImageNomer, we explore whether there is correlation between race and FC.In Figure 8, we find that there is a large and significant correlation between race and FC.We perform the same analysis for SNPs, with the caveat that sex and race can be perfectly predicted using SNPs, and that SNP data does not contain any age-related signal.Nevertheless, we still attempted to find whether there was a suggestive overlap between SNPs correlated with race and SNPs correlated with high or low WRAT score.Our results are shown in Figure 9. From a total of 10,433 SNPs found in 100 or more subjects, we identified the top 20 SNPs correlated with race and achievement score.Of these top 20, six appeared in both the highly WRAT-correlated and highly race-correlated batches.
In summary, we use ImageNomer to form the hypothesis that race may bias FC-based prediction of achievement score if there is a race signal present in FC.We then find, using ImageNomer, that the race signal present in FC is in fact stronger than the signal for achievement score, and that WRAT score to FC correlation is a subset of race-FC correlation.Finally, we draw the same conclusion in SNP to race and SNP to achievement score correlation.In the next section, we describe the use of regression models to validate our hypothesis.

Validation with regression models
We validate the qualitative results from ImageNomer's data visualization and exploration capabilities with train/test regression models.We use regularized Ridge and Logistic Regression models with an 80/20 traintest split and 20 bootstrapping repetitions to predict age, sex, race, and WRAT score.Additionally, we predict WRAT score in whole cohort as well as within intra-ethnicity groups.The results are shown in Table 1.
The results are as follows: age, sex, and race can all be modestly well predicted using FC.WRAT score can be predicted, although at a barely significant level, using the whole cohort, with both FC and/or SNPs as input.However, any ability to predict WRAT score disappears in race-controlled (within ethnicity) groups.This validates our hypothesis, formulated with ImageNomer via data exploration, that FC features used to predict achievement score are actually predicting ethnicity instead.
Next, we confirm the stability of race signal in FC by using both ImageNomer and transfer learning of regression models to find that race signal is at least somewhat conserved between the PNC and BSNIP datasets.
Finally, we consider the effect of socioeconomic status (SES) as another potential confound besides race in predicting achievement score from FC.

Transfer of race prediction models between PNC and BSNIP datasets
We show screenshots of ImageNomer-based data exploration for FC correlation with race in the PNC and BSNIP datasets in Figure 10.This figure highlights the fact that both datasets have similar correlations between specific FCs and race.We confirm the ImageNomer-based hypothesis with results for transfer of race prediction models between the PNC and BSNIP datasets, shown in Table 2.A Logistic Regression model trained on the PNC dataset was able to predict race in the BSNIP dataset with an average accuracy of 68%.When trained on BSNIP and evaluated on PNC, the average prediction accuracy was 66%.We find that the prediction is less   good than within-dataset prediction, although still better than chance.It should be taken into account that the PNC dataset is made up of healthy adolescents, while the BSNIP dataset contains schizophrenia and bipolar patients, relatives of patients, and healthy controls.Figure 10 shows a comparison between race correlation and FC in the PNC and BSNIP datasets, created using ImageNomer.

Effect of socioeconomic status (SES) explored with ImageNomer and regression models
We consider socioeconomic status (SES) as another confounding factor when predicting scholastic achievement based on a standardized test, with the majority of analysis again carried out using ImageNomer.Predictive models were only used to validate the conclusions made using data exploration in ImageNomer.
A problem is that SES was not directly measured in the PNC study, in that the income of family groups  was not known.However, previous studies have used parental education levels as a proxy for SES, 38 and this information was included in the PNC dataset.Indeed, as seen in Figure 11, we find that SES, race, and WRAT score are all inter-related.We see that non-EA ethnicity tend to have lower SES as measured by mother education level.The correlation of father education level with FC was similar to mother education, although less significant.
It should be noted that many children had missing values for father education level.
We see in Figure 11 that SES, ethnicity, and WRAT score correlate to similar regions on the FC map.The p-values associated with FC-SES correlation (as measured by mother education) are somewhat lower than those associated with FC-WRAT or FC-ethnicity correlation, but still significant.Additionally, performing regression analysis for WRAT score based on FC in low SES (mother education ≤ 12 years) and high SES (mother education ≥ 14 years) groups, we find a barely significant ability to predict achievement in the low SES group but not a significant ability in the high SES group.The results are shown in Table 3.The WRAT prediction accuracy in the low SES group is worse than in the cohort as a whole (compare 13.6 RMSE vs 14.1 RMSE).We conclude that SES is a confounding factor in WRAT score prediction, though not as severe as ethnicity in this dataset.
Finally, we reject the idea that race is a causal factor in achievement or WRAT score, but only point out the potential confounding effects if race or SES is not taken into account in studies seeking to find markers of high or low achievement, and the ease with which such confounds were found using ImageNomer.

CONCLUSION
We present ImageNomer, a new fMRI and omics visualization and analysis tool.We note that most of the figures shown in this manuscript were created as screenshots of the working tool.We use this tool to examine the large PNC dataset and discover features important for phenotype prediction.As validation for ImageNomer-based correlation analysis, we find that age, sex, and race can be moderately well predicted by FC features, with 10 FC features giving up to 72% race prediction accuracy, compared with 85% for the full model.
We find both FC features and SNPs can somewhat predict scholastic knowledge and problem-solving ability, as measured by WRAT score, but that this is probably due to a race confound.When controlling for race, FCbased achievement score prediction drops to the same accuracy as the null model and the SNP-based prediction becomes statistically insignificant.We conclude that, on average, the effect of either SNPs or FC features on scholastic achievement in normal children is very small, if one exists at all.Additionally, we find that race prediction from FC is at least somewhat robust between different datasets.Using ImageNomer, this work quickly and easily identifies race as an important confounding factor in FC and casts doubt on the ability to predict achievement-related features from both FC and SNP data.
Finally, we note that it is very easy to add additional datasets to explore into the ImageNomer program.To do so, follow the links given in the footnotes in Section 2.3 and read the corresponding instructions.Doing so requires following a Jupyter notebook, but once data is loaded into ImageNomer, it can be explored without writing any additional code.We find the ability to quickly visualize trends, correlations, and potential confounding effects provided by the ImageNomer software is invaluable to the ability to perform good and careful research.This is demonstrated by the rapid identification of a race confound on FC-based prediction of achievement using ImageNomer, despite the fact that many studies have attempted to predict achievement using FC with little or no mention of this effect. 25326

Figure 1 .
Figure 1.Comparison of existing toolkits for analysis of fMRI-based FC data with our ImageNomer software.A more comprehensive list may be found at https://en.wikipedia.org/wiki/List_of_functional_connectivity_software.

Figure 3 .
Figure 3. Main view of the ImageNomer program showing resting state FC for all subjects along with demographic data.

Figure 4 .
Figure 4. Mean FC in the PNC dataset along with the Power264 template 35 regions used to sample BOLD signal from brain regions.

Figure 5 .Figure 6 .Figure 7 .
Figure 5. Demographics of our subset of the PNC dataset.Plots of age vs sex, WRAT vs sex, and WRAT vs age are shown.All plots created using the GUI of ImageNomer, without programming input.

FCFigure 8 .Figure 9 .
Figure 8. Correlation between race and FC is much higher than the correlation between race and WRAT score.Additionally, in almost all regions, achievement score-correlated FC is a subset of race-correlated FC.SNPs Highly Correlated with WRAT Score SNPs Highly Correlated with Ancestry

Furthermore, in
the same Figure, we show that the smaller correlation between WRAT score and FC is actually a subset of the race-FC correlation.

Figure 10 .
Figure 10.Correlation of FC with race in the PNC and BSNIP datasets.Although BSNIP p-values are lower, we see the same pattern of correlation.Additionally, the overall FC in the Default Mode Network (DMN), highlighted in pink, seems to be highly predictive of race.

Figure 11 .
Figure 11.Top: distribution of mother and father education level by race category and distribution of WRAT score with mother education level.Bottom: correlation of FC with mother education level and WRAT score.Correlation consists of a correlation image and the negative log base 10 of the Bonferroni-corrected p-value, clipped at -5.

Table 1 .
Summary of prediction results for full models (34,716 features for FC, 10,433 features for SNPs) and top 10 feature models in the PNC dataset.Top 10 features selected on the training set.Statistically significant results are shown in bold.

Table 2 .
Accuracy of transfer learning between the PNC and BSNIP datasets.All predictions are better than the null model, except for identification of the AA group in the PNC dataset by a model trained on BSNIP.

Table 3 .
Prediction of WRAT score in the low SES (mother with no college education) versus high SES (mother with some college education).Predictive ability was barely significant in the low SES group, and not significant in the high SES group.