EFAshiny
is an user-friendly application for exploratory factor
analysis (EFA; Bartholomew, Knott, & Moustaki, 2011). The graphical user
interface in shiny (Chang, Cheng, Allaire, Xie, & McPherson, 2017) is
designed to free users from scripting in R by wrapping together various
packages for data management, factor analysis, and graphics.
Easy-to-follow analysis flow and reasonable default settings avoiding
common errors (Henson & Roberts, 2006) are provided. Results of analysis
in tables and graphs are presented on-line and can be exported.
Key features include:
- An easy-to-use GUI to free users from scripting in R
- A step by step analysis flow to perform EFA
- Quick ways to summarize data by tables or graphs
- Several ways to explore factor retention numerically or graphically
- Several ways to explore factor extraction and rotation numerically or graphically
- A display of confidence intervals for factor loadings
- Several ways to link visualization of correlation matrix with factor structure
- Default options are chosen according to recommendations in the literature
- A demonstration using a real psychological scale dataset
The EFAshiny
application is primarily aimed at behavioral researchers
who want to perform EFA on a set of associated variables (e.g.,
item-level scale dataset). Note that it can also be used to explore
FA-based connectivity analyses (McLaughlin et al., 1992) in instrument
data, such as event related potentials (ERPs) and functional
near-infrared spectroscopy (fNIRS) data. Though the major focus of
EFAshiny
is to perform EFA, it is worth noting that confirmatory
factor analysis (CFA) is an useful future direction for shiny
APP.
To run EFAshiny
on your R,
devtools
and shiny
are required.
install.packages("devtools")
install.packages("shiny")
Install and launch EFAshiny
:
devtools::install_github("PsyChiLin/EFAshiny")
EFAshiny::EFAshiny()
If you want to use the standard version of EFAshiny
, installation is
not required. The application is deployed on shinyapps.io server.
This standard version has all the function except for the Editor
tab
(which is only useful for users who want to code online). Users can
easily explore and analyze their data with this online APP without
worrying about installation.
Have fun with EFAshiny
:
https://psychilin.shinyapps.io/EFAshiny/
EFAshiny
adopts exploratory factor analysis (EFA, Bartholomew, Knott,
& Moustaki, 2011), a widely used method to investigate the underlying
factor structure that can be used to explain the correlations in a set
of observed indicators, as the major procedure in the application. EFA
can be useful in lots of situations. For example, it can be used to
conceptualize new constructs, to develop instruments, to select items as
a short form scale, or to organize observed variables into meaningful
subgroups. Major procedures of EFA included correlation coefficients
calculation, number of factors determination, factor extraction, and
factor rotation. In addition to the aforementioned steps of EFA, data
explorations should be conducted before using EFA, and interpreting the
results after using EFA is also an important step. Since that EFA is
helpful to account for the relationship between numerous variables, its
use has permeated fields from psychology to business, education and
clinical domain.
When you open EFAshiny
,
the interface will be shown.
- Upper Panel: The upper panel show 7 main tabs for the EFA procedure. The order of the tabs from left to right is the suggested flow. Users can easily switch the step of the EFA by simply clicking the tabs.
- Left Panel: The left panel is used to control the analysis setting or change the arguments.
- Right Panel: The right panel displays the results, tables and figures.
In the Introduction
tab, you can see the main features for EFAshiny
,
a demo figure, and some key references.
The data sets that required the implementations of EFA are typically in
a wide format, i.e., one observation per row.
They are composed of
a set of responses in one or more psychometric tests in Likert
scale.
In the Data Input
tab, users can upload the data.
- Upload data-file: Users can upload their data by browsing their computer.
- Data Format: Two kinds of data can be uploaded, including csv and txt.
- Header of variable: Users can choose whether their data have variable names or not.
- Type of Data: Two data types for EFA are available, including the typcial subject by variable raw data and the correlation matrix data type.
- Variables to include: User can choose the variables they want to include in the further steps. Simply delete the variable name from the console.
If no data is uploaded, EFAshiny
will use the Rosenberg Self-Esteem
Scale
dataset to perform the default demostrations.
After uploading the data, the exploratory data analysis should be
conducted.
In Data Summary
tab, three types of explorations are
provided.
- Numeric Statistic: The first to fourth order moments for each variable were automatically calculated and printed without worrying about inputting any arguments. Median and MAD are provided as well.
- Histogram: Histograms that demonstrated numbers of observations conditioned on the points of Likert scale (e.g. 1 to 4 points) reported the distribution of each variable.
- Density Plot: Density plots are provided. Users can visualize
the distribution of each item accroding to the histograms and
density plots. Note that the histograms and density plots are
generated using
plotly
package. In other words, they can be played dynamically. Try it with some clicks ! - Correlation Matrix: A bird’s eye view of the pairwise
correlation between variables will be illustrated.
- Type of correlation: Tetrachoric correlations can be adopted to calculate the correlations between bivariates, and Polychoric correlations can be used on dichotomous ordinal variables. The default argument is set to Pearson’s correlation coefficients.
- ggcorrplot: In addition to the
Correlation Matrix
tab usingcorrplot
package, we also provide aggcorrplot
version. Have fun with those plots and further get some intuitions.
Note that the provided correlation matrix is the basis of EFA, which is a procedure that aim to investigate the underlying structure from the correlations between variables, so either calculating or visualizing the correlation matrix will be really important.
One of the central idea of the EFA is to represent a set of observed
variables by a smaller number of factors. Thus, selecting how many
factors to retain is a critical decision.
In Factor Retention
tab, a set of indices to determine numbers of factor are provided.
- Scree Plot and Parallel Analysis: Scree Plot (Cattell, 1966) and
Parallel Analysis (Horn, 1965) are two popular methods to determine
numbers of factor.
- Quantile of Parallel analysis: Mean, 95th- and 99th-percentile eigenvalues of random data can be used as criteria.
- Number of simulated analyses to perform: Users can perform more simulation to obstain reliable results. In general, the default 200 is correct enough.
- Numeric Rules: Very simple structure complexitiy (VSS),
Velicer’s minimum average partial (MAP, Velicer, 1976) test,
RMSEA, BIC and SRMR are also provided as the objective numeric
rules.
- Max Number of Factor For Estimation: Users should define their max number of factor to estimate. Should be more than hypothesized.
- Exploratory Graph Analysis (EGA): EGA is a new approach, which
is based on the graphical lasso with the regularization parameter
specified using EBIC, for retaining factors (Golino & Epskamp,2017).
- Number of simulated analyses to perform: Users can perform more simulation to obstain reliable results. Note that too much simulated analyses will somehow slow down the EGA.
- Summary: We provide a easy summary for all these methods. Users can easy make a decision for the number of factors according to the summary.
In addition, Sample Size is another option for users to validate the
results for factor retentions by randomly adjusting different Sample
Size.
Although users still have to determine the number of factors
upon their own decisions, EFAshiny
provides users several indices
without worrying on methods implementations.
The major step of EFA is to extract and rotate the factors structure,
further estimating the factor loadings.
In Extraction and Rotation
tab, several factor extraction and rotation methods are
available, and the boostrapping for estimating confidence intervals of
factor loadings is also provided to aide in interpretations.
- Factor Extraction Methods: Available methods included principal axes method (PA), maximum likelihood method (ML), minimum residual method (minres), weighted least squares (WLS), generalized weighted least squares (GLS), and so on. The default option is PA, which has a long history and well performance in psychological studies.
- Rotation Methods: The objective of factor rotation is to obtain a simple structure for better interpretation. Both orthogonal (e.g. variamx method) and oblique rotations (e.g. promax method) are adopted. Using oblique rotations is recommended.
- Number of Bootstraps: By using bootstrapping resampling methods, users can obtain interval estimations rather than point estimations. Number of bootstrapping to perform can be changed based on users’ needs.
By providing plenty of factor extraction methods, rotation methods, and
useful interval estimations of factor loadings, EFAshiny
is not only
helpful for EFA newbies, but also flexible for EFA users with many
experiences.
For EFA results, the fundamental visualizations is plotting the
relationship between factors and indicators.
In Diagram
tab, the
path diagram representation is provided by using psych
R package
(Revelle, 2017).
It has the structure that all factors and
indicators are represented as a bigger or smaller node, and all loadings
with absolute values greater than some thresholds (e.g. 0.3) are
represented as a line.
Through the graphical representations with
flexible plotting options, users can easily understand the factor
structure.
In Factor Loadings
tab, EFAshiny provides useful visualization of
factor loadings to facilitate proper interpretations of extracted
factors.
- Bootstrapping Factor Loadings: A table of EFA loadings is presented graphically. Loadings are represented as a bar and conditioned on one or more factors. In order to enhance the interpretability at a glance, positive loadings and negative loadings are presented by different colors. The greater the loadings the deeper the color. Confidence intervals of factor loadings are visualizedto provide quick and useful understanding.
- Factor Loadings and Correlation Matrix: The plot includes the original correlation matrix of the dataset and a stacked bar-graph of the factor loadings is provided for users to make an esay comparison.
- SE and Factor Loadings: The plot visualizes the issue, which indicates oblique CF-varimax and oblique CF-quartimax rotation produced similar point estimates but different standard error estimates (Zhang & Preacher, 2015), by presenting comparison figure. Users can observe whether the phenomenon exists in their empirical dataset.
In addition to providing a table of loadings for EFA results, users can automatically get the whole picture of the EFA results through these visualizations.
We summarize, in six concrete steps, our provided flow in EFAshiny
for
performing EFA.
- Read the data and review it on the main console. Select which variable should be included in further analysis.
- Explore the data. For each item, users can examine its numeric statistic, distributions, and correlation patterns.
- Use multiple criteria to determine the number of factors.
- Perform EFA. Input the number of factors that decided in step 3. The table of EFA results will be presented, including loadings, confidence interval and correlations between factors.
- Visualize the results. Three kinds of plots are shown by EFAshiny. Get a general idea of the results from these visualization.
- Download and use the results, including figures and tables, in every step for any purpose.
To see the tutorial in vignettes:
browseVignettes("EFAshiny")
By following this analysis flow in EFAshiny
, users without any
knowledge of programming are able to perform EFA and obtain great
understandings for their own studies.
In addition to the GUI, we also provide an Editor
tab with several
code demonstrations in the Github version of EFAshiny
. In this
Editor
mode (see figure below), we already present some quick examples
allowing users to perform similar analyses in EFAshiny
GUI. Users can
also write their own R code here. With this feature users might have the
possibility to use EFAshiny
within a script pipeline. In general, this
cool feature allow users to learn R, understand the code underlying
analyses in EFAshiny
or automate the analyses in the future.
Note that this feature can also allow the use of lavaan
R package to
perform confirmatory factor analysis (CFA), which is also a widely used
method but not the main focus of EFAshiny
. Simply input
require(lavaan)
should work (see lavaan
package for details). Another useful tool is
the showcase
version of shiny
when running the APP ( definitely, you
can directly see the code in server.R
and ui.R
).
In summary, Users who want to further understand EFAshiny
or learn R
can (1) see the code in Editor
tab of github version EFAshiny
GUI
(as shown in figure), (2) download the R markdown file similar to the
code in editor mode
here,
(3) see the same R markdown file in this public
link, (4) use showcase
function in shiny
, and (5) directly see the code in server.R
and
ui.R
.
The dataset for demonstration is the 10-items Rosenberg Self-Esteem
Scale (RSE; Rosenberg, 1965) via an online platform for psychological
research. The RSE was
recorded in 1 to 4 Likert scale, where higher scores indicated higher
agreements for the items (1=strongly disagree, 2=disagree, 3=agree, and
4=strongly agree). Previous studies suggested that the RSE could be
treat as a one factor un-dimensional scale, which simply assessed a
positive self-evaluation construct, or a two factor bi-dimensional
scale, where one factor is proposed to assess positive self-esteem
(e.g. I feel that I have a number of good qualities) with another
measuring negative self-esteem (e.g. At times I think I am no good at
all). EFAshiny
already implements a 256 participants RSE data as a
built-in dataset, but
RSE.csv
with
codebook
can also be directly
downloaded.
bootnet
(Epskamp, 2017)corrplot
(Taiyun & Viliam, 2017)EFAutilities
(See Zhang, 2014 for detail)reshape2
(Wickham, 2014)EGA
(Golino & Epskamp, 2017)ggplot2
(Wickham, 2016)ggcorrplot
(Kassambara, 2016)gridExtra
(Auguie, 2017)igraph
(Csardi & Nepusz, 2006)moments
(Komsta & Novomestky, 2013)plotly
(Sievert, et al., 2017)psych
(Revelle, 2017)psycho
(Makowski, 2018)qgraph
(Epskamp, et al., 2012)shiny
(Chang, Cheng, Allaire, Xie, & McPherson, 2017)shinytheme
(Chang, 2016)
- Auguie, B. (2017). gridExtra: Miscellaneous Functions for" Grid" Graphics, 2016. R package version, 2.3.
- Bartholomew, D.J., Knott, M., Irini Moustaki, I. (2011). Latent Variable Models and Factor Analysis. A Unified Approach. Wiley.
- Cattell, R. B. (1966). The scree test for the number of factors. Multivar Behav Res, 1(2), 245-276.
- Chang, W. (2016). shinythemes: Themes for Shiny. R package version 1.1.1.
- Chang, W., Cheng, J., Allaire, J. J., Xie, Y., & McPherson, J. (2017). shiny: Web application framework for R. R package version 1.0.0.
- Csardi, G., & Nepusz, T. (2006). The igraph software package for complex network research. InterJournal, Complex Systems, 1695(5), 1-9.
- Epskamp, S., Cramer, A. O. J., Waldorp, L.J., Schmittmann, V.D., & Borsboom, D. (2012). qgraph: Network Visualizations of Relationships in Psychometric Data. Journal of Statistical Software, 48(4), 1-18.
- Epskamp, S. (2017). bootnet: Bootstrap methods for various network estimation routines. R package version 1.0.1
- Golino, H. F., & Epskamp, S. (2017). Exploratory graph analysis: A new approach for estimating the number of dimensions in psychological research. PloS one, 12(6), e0174035.
- Henson, R. K., & Roberts, J. K. (2006). Use of exploratory factor analysis in published research: Common errors and some comment on improved practice. Educational and Psychological measurement, 66(3), 393-416.
- Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30(2), 179-185.
- Komsta, L., & Novomestky, F. (2013). moments: moments, cumulants, skewness, kurtosis and related tests. R package version 0.13.
- Kassambara, A. (2016). ggcorrplot: Visualization of a Correlation Matrix using’ggplot2’. R package version 0.1.1.
- Yves Rosseel (2012). lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48(2), 1-36.
- Makowski, (2018). The psycho Package: an Efficient and Publishing-Oriented Workflow for Psychological Science. Journal of Open Source Software, 3(22), 470.
- McLaughlin, T., Steinberg, B., Christensen, B., Law, I., Parving, A., & Friberg, L. (1992). Potential language and attentional networks revealed through factor analysis of rCBF data measured with SPECT. Journal of Cerebral Blood Flow & Metabolism, 12(4), 535-545.
- Revelle, W. (2017) psych: Procedures for Personality and Psychological Research, Northwestern University, Evanston, Illinois, USA, R package version 1.7.8.
- Rosenberg, M. (1965). Rosenberg self-esteem scale (RSE). Acceptance and commitment therapy. Measures package, 61, 52.
- Sievert, C., Parmer, C., Hocking, T., Chamberlain, S., Ram, K., Corvellec, M., & Despouy, P. (2016). plotly: Create Interactive Web Graphics via ‘plotly. js’. R package version, 4.7.1.
- Taiyun Wei and Viliam Simko (2017). R package “corrplot”: Visualization of a Correlation Matrix. R package version 0.84.
- Velicer, W. F. (1976). Determining the number of components from the matrix of partial correlations. Psychometrika, 41(3), 321-327.
- Wickham, H. (2016). reshape2: Flexibly Reshape Data: A Reboot of the Reshape Package. R package version 1.4.2.
- Wickham, H. (2016). ggplot2: elegant graphics for data analysis. Springer.
- Zhang, G., & Preacher, K. J. (2015). Factor rotation and standard errors in exploratory factor analysis. Journal of Educational and Behavioral Statistics, 40(6), 579-603.
- Zhang, G. (2014). Estimating standard errors in exploratory factor analysis. Multivariate Behavioral Research, 49, 339-353.
Chi-Lin Yu : Department of
Psychology, National Taiwan University, Taiwan
Ching-Fan
Sheu : Institute of Education,
National Cheng Kung University, Taiwan
If you have a
question, comment, concern or code contribution about EFAshiny
, please
send us an email at psychilinyu@gmail.com.
Please cite as:
- Yu, C.-L., & Sheu, C.-F. (2018). EFAshiny: An User-Friendly Shiny Application for Exploratory Factor Analysis. Journal of Open Source Software, 3(22), 567-568, https://doi.org/10.21105/joss.00567.