Software
In this blogpost I present some statistical programs that I use or have used in the past. It ranges from clickable GUIs such as JASP and Jamovi over full programming languages such as R and Python to specific programmes such as \(\Omega\)nyx. Every syntax comes with tradeoffs, and eventually I might provide a (incomplete) list of advantages and disadvantages to all of them, but for now I only describe features of every software that seem central to me. As R is most central to me, and because its variety of applications, its description is longer and more detailed.
JASP
https://jasp-stats.org/ A free GUI-based software to perform all common statistical tests in a frequentist as well as in a Bayesian manner. It is developed at the University of Amsterdam and includes a variety of methods such as common statistics like regression, ANOVA or reliability analyses, but also more complicated stuff such as factor analyses, mediation, Structure Equation Modeling and machine learning methods. JASP supports APA formatted tables and figures and is thus especially interesting for scientists and students from psychological fields. It offers also plenty of material for teachers teaching statistics. Behind the scenes, it heavily makes use of R, but without frightening the user by a bunch of code, yet with a clear and easy interface. You do not even need to download it, but could test it directly in the browser (if you sign in to Rollapp). For doing a small (or large), clearly defined projects such as a seminar project or a Bachelor thesis, this programm is very well suited. Also to learn Bayesian methods, this programm is a very good start.
JAMOVI
https://www.jamovi.org/ Jamovi is as well a free GUI-based statistical program which can perform common statistical tests from the frequentist approach. While looking small in the base version, it has a bunch of modules that one can add, to perform more complex analyses such as mediation, Bayesian analyses or equivalence testing. With using these modules, Jamovi is highly similar to JASP, with the difference that it does not put Bayesian statistics so much in focus. Another difference to JASP is, that Jamovi provides R code for every that one can copy into R (with another module, it also allows to write R commands directly in Jamovi), thus enabling a nice interface to the programming language. Furthermore, jamovi provides a spreadsheet for the data view, enabling basic data pre-processing such as adding new columns for the mean or standardized values.
R
https://cran.r-project.org/ and https://rstudio.com/products/rstudio/download/ R is an open-source programming language with a large community of developers. While the console of base R is not so impressive and functional, there exist various integrated developing environments with the most famous being RStudio. Even though RStudio itself is offered as a licensed software, the free version suffices for most users, who do not use it commercially.
Base R & RStudio
Differently to the aforementioned programs, R is syntax-based, meaning that no point-and-click surface is initially provided. However, the R commander tries to implement a point-and-click surface for users who want to try R without requiring them to write much code. Nevertheless, if you want to make use the whole power of R, writing syntax is inevitable. Only then you can make use of the whole range of possibilities:
Packages
As the R community is very active, there are a lot of developers who design new packages that enable almost every commonly known statistic and data analysis method, ranging from effect sizes, over Bayesian multilevel-regressions, Structure Equation Modelling, network analysis and power analysis to the application of machine learning such as random forests or convolutional neuronal networks (CNNs). Moreover, there are plenty of packages for data preparation (e.g. with the famous tidyverse package collection and plotting, with ggplot2 being the most famous package.
The world of R markdown
Besides plain syntax files, R code can be combined with markdown. R Markdown documents enable writing documents such as pdf-documents, html-pages or scientific articles totally within R, with a combination of nicely formatted text, syntax, tables and plots. The range of documents is stunning: you can write files that can be exported to word and back again, powerpoint or html-presentations or pdf formats. There are also packages for appropriate formatting, e.g. by APA-style. Thus, R enbables writing scientific articles, books and dissertations. Moreover you can construct html-pages or blogentries (such as this blog, which is based on the blogdown package and entirely written in R).
R Shiny
R Shiny enables programmers to construct interactive web applications. Shiny apps require a little different setup compared to usual R scripts. The range of applications using shiny is astonishing, ranging from dataset visualizations such as the PISA data to medical simulations about physiological processses.
Learning R
Learning R may seem as a lifetime task as this language is evolving and I also discover new functions and packages almost every other day. Yet, due to the engaged community, very nice tutorials on learning R exist. I can suggest the books The pirate’s guide to R by Nathanael Phillips and Hadley Wickham’s book R for Data Science. If you simply want to look up some code the cheatsheets are a helpful collection of commands. You can find a more extensive collection of resources on learning R here.
Python
https://www.python.org/ https://www.spyder-ide.org/ Python is another programming language that requires you to write code. Opposed to R, Python has not been developed with the idea to be a mainly a program for statistical analysis. Thus, Python is used in much more applications and more flexible. There exist again various IDEs for Python, with Spyder being one of the most popular ones. There are various interfaces for applying Python code in R and vice versa. Sometimes there seems to be a dispute between R-Users and Python programmers, which is the better programming language. But, as often, there are advantages and disadvantages with both languages, and often people simply stick with what they learned or like more. See this post for a nice overview of advantages and disadvantages of R and Python.
SPSS
No comments! 1
PSPP
https://www.gnu.org/software/pspp/ PSPP is similar to SPSS, not only in its name, but also in its content, with one important difference: it is free! It offers the basic range of statistics such as *t"-tests, ANOVAs, linear and logistic regression models, cluster analysis, reliability analysis, factor analysis as well as non-parametric testing. It also offers a syntax mode and basic plots.
\(\Omega\)nyx
https://onyx.brandmaier.de/
\(\Omega\)nyx is a free graphical software environment for Structure Equation Modeling. It enables estimation and drawing of the models directly in the graphical interface with various powerful optimization algorithms running behind the scenes. The user can define the structure of the model as well as the appearance to export publication ready graphs. The models can also be exported as syntax for lavaan
, OpenMx and Mplus. \(\Omega\)nyx also comes with an R package but requires the installation of the software as well.
ENA WebKit
http://www.epistemicnetwork.org/ ENA is a free online tool for epistemic network analysis, that I only recently learned about by Elisabeth Bauer and am eager to use it now as well. It offers a web-application, where you can register and upload your data, constructing co-occurrence networks and analysing these. The program also comes with an R package that offers the same functionality within R.
If you want some arguments, why many RUsers dislike SPSS, you should compare it to all the advantages that R has compared to SPSS. You also should be aware, that SPSS uses sometimes outdated formulas and contains errors that are only corrected when a new version is released. For example, SPSS did not differentiate \(\eta\)2 and \(\eta\)partial2 until version 10 which caused a lot of scientific articles to report faulty estimates for the explained variance. SPSS also does not come along with such a helpful community as R and no quick updates. Yes, there exists also a SPSS syntax editor, but it is very outdated in terms of coding. For example you have to tell the program, that it should
EXECUTE.
something, when you want it to do something. Such syntax commands stem from times, when coding was done by punching cards (hopefully in the correct order) and when computers filled whole rooms in university labs. Furthermore, the SPSS program puts strange limits on the number of characters in variable names as well as on the number of variables in regression models. Therefore, instead of paying $1200 for the yearly license to only have the software running, I highly suggest that you better invest this money in one or two R courses every year. That will pay off tremendously very fast. And no, I won’t give you the link to this proprietary software.↩︎