Fortifying Statistical Analyses: Software Tools for Robust Methods
-
United Nations Industrial Development Organization (retired), Vienna, Austria [valentin@todorov.at]
Keywords: Robustness – Software – R – MATLAB – Python – Julia
Abstract
The practical deployment and success of robust methods are inconceivable without reliable and user-friendly software. This necessity was recognized early on, leading to the development of initial robust statistical software in platforms such as SAS, S-Plus, and MATLAB. This talk provides an overview of key software ecosystems, highlighting their features, use cases, and suitability for various audiences. Currently two MATLAB toolboxes for robust statistics are popular: LIBRA, developed by the research groups in robust statistics of the Katholieke Universiteit Leuven (Department of Mathematics) and the University of Antwerp (Department of Mathematics and Computer Science) and FSDA, a joint effort by the University of Parma and the Joint Research Center (JRC) of the European Commission. However, the R programming environment, a free software platform for statistical computing and graphics, has emerged as a viable alternative, offering developers and users extensive capabilities for creating and applying robust methods.
Many researchers have significantly contributed to making robust statistical methods accessible. On CRAN alone, over 700 R packages include the terms ”robust” or ”outlier” in their names, titles, or descriptions. This abundance of options can be overwhelming for both beginners and experienced users. To address this, we review the 25 most significant R packages for various tasks, briefly describing their functionalities. We also explore several key topics in robust statistics, presenting methodologies, implementations in R, and applications to real-world data. Particular attention is given to robust methods and algorithms suited for high-dimensional data.
While robust methods have long been available in R and MATLAB, Python users have only recently gained access to a comprehensive package – RobPy – that offers such methods within a cohesive framework. Comparable development in Julia remains limited although Julia promises to solve the two language problem, i.e. while programming in high-level language like R or Python, use another, natively compiled language in order to achieve the desired computational speed. Of course, then remains the problem of convincing the scientific community to use Julia.
Despite the progress in robust statistical software, challenges persist, including computational efficiency, ease of use, integration with big data frameworks, and compatibility with machine learning systems. The future undoubtedly holds exciting advancements for R, MATLAB, Python, and Julia, promising to enrich the statistical community with even more powerful and versatile tools.
References
- Atkinson et al. [2025] A.C. Atkinson, M. Riani, A. Corbellini, D. Perrotta, and V. Todorov. Robust Statistics through the Monitoring Approach: Applications in Regression. Springer-Verlag, Heidelberg, 2025. In press.
- Leyder et al. [2024] S. Leyder, J. Raymaekers, P.J. Rousseeuw, T. Servotte, and T. Verdonck. Robpy: A python package for robust statistical methods, 2024.
- Riani et al. [2012] M. Riani, D. Perrotta, and F. Torti. Fsda: A matlab toolbox for robust analysis and interactive data exploration. Chemometrics and Intelligent Laboratory Systems, 116:17–32, 2012.
- Todorov [2024] V. Todorov. The r package ecosystem for robust statistics. Wiley Interdisciplinary Reviews: Computational Statistics, 16(6):e70007, 2024.