Robust PCA in Java using R

If you want to avoid the curse of dimensionality when working with large datasets, you need to apply certain methods to reduce the number of attributes/features of your data. One common technique is Principal Component Analysis (PCA), but it does not work properly in datasets with outliers. Hence, Robust PCA technique was developed to overcome this issue.

I have been looking for a Java implementation of Robust PCA, but unfortunately I couldn’t find any. Luckily, I came upon an R implementation available in rrcov package. My initial plan was to translate this method to Java language, but it was going to take a lot of time. Therefore, I developed a Java wrapper that connects with R interface using JRI library, which is included in rJava package (a low-level R to Java interface). You can get the code here: RobustPCA.java (mirror).

If you are looking for a Robust PCA Matlab implementation you can get the code of LIBRA project, a Matlab library for robust analysis, but I encourage you to use R instead for a number of reasons.

Image: Copyright by Matthias Scholz