I tried it today and it worked flawlessly! However it requires Java 7 which may not be available on most corporate PCs still running Windows XP. Having said that my IT guys installed it for me with no issues on Windows XP.
Here are the simple steps required to use the sas7bdat.parso package
- Make sure you have Java 7 or above installed on your computer (https://www.java.com/en/download/help/download_options.xml)
- Install the package rJava, devtools, and sas7bdat.parso
Once the package has been installed you can read in SAS datasets (which all have the extension .sas7bdat) using this code:
The code behind the function read.sas7bdat.parso is simplistic. It simply converts the SAS dataset to a CSV before reading it into R using read.csv. There are very obvious steps that you can take to improve the code. I use the data.table package so the simplest I can think of is to replace the read.csv function with data.table's fread, which should read in the data much faster and return a data.table instead of data.frame. For example:
As of the latest version of the sas7bdat.parso package the function read.sas7bat.parso now has a READ_FUNC parameter. You can specify READ_FUNC = data.table::fread and it will return a data.table.
There are other potential opportunities at improving the package. Currently the read.sas7bdat.parso converts the SAS dataset into csv first. If this conversion step can be skipped and allow the data to be read in more directly then it would result in more speed benefits. Also the ability to read it in as a stream or connection so the data can be processed in chunks would be highly desirable too!