I tried it today and it worked flawlessly! However it requires Java 7 which may not be available on most corporate PCs still running Windows XP. Having said that my IT guys installed it for me with no issues on Windows XP.
Here are the simple steps required to use the sas7bdat.parso package
- Make sure you have Java 7 or above installed on your computer (https://www.java.com/en/download/help/download_options.xml)
- Install the package rJava, devtools, and sas7bdat.parso
Once the package has been installed you can read in SAS datasets (which all have the extension .sas7bdat) using this code:
The code behind the function read.sas7bdat.parso is simplistic. It simply converts the SAS dataset to a CSV before reading it into R using read.csv. There are very obvious steps that you can take to improve the code. I use the data.table package so the simplest I can think of is to replace the read.csv function with data.table's fread, which should read in the data much faster and return a data.table instead of data.frame. For example:
As of the latest version of the sas7bdat.parso package the function read.sas7bat.parso now has a READ_FUNC parameter. You can specify READ_FUNC = data.table::fread and it will return a data.table.
There are other potential opportunities at improving the package. Currently the read.sas7bdat.parso converts the SAS dataset into csv first. If this conversion step can be skipped and allow the data to be read in more directly then it would result in more speed benefits. Also the ability to read it in as a stream or connection so the data can be processed in chunks would be highly desirable too!
I successfullly installed devtools and rJava, but I get the following error message when trying to install sas7bdata.parso:
ReplyDeleteError : .onLoad failed in loadNamespace() for 'rJava', details:
call: fun(libname, pkgname)
error: JAVA_HOME cannot be determined from the Registry
Error : package 'rJava' could not be loaded
Error: loading failed
Execution halted
*** arch - x64
ERROR: loading failed for 'i386'
* removing 'C:/Users/austin.lasseter/Documents/R/win-library/3.3/sas7bdat.parso'
Error: Command failed (1)
Hi Austin,
DeleteI had similar problem. You have to check you JAVA_HOME path by running this command in R: Sys.getenv("JAVA_HOME") and compare the output with actual location of your Java installation. Also, keep in mind what version of R (32bit or 64 bit) you are using.
This comment has been removed by the author.
ReplyDeleteHI, I am getting following error when installing sas7bdat.parso
ReplyDeletedevtools::install_github("BioStatMatt/sas7bdat.parso",force = "TRUE")
Downloading GitHub repo BioStatMatt/sas7bdat.parso@master
from URL https://api.github.com/repos/BioStatMatt/sas7bdat.parso/zipball/master
Installing sas7bdat.parso
"C:/PROGRA~1/R/R-32~1.4RE/bin/x64/R" --no-site-file --no-environ --no-save --no-restore --quiet CMD INSTALL \
"C:/Users/nveeramachaneni/AppData/Local/Temp/Rtmp82ZMKe/devtools18ac15eef7d/BioStatMatt-sas7bdat.parso-867f26a" \
--library="C:/Users/nveeramachaneni/Documents/R/win-library/3.2" --install-tests
Error in setwd(dir = new) : cannot change working directory