AnalytixWare
The travails of developing a Credit Scoring Solution
Thursday, 5 February 2015
r-bloggers.com a daily read for useRs
I have gotten into a daily habit of reading r-bloggers.com. I often find interesting articles on it and it motivates me to write useful articles on R as well.
Monday, 2 February 2015
Reading SAS into R
The sas7bdat package has been around for a while. It allows some SAS datasets to be read into R directly. However it didn't deal with compressed SAS datasets at all! Recently I discovered the sas7bdat.parso package which is by the same author and it uses the parso Java library for reading SAS datasets.
I tried it today and it worked flawlessly! However it requires Java 7 which may not be available on most corporate PCs still running Windows XP. Having said that my IT guys installed it for me with no issues on Windows XP.
Here are the simple steps required to use the sas7bdat.parso package
Once the package has been installed you can read in SAS datasets (which all have the extension .sas7bdat) using this code:
The code behind the function read.sas7bdat.parso is simplistic. It simply converts the SAS dataset to a CSV before reading it into R using read.csv. There are very obvious steps that you can take to improve the code. I use the data.table package so the simplest I can think of is to replace the read.csv function with data.table's fread, which should read in the data much faster and return a data.table instead of data.frame. For example:
As of the latest version of the sas7bdat.parso package the function read.sas7bat.parso now has a READ_FUNC parameter. You can specify READ_FUNC = data.table::fread and it will return a data.table.
There are other potential opportunities at improving the package. Currently the read.sas7bdat.parso converts the SAS dataset into csv first. If this conversion step can be skipped and allow the data to be read in more directly then it would result in more speed benefits. Also the ability to read it in as a stream or connection so the data can be processed in chunks would be highly desirable too!
I tried it today and it worked flawlessly! However it requires Java 7 which may not be available on most corporate PCs still running Windows XP. Having said that my IT guys installed it for me with no issues on Windows XP.
Here are the simple steps required to use the sas7bdat.parso package
- Make sure you have Java 7 or above installed on your computer (https://www.java.com/en/download/help/download_options.xml)
- Install the package rJava, devtools, and sas7bdat.parso
Once the package has been installed you can read in SAS datasets (which all have the extension .sas7bdat) using this code:
The code behind the function read.sas7bdat.parso is simplistic. It simply converts the SAS dataset to a CSV before reading it into R using read.csv. There are very obvious steps that you can take to improve the code. I use the data.table package so the simplest I can think of is to replace the read.csv function with data.table's fread, which should read in the data much faster and return a data.table instead of data.frame. For example:
As of the latest version of the sas7bdat.parso package the function read.sas7bat.parso now has a READ_FUNC parameter. You can specify READ_FUNC = data.table::fread and it will return a data.table.
There are other potential opportunities at improving the package. Currently the read.sas7bdat.parso converts the SAS dataset into csv first. If this conversion step can be skipped and allow the data to be read in more directly then it would result in more speed benefits. Also the ability to read it in as a stream or connection so the data can be processed in chunks would be highly desirable too!
Tuesday, 18 March 2014
Packaging your Shiny App as an Windows desktop app
Introduction
In developing SkyScorer I've sought ways to package it as a standalone Windows app. The advantages of doing this is quite clear:- The end-user doesn't need any R knowledge to run the app
- The app can be distributed as a simple download
In this tutorial I will go through the process that is needed to create a standalone Windows Shiny app.
Portable R & Chrome
Firstly download Portable R and Portable Chrome. These will serve as the backbone of our app. Essentially, we are packing along with our app self-contained copies of R and Chrome:
- Portable R: http://sourceforge.net/projects/rportable/
- Portable Chrome: http://portableapps.com/apps/internet/google_chrome_portable
Once you have downloaded the installation files for Portable R and Portable Chrome just install them and note down their installation path. We will use this later.
For simplicity I will assume you can make a folder in your C Drive called YourApp, but the folder could be situated anywhere. Now copy the portable R installation and the Portable Chrome installation into the C:\YourApp folder. Also copy your Shiny folder containing server.R and ui.R (or a sole app.R in the case of single file Shiny app) into the folder. So now you should have three folders in your C:\YourApp directory
- C:\YourApp\GoogleChromePortable
- C:\YourApp\R-Portable
- C:\YourApp\Shiny
Setting Up Portable R
Just run R-Portable.exe and install all the libraries that are needed by your Shiny app. They will be stored in the Portable R folder.
Setting Up Portable Chrome
Make sure you have the following line in the GoogleChromePortable.ini file which is in the GoogleChromePortable folder
AdditionalParameters= --app="http://localhost:8888"
The additional parameter there makes Chrome start up in app model (i.e. with address bar, bookmarks etc) which makes it look more like a native app.
The additional parameter there makes Chrome start up in app model (i.e. with address bar, bookmarks etc) which makes it look more like a native app.
The .bat and .vbs files
In the C:\YourApp folder create two files
- run.vbs
- runShinyApp.R
The contents
Now WAIT! Don't click on run.vbs just yet! I advise you to add the following to the server.R inside the shinyServer(function(input, output, session) { ... }). Please make sure you pass session as the third argument! The code you need to add is
This will close the Rsession when you close the browser (in this case Portable Chrome). Now clicking on run.vbs should start your app!
Simply zip up your folder C:\YourApp and distribute! Your users need only double click on run.vbs to run the app.
If you want to appear more professional you can follow the instruction in the next section to create a setup.exe file for you app.
InnoSetup
Download InnoSetup http://www.jrsoftware.org/isdl.php and install the software and run the Wizard for creating a new setup file. Everything should be pretty self-explanatory, just make sure that InnoSetup knows to use the C:\YourApp directory. Basically InnoSetup will package everything in the directory into an .exe file which acts as a setup wizard.
I chose my default install path to be somewhere other than C:\Program Files as I have found a few issues with it on Windows 8 (no problem on Windows XP).
Once you are done with the InnoSetup Wizard you should end up with a .iss file looking like this
The code in the .iss file should be self-explanatory. I only added a few lines which I thought was helpful. Under [setup] add PrivilegesRequired=none, this will not request for admin privilege when installing the app so should allow the app to be installed by most users. Also I created a shortcut to the desktop using the below
[Icons]
Name: "{commondesktop}\YourApp"; Filename: "{app}\Apps\run.vbs"; IconFilename: {app}\Your.ico
Of course I have my own custom .ico file to make the shortcut look unique and professional.
Hope this helps!
Sunday, 15 December 2013
SkyScorer website launches today!
The website for SkyScorer is finally up!
www.skyscorer.com
The software is ready enough to begin gathering interested emails!! Please sign-up to be an alpha tester for SkyScorer - the innovative, easy-to-use, yet powerful credit scoring tool
www.skyscorer.com
The software is ready enough to begin gathering interested emails!! Please sign-up to be an alpha tester for SkyScorer - the innovative, easy-to-use, yet powerful credit scoring tool
Thursday, 22 August 2013
RExcel 64Bit
Looks like statconn has released a 64bit capable RExcel!
This is really good news and I've been asking for it since 2011!
This is really good news and I've been asking for it since 2011!
Sunday, 18 August 2013
Kaggle - Comp
Kaggle just reopened their "Give Me Some Credit" competition where we can submit a solution post deadline to compare the performance against the submissions.
I just built a model using our product AnalytixWare SkyScorer in development. We only used the automated algorithm and made no manual tweaking. Here's the result
As you can see SkyScorer only ranked a lowly 599. The measure measurement they use is AUC. Notice that SkyScorer's AUC is 0.86 when rounded.
The winner as you can see scored 0.87, so I think SkyScorer is doing not bad at all! AnalytixWare's product may not win you the competition but it is automated and builds an easy to understand model in the format of a scorecard! In practice the one point difference is a non-concern and who knows what techniques the top few guys have used to build their models, maybe it's a model that is too hard to be implemented in the existing banking infrastructure. AnalytixWare's model can be implemented with mere if/else statements! When it comes to risk models, the easier to understand/implement the better. I am really happy with the SkyScorer!
I just built a model using our product AnalytixWare SkyScorer in development. We only used the automated algorithm and made no manual tweaking. Here's the result
The winner as you can see scored 0.87, so I think SkyScorer is doing not bad at all! AnalytixWare's product may not win you the competition but it is automated and builds an easy to understand model in the format of a scorecard! In practice the one point difference is a non-concern and who knows what techniques the top few guys have used to build their models, maybe it's a model that is too hard to be implemented in the existing banking infrastructure. AnalytixWare's model can be implemented with mere if/else statements! When it comes to risk models, the easier to understand/implement the better. I am really happy with the SkyScorer!
Subscribe to:
Posts (Atom)