January 11, 2012
Rook 1.0-3 updated to run on rApache

animal.crakcers

So if you’re in need of deploying that nifty Rook app you made, download the latest release 0.1-3 on my github account or wait for it to appear on CRAN.

That is all.

4:32pm  |   URL: http://tmblr.co/Zf5rDyEd8UsX
(View comments  
Filed under: Rstats R 
January 11, 2012
Another Year, Another rApache Release

I’m feeling chancy, just like those two angels who took a chance on me so many years ago…

So I decided to release rApache 1.1.15. You can grab the source from rapache.net. Notable change includes support for deploying Rook applications, which you’ll be able to do once I release the next version of Rook. Due out shortly.

Cheers!

12:09pm  |   URL: http://tmblr.co/Zf5rDyEcS4F4
(View comments
Filed under: Rstats R 
July 12, 2011
I wish I knew everything about R. I wish I could vectorise in my sleep. I wish there were perfect R packages out there to solve all my data transformation problems. I wish there were perfect data.
If I were Paul Graham, would I ever write code like the above? Would I hire someone who wrote that, if I were Joel Spoelsky?
My code smells, but I’ve spoken with a few experts in our department whom I trust, and they agree that the approach I’m taking is sound. I’m transforming data to be fed into a Cox model. Each data row contains a start and end date, event boolean, outcome boolean, number of prior events, and number of prior outcomes. There’s also an array of rules by which to construct the data, including those that involve season start and end dates, event start and end dates, events spanning multiple data rows, etc. Oh, and I’m using a big loop rather than vectorization. 
This project has made me question my ability to solve problems in software, which is humbling, but I soldier on.

I wish I knew everything about R. I wish I could vectorise in my sleep. I wish there were perfect R packages out there to solve all my data transformation problems. I wish there were perfect data.

If I were Paul Graham, would I ever write code like the above? Would I hire someone who wrote that, if I were Joel Spoelsky?

My code smells, but I’ve spoken with a few experts in our department whom I trust, and they agree that the approach I’m taking is sound. I’m transforming data to be fed into a Cox model. Each data row contains a start and end date, event boolean, outcome boolean, number of prior events, and number of prior outcomes. There’s also an array of rules by which to construct the data, including those that involve season start and end dates, event start and end dates, events spanning multiple data rows, etc. Oh, and I’m using a big loop rather than vectorization. 

This project has made me question my ability to solve problems in software, which is humbling, but I soldier on.

9:31am  |   URL: http://tmblr.co/Zf5rDy716Sb5
(View comments  
Filed under: rstats 
June 7, 2011
R books are now showing up in the dollar bin. That’s a good sign!

R books are now showing up in the dollar bin. That’s a good sign!

10:02am  |   URL: http://tmblr.co/Zf5rDy5sjU8b
(View comments  
Filed under: rstats 
April 25, 2011
Let’s try this again: Introducing Rook for #rstats!

Well I totally botched my launch last week. Thankfully, my wife straightened me out: I decided to change the name of Rack to Rook so there won’t be any future confusion. And instead of rewriting all my posts last week, just read them again and every time you read “Rack”, think “Rook” (and then take a swig of your favorite beverage).

Here’s the email I just sent to the R-packages mailing list:

Dear  useRs,

Rook is a new package that does three things:

 - It provides a way to run R web applications on your desktop with the
 new internal R web server named Rhttpd. Please see the Rhttpd help page.

 - It provides a set of reference classes you can use to write you R
 web applications. The following help pages provide more information:
 Brewery, Builder, File, Middleware, Redirect, Request, Response, Static,
 URLMap, and Utils. Also see the example web applications located in
 'system("exampleApps",package="Rook")'.

 - It provides a specification for writing your R web applications to
 work on any web server that supports the Rook specification. Currently,
 only Rhttpd implements it, but rApache is close behind. See the Rook
 help page for more information.

You may not see the need for web applications written in R, but consider
using Rook to build a statistical engine that complements a front-end
web system, or consider creating elegant ggplot2 graphics on-demand from
a fresh data stream. Also, consider creating dynamic instructional content
for the classroom.

If you have other examples or ideas, please let me know.

--
Jeffrey Horner  (author of rApache and brew)

@eddelbuettel is going to love that I put the #rstats twitter hashtag in my title ;)

9:01am  |   URL: http://tmblr.co/Zf5rDy4bjeau
(View comments  
Filed under: rstats 
April 20, 2011
Whither rApache and Rook (for R)

The above picture shows what an apache child process will look like once I add Rook support to rApache. An explanation of the above:

1) The light-orange colored box describes the apache process space.

2) Everything in blue, whether light-blue or cyan, is part of the R process space.

3) mod_R is the rApache portion: it glues together Apache to R.

4) The smaller rectangles (containing the text “brew::brew” and “/path/to/app.R::handler”) represent the current way rApache runs your web applications. Note that these applications will ONLY run via rApache and not any other web server. 

5) The cyan ovals represent the new way in which rApache will run Rook applications. ANY Rook (for R) applications, even if you developed it on your desktop.

So, any R web application developed using rApache will only work on rApache, while any R web application using the Rook specification/interface/package will work on rApache or any other Rook-enabled web server, like Rhttpd.

2:10pm  |   URL: http://tmblr.co/Zf5rDy4T9NOu
(View comments  
Filed under: rstats 
April 18, 2011
Introducing Rook

Rook is a web server interface and software package for R. It is very much like Ruby’s Rack. In fact it is so much like Ruby’s Rack that I decided to use the same name and basic class hierarchy. You could say I “borrowed heavliy” from Ruby’s Rack, and you wouldn’t be far from the truth. In fact, you could say “I stole their idea” and re-purposed it for R, and then you’d be telling the truth.

Regardless, I think it works very well for R. You can download Rook 1.0 from my github, or you can wait until it hits CRAN (it’s on its way. It’s there!).

Why would you want to use it? Well, it turns out that R 2.13 comes with a built-in web server fully capable of running user created applications. And it turns out that there’s another project named rApache that is fully capable of running user created applications. Unfortunately, you can’t run one application on the other because of differing APIs. But if you re-write them using the Rook specification, then you can theoretically run them on either web server, or another web server that supports R in the future.

What does it mean that Rook is a specification and a package? You can read more about the specification here, but basically the specification defines a simple interface between a web server and your application. Put another way, Rook defines a calling convention so that a web server knows how to execute your web application. Basically, a web server packages up everything about a web request that your application needs to know and then hands it off to your application, for each and every request.

The Rook package also contains convenience software that allows you to work with a web server in a more meaningful way. For instance, the Request class provides simple methods for doing things like converting CGI and POST data to R lists. Rook also contains a wrapper class named Rhttpd for working with R’s internal web server in a very simple way. Let’s look at a picture of how Rhttpd looks in the R interepreter below:

Rack within Rhttpd

On the left-hand side of the big blue box you see two inputs to the R interpreter:

(1) The Keyboard interfaces to the R Read-Eval-Print-Loop (REPL). This isn’t part of Rook but it is a way to learn a bit about interfacing. The command prompt, that darker blue circle containing the “>” key,  is printed by the REPL and is now patiently awaiting for you to type valid R code and press the “Enter” key. It will then “R”ead what you have written, Parse and “E”valuate it, “P”rint out the results, and then show you the command prompt again. So the interface is quite simple; you just need to know how to write R expressions and hit the “Enter” key ;)

2) The Web browser interfaces to an object of the Rhttpd class. We’ll presume it’s called Rhttpd simplicity. When a web request is sent to R, Rhttpd intercepts it and creates a new R environment name “env”. It then finds the application it needs to run based on the URL. If the application is a simple R closure (i.e. a function), then it calls that closure and passes the “env” argument. If the application is a reference class with a “call” method, then the “call” method is invoked with the “env” argument. The result of each call are then transformed by Rhttpd into valid web responses and returned to the web browser. These are two situations depicted by the lighter clolored elipses in the picture above, And you can read more about the details in the Rook specification.

So I hope that gives you a good overview of the why and the what of Rook. And another very important point is this:

You can use it today, right now, on your own desktop! Whether you’re running Windows, Mac, or some derivative of UNIX! Really! You don’t need any other software! Just install it and read the Rhttpd help page to learn how to run example applications.

 Cheers!

12:24pm  |   URL: http://tmblr.co/Zf5rDy4PXXvq
(View comments  
Filed under: rstats 
April 15, 2011
Rook shipped to CRAN

Burning the midnight oil.

I sometimes feel like this picture: alone, left with just my thoughts and a little light to illuminate my work. For my daughter pictured above on the stool, her work is a pen and ink drawing. For me, it’s a finished piece of software to share with the world. There’s always a bit of melancholy felt at the end of a long project, and I guess it fits with today’s 100% chance of rain here in Nashville.

I just made the final spell corrections for Rook’s documentation, executed the last “R CMD check” which passed with 100%, and sent it to CRAN. On Monday I’ll make a formal announcement. If you can’t wait, then you can get it here:

https://github.com/jeffreyhorner/rRack/blob/master/Rack_1.0.tar.gz?raw=true

9:06am  |   URL: http://tmblr.co/Zf5rDy4KDgbk
(View comments
Filed under: rstats 
March 24, 2011
R Still On Top

WARNING!!! YOU ARE IN THE DATA SCIENTIST DANGER ZONE!

According to the Google Ngram corpus, R is still the top rated statistical software package.

Ok, I’m just kidding. That plot is worthless. All the data are from books published between the years 1890 and 2008, and none of those software packages would show up in books until possibly the mid to late 1990’s.

For a serious comparison of statistical software packages, scope out Robert A. Muenchen’s “The Popularity of Data Analysis Software”.

If you want to explore more with the Google Ngram data, check out my Rook application. It’s a total hack. I’m using Apache to proxy to nginx that proxies to a bunch of R instances running my Rook package. The application is running an unreleased version of R, using an unreleased memcache client in R.

The good news is that I found and fixed a bug in the internal R web server. 

9:05am  |   URL: http://tmblr.co/Zf5rDy3oFofk
(View comments
Filed under: rstats