The WinVector LLC data science blog is starting a new series on using Spark and R to handle big data.
Our goal
What we want to do with the “
R
and big data” series is:

Give a taste of some of the power of the
R
/Spark
combination. 
Share a “capabilities and readiness” checklist you should apply when evaluating infrastructure.

Start to publicly document
R
/Spark
best practices. 
Describe some of the warts and how to work around them.

Share fun tricks and techniques that make working with
R
/Spark
much easier and more effective.