This talk was presented at the Budapest Users of R Network on Aug 26 2015 with two parts:
• In the first part I will reveal why I switched to R for most of my data munging, analysis, visualization and machine learning needs in 2006. While R has been dominant in academia for long, by (accidentally) founding the R meetup in Los Angeles in 2009 I had the privilege to witness closely the raise of R in the industry and how it became the most widely used tool in the field that now is called data science. I’m going to share a few use cases from past LA R meetup talks and from informal discussions with seasoned R users in LA working for companies such as Google, Netflix, Activison etc.
• In a second and more technical part of the talk, I’m going to present a benchmark of tools for interactive data munging and show how using R on a laptop you can get results faster than using a Hadoop/Spark cluster on datasets with hundreds of millions of rows. I will also argue about the importance of data visualization and touch a bit on the topic of machine learning with R.
Szilard Pafka studied Physics in the 90s in Budapest and has obtained a PhD by using statistical methods to analyze the risk of financial portfolios. Next he has worked in a bank quantifying and managing market risk. About a decade ago he moved to California to become the Chief Scientist of a credit card processing company doing everything data (ETL, analysis, visualization, machine learning etc). He is also the founder/organizer of several data science related meetups in Santa Monica, the epicenter of startups and tech companies in the Los Angeles area.
Slides: https://speakerdeck.com/szilard/r-sto...
More details at the event homepage: http://www.meetup.com/Budapest-Users-...