Confidence and prediction intervals explained... (with a Shiny app!)

This semester I started teaching introduction to statistics and data analysis with R, at Tel-Aviv university. I put in a lot of efforts into bringing practical challenges, examples from real life, and a lot of demonstrations of statistical theory with R. This post is an example for how I’ve been using R code (and specifically Shiny apps) to demonstrate statistical theory, concepts and provide intuition. What’s the difference between confidence and prediction intervals?

Retrieving google drive item shares and permissions (in R)

Google drive is a great tool, specifically we’ve been using “G Suite” (the equivalent of google drive but for businesses), for a long time. Lately I noticed it’s missing an important feature - monitoring file shares and permission of google drive items across organization is non-trival (at least in the G suite basic subscription). I wanted to get a better sense of how my files and folders are shared across users within and outside the organization.

Securing Shiny apps with AWS Cognito authentication

Background Shiny apps are a great way to share information and empower your users. Sometimes you want to make sure that only authenticated and authorized users will be able to view your shiny apps. There are a number of ways to make sure only certain users have access to your apps. For example, you can subscribe to the professional plan in shinyapps.io which has this option built-in. You can program the authentication flow internally by yourself, or you just use a 3rd party service such as google firebase, AWS Cognito, Auth0, or others).

What NOT to do when building a shiny app (lessons learned the hard way)

I’ve been building R shiny apps for a while now, and ever since I started working with shiny, it has significantly increased the set of services I offer my clients. Here’s a documentations of some of the many lessons I learned in previous projects I did. Hopefully, others can avoid them in the future. Background Shiny is a really great tool that allows data scientists to communicate their analysis in an appealing and an effective way.

Test your tidyness - a short quiz to check your tidyverse capabilities

Over the last month I gave a tidyverse + intro to data science corporate training in a startup in Tel-Aviv. We had two groups (beginners and intermediates), and for the last assignment of the course I was aiming for a short quiz comprised of various topics which we covered during the course, such that can also be automated easily (i.e., multiple choice questions). I came up with the following quiz, which I thought would be nice to share here.

The teachR's::cheat sheet

A few months ago I attended the 2019 rstudio::conf, including the shiny train-the-trainer workshop. It was a two day workshop and it inspired me in many ways. The first day of the workshop focused on the very basics of teaching (R or anything else), and for me it put the spotlight on things I never considered before. One of the important takeways from the workshop was how to approach educating others: preparing for a course, things you can do during the lessons, and how to self-learn and improve my own teaching methods afterwards.

Settling class action lawsuits with conjoint analysis and R (+a conjoint shiny app)

A few days ago I presented at the 9th Israeli class action lawsuit conference. You’re probably asking yourself what would a data scientist do in a room full of lawyers? Apparently, there is a lot to do… Here’s the story: being in market research, we get a lot of lawyers which are faced with class action lawsuits (either suing or being sued) - and they hire us to conduct research and estimate things like the size of the group for the class action, or the total damages applied on the group.

Purrring progress bars (adding a progress bar to `purrr::map`)

With all the functional programming going on (i.e., purrr::map and the likes), there is at least one thing that I found missing: progress bars. The plyr::do function had a nice looking progress bar open up by default if the operation took more than 2 seconds and had at least two more to go (as per Hadley’s description in Issue#149 in tidyverse/purrr). The issue is still open, for the time of writing these lines, and will probably be solved sometime in the near future as a feature of purrr::map.

Short note about tidyeval

Following Jenny Bryan’s talk on tidyeval in the last rstudio::conf 2019, I decided to write this short note (mainly as a reminder to myself). What is tidyeval? Tidy evaluation, or non standard evaluation, allows us to pass column names between functions. This is the “classic” behaviour of most tidyverse functions. For example, we use: library(tidyverse) mtcars %>% select(mpg, cyl) ## mpg cyl ## Mazda RX4 21.0 6 ## Mazda RX4 Wag 21.

Recap: what I learned in rstudio::conf2019

First, let me start by saying wow!, what a wonderful experience. When I booked the trip from Israel to Austin, TX, I thought that I’ll see some good content, and learn at the conference (as I in fact did). It was much more enjoyable than I could’ve imagined. In part I guess this can be contributed to the awesome R community. The ease in which you start a conversation with just about anyone in the conference - about R, professional life (or even personal life), that’s great.