Location: Ballroom A
Keynote: Uncovering Principles of Statistical Visualization
Andrew Gelman, Department of Statistics and Department of Political Science, Columbia University
Abstract: Visualizations are central to good statistical workflow, but it has been difficult to establish general principles governing their use. We will try to back out some principles of visualization by considering examples of effective and ineffective uses of graphics in our own applied research. We consider connections between three goals of visualization: (a) vividly displaying results, (b) exploration of unexpected patterns in data, and (c) understanding fitted models.
Bio: Andrew Gelman is a professor of statistics and political science and director of the Applied Statistics Center at Columbia University. He has received the Outstanding Statistical Application award from the American Statistical Association, the award for best article published in the American Political Science Review, and the Council of Presidents of Statistical Societies award for outstanding contributions by a person under the age of 40. His books include Bayesian Data Analysis (with John Carlin, Hal Stern, David Dunson, Aki Vehtari, and Don Rubin), Teaching Statistics: A Bag of Tricks (with Deb Nolan), Data Analysis Using Regression and Multilevel/Hierarchical Models (with Jennifer Hill), Red State, Blue State, Rich State, Poor State: Why Americans Vote the Way They Do (with David Park, Boris Shor, and Jeronimo Cortina), and A Quantitative Tour of the Social Sciences (co-edited with Jeronimo Cortina). Andrew has done research on a wide range of topics, including: why it is rational to vote; why campaign polls are so variable when elections are so predictable; why redistricting is good for democracy; reversals of death sentences; police stops in New York City, the statistical challenges of estimating small effects; the probability that your vote will be decisive; seats and votes in Congress; social network structure; arsenic in Bangladesh; radon in your basement; toxicology; medical imaging; and methods in surveys, experimental design, statistical inference, computation, and graphics.
(Posters will be on display in the coffee breaks)
Keynote: Interpretability - now what?
Been Kim, Google
Abstract: In this talk, I hope to reflect on some of the progress made in the field of interpretable machine learning. We will reflect on where we are going as a field, and what are the things we need to be aware and be careful as we make progress. With that perspective, I will then discuss some of my recent work 1) sanity checking popular methods and 2) developing more lay person-friendly interpretability method.
Bio: Been Kim is a senior research scientist at Google Brain. Her research focuses on building interpretable machine learning - making ML understandable by humans for more responsible AI. The vision of her research is to make humans empowered by machine learning, not overwhelmed by it. She gave ICML tutorial on the topic in 2017, CVPR and MLSS at University of Toronto in 2018. She is a co-workshop Chair ICLR 2019, and has been an area chair at NIPS, ICML, AISTATS and FAT* conferences. In 2018, she gave a talk at G20 meeting on digital economy summit in Argentina. In 2019, her work called TCAV received UNESCO Netexplo award for "breakthrough digital innovations with the potential of profound and lasting impact on the digital society”. This work was also a part of CEO’s keynote at Google I/O 19'. She received her PhD. from MIT.
Paper Session: Application I
Paper Sesssion: Application II
Paper Session: Encoding
Paper Session: Perception
Keynote: Behind every great vis ... there's a great deal of wrangling
Jenny Bryan, RStudio
Abstract: If you are struggling to make a plot, tear yourself away from stackoverflow for a moment and ... take a hard look at your data. Is it really in the most favorable form for the task at hand? I must confess that I am no visualization pro. But even as a data analyst, I've made a very great number of plots. Time and time again I have found that my visualization struggles are really a symptom of unfinished data wrangling. I will give an overview of the data wrangling landscape, with a special emphasis on R (my language of choice), but including high-level principles that are applicable to those working in other languages.
Bio: Jenny Bryan (twitter, GitHub) is a Software Engineer at RStudio, working on the team lead by Hadley Wickham. This team brings you the popular set of packages known as the Tidyverse, as well as a set of low-level infrastructure packages. She is on leave from being an Associate Professor of Statistics at the University of British Columbia. Jenny has an undergraduate degree in Economics and German Literature from Yale University and earned her PhD in Biostatistics at UC Berkeley. Jenny has been using and teaching R (or S!) for 20 years, most recently in STAT 545 and Software Carpentry. Other aspects of her R life include ordinary membership in the R Foundation, work with rOpenSci, development/maintenance of R packages (such as readxl), and leading the curriculum development for UBC's Master of Data Science.