banner Image courtesy of Kirk Goldsberry

VDS @IEEE VIS Program

Conference registration at http://ieeevis.org/year/2021/info/registration/conference-registration.

Sun. Oct 24, 2021, 8am - 12pm (US CENTRAL Time), 2021
Sun. 8:00am-9:10am (US CENTRAL Time)
Opening & Keynote
Sun. 8:00am-8:10am (US CENTRAL Time)
Opening
TBD
Sun. 9:10am-9:25am (US CENTRAL Time)
Break
Sun. 9:25am-10:25am (US CENTRAL Time)
Papers
Sun. 9:25am-9:40am (US CENTRAL Time)
Subhajit Das, Alex Endert
Abstract: Machine learning (ML) models are constructed by expert ML practitioners using various coding languages, in which they tune and select model hyperparameters and learning algorithms for a given problem domain. In multi-objective optimization, conflicting objectives and constraints is a major area of concern. In such problems, several competing objectives are seen for which no single optimal solution is found that satisfies all desired objectives simultaneously. In the past, visual analytic (VA) systems have allowed users to interactively construct objective functions for a classifier. In this paper, we extend this line of work by prototyping a technique to visualize multi-objective objective functions either defined in a Jupyter notebook or defined using an interactive visual interface to help users to detect and resolve conflicting objectives. Visualization of the objective function enlightens potentially conflicting objectives that obstructs selecting correct solution(s) for the desired ML task or goal. We also present an enumeration of potential conflicts in objective specification in multi-objective objective functions for classifier selection. Furthermore, we demonstrate our approach in a VA system that helps users in specifying meaningful objective functions to a classifier by detecting and resolving conflicting objectives.
Sun. 9:40am-9:55am (US CENTRAL Time)
Joseph Cottam, Maria Glenski, Zhuanyi Huang, Ryan Rabello, Austin Golding, Svitlana Volkova, Dustin L Arendt
Abstract: Reasoning about cause and effect is one of the frontiers for modern machine learning. Many causality techniques reason over a ``causal graph'' provided as input to the problem. When a causal graph cannot be produced from human expertise, ``causal discovery'' algorithms can be used to generate one from data. Unfortunately, causal discovery algorithms vary wildly in their results due to unrealistic data and modeling assumptions, so the results still need to be manually validated and adjusted. This paper presents a graph comparison tool designed to help analysts curate causal discovery results. This tool facilitates feedback loops whereby an analyst compares proposed graphs from multiple algorithms (or ensembles) and then uses insights from the comparison to refine parameters and inputs to the algorithms. We illustrate different types of comparisons and show how the interplay of causal discovery and graph comparison improves causal discovery.
Sun. 9:55am-10:10am (US CENTRAL Time)
Deepthi Raghunandan, Zhe Cui, Kartik Krishnan, Segen Tirfe, Shenzhi Shi, Tejaswi Darshan Shrestha, Leilani Battle, Niklas Elmqvist
Abstract: Keeping abreast of current trends, technologies, and best practices in visualization and data analysis is becoming increasingly difficult, especially for fledgling data scientists. In this paper, we propose Lodestar, an interactive computational notebook that allows users to quickly explore and construct new data science workflows by selecting from a list of automated analysis recommendations. We derive our recommendations from directed graphs of known analysis states, with two input sources: one manually curated from online data science tutorials, and another extracted through semi-automatic analysis of a corpus of over 6,000 Jupyter notebooks. We evaluate Lodestar in a formative study guiding our next set of improvements to the tool. The evaluation suggests that users find Lodestar useful for rapidly creating data science workflows.
Sun. 10:10am-10:25am (US CENTRAL Time)
Anamaria Crisan, Vidya Setlur
Abstract: Data analysts need to routinely transform data into a form conducive for deeper investigation. While there exists a myriad of tools to support this task on tabular data, few tools exist to support analysts with more complex data types. In this study, we investigate how analysts process and transform large sets of XML data to create an analytic data model useful to further their analysis. We conduct a set of formative interviews with four experts that have diverse yet specialized knowledge of a common dataset. From these interviews, we derive a set of goals, tasks, and design requirements for transforming XML data into an analytic data model. We implement Natto as a proof-of-concept prototype that actualizes these design requirements into a set of visual and interaction design choices. We demonstrate the utility of the system through the presentation of analysis scenarios using real-world data. Our research contributes novel insights into the unique challenges of transforming data that is both hierarchical and internally linked. Further, it extends the knowledge of the visualization community in the areas of data preparation and wrangling.
Sun. 10:25am-10:40am (US CENTRAL Time)
Break
Sun. 10:40am-11:40am (US CENTRAL Time)
Closing Keynote
TBD






KDD Program (Past)

Conference registration at https://kdd.org/kdd2021/attending.

Sun. August 15, 8am - 12pm (Singapore) / Sat. August 14, 5pm - 9pm (US West), 2021
Sun. 8:00am-9:10am (Singapore)/Sat. 5:00pm-6:10pm (US West)
Opening & Keynote
Sun. 8:00am-8:10am (Singapore)/Sat. 5:00pm-5:10pm (US West)
Opening
Hanspeter Pfister
Hanspeter Pfister
Sun. 8:10am-9:10am (Singapore)/Sat. 5:10pm-6:10pm (US West)
Keynote: Towards Visually Interactive Neural Probabilistic Models
Hanspeter Pfister, Harvard University
Abstract: Deep learning methods have been a tremendously effective approach to problems in computer vision and natural language processing. However, these black-box models can be difficult to deploy in practice as they are known to make unpredictable mistakes that can be hard to analyze and correct. In this talk, I will present collaborative research to develop visually interactive interfaces for probabilistic deep learning models, with the goal of allowing users to examine and correct black-box models through visualizations and interactive inputs. Through co-design of models and visual interfaces, we will take the necessary next steps for model interpretability. Achieving this aim requires active investigation into developing new deep learning models and analysis techniques, and integrating them within interactive visualization frameworks.

Bio: Hanspeter Pfister is the An Wang Professor of Computer Science at the Harvard John A. Paulson School of Engineering and Applied Sciences and an affiliate faculty member of the Center for Brain Science. His research in visual computing lies at the intersection of visualization, computer graphics, and computer vision and spans a wide range of topics, including biomedical image analysis and visualization, image and video analysis, interpretable machine learning, and visual analytics in data science. Pfister has a PhD in computer science from the State University of New York at Stony Brook and an MS in electrical engineering from ETH Zurich, Switzerland. From 2013 to 2017 he was director of the Institute for Applied Computational Science. Before joining Harvard, he worked for over a decade at Mitsubishi Electric Research Laboratories, where he was associate director and senior research scientist. He was the chief architect of VolumePro, Mitsubishi Electric’s award-winning real-time volume rendering graphics card, for which he received the Mitsubishi Electric President’s Award in 2000. Pfister was elected as an ACM Fellow in 2019. He is the recipient of the 2010 IEEE Visualization Technical Achievement Award, the 2009 IEEE Meritorious Service Award, and the 2009 Petra T. Shattuck Excellence in Teaching Award. Pfister is a member of the ACM SIGGRAPH Academy, the IEEE Visualization Academy, and a director of the ACM SIGGRAPH Executive Committee and the IEEE Visualization and Graphics Technical Committee.
Sun. 9:10am-9:25am (Singapore)/Sat. 6:10pm-6:25pm (US West)
Break
Sun. 9:25am-10:25am (Singapore)/Sat. 6:25pm-7:25pm (US West)
Papers
Sun. 9:25am-9:40am (Singapore)/Sat. 6:25pm-6:40pm (US West)
Subhajit Das, Alex Endert
Abstract: Machine learning (ML) models are constructed by expert ML practitioners using various coding languages, in which they tune and select model hyperparameters and learning algorithms for a given problem domain. In multi-objective optimization, conflicting objectives and constraints is a major area of concern. In such problems, several competing objectives are seen for which no single optimal solution is found that satisfies all desired objectives simultaneously. In the past, visual analytic (VA) systems have allowed users to interactively construct objective functions for a classifier. In this paper, we extend this line of work by prototyping a technique to visualize multi-objective objective functions either defined in a Jupyter notebook or defined using an interactive visual interface to help users to detect and resolve conflicting objectives. Visualization of the objective function enlightens potentially conflicting objectives that obstructs selecting correct solution(s) for the desired ML task or goal. We also present an enumeration of potential conflicts in objective specification in multi-objective objective functions for classifier selection. Furthermore, we demonstrate our approach in a VA system that helps users in specifying meaningful objective functions to a classifier by detecting and resolving conflicting objectives.
Sun. 9:40am-9:55am (Singapore)/Sat. 6:40pm-6:55pm (US West)
Joseph Cottam, Maria Glenski, Zhuanyi Huang, Ryan Rabello, Austin Golding, Svitlana Volkova, Dustin L Arendt
Abstract: Reasoning about cause and effect is one of the frontiers for modern machine learning. Many causality techniques reason over a ``causal graph'' provided as input to the problem. When a causal graph cannot be produced from human expertise, ``causal discovery'' algorithms can be used to generate one from data. Unfortunately, causal discovery algorithms vary wildly in their results due to unrealistic data and modeling assumptions, so the results still need to be manually validated and adjusted. This paper presents a graph comparison tool designed to help analysts curate causal discovery results. This tool facilitates feedback loops whereby an analyst compares proposed graphs from multiple algorithms (or ensembles) and then uses insights from the comparison to refine parameters and inputs to the algorithms. We illustrate different types of comparisons and show how the interplay of causal discovery and graph comparison improves causal discovery.
Sun. 9:55am-10:10am (Singapore)/Sat. 6:55pm-7:10pm (US West)
Deepthi Raghunandan, Zhe Cui, Kartik Krishnan, Segen Tirfe, Shenzhi Shi, Tejaswi Darshan Shrestha, Leilani Battle, Niklas Elmqvist
Abstract: Keeping abreast of current trends, technologies, and best practices in visualization and data analysis is becoming increasingly difficult, especially for fledgling data scientists. In this paper, we propose Lodestar, an interactive computational notebook that allows users to quickly explore and construct new data science workflows by selecting from a list of automated analysis recommendations. We derive our recommendations from directed graphs of known analysis states, with two input sources: one manually curated from online data science tutorials, and another extracted through semi-automatic analysis of a corpus of over 6,000 Jupyter notebooks. We evaluate Lodestar in a formative study guiding our next set of improvements to the tool. The evaluation suggests that users find Lodestar useful for rapidly creating data science workflows.
Sun. 10:10am-10:25am (Singapore)/Sat. 7:10pm-7:25pm (US West)
Anamaria Crisan, Vidya Setlur
Abstract: Data analysts need to routinely transform data into a form conducive for deeper investigation. While there exists a myriad of tools to support this task on tabular data, few tools exist to support analysts with more complex data types. In this study, we investigate how analysts process and transform large sets of XML data to create an analytic data model useful to further their analysis. We conduct a set of formative interviews with four experts that have diverse yet specialized knowledge of a common dataset. From these interviews, we derive a set of goals, tasks, and design requirements for transforming XML data into an analytic data model. We implement Natto as a proof-of-concept prototype that actualizes these design requirements into a set of visual and interaction design choices. We demonstrate the utility of the system through the presentation of analysis scenarios using real-world data. Our research contributes novel insights into the unique challenges of transforming data that is both hierarchical and internally linked. Further, it extends the knowledge of the visualization community in the areas of data preparation and wrangling.
Sun. 10:25am-10:40am (Singapore)/Sat. 7:25pm-7:40pm (US West)
Break
Sun. 10:40am-11:40am (Singapore)/Sat. 7:40pm-8:40pm (US West)
Closing Keynote
Arvind Satyanarayan
Arvind Satyanarayan
Sun. 10:40am-11:40am (Singapore)/Sat. 7:40pm-8:40pm (US West)
Keynote: From Tools to Toolkits - Towards more Reusable, Composable, and Reliable Machine Learning Interpretability
Arvind Satyanarayan, MIT
Abstract: As machine learning models are increasingly deployed into real-world contexts, the need for interpretability grows more urgent. In order to hold models accountable for the outcomes they produce, we cannot rely on quantitative measures of accuracy alone; rather, we also need to be able to qualitatively inspect how they operate. To meet this challenge, recent years have seen an explosion of research developing techniques and systems to interpret model behavior. But, are we making meaningful progress on this issue? In this talk, I will give us a language for answering this question by drawing on frameworks in human-computer interaction (HCI) and by analogizing to the progress of research in data visualization. I will use this language to characterize existing work (including work my research group is currently conducting) and sketch out directions for future work.

Bio: Arvind Satyanarayan is the NBX Assistant Professor of Computer Science in the MIT EECS department and a member of the Computer Science and Artificial Intelligence Lab (CSAIL). He leads the MIT Visualization Group which uses data visualization as a petri dish to study intelligence augmentation (IA), or how software systems can help amplify our cognition and creativity while respecting our agency. His work has been recognized with an NSF CAREER award and a Google Research Scholar award, best paper awards at premier academic venues (e.g., ACM CHI and IEEE VIS), and by practitioners (e.g., with an Information is Beautiful Award nomination). Visualization toolkits and systems he has developed with collaborators are widely used in industry (including at Apple, Google, and Microsoft), on Wikipedia, and in the Jupyter/Observable data science communities. Between 2018–2020, he served as a co-editor of Distill, an academic journal devoted to clarity in machine learning research.