Data Augmentation using Counterfactuals: Proximity vs Diversity

Authors

  • Md Golam Moula Mehedi Hasan Tennessee Technological University
  • Douglas Talbert Tennessee Technological University

DOI:

https://doi.org/10.32473/flairs.v35i.130705

Abstract

Counterfactual explanations are gaining in popularity as a way of explaining machine learning models. Counterfactual examples are generally created to help interpret the decision of a model. In that case, if a model makes a certain decision for an instance, the counterfactual examples of that instance reverse the decision of the model. Counterfactual examples can be created by craftily changing particular feature values of the instance. Though counterfactual examples are generated to explain the decision of machine learning models, we have already explored that counterfactual examples can be used for effective data augmentation. In this work, we want to explore what kind of counterfactual examples work best for data augmentation. In particular, we want to generate counterfactual examples from two perspectives: proximity and diversity. We want to observe
which perspective works best in this regard. We demonstrate the efficacy of these approaches on the widely used “Adult-Income” dataset. We consider several scenarios where we do not have enough data and use each of these approaches to augment the dataset. We compare these two approaches and discuss the implications of the results.

Downloads

Published

04-05-2022

How to Cite

Mehedi Hasan, M. G. M., & Talbert, D. (2022). Data Augmentation using Counterfactuals: Proximity vs Diversity. The International FLAIRS Conference Proceedings, 35. https://doi.org/10.32473/flairs.v35i.130705

Issue

Section

Main Track Proceedings