Data Augmentation using Counterfactuals: Proximity vs Diversity

Md Golam Moula Mehedi Hasan; Douglas Talbert

doi:10.32473/flairs.v35i.130705

Authors

Md Golam Moula Mehedi Hasan Tennessee Technological University
Douglas Talbert Tennessee Technological University

DOI:

https://doi.org/10.32473/flairs.v35i.130705

Abstract

Counterfactual explanations are gaining in popularity as a way of explaining machine learning models. Counterfactual examples are generally created to help interpret the decision of a model. In that case, if a model makes a certain decision for an instance, the counterfactual examples of that instance reverse the decision of the model. Counterfactual examples can be created by craftily changing particular feature values of the instance. Though counterfactual examples are generated to explain the decision of machine learning models, we have already explored that counterfactual examples can be used for effective data augmentation. In this work, we want to explore what kind of counterfactual examples work best for data augmentation. In particular, we want to generate counterfactual examples from two perspectives: proximity and diversity. We want to observe
which perspective works best in this regard. We demonstrate the efficacy of these approaches on the widely used “Adult-Income” dataset. We consider several scenarios where we do not have enough data and use each of these approaches to augment the dataset. We compare these two approaches and discuss the implications of the results.

Data Augmentation using Counterfactuals: Proximity vs Diversity

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Developed By

Make a Submission

Language