Vivek Gupta, Shuo Zhang, Alakananda Vempala, Yujie He, Temma Choji and Vivek Srikumar
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), 2022.

Abstract

When pre-trained contextualized embedding-based models developed for unstructured data are adapted for structured tabular data, they perform admirably. However, recent probing studies show that these models use spurious correlations, and often predict inference labels by focusing on false evidence or ignoring it altogether. To study this issue, we introduce the task of Trustworthy Tabular Reasoning, where a model needs to extract evidence to be used for reasoning, in addition to predicting the label. As a case study, we propose a two-stage sequential prediction approach, which includes an evidence extraction and an inference stage. First, we crowdsource evidence row labels and develop several unsupervised and supervised evidence extraction strategies for InfoTabS, a tabular NLI benchmark. Our evidence extraction strategy outperforms earlier baselines. On the downstream tabular inference task, using only the automatically extracted evidence as the premise, our approach outperforms prior benchmarks.

Links

Bib Entry

@inproceedings{gupta2022right-for,
  author = {Gupta, Vivek and Zhang, Shuo and Vempala, Alakananda and He, Yujie and Choji, Temma and Srikumar, Vivek},
  title = {{Right for the Right Reason: Evidence Extraction for Trustworthy Tabular Reasoning}},
  booktitle = {Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL)},
  year = {2022}
}