Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), 2022.
Abstract
When pre-trained contextualized embedding-based models developed for unstructured data are adapted for structured tabular data, they perform admirably. However, recent probing studies show that these models use spurious correlations, and often predict inference labels by focusing on false evidence or ignoring it altogether. To study this issue, we introduce the task of Trustworthy Tabular Reasoning, where a model needs to extract evidence to be used for reasoning, in addition to predicting the label. As a case study, we propose a two-stage sequential prediction approach, which includes an evidence extraction and an inference stage. First, we crowdsource evidence row labels and develop several unsupervised and supervised evidence extraction strategies for InfoTabS, a tabular NLI benchmark. Our evidence extraction strategy outperforms earlier baselines. On the downstream tabular inference task, using only the automatically extracted evidence as the premise, our approach outperforms prior benchmarks.
Links
- Link to paper
- Link to code
- Website for the project with data and other information
- See on Google Scholar
Bib Entry
@inproceedings{gupta2022right-for, author = {Gupta, Vivek and Zhang, Shuo and Vempala, Alakananda and He, Yujie and Choji, Temma and Srikumar, Vivek}, title = {{Right for the Right Reason: Evidence Extraction for Trustworthy Tabular Reasoning}}, booktitle = {Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL)}, year = {2022} }