The Role of Transformer Architecture in the Logic-as-Loss Framework
Mattia Medina Grespan and Vivek Srikumar
Machine Learning and Knowledge Discovery in Databases. Research Track (ECML PKDD), 2025.
Abstract
The logic-as-loss framework has enabled transformer models to incorporate domain knowledge by encoding logical constraints as differentiable objectives, allowing neural networks to learn from explicit rules. Despite its effectiveness across diverse tasks, the relationship between neural architecture and rule internalization remains poorly understood. This study systematically investigates how transformer encoder configurations influence the ingestion of logical rules, beyond simply scaling up model capacity. We aim to identify the architectural factors that enable successful rule internalization and the inherent limitations of this process. Empirical analysis on controlled reasoning tasks reveals a capacity threshold: transformers perform poorly at rule adherence below a critical parameter count, while performance plateaus above it. A key finding is that embedding dimensionality drives rule ingestion efficacy, while increased network depth mitigates spurious solutions that satisfy rules without improving task performance. Our work highlights the role of architectural design choices for effective neuro-symbolic learning.
Links
- Link to paper
- See on Google Scholar
Bib Entry
@inproceedings{medinagrespan2025role-of,
author = {Medina Grespan, Mattia and Srikumar, Vivek},
title = {{The Role of Transformer Architecture in the Logic-as-Loss Framework}},
booktitle = {Machine Learning and Knowledge Discovery in Databases. Research Track (ECML PKDD)},
year = {2025}
}