Guineng Zheng, Robert Ricci and Vivek Srikumar
ACM Conference on Reproducibility and Replicability, 2024.

Abstract

Logging mechanisms are a cornerstone in the development and maintenance of computer systems, gaining even greater prominence in the era of large-scale cloud-based applications. Their critical role in real-time system monitoring and behavior analysis cannot be overstated, making them invaluable tools for system administrators and developers alike. Automated parsing of log messages—turning the text of log messages into parsed, structured data—is a significant research area. Despite the number of log parsers that have emerged over the years, there has been a noticeable gap in the evaluation of these tools, reproduction of results, and direct comparisons between them on a level playing field. Recognizing this, we re-implemented twelve of the most popular log parsers from scratch, enabling them to be used in replication studies. This paper presents our open-source project LogFlux, which is a suite for evaluating automated log parsers so that studies involving them can be replicated. Through LogFlux, we aim to bridge the gap between theoretical log parsing methods and their practical application, offering a robust and easy to use solution that is accessible and effective for a range of users. Our experience in attempting to obtain results from many published log parser algorithms has shed light on important aspects of replication, such as the value of independent implementation for uncovering bugs and the need for careful software engineering to facilitate maintenance.

Links

Bib Entry

@inproceedings{zheng2024logflux,
  author = {Zheng, Guineng and Ricci, Robert and Srikumar, Vivek},
  title = {{{LogFlux}}: {{A Software Suite For Replicating Results In Automated Log Parsing}}},
  booktitle = {ACM Conference on Reproducibility and Replicability},
  year = {2024}
}