Don't Panic! Better, Fewer, Syntax Errors for LR Parsers (ECOOP 2020 - Artifacts)

Sun 15 - Tue 17 November 2020 Online Conference

co-located with SPLASH 2020

Who

Lukas Diekmann, Laurence Tratt

Track

ECOOP 2020 Artifacts

Abstract

Syntax errors are generally easy to fix for humans, but not for parsers, in general, and LR parsers, in particular. Traditional ‘panic mode’ error recovery, though easy to implement and applicable to any grammar, often leads to a cascading chain of errors that drown out the original. More advanced error recovery techniques suffer less from this problem but have seen little practical use because their typical performance was seen as poor, their worst case unbounded, and the repairs they reported arbitrary. In this paper we introduce an algorithm and implementation that addresses these issues. First, we report the complete set of minimum cost repair sequences for a given location, allowing programmers to select the one that best fits their intention. Second, on a corpus of 200,000 real-world syntactically invalid Java programs, we are able to repair 98.38%±0.018% of files within a cut-off of 0.5s. Finally, we use the existence of the complete set of minimum cost repair sequences to reduce one of the most frustrating consequences of error reporting: the cascading error problem. Across our corpus, we report 435,824±480 error locations to the user, while the panic mode algorithm reports 981,628±0 error locations: in other words, we reduce the cascading error problem by well over half.

Don't Panic! Better, Fewer, Syntax Errors for LR Parsers

Lukas Diekmann

King's College London

United Kingdom

Laurence Tratt

King's College London

United Kingdom

Tracks

Workshops

Co-hosted Symposia