ECOOP 2020
Sun 15 - Tue 17 November 2020 Online Conference
co-located with SPLASH 2020

Machine learning has been widely adopted in diverse science and engineering domains, aided by reusable libraries and quick development patterns. The TensorFlow library is probably the best-known representative of this trend and most users employ the Python API to its powerful back-end. TensorFlow programs are susceptible to several systematic errors, especially in the dynamic typing setting of Python. We present a static analysis that tracks the shapes of tensors across Python library calls and warns of several possible mismatches. The key technical aspects are a close modeling of library semantics with respect to tensor shape, and an identification of violations and error-prone patterns. Our analysis is powerful enough to statically detect (with 100% precision) 11 of the 14 shape-related TensorFlow bugs in the recent Zhang et al. empirical study—an independent slice of real-world bugs that also includes semantic violations not statically detectable.