SAtRes News Funded DFG Project on Proofs as Big Data

Funded DFG Project on Proofs as Big Data

November 11, 2025

Proofs of unsatisfiability have become a core part of the theory, practice, and research surrounding SAT solving. They are essential artifacts for many reasons: Since proofs can be checked by independent checker tools that are formally verified down to the machine code level, they provide a correctness guarantee and are thus indispensable for use cases of SAT solving that involve mission-critical systems. In mathematical theorem proving, proofs are necessary to reach the high level of confidence and trust that is required in the result and to allow other parties to independently confirm it. They can serve as certificates for the safety or correctness of a system, can aid the debugging of wrong solver behavior, and can be used to identify unsatisfiable subsets of a formula.

As of yet, proof support for state-of-the-art distributed SAT solving still features an undesirable gap between scalable real-time proof checking, which does not provide a persistent and transferable artifact, and explicit “monolithic” proof production, which, ultimately, does not scale in terms of sequentially writing and checking the potentially huge proof. Achieving the best of both worlds may seem infeasible due to the intrinsically sequential proof writing and checking. However, we argue that the underlying issue is that distributed solvers are being forced to abide by the standards of sequential proof technology, which has not been designed for this purpose and is therefore insufficient to achieve full scalability.

The central objective of our DFG funded project “Propositional Proofs as Big Data” (DFG-Einzelantrag) is to bridge this research gap by devising formal, algorithmic, and practical frameworks for the production and validation of propositional proofs that are specifically designed for (massively) parallel computations and, as such, fully scalable. We propose to think of the tasks of producing and checking a proof of unsatisfiability as Big Data processing tasks. Just like distributed operations in Big Data frameworks (e.g., MapReduce, Apache Spark, Thrill), proof data will need to be produced, stored, and checked in parallel and across many nodes. Our main hypothesis is that careful algorithm engineering along this direction can overcome the scalability issues tied to prior, monolithic proofs in such a way that the overhead incurred during solving becomes negligible (≤ 10%) and that proofs can be checked significantly faster than it takes to derive them. By enabling this technology for parallel SAT solvers, we expect our proposed project to significantly increase the accessibility and dependability of scalable automated reasoning tools for many mission-critical use cases.

We are excited to officially kick off this research project in early 2026 with an additional, DFG funded researcher joining our group. Our cooperation partners include Marijn Heule (CMU), Armin Biere (University Freiburg), and Yong Kiam Tan (NRU + A*STAR Singapore).