ScratchLens: Lens-Parametric Behavioral Equivalence for Scratch Programs

2026-06-14 · Source: cs.SE updates on arXiv.org · Field: Technology & Digital — Software Development & Engineering, Artificial Intelligence & Machine Learning · Depth: Expert, extended

Summary

ScratchLens is an equivalence checker designed for Scratch programs, addressing the challenge of determining behavioral similarity despite syntactic variations or subtle one-block edits. It operates on the principle that program equivalence is parametric in an "observation lens," allowing it to categorize behavioral divergences by causal phenomena and observing lens. The system compiles Scratch projects into a causal intermediate representation (CSIR), applying canonicalization for alpha-renamings and procedure bodies, and utilizing Mazurkiewicz trace normal forms to manage concurrency and races. ScratchLens employs SMT obligations and counterexample-guided execution on an instrumented Scratch VM for unresolved cases. Evaluated on a corpus of 444 validated pairs, ScratchLens achieved 1.000 accuracy, made 0 false-equivalence claims on 158 witnessed-different pairs, and recovered root causes with 0.987 accuracy, significantly outperforming structural, dynamic-only, and large language model baselines with a median decision time of 0.22s.

Key takeaway

For research scientists or software engineers developing automated analysis tools for block-based programming environments like Scratch, you should adopt a lens-parametric approach to behavioral equivalence. This method, which explicitly defines observation lenses, is critical for accurately comparing programs that may have diverse syntactic forms but identical behavior, or subtle one-block changes with significant behavioral divergence. Implementing such a system will prevent false equivalence claims, ensuring your feedback, grading, or repair validation systems correctly identify defects and accept valid refactorings.

Key insights

Behavioral equivalence in Scratch programs is defined by an explicit observation lens, not just syntax.

Principles

Program equivalence must account for diverse syntactic realizations of identical behavior.
Sound equivalence checkers require explicit observation lenses to avoid false positives or negatives.
Static analysis provides equivalence proofs, while VM execution offers difference witnesses.

Method

ScratchLens compiles projects to CSIR, canonicalizes elements, applies Mazurkiewicz trace normal forms for concurrency, and resolves ambiguities via SMT and counterexample-guided VM execution.

In practice

Automate feedback for student Scratch projects.
Validate program repairs across diverse coding styles.
Support grading by accepting correct refactorings.

Topics

Scratch Programming
Behavioral Equivalence
Program Analysis
Concurrency Verification
Intermediate Representations
Automated Grading

Code references

scratchfoundation/scratch-vm

Best for: AI Scientist, Software Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.