Losses that Cook: Topological Optimal Transport for Structured Recipe Generation
Summary
Researchers from Trustpilot and Politecnico di Torino introduced a novel topological loss function for structured cooking recipe generation, addressing the limitations of standard cross-entropy (CE) in handling complex procedural and compositional coherence. Building on the RECIPE-NLG corpus, their method represents ingredient lists as point clouds in embedding space, minimizing the Sinkhorn divergence between predicted and gold ingredients. Experiments with a fine-tuned Qwen3-4B Small Language Model (SLM) demonstrated that this topological loss significantly improves ingredient recall, quantity precision, and procedural accuracy. When combined with Dice loss in a mixed objective, the approach achieved synergistic gains in quantity and time precision, outperforming CE and single custom losses. Human evaluation further supported these findings, with the proposed model preferred in 62% of cases and showing a 67.5% reduction in generation errors compared to CE.
Key takeaway
For AI Engineers developing structured text generation systems, particularly for complex domains like recipes, you should consider augmenting standard cross-entropy with specialized loss functions. Implementing a topological loss can significantly enhance the factual correctness and procedural coherence of generated outputs, leading to more usable and accurate results. This approach reduces critical errors and improves human preference, making your models more robust for real-world applications.
Key insights
Topological loss significantly improves structured recipe generation by encoding ingredient-level semantic and structural coherence.
Principles
- Cross-entropy is insufficient for structured text generation.
- Loss functions can encode geometric structure in embedding space.
- Domain adaptation is crucial for constraint-satisfying generation.
Method
The method involves fine-tuning SLMs with a composite objective that includes a topological loss, representing ingredient lists as point clouds and minimizing Sinkhorn divergence between predicted and gold embeddings.
In practice
- Augment CE with topological loss for structured text tasks.
- Use recipe-specific metrics for evaluating generation quality.
- Combine Dice and topological losses for balanced accuracy.
Topics
- Structured Recipe Generation
- Topological Loss
- Optimal Transport
- Sinkhorn Divergence
- Dice Loss
Code references
Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.