Losses that Cook: Topological Optimal Transport for Structured Recipe Generation

2026-04-21 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Expert, extended

Summary

Researchers from Trustpilot and Politecnico di Torino introduced a novel topological loss function for structured cooking recipe generation, addressing the limitations of standard cross-entropy (CE) in handling complex procedural and compositional coherence. Building on the RECIPE-NLG corpus, their method represents ingredient lists as point clouds in embedding space, minimizing the Sinkhorn divergence between predicted and gold ingredients. Experiments with a fine-tuned Qwen3-4B Small Language Model (SLM) demonstrated that this topological loss significantly improves ingredient recall, quantity precision, and procedural accuracy. When combined with Dice loss in a mixed objective, the approach achieved synergistic gains in quantity and time precision, outperforming CE and single custom losses. Human evaluation further supported these findings, with the proposed model preferred in 62% of cases and showing a 67.5% reduction in generation errors compared to CE.

Key takeaway

For AI Engineers developing structured text generation systems, particularly for complex domains like recipes, you should consider augmenting standard cross-entropy with specialized loss functions. Implementing a topological loss can significantly enhance the factual correctness and procedural coherence of generated outputs, leading to more usable and accurate results. This approach reduces critical errors and improves human preference, making your models more robust for real-world applications.

Key insights

Topological loss significantly improves structured recipe generation by encoding ingredient-level semantic and structural coherence.

Principles

Cross-entropy is insufficient for structured text generation.
Loss functions can encode geometric structure in embedding space.
Domain adaptation is crucial for constraint-satisfying generation.

Method

The method involves fine-tuning SLMs with a composite objective that includes a topological loss, representing ingredient lists as point clouds and minimizing Sinkhorn divergence between predicted and gold embeddings.

In practice

Augment CE with topological loss for structured text tasks.
Use recipe-specific metrics for evaluating generation quality.
Combine Dice and topological losses for balanced accuracy.

Topics

Structured Recipe Generation
Topological Loss
Optimal Transport
Sinkhorn Divergence
Dice Loss

Code references

DarthReca/losses-cook

Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.