Making AI systems more transparent and trustworthy: an interview with Ximing Wen

· Source: ΑΙhub · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, medium

Summary

Ximing Wen, a PhD candidate at Drexel University, is researching methods to enhance the transparency and trustworthiness of AI systems, particularly large language models. Her work addresses the challenge of models providing confident but unverified answers by developing approaches that show reasoning and evidence for outputs, crucial for applications in healthcare and legal document review. Wen's research includes a prototype-based approach that achieves interpretability without sacrificing accuracy, outperforming previous interpretable models. She extended this concept to generative models and medical AI, creating interpretable diagnostic tools for limited datasets. A key finding involved redesigning a loss function for spatial coordinates, boosting accuracy from 65% to over 85% in document understanding tasks, emphasizing the importance of teaching methods in model performance.

Key takeaway

For AI Scientists and Research Scientists developing language models for sensitive domains like healthcare or legal review, you should prioritize integrating transparency mechanisms from the outset. Your models must not only provide accurate answers but also clearly show their reasoning and supporting evidence. Consider adopting prototype-based interpretability and spatial grounding techniques to build trust and enable verification, especially when working with limited training data or aiming to explain reward model preferences for safer AI systems.

Key insights

Interpretable AI models can achieve high accuracy by showing reasoning and evidence, crucial for trust.

Principles

Method

Develop prototype-based approaches for classification and spatial grounding for document Q&A to show model reasoning and evidence, even with limited data.

In practice

Topics

Best for: AI Scientist, Research Scientist, AI Ethicist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by ΑΙhub.