OVT-MLCS: An Online Visual Tool for MLCS Mining from Long or Big Sequences

· Source: cs.AI updates on arXiv.org · Field: Science & Research — Life Sciences & Biology, Mathematics & Computational Sciences · Depth: Advanced, long

Summary

OVT-MLCS is a new online visual tool designed for mining multiple longest common subsequences (MLCS) from long (length ≥ 1,000) or big (length ≥ 10,000) sequences, a task that existing exact MLCS algorithms struggle with due to memory and time complexity issues. The tool incorporates a novel key point-based MLCS algorithm, KP-MLCS, and a method for compactly representing and visualizing all mined MLCSs. Built as a lightweight web application using open-source Java components, OVT-MLCS supports online mining, storage, and download of MLCSs for sequences ranging from 3 to 5000 in scale. It offers user-friendly interactive functions, including real-time graphic visualization, exact or top-k MLCS mining, and insights into common patterns, addressing critical needs in fields like bioinformatics for tasks such as cancer gene pattern detection and COVID-19 virus evolution research.

Key takeaway

For AI Scientists working with large biological or character sequence datasets, OVT-MLCS offers a robust solution for MLCS mining. You can efficiently identify common patterns and similarities in sequences up to 5000 in scale, which was previously challenging due to computational constraints. Utilize its online visualization and top-k mining features to accelerate research in areas like genomics and virology, enabling faster insights into evolutionary relationships or disease markers.

Key insights

OVT-MLCS enables efficient, visual MLCS mining from large sequences, overcoming prior computational and visualization limitations.

Principles

Method

OVT-MLCS employs the KP-MLCS algorithm with a novel $DAG_{KP}$ graph model, multi-threaded mining, and dynamic memory management via serialization/de-serialization to handle long/big sequences and provide exact or top-k MLCS results.

In practice

Topics

Code references

Best for: AI Scientist, Research Scientist, Domain Expert, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.