HTML table extractor

· Source: Simon Willison's Weblog · Field: Technology & Digital — Data Science & Analytics, Software Development & Engineering · Depth: Fundamental Awareness, quick

Summary

The HTML table extractor is a new web-based tool released on June 29, 2026, designed to automatically detect and convert tables from various pasted content types. Users can paste HTML, rich text, or plain text containing tables, and the tool provides a preview before allowing export into multiple formats, including HTML, Markdown, CSV, TSV, and JSON. This utility expands a collection of paste-conversion tools. A recent update further enhanced its functionality by integrating with Wikipedia's open CORS API, enabling users to search for a Wikipedia page and directly import and display its tables. This development follows a similar rebuild of the "Rich text to markdown" tool, which also gained improved UI and table support.

Key takeaway

For data analysts or developers frequently needing to extract structured data from web content, this HTML table extractor streamlines your workflow significantly. You can quickly convert tables from pasted rich text or HTML into formats like CSV or JSON, bypassing manual parsing. Consider integrating this tool into your data acquisition pipeline, especially for Wikipedia data, to save time and reduce errors.

Key insights

The tool simplifies table data extraction and format conversion from diverse sources, including direct Wikipedia integration.

Principles

Method

Paste content (HTML, rich text, plain text), tool detects tables, previews them, then export to HTML, Markdown, CSV, TSV, or JSON. Optionally, search Wikipedia to import tables directly.

In practice

Topics

Code references

Best for: Data Analyst, Software Engineer, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Simon Willison's Weblog.