You Don’t Need GQL in BigQuery. Unless…

· Source: Data Engineering on Medium · Field: Technology & Digital — Data Science & Analytics, Cloud Computing & IT Infrastructure · Depth: Intermediate, medium

Summary

Google has introduced BigQuery Graph, integrating Graph Query Language (GQL) into BigQuery for graph-based data analysis. This feature allows users to layer property graph definitions over existing BigQuery tables without moving data, preserving current storage, access controls, and SQL query functionality. While most BigQuery workloads may not require GQL, it offers significant improvements for specific complex queries that are challenging to express in standard SQL, particularly those involving multi-hop traversals, unknown-depth relationships, or intricate "diamond" patterns. The article demonstrates GQL's utility using BigQuery's INFORMATION_SCHEMA.JOBS_BY_PROJECT data, showcasing how to define nodes (User, Job, Table) and edges (RAN, READ, WROTE) to analyze query history. GQL queries run within a GRAPH_TABLE function, providing a more readable and maintainable syntax for graph-specific problems compared to recursive CTEs in SQL.

Key takeaway

For Data Engineers or Analytics Engineers struggling with complex data lineage, dependency mapping, or anomaly detection using recursive CTEs in BigQuery, GQL offers a more intuitive and maintainable solution. You should evaluate BigQuery Graph for workloads involving multi-hop traversals or intricate relationship patterns, as it can significantly simplify query logic and improve readability. Consider using the GQL ↔ SQL converter to assess the benefits on your existing complex SQL queries before fully adopting GQL.

Key insights

BigQuery Graph and GQL simplify complex graph traversals and pattern matching that are cumbersome in SQL.

Principles

Method

Define a property graph using CREATE PROPERTY GRAPH over existing BigQuery tables, specifying node keys and edge relationships. Then, use GQL within the GRAPH_TABLE function for graph pattern matching and traversal.

In practice

Topics

Best for: Data Scientist, Data Engineer, Analytics Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Data Engineering on Medium.