The hottest topic in data and AI today is arguably talking to your own data. Writing SQL queries is far from intuitive when exploring data, so the ability to simply ask questions in natural language and receive AI-powered answers backed by your business data sounds like science fiction come to life. We recently explored MotherDuck's MCP-powered talk-to-your-data capabilities in our blog series, and now we're shifting our focus to one of the leading enterprise solutions: Databricks.
In this first article, we'll survey the Databricks AI landscape and create an overview of the core pillars that enable users to interact with their data naturally. Future articles will dive deeper into these concepts and highlight best practices for implementation.
Databricks AI/BI
Let's examine what Databricks offers in this space. The company bundles features tailored toward enabling business users to access their data more easily under the umbrella of Databricks AI/BI. This business intelligence solution harnesses compound AI to enhance data analysis with self-service insights, robust governance, and exceptional performance.
Compound AI systems combine multiple AI technologies or models to solve complex problems. Rather than relying on a single AI model or algorithm, these systems integrate multiple interacting components to enhance performance and accuracy. By leveraging complementary AI technologies, compound AI systems deliver more accurate and insightful results than individual AI models working in isolation.
Databricks AI/BI is built on top of and tightly integrated with the Databricks Data Intelligence Platform. It seamlessly connects with Unity Catalog, aligning with its governance framework and adhering to any global policies set by administrators. Through Unity Catalog's lineage visualization, data producers and administrators can track how their data assets are being used within AI/BI. This traceability back to the dataset's ingestion point instills confidence in analysis results.
The platform is also built into Databricks Identity and Access Management, which integrates directly with many identity providers so users can share their analysis with anyone in their organization. Databricks AI/BI has no seat-based restrictions, meaning anyone from the organization can be added without the expense of purchasing new licenses. Being tightly integrated with SQL warehouses and the Photon engine, AI/BI benefits from unique optimizations that deliver high-performance interactions. Since AI/BI also integrates with the existing data ecosystem, there's no need to extract datasets to a separate BI engine. This improves data freshness and simplifies data governance, leading to a more streamlined data analysis process.
Traditional BI tools have relied on reports and dashboards, often requiring extensive involvement from data professionals to create new visualizations. While AI assistants have been integrated into BI tools to address this bottleneck, they frequently struggle with real-world data complexities, providing impressive demos but failing in practice. The semantic model for an organization typically relies on the knowledge of those who work with the data daily. Databricks AI/BI captures this understanding from intersections across Databricks, augmenting the existing context in the Data Intelligence Platform and leveraging this knowledge to provide practical, real-world answers. Databricks AI/BI is powered by two complementary product experiences: AI/BI Dashboards and Genie Spaces.
Genie Spaces
Genie is a Databricks feature that allows business teams to interact with their data using natural language. It employs generative AI tailored to the organization's terminology and data, with the ability to monitor and refine its performance through user feedback. Domain experts configure Genie spaces with datasets, sample queries, and text guidelines to help Genie translate business questions into analytical queries. After setup, business users can ask questions and generate visualizations to understand operational data, while Genie's semantic knowledge is continuously updated as data changes and users pose new questions.
Genie automatically selects relevant names and descriptions from annotated tables and columns to convert natural language questions into equivalent SQL queries. It then responds with the generated query and results table. If Genie cannot generate an answer, it can ask follow-up questions to clarify before providing a response.
When a user submits a question, Genie parses the request, identifies relevant data sources, and determines how to generate an appropriate response. Details provided by authors, combined with relevant Unity Catalog comments, metadata, and sample values from selected columns, allow Genie to infer both business and technical logic. Genie intelligently filters example SQL queries, table and column metadata, and chat history to select the most relevant context for answering the request.
Genie generates responses with multiple components working in concert. Unity Catalog table metadata is used as Genie parses the request and converts the natural language prompt to SQL. Genie intelligently filters relevant column names and descriptions to include. Authors can also locally edit asset metadata and choose columns that provide relevant values to Genie, which helps generate more accurate responses without altering existing Unity Catalog metadata. Genie also intelligently selects relevant SQL examples or SQL functions that have been added to the space. Authors can provide plain-text instructions as well. Finally, prompts and responses from the current chat are included as context.
While Genie can be used in languages other than English, the underlying agent framework wraps prompts in English, and Databricks recommends that space creators add as much metadata as possible in their language of choice. However, Genie might sometimes respond in English due to the underlying system prompts.
In most cases, Genie generates a SQL query that runs on the space's SQL warehouse. Generated queries are always read-only. Retries are handled automatically and the SQL warehouse manages concurrency and scale. The result set is presented as part of the response.
AI/BI Dashboards
Dashboards enable users to build data visualizations and share reports with others. AI/BI dashboards feature AI-assisted authoring, an enhanced visualization library, and a streamlined configuration experience so that data can be quickly transformed into shareable insights. When published, dashboards can be shared with anyone registered to the Databricks account, even if they do not have explicit access to the workspaces.
Published dashboards include a Genie space by default, allowing business users to explore data using natural language. Genie enables viewers to chat with the data instead of relying solely on predefined visualizations. When publishing, Databricks automatically generates a Genie space based on the dashboard datasets and visualizations.
These companion Genie spaces use Agent mode, which extends Genie's capabilities to answer both straightforward data questions and complex business questions. It employs multi-step reasoning and hypothesis testing to uncover deeper insights. When asked a question, Agent mode creates and refines a research plan, running multiple SQL queries, learning from each result, and iterating until it has enough evidence to provide a comprehensive answer.
Unlike standard Genie queries, Agent mode develops a structured approach and hypotheses for answering complex questions, then executes several SQL queries to gather evidence from different angles. It continuously adjusts its approach based on what it discovers, refining its reasoning until it's confident in the answer. Finally, it provides detailed summaries with citations, visualizations, and supporting tables.
Databricks One
Databricks One is a product designed specifically for business users. It gives these users a single, intuitive entry point to interact with data and AI without needing to understand clusters, queries, models, or notebooks. Through Databricks One, users can simply view and interact with AI/BI Dashboards, ask and answer data questions in natural language using Genie, use custom-built Databricks Apps, and browse content by domain, organized around business areas.
Databricks One gives business users one place to access insights. The interface is designed to remove clutter and guide users toward the dashboards, apps, and Genie spaces most relevant to their role. To give users and groups access to this simplified workspace experience, they can be added to the workspace and given the new "consumer access" entitlement.
Underpinning Databricks One is Unity Catalog, so data teams can expand access confidently without changing their governance strategy. This means admins can manage data access by individual users or groups, implement row and column-level security, and control access simultaneously and seamlessly.
Genie Code
Beyond this BI-focused set of tools, Databricks recently released a comparable product tailored to developers: Genie Code. Genie Code is an autonomous AI partner purpose-built for data work in Databricks. Unlike other AI assistants, Genie Code is deeply integrated with Unity Catalog, allowing it to understand the complete data landscape, including tables, columns, and lineage. This contextual awareness makes Genie Code highly effective for developers and data practitioners by accelerating complex, multi-step data tasks and autonomously adapting to the specific data and governance model. It is designed for data teams to use every day, from experimentation and model development to production pipelines and BI dashboards.
Genie Code offers specialized agentic experiences that can accelerate complex data work and handle multi-step tasks autonomously. In Agent mode, Genie Code adapts its capabilities based on the product surface currently being used in Databricks. In the Lakeflow Pipelines Editor, Genie Code focuses on pipeline editing and data engineering tasks. In notebooks and the SQL Editor, Genie Code supports data exploration and analysis. In dashboards, it supports data analysis and dashboard creation.
Genie Code is built into the Databricks platform to help with everyday work and unblock users when they encounter issues with code. It uses Unity Catalog metadata to understand the tables, columns, descriptions, and popular data assets across the company to provide personalized responses. Users can chat with Genie Code, receive quick fixes or inline suggestions, and use it to filter and explore sample data or diagnose errors.
Conclusion
Databricks has constructed a comprehensive ecosystem for natural language data interaction that bridges the gap between technical complexity and business accessibility. By combining Genie Spaces for business users, AI/BI Dashboards for collaborative insights, Databricks One for simplified access, and Genie Code for developers, the platform addresses the needs of diverse user groups while maintaining centralized governance through Unity Catalog.
What distinguishes Databricks from other solutions is its compound AI approach, which doesn't rely on a single model but instead orchestrates multiple AI components to handle real-world data complexities. The tight integration with existing data infrastructure means organizations don't need to extract data into separate systems, preserving both data freshness and governance controls. The elimination of seat-based licensing for AI/BI also removes a common barrier to democratizing data access across organizations.
As we continue this series, we'll explore how to effectively implement these tools, share best practices for configuring Genie spaces, and examine real-world use cases that demonstrate the power of conversational data interaction. The question is no longer whether natural language data interaction is possible, but rather how organizations can best leverage these capabilities to accelerate insights and empower decision-makers at every level.
More articles in this subject area
Discover exciting further topics and let the codecentric world inspire you.
Blog author
Niklas Niggemann
Working Student Data & AI
Do you still have questions? Just send me a message.
Do you still have questions? Just send me a message.