Garbage In, Garbage Out. This computing truism has never been more critical than in the age of AI. Large Language Models don't amplify poor data quality, they wrap it in confident-sounding prose that can mislead even experienced users. As organizations adopt conversational analytics tools like Databricks Genie, the stakes are higher. The old adage evolves: Garbage In, Confident Garbage Out.
In our last article, we introduced Databricks' AI-powered "Ask Your Data" capabilities, including Genie, a feature that lets business teams query data using natural language. Genie uses generative AI tailored to an organization's terminology, learns from user feedback, and helps non-technical users generate visualizations and insights from operational data.
Building Quality into Your Genie Space
Start with a Solid Foundation: Clean Data Before AI
Data quality for Genie begins well before users start asking questions. It starts at the platform level, where Databricks provides comprehensive tools to ensure high-quality data reaches your AI analytics layer.
Data quality management must span the entire data estate, covering both operational systems (OLTP) and analytical platforms (OLAP). The industry-standard framework defines six core dimensions: consistency (data values don't conflict across systems), completeness (no missing information), accuracy (error-free data), validity (conformance to required formats), uniqueness (no duplicates), and timeliness (up-to-date information). Databricks provides Lakehouse Monitoring to track all six dimensions. Table monitors create metric tables and auto-generated dashboards that visualize quality metrics over time.
Data entering the analytics platform rarely satisfies all six dimensions initially, especially when sourced from multiple systems. During ingestion, Databricks can block invalid data using constraints, quarantine problematic records for review, or flag violations for downstream handling. Auto Loader features provide intelligent schema handling where schemas can be enforced, evolved, overwritten, or explicitly updated based on governance needs.
As data flows through transformation pipelines, quality improves through the medallion architecture. The Bronze layer holds raw data from sources, augmented with metadata for discovery. The Silver layer is where most cleaning happens: deduplication, schema enforcement, creating a single source of truth. Databricks offers multiple deduplication techniques through MERGE operations for upserts and ranking window functions to identify and remove duplicates. The Gold layer delivers refined, aggregated data ready for reporting and AI tools.
Genie spaces should connect to Gold layer tables whenever possible. This ensures users query data that's already been cleaned and validated, which goes a long way toward preventing AI from confidently presenting garbage results.
Teaching Genie Your Business Language
Think of Genie as a new data analyst joining your company. Like any new hire, Genie needs clear context to be effective. It relies on quality metadata to understand what data represents, example queries to learn how the business solves problems, and structured definitions of business terminology. The better the onboarding, the better the results.
Databricks allows adding metadata at multiple levels: databases, tables, columns, even individual commits. Built-in SQL commands capture metadata like ingestion timestamps and source file lineage during data loading. This helps track issues and assists with debugging transformation bugs. Unity Catalog integrates with enterprise cataloging tools, enabling comprehensive metadata export. Well-annotated datasets are searchable, auditable, and far easier for AI to interpret. Since every Genie space is built on Unity Catalog-registered data, Genie uses the metadata attached to those objects.
With Databricks, we can generate descriptions of our datasets:
And also add comments to our columns:
Every Genie space has a space-level Knowledge Store, a collection of curated semantic definitions that enhance Genie's understanding of your business data. The Knowledge Store allows space authors to customize table and column descriptions specific to the space without altering Unity Catalog metadata, define business terms and synonyms, and hide irrelevant columns from the space.
Genie also learns from interaction. Users can provide feedback on Genie's responses through a simple thumbs up and thumbs down system. When authors approve responses or download results, Genie analyzes the SQL and suggests new expressions or join relationships that could improve future accuracy.
Business users don't know the exact column names or value formats. Genie bridges this gap through multiple features that learn the language users naturally speak. Every column can be enriched with synonyms that capture the specific terminology users will likely use when conversing with Genie.
Prompt matching consists of two components that help Genie interpret natural language. Format assistance provides representative values for all eligible columns, helping Genie understand data types and formatting patterns. Entity matching curates lists of distinct values (up to 120 columns, 1,024 values per column) for fields users commonly reference, like states, product categories, or customer segments. Together, these features allow Genie to match conversational phrasing to actual column names, correct spelling errors in user prompts, and map user terminology to database values. Prompt matching is enabled by default and customizable per column.
Encoding Business Logic for Trustworthy Results
Accuracy is the most critical dimension for Genie. AI confidence can mislead users, especially non-technical ones. Without proper grounding in business logic, Genie might generate syntactically correct but semantically wrong queries. Databricks provides multiple mechanisms to build trust and ensure accurate results.
When adding data assets to a Genie space, it automatically searches for popular workspace queries associated with those assets. These can be reviewed and added as example SQL queries that help Genie generate correct SQL for common questions. Example queries can be static or parameterized. It's worth using the most typical phrasing of the user's question as the title, since this improves Genie's ability to match prompts. Genie can use the example directly or learn patterns from it. Responses using parameterized queries are marked as Trusted.
For complex logic that shouldn't be exposed or modified, custom functions can be registered in Unity Catalog. Genie can call these functions to answer specific questions without seeing the underlying SQL. Responses using SQL functions are also marked Trusted. These markers give users confidence that results follow the organization's established logic rather than Genie's best guess.
Each Genie space should have one clear focus. To help Genie stay on target, implicit business knowledge must be made explicit through join relationships and SQL expressions. Join relationships define how tables connect. By clearly defining these relationships once, Genie doesn't have to guess how tables relate, which reduces hallucination risk when combining data from multiple sources.
SQL expressions provide structured definitions for KPIs and metrics (how to calculate important business values), business attributes (additional dimensions for analysis), and conditions and filters (business rules for data subsets). SQL expressions complement example queries: expressions are for reusable business concepts, while example queries teach Genie how to handle common prompt formats. In our space, we wanted Genie to understand terms like "Market Share," "Performance," and "Sales," so we added them as measures in the Knowledge Store:
Validation and Continuous Improvement
Even with careful setup, Genie needs ongoing validation to ensure it produces accurate results. Databricks provides benchmarking tools to measure and improve Genie's performance over time.
Benchmarks are test question sets (up to 500 per space) that assess Genie's response accuracy. Each benchmark question can optionally include a gold standard SQL query whose results represent the correct answer, or a Unity Catalog SQL function as the reference. When adding new benchmark queries, you can even let Genie generate the initial SQL, though this requires caution since the very purpose of benchmarks is to ensure high-quality AI results. It's worth verifying the SQL is correct before using it as ground truth.
During benchmark runs, Genie compares its generated results against these gold standards. After setting up a benchmarking suite, every query can be executed with one button. The system displays results side-by-side: Genie's generated SQL and results alongside the ground truth. An automated comparison tags each result, making it easy to identify problems at a glance.
The most useful benchmarks cover the questions users ask most often, include multiple phrasings of the same question to test robustness, use realistic question formats that mirror how users actually talk, and run regularly as you refine the space. Benchmarks provide objective quality metrics and help identify where the Knowledge Store needs improvement.
Putting It to the Test
Now that we've prepared our Genie space, we can challenge it with a deliberately sloppy prompt asking for the publishers with the biggest market share in the North American region.
Genie handles it well. It's not fazed by our typo, the omission of the word "publisher," or the casual way we reference North America. It delivers a concise overview supported by a visualization.
Conclusion
Data quality isn't a one-time checklist. It's an ongoing practice that determines whether AI tools deliver value or confusion. For Databricks Genie, quality starts at the platform level with Delta Lake's ACID guarantees, Unity Catalog's metadata management, and the medallion architecture's structured refinement. But it doesn't end there.
Genie extends these foundations by making data quality actionable for AI. Through the Knowledge Store, SQL provenance, prompt matching, and benchmarks, Genie transforms raw metadata into business context that generates accurate, trustworthy insights.
The investment in data quality pays returns: better AI responses, increased user trust, faster time to insight for business teams, and reduced support burden. If you're building a Genie space, the essentials are connecting to Gold layer tables that have already been cleaned and validated, adding metadata to all tables and columns, capturing proven queries as example SQL to ground Genie in your business logic, building a benchmark suite covering your most common questions, and iterating based on feedback from actual users.
As the principle goes: garbage in, garbage out. But with deliberate attention to quality across all six dimensions, the opposite is achievable: quality in, confident quality out.
More articles in this subject area
Discover exciting further topics and let the codecentric world inspire you.
Blog author
Niklas Niggemann
Working Student Data & AI
Do you still have questions? Just send me a message.
Do you still have questions? Just send me a message.