Skip to main content

Core Concepts

Data Discovery is the act of finding something. It’s the moment a user searches, browses, or stumbles onto a dataset they didn’t know existed. It’s transactional and ephemeral — you discover once, then you move on.
Data Catalog is the system of record. It’s the persistent, maintained repository of metadata, definitions, lineage, ownership, and quality signals for every data asset. The catalog is what makes discovery possible, but it’s much more than a search index — it’s the ongoing governance layer that keeps information accurate, trusted, and actionable over time.
Is the data behaving normally?Continuous monitoring of data health over time, catching anomalies before users notice. It’s a signal.
  • Focuses on change and drift — volume dropped, schema changed, freshness delayed, distribution shifted
  • Produces alerts, incidents, SLA breaches
  • Forward-looking and continuous — “something changed, investigate now”
  • Example: “Row count dropped 40% compared to yesterday’s average — anomaly detected”
Is the data correct?A measurement of whether data meets defined standards at a point in time. It’s a verdict.
  • Focuses on the data itself — completeness, accuracy, consistency, uniqueness, timeliness, validity
  • Produces a score, a pass/fail, a rule result
  • Backward-looking — “was this data good when it was loaded?”
  • Example: “12.7% of customer_id values are null — that fails our threshold”
Semantics is about the inherent meaning and definition of a data element — what it represents in business terms, independent of how it’s being used. Example: A column called cust_id in a database table. 
  • Semantics tells you: “This is a Customer Identifier — a unique reference to a person or organization that has a business relationship with the company” 
  • It gets tagged with business terms like: Customer, PII, Primary Key, CRM Entity 
  • This meaning is stable — cust_id means the same thing whether it appears in a sales table, a support ticket table, or a billing table 
Context is about the circumstances surrounding data — who uses it, where it flows, what depends on it, and what impact it has on the business. Example: That same cust_id column — now let’s add context: 
  • It feeds into the daily revenue dashboard used by the CFO 
  • It’s joined to a pipeline that triggers customer invoices 
  • It was flagged with 3% null values last Tuesday 
  • It’s downstream of a Salesforce sync that ran late 
Context tells you: 
  • This particular instance of cust_id is business-critical 
  • A data issue here affects revenue reporting and invoicing 
  • This needs to be prioritized over a cust_id sitting in an archive table nobody uses 
SemanticsContext
QuestionWhat does this data mean?Why does this data matter right now?
NatureStatic definitionDynamic and situational
Examplecust_id = Customer Identifiercust_id feeds the CFO dashboard and invoice pipeline
Set byBusiness glossary, classificationLineage, usage patterns, downstream dependencies
Changes over time?RarelyConstantly
Yes, Prizm provides both Data Discovery or Data Catalog as part of the platform. A simple way to think about it:
DiscoveryCatalog
Question it answers”Does this data exist?""What is this data, who owns it, can I trust it?”
Primary userAnyone looking for dataData owners, stewards, engineers, and analysts
FrequencyOne-time or occasionalOngoing, daily governance work
Core featureSearch barAsset pages, lineage, DQ rules, glossary
ValueSpeed to findDepth of trust
In your Prizm UI, the global search bar is the discovery surface. Everything else — the asset detail page, DQ rules, lineage, glossary, tags, usage signals — is the catalog doing its job of making the discovered asset trustworthy and actionable. Think discovery as the front door and the catalog as the building behind it.
Yes, Prizm provides both Data Observability and Data Quality as part of the platform.
Yes, Prizm provides both. This where Prizm’s power comes in — semantics without context is just a label; context without semantics is just noise. 
  1. Semantics tells Prizm: “This is customer data, it’s PII, it’s a key business entity” 
  2. Context tells Prizm: “This specific instance flows into 12 downstream reports, was touched by 3 pipelines today, and is used by the finance team daily” 
  3. Together, Prizm’s can say: “There’s a data quality issue here — and it’s high priority because of what this data means AND how critical it is to the business right now” 
That combined intelligence is what is provided as part of Prizm
PRIZM supports six tag types that you can apply to any asset:
  • Status — Current health state, for example Healthy.
  • Domain — Business domain, for example Customer, Sales, or Finance.
  • System — Source system, for example CRM.
  • Data — Data category, for example Transactions.
  • Priority — Operational priority, for example High Priority.
  • Security — Data sensitivity classification, for example PII.

Getting started

DQLabs Prizm is a comprehensive, AI-native platform for data quality, observability, and cataloging — built for modern enterprises. DQLabs Prizm is a multi-agentic, AI-native platform that transforms how organizations manage, trust, and govern their data. By deploying specialized AI agents that collaborate autonomously, Prizm provides end-to-end visibility, intelligent data catalog, data observability and data quality management, and proactive governance across your entire data ecosystem.
PRIZM connects to a range of popular databases and cloud data warehouses. Based on supported integrations, the platform works with:
  • Snowflake (e.g., snowflake/production_db)
  • PostgreSQL (e.g., postgres/analytics_db)
  • MySQL (e.g., mysql/ecommerce_db)
  • Amazon Redshift (e.g., redshift/warehouse_db)
Contact your PRIZM administrator to add or configure a new data source connection.
On the PRIZM login screen, click Create new account below the sign-in form. Depending on your organization’s settings, account creation may require administrator approval before you can sign in. If you don’t receive access promptly, check with your PRIZM administrator.

Assets

An asset is any data object that PRIZM monitors in your connected databases. Assets include:
  • Tables & Views — relational tables and database views
  • Queries — saved or scheduled SQL queries
  • Pipelines — data pipeline jobs moving or transforming data
  • Reports — reporting objects or materialized outputs
  • Semantic models — logical data models and definitions
  • Attributes — columns related to tables, queries, views.
You can browse and search all assets from the Assets page, and filter by type using the tab navigation (All, Tables & Views, Attiribute, Query, Pipeline, Reports, Semantics).
A fingerprint column is the column PRIZM uses to detect changes in an asset’s data over time. It is typically a timestamp column such as updated_at, transaction_date, or last_login. PRIZM reads this column to determine when the data was last modified, which directly feeds the Freshness metric. Your PRIZM administrator configures the fingerprint column for each asset during setup.
The Score is an overall data quality rating for an asset, expressed as a percentage from 0 to 100%. It aggregates multiple quality dimensions — such as completeness, freshness, schema compliance, and custom measure results — into a single number. A higher score means the asset is meeting more of its configured quality expectations. Use the Score to quickly identify assets that need attention.
Freshness tells you how recently an asset’s data was last updated. PRIZM calculates it using the asset’s fingerprint column. A freshness value of 2d means the most recent record is two days old. If your asset is expected to update daily but shows a freshness of 5d, that is a signal worth investigating. Freshness thresholds can be configured as measures so PRIZM fires an alert when data goes stale.

Alerts & Issues

Alerts are created automatically when a measure’s definition threshold is breached. A measure is a configured rule or check on an asset — for example, “the null rate on email must stay below 5%” or “the row count must not drop by more than 10% day over day.” When PRIZM evaluates a measure and finds the threshold has been exceeded, it generates an alert. Alerts appear on the Alerts page, categorized by level: Critical, Warning, Info, or High.
Drift status indicates how much an asset’s data has deviated from its established baseline. PRIZM calculates drift by comparing current values against historical patterns for a given column or measure. The possible statuses are:
  • Low — minor deviation, within normal variation
  • Medium — notable deviation that warrants review
  • High — significant deviation that likely requires action
You can see the drift status for each alert in the Alerts table alongside the specific percent change that triggered it.
Alerts and issues serve different purposes in PRIZM’s quality workflow:
  • An alert is an automated notification generated when a measure threshold is breached. Alerts are read-only records of what PRIZM detected.
  • An issue is a tracked work item you create and manage through to resolution. Issues have a status (New, In Progress, Resolved), a priority, and are linked to a specific asset and database.
When you want to act on an alert, navigate to the Issues page and create or update an issue to track the investigation and resolution.
PRIZM does not currently offer a one-click silence or acknowledge action directly on an alert. To track your response to an alert, open the Issues page and create a new issue linked to the affected asset. Set the status to In Progress while you investigate, and mark it Resolved once the underlying data problem is fixed. This gives your team full visibility into which alerts are being handled.

Metrics

The Data Quality percentage is an aggregated score that reflects the overall health of your monitored data assets. It combines results across multiple quality dimensions including completeness, freshness, schema compliance, and any custom measure checks you have configured. The score is calculated across all connected assets, so a drop in the percentage typically means one or more assets have started failing their quality checks. Drill into the Assets page to find which specific assets are contributing to a lower score.
Active Pipelines shows the number of data pipelines that PRIZM is currently monitoring. This count is broken down by running pipelines (actively executing) and paused pipelines (configured but not currently running). If a pipeline you expect to be running appears paused, contact your data administrator to investigate the pipeline status in the source system.