Back to packs

Insurance Claims Data Quality

free

Validate claim records — numbers, statuses, dates, reserves, financial reconciliation, and policy linkage.

15 rules 2362 downloads4.4 avg (105)
insuranceclaimspolicyactuarialreservenaic
4.4(105 ratings)

Sign in to rate this pack

Test this pack with your data

Download the template, fill in your data, and see quality results instantly.

Test This Pack

Download & Install

Choose your tool — get a ready-to-run file

Run this on your data? Upload your CSV — we'll auto-map the columns, validate, and report the bad rows.Test my data
Or use the CLI
$ npx dqhub install insurance-claims --format soda --table YOUR_TABLE

About this pack

Comprehensive data quality checks for insurance claims processing. Covers: - Claim number and policy number format validation - Claim lifecycle: status transitions, date sequencing (loss <= report <= current) - Financial integrity: paid + reserve = incurred, reserve consistency with status - NAIC company code and CUSIP security identifier validation - Loss ratio reasonableness checks - Actuarial data quality for reserve development

Sources & References

Insurance companies are assigned a unique 5-digit NAIC code used in all statutory filings and regulatory reporting

Securities identifiers must conform to the 9-character CUSIP format for accurate trade settlement and regulatory reporting

What's included

7consistency rules
4format rules
2completeness rules
1uniqueness rules
1referential integrity rules

Checks included (15)

Claim Status Allowed Values(claim_status)

Validates that claim status values are restricted to a standard set of insurance claim lifecycle states. Ensures data consistency across claims systems and prevents invalid or misspelled status values from entering the pipeline.

Claim Date Sequence Validation

Validates the temporal ordering of claim dates: the loss date must be on or before the report date, and the report date must be on or before the current date. Catches backdated claims, data entry errors, and ETL issues that break the expected chronological sequence of insurance claim events.

Claim Reserve Consistency

Validates that claim reserve amounts are consistent with claim status. When a claim is Closed, the reserve must be zero (fully settled). When a claim is Open, the reserve must be greater than zero (outstanding liability exists). This prevents financial reporting errors and ensures accurate loss reserve calculations.

Claim Financial Reconciliation

Validates the fundamental insurance accounting identity: paid amount plus outstanding reserve must equal the incurred amount. This triangle reconciliation is essential for accurate loss development reporting, actuarial analysis, and regulatory filings.

Policy Date Validity

Validates that the policy effective date is strictly before the policy expiration date. Ensures that no policy has a zero-duration or negative-duration term, which would indicate data entry errors or ETL issues in the policy administration system.

Loss Ratio Range Check(loss_ratio)

Validates that loss ratio values fall within the expected range of 0% to 200%. The loss ratio (incurred losses divided by earned premium, expressed as a percentage) is a key actuarial metric. Values outside this range typically indicate data errors, misallocated premiums, or catastrophic event data that should be flagged for review.

Enum Value Valid

Asserts that all values in a column belong to a predefined set of allowed values. Catches typos, unexpected category values, or upstream system changes that introduce new enum variants without coordination.

Claim Number Format(claim_number)

Validates that claim numbers conform to a standard alphanumeric format between 8 and 20 characters. Accepts uppercase letters, digits, and hyphens. This ensures claim identifiers are consistent across systems and suitable for cross-referencing between carriers, TPAs, and reinsurers.

Policy Number Format(policy_number)

Validates that policy numbers conform to a standard alphanumeric format between 10 and 20 characters. Accepts uppercase letters, digits, and hyphens. Ensures policy identifiers are consistent for binding, endorsement tracking, and claims linkage.

NAIC Company Code Format(naic_code)

Validates that NAIC (National Association of Insurance Commissioners) company codes are exactly 5 digits. The NAIC code uniquely identifies insurance companies in regulatory filings, statutory reports, and industry databases across all US states and territories.

CUSIP Security Identifier Format(cusip)

Validates that CUSIP (Committee on Uniform Securities Identification Procedures) identifiers conform to the standard 9-character alphanumeric format. CUSIPs uniquely identify North American securities and are used extensively in insurance investment portfolio reporting and regulatory filings.

Column Not Null

Asserts that a specified column contains no null values. This is the most fundamental completeness check — every row must have a value present in the target column.

Column Completeness Threshold

Asserts that a column meets a minimum completeness threshold, measured as the percentage of non-null values. Useful when some nulls are acceptable but the overall population rate must stay above a defined level (e.g., 95%).

Column Unique

Validates that all non-null values in a specified column are unique. Useful for natural keys, email addresses, identifiers, and any column where duplicates indicate a data quality issue.

Foreign Key Valid

Validates that all non-null values in a foreign key column exist in the referenced parent table's primary key column. Detects orphaned references that break referential integrity.