Back to packs

VIN & Vehicle Data Quality

free

Validate Vehicle Identification Numbers against NHTSA and ISO 3779 standards — format, check digit, year code, and country of origin.

12 rules 1625 downloads4.3 avg (71)
vinvehicleautomotivenhtsaiso-3779fleetinsurancedmv
4.3(71 ratings)

Sign in to rate this pack

Test this pack with your data

Download the template, fill in your data, and see quality results instantly.

Test This Pack

Download & Install

Choose your tool — get a ready-to-run file

Run this on your data? Upload your CSV — we'll auto-map the columns, validate, and report the bad rows.Test my data
Or use the CLI
$ npx dqhub install automotive-vin --format soda --table YOUR_TABLE

About this pack

Comprehensive VIN (Vehicle Identification Number) validation rules based on federal NHTSA regulations and ISO 3779/3780 standards. Covers: - VIN format validation (17-character alphanumeric, excluding I, O, Q) - Check digit verification using the NHTSA mod-11 algorithm (position 9) - Model year code validation (position 10, 30-year cycle) - Country of manufacture identification (position 1, ISO 3780) - Basic completeness and uniqueness checks for vehicle records Essential for automotive dealerships, fleet management, insurance companies, DMVs, and any organization that processes vehicle data.

Sources & References

NHTSA — 49 CFR Part 565 - Vehicle Identification Number (VIN) Requirements

All motor vehicles manufactured for sale in the US must have a 17-character VIN conforming to this format

ISO — ISO 3779:2009 - Road vehicles - Vehicle identification number (VIN)

International standard defining VIN content and structure

NHTSA — 49 CFR Part 565.15(c) - Check digit

Position 9 of the VIN must contain a check digit computed using the prescribed transliteration and weighting algorithm

ISO — ISO 3779:2009 Section 4.3

Check digit calculation for VIN verification

NHTSA — 49 CFR Part 565.15(b) - Year of manufacture

Position 10 of the VIN must indicate the model year using the prescribed character codes

ISO — ISO 3779:2009 Annex B

Year of manufacture designation in the VIN

ISO — ISO 3780:2009 - Road vehicles - World manufacturer identifier (WMI) code

The first character of the VIN identifies the geographic region of the vehicle manufacturer

NHTSA — 49 CFR Part 565.15(a) - World manufacturer identifier

Positions 1-3 of the VIN represent the World Manufacturer Identifier assigned by SAE International

What's included

5format rules
3completeness rules
2uniqueness rules
1range rules
1consistency rules

Checks included (12)

VIN Format Validation(vin)

Validates that values conform to the standard Vehicle Identification Number (VIN) format as defined by NHTSA and ISO 3779. A VIN must be exactly 17 characters composed of digits 0-9 and uppercase letters A-Z, excluding I, O, and Q (which are visually ambiguous with 1, 0, and 9).

VIN Check Digit Validation (Position 9)(vin)

Validates the VIN check digit at position 9 using the NHTSA mod-11 algorithm. Each character is transliterated to a numeric value, multiplied by a positional weight, summed, and divided by 11. The remainder must equal the check digit (0-9 or X for remainder 10). This catches transcription errors and detects counterfeit VINs.

VIN Model Year Code Validation (Position 10)(vin)

Validates that position 10 of the VIN contains a valid model year character. The year code follows a 30-year cycle defined by NHTSA: digits 1-9 and letters A-H, J-N, P, R-T, V-Y (excluding I, O, Q, U, and Z). For example, 'R' = 2024, 'S' = 2025, 'T' = 2026, '1' = 2027 in the current cycle.

VIN Country of Manufacture Code (Position 1)(vin)

Validates that position 1 of the VIN contains a recognized World Manufacturer Identifier (WMI) country code per ISO 3780. Position 1 indicates the country or region of manufacture: 1-5 = North America, 6-7 = Oceania, 8-9 = South America, A-H = Africa, J-R = Asia, S-Z = Europe.

Valid Date String Format(event_date)

Validates that date string values match the expected format. Supports configurable formats including YYYY-MM-DD (ISO 8601), MM/DD/YYYY, DD/MM/YYYY, YYYY/MM/DD, and DD-Mon-YYYY. Validates month (01-12), day (01-31), and reasonable year ranges.

Column Not Null

Asserts that a specified column contains no null values. This is the most fundamental completeness check — every row must have a value present in the target column.

Column Completeness Threshold

Asserts that a column meets a minimum completeness threshold, measured as the percentage of non-null values. Useful when some nulls are acceptable but the overall population rate must stay above a defined level (e.g., 95%).

String Not Empty

Asserts that a string column contains no empty strings. This is distinct from a null check — a value can be non-null but still empty ('') or whitespace-only. Catches cases where upstream systems insert blank strings instead of proper nulls.

Column Unique

Validates that all non-null values in a specified column are unique. Useful for natural keys, email addresses, identifiers, and any column where duplicates indicate a data quality issue.

Duplicate Detection

Detects and counts duplicate rows based on specified columns. Returns the number of duplicates found and identifies the offending rows. Supports threshold-based alerting for acceptable duplicate rates.

Date Not In Future

Validates that a date or timestamp column contains no values in the future. Catches data entry errors, timezone issues, and ETL bugs that produce future-dated records for columns like birth_date, transaction_date, or created_at.

Enum Value Valid

Asserts that all values in a column belong to a predefined set of allowed values. Catches typos, unexpected category values, or upstream system changes that introduce new enum variants without coordination.