FDA CDISC SDTM Submission
FDA_CDISCfree10 data-quality rules for FDA clinical-trial data submitted as CDISC SDTM datasets (within an eCTD submission). Covers required identifier variables, USUBJID cross-domain referential integrity, CDISC Controlled Terminology conformance, ISO 8601 dates, DOMAIN consistency, sequence uniqueness, and visit referential checks. Use it to catch SDTM conformance issues before a define.xml / Pinnacle 21 validation run.
Checks included (20)
USUBJID Exists in DM
Every subject referenced in a domain (e.g. AE, EX, LB) must exist in the Demographics (DM) domain.
Coded Values Conform to CDISC Controlled Terminology
Variables governed by CDISC Controlled Terminology (e.g. AESEV, AEOUT, units) must contain values from the applicable CT codelist.
VISITNUM Exists in Trial Visits (TV)
Each VISITNUM used in a domain must be defined in the Trial Visits (TV) dataset.
AE Coded in MedDRA
Adverse-event dictionary-derived terms (AE.AEDECOD / AEBODSYS) must be valid MedDRA terms for the study's MedDRA version.
ARMCD Exists in Trial Arms (TA)
DM.ARMCD must be defined in the Trial Arms (TA) dataset.
Result Units Conform to CT
Standardized result units (--STRESU) must be values from the CDISC Unit codelist.
SUPP-- Records Link to a Parent
Each Supplemental Qualifier (SUPP--) record's RDOMAIN + USUBJID + IDVAR/IDVARVAL must reference an existing parent-domain record.
--DTC Dates Are ISO 8601(aestdtc)
SDTM date/time variables (--DTC) must be in ISO 8601 format (YYYY-MM-DD or with a time component).
DM.SEX Conforms to Sex Codelist(sex)
DM.SEX must be a value from the CDISC Sex codelist (M, F, U, UNDIFFERENTIATED).
AE Severity Conforms to CT(aesev)
AE.AESEV must be MILD, MODERATE, or SEVERE per CDISC Controlled Terminology.
AE Outcome Conforms to CT(aeout)
AE.AEOUT must be a value from the outcome CT (RECOVERED/RESOLVED, RECOVERING/RESOLVING, NOT RECOVERED/NOT RESOLVED, RECOVERED/RESOLVED WITH SEQUELAE, FATAL, UNKNOWN).
Y/N Variables Are Valid(aepresp)
Variables governed by the (N)Y codelist (e.g. --PRESP, --OCCUR) must be Y or N.
Baseline Flag Is Valid(blfl)
The baseline flag (--BLFL) must be 'Y' or null (never 'N').
DOMAIN Value Matches the Dataset
The DOMAIN variable must equal the two-character domain code of the dataset it appears in.
STUDYID Is Consistent
All records in a submission should carry the same STUDYID.
Start Date Not After End Date
For interval events/interventions, --STDTC must not be after --ENDTC.
Required Identifier Variables Present
Every SDTM record must carry STUDYID, DOMAIN, and USUBJID — the core identifier variables required by the SDTM IG.
USUBJID Present(usubjid)
USUBJID (unique subject identifier) must be populated on every record.
--SEQ Is Unique Within Subject(usubjid_seq)
The sequence number (--SEQ) must be unique within each subject in a domain.
No Duplicate Records (Natural Key)(natural_key)
Each domain record must be unique on its natural key (STUDYID + USUBJID + domain-specific identifying variables).