You are on page 1of 18

CDISC

CDISC (Clinical Data Interchange Standards Consortium) has


spent the past decade developing and establishing standards that
facilitate the collection, exchange, reporting and submission of clinical
data along with the underlying standard terminology.

Why CDISC?
The Submissions Data Standards team of CDISC prepared the
Submission
Data
Standards (SDS). SDS is intended to guide the organization, structure
and
format
of
standard clinical trial tabulation datasets submitted to a regulatory
authority
such
as
the US Food and Drug Administration (FDA).The main use of CDISC is
portability of data.

This can be done by establishing data standards and applying them to


all the projects.

CONTENT OF DATASETS:

While conducting a clinical trial, clinicians collect all kinds of


patient data. Section K of the FDAs Guidance for NDAs describes
criteria for CRTs and, in particular, outlines the contents of datasets
submitted as part of the CRTs. In general, the following data (grouped
on what were formerly known as CRF domains) should be provided as
individual datasets.

Demographics
Inclusion criteria
Exclusion criteria
Concomitant medication
Medical history
Drug exposure
Disposition
Efficacy results
Human pharmacology and bioavailability/bioequivalence data
Microbiology data
Adverse Events
Lab chemistry
Lab hematology
Lab urinalysis
ECG
Vital signs
Physical examination

BENEFITS OF CDISC :

CDISC standards can significantly improve processes, thus


saving time and cost
Increase data quality
Enable data integration, enhancing re-usability in
knowledge warehouses to improve science, marketing
and safety surveillance
Streamline data interchange among partners
Facilitate review of regulatory submissions
Enable integration of data from disparate
tools/technologies

CDISC CAPABILITIES:
The CDISC performs the following checks on domain content of the
source.
1. It verifies that all required variables are present in the
dataset.
2. It reports as an error if any variable in the dataset that are not
defined in the domain.
3. It reports a warning for any expected domain variables that
are not in the dataset.
4. It notes any permitted domain variables that are not in the
dataset.

5. It verifies all domain variables that are of expected data type


and proper length.
6. It detects any domain variable that are assigned a control
terminology specifications by the domain and do not have a
format assign to them.
CDISC DATA MODELS :
CDISC is developing standard data models to support the electronic
acquisition, exchange, submission and archiving of clinical trial data
and metadata (data about data).
The main CDISC models are as follows:
Operational Data Model (ODM) - Developed to support the
acquisition, interchange and archiving of operational data
Submission Data Model (SDTM) - Standard metadata models
being developed to support the data flow from the operational
database to regulatory submission.
LAB Model - Covers the exchange of laboratory information
AdaM Model - Guidelines for the creation of analysis data sets to ease
the FDA statistical review process.
SEND Model - Standard for exchange of non-clinical data.
CDASH Model Clinical Data Acquisition Standards Harmonization.

1. ODM (OPERATIONAL DATA MODEL):


The model represents study metadata, data, and administrative
data associated with a clinical trial. The implications of data
transmission using this standard will be discussed, including how to
display data in ODM, how to validate imported and exported data, and
how to otherwise assess conformance of the data with the model.
The technical focus in the development of ODM was the definition of
structures to represent the three major information components
relating to a clinical trial:
clinical study metadata (item definitions and protocol)
clinical study administrative data (users and access privileges)
clinical study data (complete record of patient data and audit trail)
This included representation of metadata capable of supporting either
direct electronic, or paper-based, data collection, and capture of
clinical data from one system to another.New version of ODM adds
support that will greatly increase the likelihood of adoption by vendors
of clinical data management systems and other sponsors, including:
Ability to address changes to key data values
Expanded transaction support for partial or incremental transfers
Expanded metadata descriptions of more complex event
structures
Support for including multiple studies and reusable metadata in
one file
Support for depicting non-patient reference data
Support for vendor extensibility
Increased compatibility with the CDISC Submissions Data Model
Increased support for archiving of clinical data and metadata
Elimination of support for the flat representation of clinical data
included in latest version
Changes to the order of elements
Changes to element names and attributes, including the Item
Data element.
Corrections to numerous bugs including those related to locales,
time zones and
signatures

The model represents study metadata, data, and administrative


data associated with a clinical trial. The implications of data
transmission using this standard will be discussed, including how to
display data in ODM, how to validate imported and exported data, and
how to otherwise assess conformance of the data with the model.
2. SDTM (STUDY DATA TABULATION MODEL):
The CDISC Study Data Tabulation Model (SDTM) defines the standard
format for tabulation data. Generally, there are three approaches for
implementing SDTM within the pharmaceutical industry: pure SDTM,
submission-only, and database-only. The pure SDTM approach means
all study data will be SDTM-compliant, starting from data capture and
ending with data analysis and submission. The submission-only
approach leaves the current practice alone and creates the SDTMcomplaint datasets for submissions. The database-only approach feeds
the study data into a clinical database, which is SDTM-compliant.

REQUIRED These variables are necessary for the proper functioning


of standard software tools used by reviewers. They must be included in
the data set structure and should not have a missing value for any
observation.
EXPECTED These variables form the fundamental core of information
within a domain. They must be included in the data set structure;
however it is permissible to have missing values.
PERMISSIBLE These variables are not a required part of the domain
and they should not be included in the data set structure if the
information they were designed to contain was not collected.

USES OF SAS IN CDISC:


A key reason for the power of pairing SAS with Excel is the flexibility of
SAS provided for exchanging data with Excel. The SAS Excel libname
engine allows you to read and write from Excel worksheets as though
they were a SAS data set. The IMPORT and EXPORT procedures allow
you exchange data for an entire data set as a stand-alone process or
from within a SAS program. Dynamic data exchange (DDE) allows you
to define a DDE triplet that defines a range of cells in Excel to be
treated as a flat file in SAS. The SAS Add-In for Microsoft Office allows
you to use SAS as a powerful data access, manipulation and analysis
back end for Excel applications. SAS also provides the XML libname
engine to facilitate reading and writing XML files. In version 9, SAS
added ODM native mode support (xmltype = CDISCODM) to the XML
engine. The SAS CDISC procedure currently provides read and write
capability for ODM, and content and structure validation for SDTM.
ADVANTAGES OF SDTM:
Your raw database is equivalent to your SDTM which provides
the most elegant solution
One of the advantages of the SDTM model is that it defines
data structure and is not dependent on individual vendor's system.
.
Your clinical data management staff will be able to converse
with end-users/sponsors about the data easily
since your clinical data manager and the und-user/sponsor will both be
looking at SDTM datasets.
As soon as the CDMS database is built, the SDTM datasets are
available.
DISADVANTAGES:
This approach may be cost prohibitive. Forcing the CDMS to
create the SDTM structures may simply be too cumbersome to do
efficiently.
Forcing the CDMS to adapt to the SDTM may cause problems
with the operation of the CDMS which could reduce data quality.
CRT (CASE REPORT TABULATIONS):
CRTs are made up of datasets and the accompanying
documentation for the datasets. This paper details the development of
CRTs for electronic submission to the FDA.

SAS code to generate data definition tables using the Output


Delivery System (ODS) in SAS version 7 or higher,
Conversion of the data definition tables to Adobe Portable
Document Format (PDF)

COMPONENTS OF CRT:
Define Document : The Define document is the central, toplevel document that allows easy navigation to the various components.
The document contains metadata on datasets and variables. It
contains bookmarks and links to each dataset. Currently, the expected
format is a PDF document, but eventually the requirement will be XML.
SAS TRANSPORT FILES : The CRT contains data for the study
stored in SAS transport files. Currently, the FDA requires data to be in
SAS XPORT Transport Format. This is an open format, meaning it can be
translated to and from the XPORT transport format to other commonly
used formats without the use of programs from any specific vendor
ADDENDUM REPORTS: The addendum document is the place to do
this. It contains descriptions of complex algorithms and detailed
explanations that would be too long for the metadata table.
BLANK CRF : The blankcrf is a PDF file containing the full set of
CRFs, annotated with the variable names for each CRF item that is
included in the tabulation datasets1. As mentioned above, CRF
variables in the variable-level metadata of the Define document will
link to the appropriate page in this file.
PROGRAMS : The programs are the actual code that was written
to generate the analysis datasets.
DATA DEFINITION MODEL :
The data definition file describes the format and content of the
submitted datasets.

SPECIFICATIONS :
The specification for the data definitions for datasets provided
using the CDISC SDTM is included in the Case Report Tabulation Data
Definition Specification (define.xml) developed by the CDISC
define.xml Team

3. ADaM (ANALYSIS DATA MODEL):


The CDISC Analysis Data Model (ADaM) defines a standard for
ADs to be submitted to the regulatory agency. Analysis datasets are
datasets created to support results presented in study reports, ISS, ISE
and other analyses that enable a thorough regulatory review. Analysis
datasets contain both raw and derived data. The underlying principle of
these models is to provide clear and unambiguous communication of
the content, source, and quality of the datasets submitted in support of
the statistical analysis performed by the sponsor.

In ADaM, the descriptions of the ADs build on the nomenclature


of the SDTM with the addition of attributes, variables and data
structures needed for statistical analyses. To achieve the principle of
clear and unambiguous communication relies on clear AD
documentation. This documentation provides the link between the
general description of the analysis found in the protocol or statistical
analysis plan and the source data. Of high importance is the clear
description of the source(s) of data used as input to the ADs. These
descriptions allow the reviewer to trace the derived data items back to
their source. Documentation detailing the AD metatdata, analysis
variable metadata, and the analysis value-level metadata are
recognized for their importance. ADaM also defines analysis-level
metadata, which describes the major attributes of each important
analysis result that is presented in the study report. The purpose of this
analysis-level metadata is to allow the reviewer to link from the
statistical results to the metadata describing the analysis, the reason
for the analysis, and the datasets and programs used to generate the
results

4. SEND (STANDARD FOR EXCHANGE OF NON-CLINICAL DATA):

The Standard for Exchange of Nonclinical Data (SEND) Models


have been prepared by the SEND Team to guide the organization,
structure, and format for non-clinical data from animal toxicology
studies submitted to the FDA. The FDA is developing a repository for all
sponsor data from a product submission, and a suite of standard
review tools to access, manipulate, and view the toxicology data. SEND
is intended to facilitate transfer of nonclinical data from sponsor to the
FDA and subsequent loading into the FDA repository.

5. LAB (LABORATORY DATA ANALYSIS):


The CDISC laboratory data Content Model has a main core
designed to handle simple laboratory data with the classic one test,
one result data structure. Extensions will be added to the code model
to handle more complex laboratory data such as Microbiology and
Pharmacogenomics.
The core model is separated into 10 logical levels as follows:
Good Transmission Practice:
Study
Site
Subject
Visit
Accession
Container
Panel
Test
Result
These levels were chosen because they follow the recognizable
hierarchy
of
clinical laboratory data. The hierarchical nature of the model levels will
significantly
reduce data redundancies when hierarchical implementations of the
model are finalized.
CDASH
(CLINICAL
HARMONIZATION):

DATA

ACQUISITION

STANDARDS

This document is intended to be used by those functions


involved in the planning, collection, management and analysis of
clinical trials and clinical data. The CDASH project identifies the basic
data collection fields needed from a clinical, scientific and regulatory
perspective to enable more efficient data collection at the investigative
sites. CDASH moves upstream in the data-flow and identifies a basic
set of highly recommended and recommended/conditional data
collection fields that are expected to be present on the majority of
CRFs. The CDASH data collection fields (or variables) can be mapped to
the SDTM structure.

Why is the adoption of the CDISC standards important for the life
sciences industry and
What is SAS' relationship with CDISC?
The Clinical Data Interchange Standards Consortium (CDISC) is
fundamentally
changing the way information is being managed in the life sciences
research
industries, and is beginning to change the way the US Food and Drug
Administration (FDA) reviews new drug applications.
Historically, most pharmaceutical organizations developed their
own internal data standards, and associated processing, for managing
research
information.
Each
organization's unique approach to this information management
process has done little to expedite the overall new drug development

process,
and
in
many
ways
has
slowed
it
down - by requiring each organization to spend unnecessary effort
acquiring,
developing
and implementing distinct processes. Furthermore, the submission of
research
data
in
nonstandardized formats makes it difficult for the FDA to develop
standard,
efficient,
processes for new drug application reviews. In the case of urgent
safety
reviews
when
questions arise about a class of drugs (Vioxx, Celebrex, etc.), the
integration of data from multiple research programs has proven to be a
huge obstacle in analyzing any class- related effects.
These life sciences research companies have since recognized
that they will distinguish themselves by the new science their
organizations provide, and not by their proprietary information
handling methodologies. So began CDISC. CDISC is a collection of
research companies, services organizations and technology providers
that have banded together to develop, distribute and implement a
series of data standards to promote efficiencies and process
improvements in the life sciences industries.
CDISC received major validation in July 2004 when the FDA
announced the
Adoption of its Study Data Tabulation Model (SDTM) as the standard for
submitting clinical trials data. The acceptance of CDISC's SDTM-based
model means that now the FDA has a common data structure from
which to review new drug applications. Common standards lead to
common tools. Common tools lead to common processes, which
ultimately result in faster and more accurate decisions regarding the
approvals of new drug therapies. Although the FDA does not yet
require all data to be submitted using the SDTM standard, its adoption
of the standard is a key action that life sciences organizations will not
ignore. In fact, Pfizer submitted the first SDTM data to the FDA in late
2004.SAS has been the standard for data transformation and analysis
in the life sciences research industries for many years, and the
required FDA data submission standard since 1999. That year, the FDA
adopted the SAS transport format as the standard for delivering
electronic data for FDA submissions.
While the move to a defined industry standard has been a slow
process, CDISC is making significant strides in meeting this objective.
As more organizations adopt the CDISC standards, processes will
become more efficient and time to market (and cure) will be reduced.

PROC CDISC :
This procedure also performs the following checks domain data
content of the source on a for observation bases.
1. It verifies that all required variable fields do not contain
missing values.
2. It detects occurrences of expected variable fields that contain
missing value.
3. It detects the conformance of all ISO 8601 specification
assigned values including date, time, datetime, duration and
interval types.
4. It notes correctness for YES or NO or NULL responses of the
domain datasets.

You might also like