What are data standards and why do we need them?
Data standards - as the name suggests are agreed upon guidelines for recording /storing data elements or datasets. Adhering to data standards makes the sharing and exchange of data easy and streamlined. Let's look at an example to understand the value of data standards. Here is a hypothetical case where three hospital systems are following three different "standards" for storing the 'Gender' of a patient.
- Hospital A: stores gender as an integer binary value {0,1} where 0 is for 'Male' and 1 is for 'Female'
- Hospital B: stores gender as an integer binary value {0,1} however in this case 0 is for 'Female' and 1 is for 'Male'
- Hospital C: stores gender as a single character {M,F,U} ('U' being for Unknown)

Now, imagine you are an IT company that has been tasked to develop a claim adjudication engine which is compatible with all three hospital systems above. Though it's not impossible to build such a system, however, not having data standards makes the task challenging. Also, lack of data standards is potentially a source for errors when combining data across multiple systems, e.g., if the developer assumes that 0/1 representation in case of Hospital B is same as that of Hospital A then this will lead to a huge systemic issue.
Key data standards in health informatics
There are six key data standards in healthcare:
1. Systematized Nomenclature for Medicine - Clinical Terms (SNOMED CT) 2. International Classification of Diseases (ICD-10) 3. Current Procedural Terminology (CPT) 4. Logical Observation Identifiers Names and Codes (LOINC) 5. National Drug Code (NDC) 6. RxNorm Briefly, SNOMED CT is the most comprehensive, multilingual clinical healthcare terminology in the world. ICD and CPT are widely used for medical billing. LOINC provides significant details about clinical tests. NDC is a US-specific standard for medications maintained by the US Food and Drug Administration (FDA). RxNorm is US-specific terminology in medicine that contains all medications available on the US market.

Deep-dive into SNOMED CT [Systematized Nomenclature for Medicine - Clinical Terms]
The development of SNOMED Clinical Terms traces its roots to a project begun in the 1960s at National Institutes of Health (NIH) to use natural language processing (NLP) to machine code pathologists’ free text dictated notes. SNOMED CT is the most comprehensive, multilingual clinical healthcare terminology in the world. It is essentially an ontology representing relationships among its concepts. SNOMED CT has 350,000+ concepts with over 1.3 million relationships (a relationship is an association between a source concept and a destination concept). Given its scope it is heavily referenced in FHIR Resources. Let's look at SNOMED CT through an example:
Link to SNOMED CT browser:
If we search for the term ‘hypertension’ in the SNOMED CT browser we get 379 matches. The first match is ‘hypertension’ with an FSN of ‘Hypertensive disorder, systemic arterial (disorder)’.

Now, if we click on ‘Hypertension’ we see the concept has an SCTID is 38,341,003. Below it we see that SNOMED CT recognizes 14 synonyms for this disorder. These synonyms are useful in processing text notes that might contain one or more of these alternative terms. Further we see the related parent and children concepts for ‘hypertension’ in the right panel - the hierarchical nature of SNOMED CT is quite visible here. Hypertension is a child of ‘Disorder of cardiovascular system (disorder)’. Hypertension, in turn, has 25 more specific sub-disorders called children. Such a parent-child hierarchy can be useful for grouping patients for analysis.

The browser also provides a diagrammatic hierarchical view (as shown below). It shows how SNOMED CT explicitly reveals important computable clinical relationships. For example: using this hierarchy we can build a computational rule that hypertension is associated with an increased blood pressure.
