Skip to content

New Data Element: Social Determinants of Health

KellieMarie edited this page Feb 16, 2022 · 6 revisions

In service of our goal to provide the richest possible dataset to support COVID-19 and long-COVID research, we are seeking sites willing to add new data elements to their common data models (and therefore N3C payloads). It is unlikely that you are supporting any of the below data elements already (though let us know if you are!); for this reason, we are providing guidance for extracting and mapping data elements and templates that you can use to store these data in your local CDMs, which will then flow through to your N3C payloads.

So that we can process everyone’s payloads consistently, we ask that you use these design templates if you are going to provide these data elements.

Have questions? Need to discuss how to apply this guidance at your site? Please reach out via the PASC Slack in the #data-enhancements channel.

Contents

Overview

What are Social Determinants of Health?

Social determinants of health (SDOH) encompass a variety of non-medical factors that influence health outcomes.1

Why is this important?

Social determinants of health (SDOH), such as housing security, food security, and access to transportation, can impact an individual's susceptibility to disease and overall health outcomes. Patient-level SDOH data will allow us to explore these connections in the context of COVID-19 and long COVID.

What sites should implement this?

Sites that routinely collect SDOH information from patients. Often this will be collected in an “SDOH module” and it may be EHR vendor-based or homegrown.

Scope and Description

  • SDOH-related questions and responses from patients collected in a standardized manner (often in an SDOH module).
  • Though there are many SDOH, we have focused on six for inclusion: financial resource strain, food insecurity, social connectedness, stress, transportation, and housing. Sites should submit data in all areas for which they have source data available.
  • Please retain raw/source values for SDOH questions and answers in the appropriate fields in your CDM. There may be a future need to include these data in the extract. Additional guidance will be provided if this happens.
  • This SDOH data enhancement does not include area-level variables, social history variables (e.g., alcohol use, drug use), nor demographic variables collected elsewhere (e.g., race, ethnicity).

Sourcing SDOH Data

Adding SDOH data to your N3C payload will likely require modification to your ETL. Data should be sourced from your site’s version of a SDOH module. In Epic, for example, these are stored in flowsheets. You should not attempt to extrapolate from diagnosis or CPT codes to generate SDOH information.

Mapping SDOH Data

In order to ensure usability in the N3C Enclave, SDOH variables must be mapped to standardized codes. Many SDOH collected in EHRs mirror standardized and well established tools, which, in turn, often have LOINC codes assigned. Therefore, LOINC will be our initial standardized codeset for SDOH. Detailed guidance is below.

Shared Community Mappings

We have created a shared mappings spreadsheet for sites to reference and contribute to. At the time of publication, this spreadsheet is primarily driven by the Epic SDOH module. As you identify additional mappings for other EHRs or homegrown SDOH modules, please add your mappings to this list. Your additions will be important for downstream data harmonization.

We also ask that sites add to our list of Unmappable SDOH in this spreadsheet. This will allow us to identify areas that may require custom mappings. As we better understand the variation in SDOH across sites and limitations of mapping to LOINC, we will work with the community to develop custom, but consistent, codes if needed.

If you have questions about mapping SDOH or are struggling to find appropriate matches for many of your site SDOH, please reach out via Slack.

Mapping SDOH to LOINC

Source SDOH questions should be mapped to an exact or near exact match LOINC code.

Source SDOH responses should be mapped to LOINC Answer codes. The selected LOINC Answer must be a valid answer for the LOINC Question code. Values such as “Patient Refused”, “Not Asked”, and similar values may be mapped to null or the flavor of null appropriate for the CDM you are using, if there is no valid LOINC answer.

The most straightforward way to find valid LOINCs is to simply search for your local question on the LOINC website or Athena tool. When checking for valid LOINC answers, be sure you are looking at the full list of LOINC answers.

Example LOINC Question and Answer Mappings

First, find a LOINC code match for the question in your source data.

Source Question Text LOINC Code LOINC Question Text Acceptable Mapping
How hard is it for you to pay for the very basics like food, housing, medical care, and heating 76513-1 How hard is it for you to pay for the very basics like food, housing, medical care, and heating YES
Are you worried about losing your housing 93033-9 Are you worried about losing your housing YES
In the last 12 months, was there a time when you did not have a steady place to sleep or slept in a shelter (including now) 93669-0 Are you homeless or worried that you might be in the future NO

In the last example, both texts focus on housing, but each asks the question in a different way. Therefore, they cannot be mapped to one another.

Now, let’s look at how you might map possible answers for “How hard is it for you to pay for the very basics like food, housing, medical care, and heating?”, which has a LOINC code of 76513-1.

LOINC Code Source Value Mapped Value Acceptable Mapping?
76513-1 Hard LA14745-6 YES
76513-1 Not Hard at All LA31980-8 YES
76513-1 Patient Refused OT YES
76513-1 Moderate LA6751-7 NO

The fourth example is not an acceptable mapping, because LA6751-7 is not a valid LOINC answer for LOINC Code 73513-1.

Difficulty Finding Mappings?

It’s unlikely you will find appropriate LOINC codes for all of your SDOH data, that is okay and expected. Please add these unmappable SDOH you wish to include to the shared mappings spreadsheet in the Unmappable tab.

If you the lack of an appropriate LOINC mapping prevents you from providing SDOH data in one of the six focus areas (financial resource strain, food insecurity, social connectedness, stress, transportation, and housing), please let us know via Slack and we will work with you to try to find a solution.

How should we structure the data?

We have separate instructions for each data model below to describe how to structure these data.

OMOP

OMOP sites will map records to LOINC and utilize the CONCEPT table to dictate the DOMAIN_ID for where to store this information. To provide this data element, adjust your ETL to include SDOH data as described above, and use those data to fill in an OBSERVATION or MEASUREMENT row. You can consult the OMOP CDM Wiki for specifications for each domain (https://ohdsi.github.io/CommonDataModel/cdm53.html#OBSERVATION and https://ohdsi.github.io/CommonDataModel/cdm53.html#MEASUREMENT). In the wiki, you will find information on conventions and requirements for each field.

PCORnet

PCORnet sites will use the OBS_GEN table to store this information. To provide this data element, adjust your ETL to include SDOH data as described above, and use that data to fill in an OBS_GEN row as shown below. (Fields not shown in the example are not required for N3C and may be null.) Note that these visits should also appear in your ENCOUNTER table.

Obsgen_result_text is used instead of obsgen_result_qual so as to not interfere with PCORnet’s established value set for _qual fields. If using obsgen_result_text presents a problem for your site, please let us know.

OBS_GEN Template for SDOH

obsgenid patid encounterid obsgen_start_date obsgen_type obsggen_code obsgen_result_text obsgen_result_modifier obsgen_source
{generated by your ETL} {patient attached to the visit} {visit id of the clinic visit} {date of entry} {vocabulary used, most often will be LOINC} {standardized code for SDOH question, usually will be LOINC code} {standardized answer, usually will be Loinc answer ID} TX {whichever option is appropriate for your site}

OBS_GEN Example for SDOH

obsgenid patid encounterid obsgen_start_date obsgen_type obsggen_code obsgen_result_text obsgen_result_modifier obsgen_source
1 P333 E99 2020-06-03 LC 93033-9 LA32-8 TX HC
2 P333 E99 2020-06-03 LC 93030-5 LA30133-5 TX HC
3 P333 E99 2020-06-03 LC 88122-7 OT TX HC

i2b2/ACT

ACT sites will use the OBSERVATION_FACT table to record SDOH facts. The N3C SDOH facts will use standardized coding for both the question and answers (typically LOINC as described above). The code representing the question will be stored in the concept_cd column and the answer will be stored as a standardized code, usually representing an enumerated type, in the valueflag_cd column. The valtype_cd will be ‘T’ indicating that the fact will be an enumerated type or text. Remember that the modifier_cd must be set to '@' to enable query by value in i2b2. All other i2b2 ETL fact rules apply including corresponding visit_dimension and patient_dimension row entries. (Fields not shown in the example are not required for N3C and may be null.)

OBSERVATION_FACT Template for SDOH Facts

encounter_num start_date end_date patient_num concept_cd modifier_cd valtype_cd valueflag_cd
{visit num} {date of survey {patient attached to the visit} {question LOINC code} @ T {answer LOINC code}

OBSERVATION_FACT Example for SDOH Facts

encounter_num start_date end_date patient_num concept_cd modifier_cd valtype_cd valueflag_cd
12345678 12-NOV-21 3456789 LOINC:93033-9 @ T LOINC:LA32-8

TriNetX

If you are populating your TriNetX data from one of the data models above, the best approach would be to use one of the above approaches in your upstream data, and then allow those data to flow through to TriNetX.

If you are populating TriNetX directly from your EHR or a custom data warehouse, please load SDoH data into the Lab Results table. The code representing the question should be stored in Lab Observation Code field. The Result Type field should be set to “T” (which stands for “text”), and the answer code should be stored in Lab Results Text Value field.

If you have any questions, please reach out to {n3c at trinetx dot com}. It is also helpful if you let the N3C team know via Slack if and when you plan to add this enriched data, so that we know to look out for it.