Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Fix matching GCS with previous GCS #1248

Merged
merged 1 commit into from
Jul 7, 2022
Merged

Conversation

prockenschaub
Copy link
Contributor

@prockenschaub prockenschaub commented Feb 10, 2022

Problem

I believe there is a minor error in the GCS concept. When looking for previous GCS measurements, it currently looks in the previous row where b.rn = b2.rn+1 but then requires the previous row to be recorded more than 6 hours after the following row. Since rows are ordered by time, this is impossible and no previous measurements are found (this can be verified with running the adapted reproducible example below).

Solution

Change DATETIME_ADD to DATETIME_SUB in order to obtain the intended behaviour.

Additional question

While looking at GCS, I was also wondering why GCS is replaced with 15 when the patient is intubated and verbal is missing. This does not seem an appropriate imputation and also --- contra to what the ascertation in the notes --- does not seem

in line with how the data is meant to be collected. e.g., from the SAPS II publication

Maybe the code's creator could comment on this choice?

Reprex

Current behaviour

Click to expand
with base as
(
  select
    subject_id
  , ce.stay_id, ce.charttime
  -- pivot each value into its own column
  , max(case when ce.ITEMID = 223901 then ce.valuenum else null end) as GCSMotor
  , max(case
      when ce.ITEMID = 223900 and ce.VALUE = 'No Response-ETT' then 0
      when ce.ITEMID = 223900 then ce.valuenum
      else null
    end) as GCSVerbal
  , max(case when ce.ITEMID = 220739 then ce.valuenum else null end) as GCSEyes
  -- convert the data into a number, reserving a value of 0 for ET/Trach
  , max(case
      -- endotrach/vent is assigned a value of 0
      -- flag it here to later parse specially
      when ce.ITEMID = 223900 and ce.VALUE = 'No Response-ETT' then 1 -- metavision
    else 0 end)
    as endotrachflag
  , ROW_NUMBER ()
          OVER (PARTITION BY ce.stay_id ORDER BY ce.charttime ASC) as rn
  from mimic_icu.chartevents ce
  -- Isolate the desired GCS variables
  where ce.ITEMID in
  (
    -- GCS components, Metavision
    223900, 223901, 220739
  )
  group by ce.subject_id, ce.stay_id, ce.charttime
)
, gcs as (
  select b.*
  , b2.GCSVerbal as GCSVerbalPrev
  , b2.GCSMotor as GCSMotorPrev
  , b2.GCSEyes as GCSEyesPrev
  , b2.charttime as charttimePrev
  -- Calculate GCS, factoring in special case when they are intubated and prev vals
  -- note that the coalesce are used to implement the following if:
  --  if current value exists, use it
  --  if previous value exists, use it
  --  otherwise, default to normal
  , case
      -- replace GCS during sedation with 15
      when b.GCSVerbal = 0
        then 15
      when b.GCSVerbal is null and b2.GCSVerbal = 0
        then 15
      -- if previously they were intub, but they aren't now, do not use previous GCS values
      when b2.GCSVerbal = 0
        then
            coalesce(b.GCSMotor,6)
          + coalesce(b.GCSVerbal,5)
          + coalesce(b.GCSEyes,4)
      -- otherwise, add up score normally, imputing previous value if none available at current time
      else
            coalesce(b.GCSMotor,coalesce(b2.GCSMotor,6))
          + coalesce(b.GCSVerbal,coalesce(b2.GCSVerbal,5))
          + coalesce(b.GCSEyes,coalesce(b2.GCSEyes,4))
      end as GCS

  from base b
  -- join to itself within 6 hours to get previous value
  left join base b2
    on b.stay_id = b2.stay_id
    and b.rn = b2.rn+1
    and b2.charttime > DATETIME_ADD(b.charttime, INTERVAL '6' HOUR)
)
select count(*) as n_matched
from gcs
where charttimePrev is not null;

-- n_matched
--         0

New behaviour

Click to expand
with base as
(
  select
    subject_id
  , ce.stay_id, ce.charttime
  -- pivot each value into its own column
  , max(case when ce.ITEMID = 223901 then ce.valuenum else null end) as GCSMotor
  , max(case
      when ce.ITEMID = 223900 and ce.VALUE = 'No Response-ETT' then 0
      when ce.ITEMID = 223900 then ce.valuenum
      else null
    end) as GCSVerbal
  , max(case when ce.ITEMID = 220739 then ce.valuenum else null end) as GCSEyes
  -- convert the data into a number, reserving a value of 0 for ET/Trach
  , max(case
      -- endotrach/vent is assigned a value of 0
      -- flag it here to later parse specially
      when ce.ITEMID = 223900 and ce.VALUE = 'No Response-ETT' then 1 -- metavision
    else 0 end)
    as endotrachflag
  , ROW_NUMBER ()
          OVER (PARTITION BY ce.stay_id ORDER BY ce.charttime ASC) as rn
  from mimic_icu.chartevents ce
  -- Isolate the desired GCS variables
  where ce.ITEMID in
  (
    -- GCS components, Metavision
    223900, 223901, 220739
  )
  group by ce.subject_id, ce.stay_id, ce.charttime
)
, gcs as (
  select b.*
  , b2.GCSVerbal as GCSVerbalPrev
  , b2.GCSMotor as GCSMotorPrev
  , b2.GCSEyes as GCSEyesPrev
  , b2.charttime as charttimePrev
  -- Calculate GCS, factoring in special case when they are intubated and prev vals
  -- note that the coalesce are used to implement the following if:
  --  if current value exists, use it
  --  if previous value exists, use it
  --  otherwise, default to normal
  , case
      -- replace GCS during sedation with 15
      when b.GCSVerbal = 0
        then 15
      when b.GCSVerbal is null and b2.GCSVerbal = 0
        then 15
      -- if previously they were intub, but they aren't now, do not use previous GCS values
      when b2.GCSVerbal = 0
        then
            coalesce(b.GCSMotor,6)
          + coalesce(b.GCSVerbal,5)
          + coalesce(b.GCSEyes,4)
      -- otherwise, add up score normally, imputing previous value if none available at current time
      else
            coalesce(b.GCSMotor,coalesce(b2.GCSMotor,6))
          + coalesce(b.GCSVerbal,coalesce(b2.GCSVerbal,5))
          + coalesce(b.GCSEyes,coalesce(b2.GCSEyes,4))
      end as GCS

  from base b
  -- join to itself within 6 hours to get previous value
  left join base b2
    on b.stay_id = b2.stay_id
    and b.rn = b2.rn+1
    and b2.charttime > DATETIME_SUB(b.charttime, INTERVAL '6' HOUR)
)
select count(*) as n_matched
from gcs
where charttimePrev is not null;

-- n_matched
-- 1,508,296

Edits: typos

@prockenschaub
Copy link
Contributor Author

As an additional comment, sub-table gcs_priority currently does not serve any purpose as no more than 1 row can exist per stay_id / charttime combination. This can be verified by changing rn = 1 to rn > 1 in the final select-statement. This sub-table could therefore be removed without changing the overall query.

@alistairewj
Copy link
Member

Problem

I believe there is a minor error in the GCS concept. When looking for previous GCS measurements, it currently looks in the previous row where b.rn = b2.rn+1 but then requires the previous row to be recorded more than 6 hours after the following row. Since rows are ordered by time, this is impossible and no previous measurements are found (this can be verified with running the adapted reproducible example below).

Solution

Change DATETIME_ADD to DATETIME_SUB in order to obtain the intended behaviour.

Thanks for the find! Weirdly, it's correct in the MIMIC-III code (i.e. coded as DATETIME_SUB). I'm not sure how I ended up changing it to DATETIME_ADD when porting the code to MIMIC-IV..

Additional question

While looking at GCS, I was also wondering why GCS is replaced with 15 when the patient is intubated and verbal is missing. This does not seem an appropriate imputation and also --- contra to what the ascertation in the notes --- does not seem

in line with how the data is meant to be collected. e.g., from the SAPS II publication

Maybe the code's creator could comment on this choice?

Well it sort of follows. The SAPS-II article reads:

For sedated patients, the Glasgow Coma Score before sedation was used. This was ascertained either from interviewing the physician who ordered the sedation, or by reviewing the patient's medical record.

They don't mention what they do if there is no pre-sedation value, but likely they assumed normal (or worse, they just guessed, introducing a random/unsystematic error). The logic is thus: (1) use verbal score if 1 or greater, (2) if 0, look for a previous one (using 6 hours as a reasonable window), and (3) if no previous value exists, assume normal. The sticking point is probably (3). I agree it is not the best assumption (since the existence of the missing data is dependent on the underlying value), but it is at least consistent. Either you systematically underestimate mortality by imputing normal, or systematically overestimate it by imputing abnormal. I chose the former; and there's reasonable evidence for this, see e.g. Livingston et al. who find only ~30% of scores for ventilated patients differ if you switch from pre-sedation to assuming normal (granted, this varied considerably across units).

Definitely a modeler's choice though - there are reasonable arguments for imputing an abnormal value.

As an additional comment, sub-table gcs_priority currently does not serve any purpose as no more than 1 row can exist per stay_id / charttime combination. This can be verified by changing rn = 1 to rn > 1 in the final select-statement. This sub-table could therefore be removed without changing the overall query.

This likely became obsolete once we dropped CareVue. I'll remove it after the merge, thanks!

@alistairewj alistairewj merged commit 274d78d into MIT-LCP:main Jul 7, 2022
@alistairewj alistairewj mentioned this pull request Jul 7, 2022
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants