Hospitals may record data according to bed stays
,
ward stays
and in-patient stays
(or episode)
as it is used to record how people move around the hospital.
A patient will attend
something like Accident and
Emergency - A&E (also known as Emergency Department or as the
acronym ED) and if they stay in hospital that becomes an
admission
. Leaving a hospital from its service (because
this can also occur with outpatients where a person isn’t admitted to
stay for any period) is called discharge
.
Patients can be moved from beds
in a ward
,
moved between wards
in a hospital
and have an
overall in-patient stay
which is how long they were in a
hospital.
Some services, like those for mental health, may record a move to
another hospital and in the case of mental health will record this as
out of area
. Even with all these “stays” the actual time a
patient is in the service, or commissioned as part of the service, could
all count as one and, if records are from different systems, may either
overlap or not join together.
For example, if a mental health patient is in a local service, then
moves, a gap of a day or two is most likely to be an administrative
error. An example of this delay could be because a patient is moved from
one hospital on the Sunday but the recording takes place on the Monday
or even the Tuesday if the Monday is a public holiday. In order to
record the move the patient will be discharged
although, in
actuality they are still in healthcare services.
It may be that an analyst has access to all the data related to a
stay in hospital, so beds, ward, overall stay in something like a SQL
warehouse and in that case it’s very likely that the dates are
reasonably accurate. However, if there is any overlapping in any of the
data this will affect overall counts because the one stay could look
like two or more entries. It could also affect counts by days to see
bed occupancy
as one person could appear twice and so be
double counted.
There will be many ways around these issues of double counting but
the package NHSRepisodes has quick and efficient functions in R that can
change or add a column of information on the episodes
to
your data.
library(dplyr)
library(NHSRepisodes)
services <- tibble::tribble(
~code, ~service, ~type,
"t100", "service A", "inpatient",
"t200", "service B", "inpatient",
"t500", "service C", "inpatient",
"t600", "service D", "out of area"
)
# Create a dummy data set give the first and last dates of an episode
# using withr package to
dat <- tribble(
~patient_id, ~admission, ~discharge, ~code, ~notes,
1L, "2020-01-01", "2020-01-10", "t100", NA,
1L, "2020-01-12", "2020-01-22", "t600", "this has a gap",
1L, "2020-05-01", "2020-10-01", "t100", NA,
1L, "2020-10-01", "2020-11-01", "t600", "same day overlap"
) |>
# columns must be date format to work with NHSRepisodes functions
mutate(across(admission:discharge, as.Date)) |>
left_join(services) |>
arrange(patient_id, admission)
Find the episodes and add a column to the data:
dat |>
# Rename the columns so that they are recognised by the `add_parent_interval()` function
select(
id = patient_id,
start = admission,
end = discharge,
everything()
) |>
NHSRepisodes::add_parent_interval() |>
select(id,
start,
end,
.parent_start,
.parent_end,
.interval_number)
#> # A tibble: 4 × 6
#> id start end .parent_start .parent_end .interval_number
#> <int> <date> <date> <date> <date> <int>
#> 1 1 2020-01-01 2020-01-10 2020-01-01 2020-01-10 1
#> 2 1 2020-01-12 2020-01-22 2020-01-12 2020-01-22 2
#> 3 1 2020-05-01 2020-10-01 2020-05-01 2020-11-01 3
#> 4 1 2020-10-01 2020-11-01 2020-05-01 2020-11-01 3
Because this patient had a gap of a few days between and inpatient and out of area stay this appears, currently, as two intervals (episodes).
To adjust this add days to the discharge (end) column according to what is appropriate for the data.
dat |>
# Rename the columns so that they are recognised by the `add_parent_interval()` function
select(
id = patient_id,
start = admission,
end_actual = discharge,
everything()
) |>
mutate(end = end_actual + 2) |>
NHSRepisodes::add_parent_interval() |>
select(id,
start,
end,
.parent_start,
.parent_end,
.interval_number)
#> # A tibble: 4 × 6
#> id start end .parent_start .parent_end .interval_number
#> <int> <date> <date> <date> <date> <int>
#> 1 1 2020-01-01 2020-01-12 2020-01-01 2020-01-24 1
#> 2 1 2020-01-12 2020-01-24 2020-01-01 2020-01-24 1
#> 3 1 2020-05-01 2020-10-03 2020-05-01 2020-11-03 2
#> 4 1 2020-10-01 2020-11-03 2020-05-01 2020-11-03 2
Rather than adding a column to the data, it is possible to use the
function merge_episodes()
to return one row for each
episode. Using the last example where days were added to the end
(discharge) to close an out of area gap:
dat |>
# Rename the columns so that they are recognised by the `add_parent_interval()` function
select(
id = patient_id,
start = admission,
end_actual = discharge,
everything()
) |>
mutate(end = end_actual + 2) |>
NHSRepisodes::merge_episodes()
#> # A tibble: 2 × 4
#> id .interval_number .episode_start .episode_end
#> <int> <int> <date> <date>
#> 1 1 1 2020-01-01 2020-01-24
#> 2 1 2 2020-05-01 2020-11-03