# 03: Pre-process population data
*Ingest and transform population data from the [Global Human Settlement](https://ghsl.jrc.ec.europa.eu/datasets.php) dataset. The population grids will be used for weighted spatial averaging of climate data for some subsequent analyses.*

In [None]:
import numpy as np
import rioxarray as rio
import xarray as xr
from rasterio.enums import Resampling

Open a sample UHE-Daily tif to use as a template for the target resolution (~5 km). This sample file was accessed from `http://data.chc.ucsb.edu/people/cascade/UHE-daily/wbgtmax/2006/wbgtmax.2006.01.10.tif` with help from Cascade Tuholske (Montana State University) and Pete Peterson (University of California, Santa Barbara).

In [None]:
uhe = xr.open_dataset(
 "s3://carbonplan-climate-impacts/extreme-heat/v1.0/inputs/UHE-daily.wbgtmax.2006.01.10.tif",
 engine="rasterio",
)
uhe = uhe.sel(band=1).band_data.drop(["band", "spatial_ref"])
uhe = uhe.rio.write_crs("epsg:4326")
uhe = uhe.reindex(y=list(reversed(uhe.y)))

Make population data align with the UHE-Daily dataset (~5 km).

In [None]:
pop_data = rio.open_rasterio(
 "s3://carbonplan-climate-impacts/extreme-heat/v1.0/inputs/GHS_POP_E2030_GLOBE_R2023A_4326_30ss_V1_0.tif"
).load()
pop_data = pop_data.sel(y=slice(90, -60)).sel(band=1).drop(["spatial_ref", "band"])
pop_data = pop_data.rio.write_crs("epsg:4326")
pop_data = pop_data.reindex(y=list(reversed(pop_data.y)))

In [None]:
fine_pop = pop_data.rio.reproject_match(uhe, resampling=Resampling.sum)

In [None]:
fine_pop.to_dataset(name="population").to_zarr(
 "s3://carbonplan-climate-impacts/extreme-heat/v1.0/inputs/GHS_POP_E2030_GLOBE_R2023A_4326_30ss_V1_0_resampled_to_UHE_daily.zarr",
 mode="w",
)

Repeat the process above, but with the coarser (~25 km) dataset from `02_generate.ipynb` as a target. Open up a single result file as a template.

In [None]:
wbgt_cp = xr.open_zarr(
 "s3://carbonplan-scratch/extreme-heat/wbgt-shade-gridded/years/ACCESS-CM2/ACCESS-CM2-historical-2008.zarr"
)
wbgt_cp = wbgt_cp.isel(time=0).WBGT
wbgt_cp = wbgt_cp.rio.write_crs("epsg:4326")

In [None]:
wbgt_cp = wbgt_cp.rename({"lat": "y", "lon": "x"})

In [None]:
new_lons = wbgt_cp["x"].where(wbgt_cp["x"] < 180, wbgt_cp["x"] - 360)
wbgt_cp = wbgt_cp.assign_coords(x=new_lons)
wbgt_cp = wbgt_cp.reindex({"x": np.sort(wbgt_cp.x.values)})

Write out the population data aligned to the coarser estimates.

In [None]:
coarse_pop = pop_data.rio.reproject_match(wbgt_cp, resampling=Resampling.sum)
coarse_pop = coarse_pop.where(coarse_pop != coarse_pop.attrs["_FillValue"], 0)
coarse_pop = coarse_pop.to_dataset(name="population")
coarse_pop.to_zarr(
 "s3://carbonplan-climate-impacts/extreme-heat/v1.0/inputs/GHS_POP_E2030_GLOBE_R2023A_4326_30ss_V1_0_resampled_to_CP.zarr",
 mode="w",
)