{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "75174eaa-da62-430b-aef7-2cdf8fdd4256",
   "metadata": {},
   "source": [
    "# 03: Pre-process population data\n",
    "*Ingest and transform population data from the [Global Human Settlement](https://ghsl.jrc.ec.europa.eu/datasets.php) dataset. The population grids will be used for weighted spatial averaging of climate data for some subsequent analyses.*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c89e46b0-b8dc-4216-b1d4-4fd1c980a046",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import rioxarray as rio\n",
    "import xarray as xr\n",
    "from rasterio.enums import Resampling"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "425989a1-5d22-4d62-af3b-1a7d7ea66c60",
   "metadata": {
    "tags": []
   },
   "source": [
    "Open a sample UHE-Daily tif to use as a template for the target resolution (~5 km). This sample file was accessed from `http://data.chc.ucsb.edu/people/cascade/UHE-daily/wbgtmax/2006/wbgtmax.2006.01.10.tif` with help from Cascade Tuholske (Montana State University) and Pete Peterson (University of California, Santa Barbara)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fd225d43-6c1e-4517-a91a-c8e908c826df",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "uhe = xr.open_dataset(\n",
    "    \"s3://carbonplan-climate-impacts/extreme-heat/v1.0/inputs/UHE-daily.wbgtmax.2006.01.10.tif\",\n",
    "    engine=\"rasterio\",\n",
    ")\n",
    "uhe = uhe.sel(band=1).band_data.drop([\"band\", \"spatial_ref\"])\n",
    "uhe = uhe.rio.write_crs(\"epsg:4326\")\n",
    "uhe = uhe.reindex(y=list(reversed(uhe.y)))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "152b6210-86f8-4525-bb9d-bc60f088e886",
   "metadata": {},
   "source": [
    "Make population data align with the UHE-Daily dataset (~5 km)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "29d6d8f8-f8bc-4139-bf45-04749590a299",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "pop_data = rio.open_rasterio(\n",
    "    \"s3://carbonplan-climate-impacts/extreme-heat/v1.0/inputs/GHS_POP_E2030_GLOBE_R2023A_4326_30ss_V1_0.tif\"\n",
    ").load()\n",
    "pop_data = pop_data.sel(y=slice(90, -60)).sel(band=1).drop([\"spatial_ref\", \"band\"])\n",
    "pop_data = pop_data.rio.write_crs(\"epsg:4326\")\n",
    "pop_data = pop_data.reindex(y=list(reversed(pop_data.y)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a9e16899-d55c-45d8-8b9e-ef120e5a508d",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "fine_pop = pop_data.rio.reproject_match(uhe, resampling=Resampling.sum)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "77ada379-bf04-4821-8db7-3e34d1048ad8",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "fine_pop.to_dataset(name=\"population\").to_zarr(\n",
    "    \"s3://carbonplan-climate-impacts/extreme-heat/v1.0/inputs/GHS_POP_E2030_GLOBE_R2023A_4326_30ss_V1_0_resampled_to_UHE_daily.zarr\",\n",
    "    mode=\"w\",\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c8b1ad52-fcea-4ebf-8bc3-919eb1e56fa9",
   "metadata": {},
   "source": [
    "Repeat the process above, but with the coarser (~25 km) dataset from `02_generate.ipynb` as a target. Open up a single result file as a template."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1d2a22f7-f6d2-4d7d-802a-23192d5e4601",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "wbgt_cp = xr.open_zarr(\n",
    "    \"s3://carbonplan-scratch/extreme-heat/wbgt-shade-gridded/years/ACCESS-CM2/ACCESS-CM2-historical-2008.zarr\"\n",
    ")\n",
    "wbgt_cp = wbgt_cp.isel(time=0).WBGT\n",
    "wbgt_cp = wbgt_cp.rio.write_crs(\"epsg:4326\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4cc138ba-cf36-4303-a372-e1b75b6544b7",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "wbgt_cp = wbgt_cp.rename({\"lat\": \"y\", \"lon\": \"x\"})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7e0138b6-3831-4233-9a7c-f9f14d8c4a8c",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "new_lons = wbgt_cp[\"x\"].where(wbgt_cp[\"x\"] < 180, wbgt_cp[\"x\"] - 360)\n",
    "wbgt_cp = wbgt_cp.assign_coords(x=new_lons)\n",
    "wbgt_cp = wbgt_cp.reindex({\"x\": np.sort(wbgt_cp.x.values)})"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fae94e1f-d7bf-471e-8bf2-a893c962c051",
   "metadata": {
    "tags": []
   },
   "source": [
    "Write out the population data aligned to the coarser estimates."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a8b08670-bc1f-4f39-85b9-14df2ca9980f",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "coarse_pop = pop_data.rio.reproject_match(wbgt_cp, resampling=Resampling.sum)\n",
    "coarse_pop = coarse_pop.where(coarse_pop != coarse_pop.attrs[\"_FillValue\"], 0)\n",
    "coarse_pop = coarse_pop.to_dataset(name=\"population\")\n",
    "coarse_pop.to_zarr(\n",
    "    \"s3://carbonplan-climate-impacts/extreme-heat/v1.0/inputs/GHS_POP_E2030_GLOBE_R2023A_4326_30ss_V1_0_resampled_to_CP.zarr\",\n",
    "    mode=\"w\",\n",
    ")"
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}