Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Adjusting extract_time to make year optional #2304

Open
malininae opened this issue Jan 20, 2024 · 11 comments
Open

Adjusting extract_time to make year optional #2304

malininae opened this issue Jan 20, 2024 · 11 comments
Assignees
Labels
enhancement New feature or request preprocessor Related to the preprocessor

Comments

@malininae
Copy link
Contributor

Almost ready to submit an extreme event diagnostic. To make the current diagnostic to be applicable for a wider range of applications, I suggest to create a preprocessor extract_day_range(start_day, end_day) which would extract a range of days from start_day to end_day in each year (similar to extract_month). I tested it in the diagnostic, extracting constrains in iris and it works. I will try to make it happen in the next weeks.

@malininae malininae added enhancement New feature or request preprocessor Related to the preprocessor labels Jan 20, 2024
@malininae malininae self-assigned this Jan 20, 2024
@bouweandela
Copy link
Member

It would probably be easiest to modify the extract_time preprocessor function so the month and year arguments are optional.

@valeriupredoi
Copy link
Contributor

I agree with @bouweandela - next thing you know, when NGMS comes about, we'll want an extract_millisecond_range preprocessor too 😁

@malininae malininae changed the title Preprocessor extract_day_range Adjusting extract_time to make year optional Jan 22, 2024
@malininae
Copy link
Contributor Author

@bouweandela @valeriupredoi sounds good, I though before to adjust exctract_month but you suggestion is way better 😃 The only issue, currently my recipe looks like that, and I already use extract time, however I might be able to switch to clip_timerange not to repeat the preprocessors.

@bouweandela
Copy link
Member

If you specify a timerange or start_year/end_year for a variable or dataset, clip_timerange will be used automatically.

@malininae
Copy link
Contributor Author

If you specify a timerange or start_year/end_year for a variable or dataset, clip_timerange will be used automatically.

True, the problem is that I'm interested in the data from, e.g., 2011-2030, but since I need anomalies with the 1940-1969 base period, I have to use start_year: 1940, end_year: 2030 timeseries and then extract_timerange from 2011 to 2030, otherwise I get an error. It is an issue with anomalies preprocessor I've noticed a while ago, but I've never opened an issue about it (will do it shortly). Please, let me know if you have a better solution in mind.

@malininae
Copy link
Contributor Author

malininae commented Jan 22, 2024

Ok I tried following:

# ESMValTool
# recipe_bc_extremes_txx.yml
---
documentation:

  title: GEVs 2021 BC heatwave

  description: This is a recipe for analysing 2021 BC heat wave.

  authors:
    - malinina_elizaveta

preprocessors:
  preproc_txx: &prep_txx
    custom_order: True
    regrid:
      target_grid: 0.5x0.5
      lon_offset: 0.25
      lat_offset: 0.25
      scheme: linear    
    extract_shape:
      shapefile: british_columbia.shp
      method: contains
      crop: True
    area_statistics:
      operator: mean
    annual_statistics:
      operator: max
    anomalies:
      period: full
      reference:
        start_year: 1940
        start_month: 1
        start_day: 1
        end_year: 1959
        end_month: 12
        end_day: 31    

  preproc_txx_all: 
    <<: *prep_txx
    clip_timerange:
      timerange: 2011-01-01/2020-12-31

datasets_tasmax: &datasets_tasmax
  - {dataset: TaiESM1, ensemble: r1i1p1f1, grid: gn}

diagnostics:
  xcbox32:
    description: Figure for BC extremes
    variables:
      all:
        short_name: tasmax
        exp: [historical, ssp245]
        start_year: 1940
        end_year: 2030
        project: CMIP6
        mip: day
        preprocessor: preproc_txx_all
        additional_datasets: *datasets_tasmax 
    scripts: null

By the looks of it, the second clip_timerange didn't work, and the cube has 91 years, although before it was supposed to have 20 (at least worked with extract_timerange). Suggestions welcome.

P.S. I will try to explain what I am trying to do and why, may be I overcomplicate things and there is a better way or there is a way to pass object through yml, which I missed or there is a way to apply preprocessor on the preprocessed data which is directed through the recipe and not diagnostic (?).

I have an extreme event attribution recipe, similar to the one above, but for multiple climates (1850-1900, 2011-2030, 2081-2100). The above recipe works for well for yearly values, the extract_time is not an issue. Now, we are trying to operationalize this recipe and run it not for the yearly values, but to look at extremes (maxes in the above case) not for the year, but in in the 31-day periods (let's say, the 15 January to the 15 February) centered at a certain day (let's say that day is the 31 January). And for that request, where we don't use annual_statistics using the combination extract_time( start_month=1, start_day=15, end_month=2, end_day=15)/anomalies/extract_time(start_year=2011, start_month=1, start_day=1, end_year=2030, end_month=12, end_day=31) would not work. I see that the problem is somewhat custom, and one could argue that one of the extract_times could be in a diagnostic, but I think that the combination of looking at the partial time range for the period where the anomaly's base period is not a part of the the period of interest is not that crazy of a request.

So far I see several solutions to this problem:

  • change anomalies so the start/end period can be outside cube range and one would need only one updated partial extract_time
  • create a separate function extract_date_range (I guess a mish-mash of extract_time/extract_month/resample_time, which basically parses a partial iris time constraint)
  • not change anything in the preprocessor and extract partial time in the diagnostic (not amazing, because several steps which are preprocessor functions would be done in the diagnostic making it less flexible)
  • change extract_time making the year/month optional and subtract anomalies in the diagnostic.

Currently, I've done the combo of the lowest to options and ran the extract partial datetime and subtract anomalies in the diagnostic, but I think it could be done better. Again, any suggestions welcome.

@bouweandela
Copy link
Member

By the looks of it, the second clip_timerange didn't work

Indeed, that does not work: clip_timerange cannot be used directly from the recipe, but it is always called automatically when the start_year and end_year or timerange facets are present in the variable definition.

@bouweandela
Copy link
Member

See here for an idea how we could enable using the anomalies preprocessor in this case.

@malininae
Copy link
Contributor Author

Thanks, the consensus is to make the year and month extract_time optional and for my and stuff move the anomaly calculation into the diagnostic.

@malininae
Copy link
Contributor Author

@bouweandela @valeriupredoi you might know, why is this part needed in the _extract_datetime see this part? What I don't get, why the if statement is needed. The only logic I could find, was to check, if time is the first coordinate, but since cube.coord_dims(time_coord) returns a tuple, it looks like the if will be always False and we'd be always doing the else part. I was testing what happens in case time isn't the first dimension, and extracting using PartialDateTime worked like a charm.

@bouweandela
Copy link
Member

It is needed in case time isn't associated with a dimension at all.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request preprocessor Related to the preprocessor
Projects
None yet
Development

No branches or pull requests

3 participants