Skip to content

Commit

Permalink
Change categories mapping logic (#3946)
Browse files Browse the repository at this point in the history
* change pre-filtering logic

* Update src/otx/core/data/pre_filtering.py

Co-authored-by: Eunwoo Shin <eunwoo.shin@intel.com>

---------

Co-authored-by: Eunwoo Shin <eunwoo.shin@intel.com>
  • Loading branch information
kprokofi and eunwoosh authored Sep 13, 2024
1 parent c7efcbc commit ecef545
Showing 1 changed file with 9 additions and 1 deletion.
10 changes: 9 additions & 1 deletion src/otx/core/data/pre_filtering.py
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,15 @@ def remove_unused_labels(dataset: DmDataset, data_format: str, ignore_index: int
raise ValueError(msg)
if len(used_labels) == len(original_categories):
return dataset
if data_format == "arrow" and max(used_labels) != len(original_categories) - 1:
# we assume that empty label is always the last one. If it is not explicitly added to the dataset,
# (not in the used labels) it will be filtered out.
mapping = {cat: cat for cat in original_categories[:-1]}
elif data_format == "arrow":
# this mean that some other class wasn't annotated, we don't need to filter the object classes
return dataset
else:
mapping = {original_categories[idx]: original_categories[idx] for idx in used_labels}
msg = "There are unused labels in dataset, they will be filtered out before training."
warnings.warn(msg, stacklevel=2)
mapping = {original_categories[idx]: original_categories[idx] for idx in used_labels}
return dataset.transform("remap_labels", mapping=mapping, default="delete")

0 comments on commit ecef545

Please # to comment.