Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

feat: UPS tracking numbers #228

Merged
merged 3 commits into from
Nov 7, 2021
Merged

Conversation

P403n1x87
Copy link
Contributor

@P403n1x87 P403n1x87 commented Nov 4, 2021

⚠ Pull Requests not made with this template will be automatically closed 🔥

Prerequisites

Why do we need this pull request?

  • This PR adds the regex for UPS tracking numbers

What GitHub issues does this fix?

N. A.

Copy / paste of output

pywhat 1Z123CAB9912345678    
Matched on: 1Z123CAB9912345678
Name: UPS Tracking Number
Link:  https://www.ups.com/track?tracknum=1Z123CAB9912345678

"Regex": "^(1Z[0-9A-Z]{6}[0-9]{2}[0-9]{8})$",
"plural_name": false,
"Description": null,
"Rarity": 1,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say that rarity should be lowered.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any suggestions? 🙂

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something around 0.3

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 0.3? I'd say higher like 0.5 or 0.6 because:

  1. The string has to start with 1Z
  2. It needs 7 chars 0-9A-Z
  3. It has exactly 2 numbers
  4. It has 8 numbers

Also, can we make it:

- ^(1Z[0-9A-Z]{6}[0-9]{2}[0-9]{8})$
- + ^(1Z[0-9A-Z]{6}[0-9]{10})$

?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 0.4 or 0.5. And yes, regex should be changed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of the 2+8 split is because the first 2 digits in this group represent a service indicator code and perhaps it could be captured and handled in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside: I wonder if the "rarity" could be estimated more reliably through some entropy-based measure 🤔

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

service indicator code

We have precedence for this called sub-categories. See the Mastercard / Phone Numbers regex. I am not sure it'll work on data in the middle of the regex, we may need to change the code for that :)

Aside: I wonder if the "rarity" could be estimated more reliably through some entropy-based measure 🤔

Probably! Currently I am estimating it based on what I see when people post this:
Screenshot 2021-11-04 at 16 09 24

And also whether we have any false positives.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@P403n1x87 You can use subcategories with regex method for that.

@P403n1x87 P403n1x87 force-pushed the feat/ups-tracking branch 3 times, most recently from 14b9fcc to 45d37fe Compare November 7, 2021 11:02
@P403n1x87 P403n1x87 requested review from bee-san and a user November 7, 2021 11:03
@codecov-commenter
Copy link

codecov-commenter commented Nov 7, 2021

Codecov Report

Merging #228 (62b3df8) into main (a5a4a3b) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #228   +/-   ##
=======================================
  Coverage   92.60%   92.60%           
=======================================
  Files          15       15           
  Lines        1217     1217           
=======================================
  Hits         1127     1127           
  Misses         90       90           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a5a4a3b...62b3df8. Read the comment docs.

Co-authored-by: piatrashkakanstantinass <74979584+piatrashkakanstantinass@users.noreply.github.com>
@P403n1x87 P403n1x87 requested a review from a user November 7, 2021 11:27
@bee-san bee-san merged commit 559a89c into bee-san:main Nov 7, 2021
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants