DESCRIPTION

Address line parser

DESCRIPTION
SOLUTION
DEV-DEPENDENCIES
RUN
OPTIONS
EXAMPLES
RUN-TESTS

DESCRIPTION

An address provider returns addresses only with concatenated street names and numbers. Our own system on the other hand has separate fields for the street name and street number.

Input: string of address

Output: string of street and string of street-number as JSON object

Write a simple program that does the task for the simplest cases, e.g.
1. "Winterallee 3" -> {"street": "Winterallee", "housenumber": "3"}
2. "Musterstrasse 45" -> {"street": "Musterstrasse", "housenumber": "45"}
3. "Blaufeldweg 123B" -> {"street": "Blaufeldweg", "housenumber": "123B"}
Consider more complicated cases
1. "Am Bächle 23" -> {"street": "Am Bächle", "housenumber": "23"}
2. "Auf der Vogelwiese 23 b" -> {"street": "Auf der Vogelwiese", "housenumber": "23 b"}
Consider other countries (complex cases)
1. "4, rue de la revolution" -> {"street": "rue de la revolution", "housenumber": "4"}
2. "200 Broadway Av" -> {"street": "Broadway Av", "housenumber": "200"}
3. "Calle Aduana, 29" -> {"street": "Calle Aduana", "housenumber": "29"}
4. "Calle 39 No 1540" -> {"street": "Calle 39", "housenumber": "No 1540"}

SOLUTION

It requires the Python interpreter, have been tested with version 3.6.9, and is not platform specific. It should work on Unix, on Windows or on macOS.

The solution being used tries to extract the Street name and house number using regular expressions.

Are being used 3 different regular expressions to try to extract the corresponding address components.
They are evaluated one at a time, and when one regex matches against the input address line, it's not necessary continue evaluating the ones that have not been used yet:

Regex I

^(?P<street>[a-zA-ZäöüÄÖÜẞß]+(?:(?:\s|-)[a-zA-ZäöüÄÖÜẞß]+)*\.?)\,?\s(?P<number>(?:(?:\d)+\s?[a-zA-Z]{0,1})|(?:\d+\s?-\s?\d+))$

It will match addresses with the following format between others:

Winterallee 3
Musterstrasse 45
Blaufeldweg 123B
Am Bächle 23
Auf der Vogelwiese 23 b
Lange Str. 8
Königsbrücker Str. 21 - 29
Karl-Weysser-Str. 9
Calle Aduana, 29

Regex II

^(?P<number>[\d]+),?\s(?P<street>[a-zA-Z]+(?:\s[a-zA-Z]+)*)$

It will match addresses with the following format between others:

4, rue de la revolution
200 Broadway Av

Regex III

^(?P<street>.+\d+)\s(?P<number>(?:No|no|nr|Nr)\.?\s\d+\s?[a-zA-Z]?)$

It will match addresses with the following format between others:

Calle 39 No 1540
Calle 39 Nr. 1540

DEV-DEPENDENCIES

For development purpose, source code quality analysis, detection of error and for running tests, the following dependencies are required:

tox
flake8
pylint
mypy
pytest

RUN

Clone the project executing the following command in a terminal:
git clone https://github.com/reynierg/addressline.git

Move to the project directory using:
cd addressline

Execute the following:
python3 bin/addressline.py [OPTIONS] ADDRESS_LINE

OPTIONS

-h, --help                  Print this help text and exit
-v, --verbose               Display verbose information about the proram execution

EXAMPLES

I- Parse the address "Königsbrücker Str. 21 - 29":
python3 bin/addressline.py 'Königsbrücker Str. 21 - 29'

Output:

A regular expression matched the specified address line
Output: {'street': 'Königsbrücker Str.', 'housenumber': '21 - 29'}

II- Parse the address "4, rue de la revolution":
python3 bin/addressline.py '4, rue de la revolution'

Output:

A regular expression matched the specified address line
Output: {'street': 'rue de la revolution', 'housenumber': '4'}

RUN-TESTS

For run the tests, first will be needed to install the development requirements:

Create and activate a virtual environment

Being in the project directory, execute the following commands:

python3 -m venv venv
. venv/bin/activate

Install the development dependencies

pip install -r dev-requirements.txt

Run the tests

Once installed the dependencies, is only necessary to execute "tox", and it will take care of run "flake8", "pylint", "mypy" and the tests using "pytest":
tox

If the tests run successfully, in a directory named "htmlcov", will be created html files with the test coverage report and in the terminal will appear something like the following:

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
bin		bin
images		images
src		src
tests		tests
.coveragerc		.coveragerc
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
dev-requirements.txt		dev-requirements.txt
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DESCRIPTION

SOLUTION

Regex I

Regex II

Regex III

DEV-DEPENDENCIES

RUN

OPTIONS

EXAMPLES

RUN-TESTS

Create and activate a virtual environment

Install the development dependencies

Run the tests

About

Releases

Packages

Languages

License

reynierg/addressline

Folders and files

Latest commit

History

Repository files navigation

DESCRIPTION

SOLUTION

Regex I

Regex II

Regex III

DEV-DEPENDENCIES

RUN

OPTIONS

EXAMPLES

RUN-TESTS

Create and activate a virtual environment

Install the development dependencies

Run the tests

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages