ScraPoi

ScraPoi is a simple web crawler based in ScraPy for extract points of interest information from any country in the world. The results are presented in a website with a map and a list of poi with their descriptions.

The points of interest data is extracted from MiNube UK site http://www.minube.co.uk/ a travel website with pois and descriptions for each country in the world.

Installation

For installing this project you only need to clone it and set up a Python 2.7 virtual enviroment with necessary libraries described in requirements.txt, for this you need to execute:

pip install -r requirements.txt

Usage

After install the requirements you can execute scrapy command. The country to be crawled can be passed as parameter like this:

scrapy crawl minube -a country="ireland"

The result of this execution is a simple html file called index.html with a presentation of crawled data: title, map and list of pois.

Notes

For country parameter is important to use the same country name as minube website, otherwise a 404 error may be raised in crawling process. Generally this names matches with english names of the countries: spain, ireland, uruguay, germany, cuba, italy, turkey, greece, bulgaria...

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
scrapoi		scrapoi
.gitignore		.gitignore
README.md		README.md
index_template.html		index_template.html
requirements.txt		requirements.txt
scrapy.cfg		scrapy.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ScraPoi

Installation

Usage

Notes

About

Releases

Packages

Languages

juansegonzalez/scrapoi

Folders and files

Latest commit

History

Repository files navigation

ScraPoi

Installation

Usage

Notes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages