JUCESP Robot Process Automation

This repository holds all the necessary code to run the an automation robot that extracts company-related information at JUCESP.

Package Guidelines

Installation

Install all the pre-needed requirements using:

pip install -r requirements.txt

Configuration File

Please copy config.ini.example to config.ini and fill out the 2Captcha API key.

Usage

Advanced Search

The first step is to perform the advanced search at JUCESP and extracts its HTML content. To accomplish such a step, one needs to use the following script:

python advanced_search.py -h

Note that -h invokes the script helper, which assists users in employing the appropriate parameters.

Parse Advanced Search

After conducting the search, one needs to parse the HTML into a CSV holding the companies' identifier and city. Please, use the following script to accomplish such a procedure:

python parse_advanced_search.py -h

Parse Company Information

Finally, all companies HTML will be dumped to companies/ folder. One can use the following script to parse their information into a readable CSV:

python parse_company_info.py -h

Bash Script

Instead of invoking every script to conduct the automation, it is also possible to use the provided shell script, as follows:

./pipeline.sh

Such a script will conduct every step needed to accomplish the automation process. Furthermore, one can change any input argument that is defined in the script.

Support

We know that we do our best, but it is inevitable to acknowledge that we make mistakes. If you ever need to report a bug, report a problem, talk to us, please do so! We will be available at our bests at this repository.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

JUCESP Robot Process Automation

Package Guidelines

Installation

Configuration File

Usage

Advanced Search

Parse Advanced Search

Company Information

Parse Company Information

Bash Script

Support

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
companies		companies
paths		paths
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
advanced_search.py		advanced_search.py
company_info.py		company_info.py
config.ini.example		config.ini.example
parse_advanced_search.py		parse_advanced_search.py
parse_company_info.py		parse_company_info.py
pipeline.sh		pipeline.sh
requirements.txt		requirements.txt

License

gugarosa/jucesp_rpa

Folders and files

Latest commit

History

Repository files navigation

JUCESP Robot Process Automation

Package Guidelines

Installation

Configuration File

Usage

Advanced Search

Parse Advanced Search

Company Information

Parse Company Information

Bash Script

Support

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages