Skip to content
This repository has been archived by the owner on Sep 23, 2019. It is now read-only.

Re-write to use beautiful soup and selenium #3

Open
taikedz opened this issue Mar 5, 2019 · 0 comments
Open

Re-write to use beautiful soup and selenium #3

taikedz opened this issue Mar 5, 2019 · 0 comments

Comments

@taikedz
Copy link
Owner

taikedz commented Mar 5, 2019

A lot of sites are moving to more heavily JavaScript-oriented solutions for display, and naive DOM scraping of the base HTML is no longer viable.

The engine should be able to provide a post-render DOM object for querying - hopefully BeautifulSoup is the solution to this problem.

This also means that the solution will need to use a python virtualenv on-deployment

https://medium.freecodecamp.org/better-web-scraping-in-python-with-selenium-beautiful-soup-and-pandas-d6390592e251

# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant