This is a trail of using a Chrome browser extension to find CSS selectors to extract information elements from web pages.
The CSS selectors are generated using css-selector-generator.min.js from here.
Drag the chrome-ext-selector
directory onto the browser's Extensions page (which you can open from the previous link or with menu
> More tools
> Extensions
).
Intended usage:
- open web page of interest
- click on the extension icon to show its popup
- in the popup select the name of the information element you wish to extract (e.g. Price)
- in the page of interest click on an example of the information element (e.g. a price). With the popup activated links in the page are disabled, making it possible to select a link without opening it.
- The popup displays a CSS selector that uniquely selects the element that was chosen and the element matching the selector is shown with a red border.
- manually edit the selector in the popup to perhaps simplify it or generalise it to match multiple prices and lose the focus on the edited item (e.g. select another editable item in the popup)
- Elements matching the edited selector are shown with a red border.
- Once the user is happy with the selector it can be used in a crawler/parser to automatically extract information in bulk.
At the moment it only works with the Extension open in Developer tools:
- click on the extension icon to show its popup
- on the extension icon
right-click
>Inspect popup
The popup's javascript may need to be moved to a background page
to fix this.