TAGsieve is a simple HTML/XML stripper GUI application, written in Python and based on the HTML sanitizer bleach. It strips or batch strips tags from .html, .htm, or .xml files, ignoring specified tags and attributes by means of whitelisting.
TAGsieve is a simple GUI application that strips tags of a single file or of a directory of files.
It follows bleach and works with tag, attribute, and style whitelists: these tags, attributes, or styles will not be stripped. The "Clean" button cleans a file or directory of files.
Install Python on your machine, using the Anaconda distribution. This includes the PyQt package needed for the GUI.
After installing the distribution, check if PyQt is there:
$ conda install pyqt
Then run:
$ pip install TAGsieve
This should install the requirements used for TAGsieve. Start the program in terminal by running:
$ python -m TAGsieve