Process a list of links
For the collection and filtering of links see this tutorial and this blog post.
Two major options are necessary here:
The input list will be read sequentially, and only lines beginning with a valid URL will be read; any other information contained in the file will be discarded.
The output directory can be created on demand, but it has to be writable.
$ trafilatura -i list.txt -o txtfiles # output as raw text
$ trafilatura --xml -i list.txt -o xmlfiles # output in XML format
The second instruction creates a collection of XML files which can be edited with a basic text editor or a full-fledged text-editing package or IDE such as Atom.