strip-tags 0.6. It's been a while since I updated this tool, but in investigating a tricky mistake in my tutorial for LLM schemas I discovered a bug that I needed to fix.
Those release notes in full:
- Fixed a bug where
strip-tags -t meta
still removed<meta>
tags from the<head>
because the entire<head>
element was removed first. #32- Kept
<meta>
tags now default to keeping theircontent
andproperty
attributes.- The CLI
-m/--minify
option now also removes any remaining blank lines. #33- A new
strip_tags(remove_blank_lines=True)
option can be used to achieve the same thing with the Python library function.
Now I can do this and persist the <meta>
tags for the article along with the stripped text content:
curl -s 'https://apnews.com/article/trump-federal-employees-firings-a85d1aaf1088e050d39dcf7e3664bb9f' | \
strip-tags -t meta --minify
Here's the output from that command.
Recent articles
- Here's how I use LLMs to help me write code - 11th March 2025
- What's new in the world of LLMs, for NICAR 2025 - 8th March 2025
- I built an automaton called Squadron - 4th March 2025