pup is a command line tool for processing HTML. It reads from stdin, prints to stdout, and allows the user to filter parts of the page using CSS selectors. Inspired by jq, pup aims to be a fast and flexible way of exploring HTML from the terminal. If you have Go installed on your computer just run go get. If you're on OS X, use Homebrew to install (no Go required). By default pup will fill in missing tags and properly indent the page. CSS selectors have a group of specifiers called "pseudo classes" which are pretty cool. pup implements a majority of the relevant ones them. When combining selectors, the HTML nodes selected by the previous selector will be passed to the next ones. Non-HTML selectors which effect the output type are implemented as functions which can be provided as a final argument. Print the values of all attributes with a given key from all selected nodes.

Features

  • Clean and indent
  • Filter by tag
  • Filter by id
  • Filter by attribute
  • Pseudo Classes
  • Print HTML as JSON

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow pup

pup Web Site

You Might Also Like
Gen AI apps are built with MongoDB Atlas Icon
Gen AI apps are built with MongoDB Atlas

The database for AI-powered applications.

MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
Start Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of pup!

Additional Project Details

Operating Systems

Mac, Windows

Registered

2021-09-03