SOL-001

CAPCAT TUI/CLI ETHICAL WEB ARCHIVING UTILITY

Product DesignUX ResearchContext EngineeringMVPBrand DesignIllustration

Role

Product Designer and MVP Developer

Timeline

2.5 months, end of 2025 - 1,5 months, start of 2026

Team

Solo project

Summary

Capcat - a dual-mode command-line tool designed for ethical web scraping and content preservation. It allows users to archive articles from various online sources into a local, searchable library, ensuring content remains accessible even if the original websites go offline.

Core Functionality:

Capcat operates in two distinct modes to suit different workflows:

CLI Mode: A fast, scriptable interface optimized for power users, automation, cron jobs, and integration into existing technical workflows.

TUI Mode: A visual, guided exploration mode that allows users to discover sources and test workflows without needing to memorize commands.

The tool fetches content from over 12 pre-configured sources (such as Hacker News, BBC, The Guardian, and IEEE Spectrum) or custom RSS feeds.

Capcat - working open-source tool, live at capcat.org.

Outside the main product goal, this project gave me the ideal opportunity to learn how to use LLMs generative coding with context engineering and spec driven development (Technically: Context engineering - the practice of structuring what an LLM receives before it generates. The architecture of the input determines the quality of the output).

I used clig.dev to set the standard for CLI usage experience. The TUI interface follows the strict heuristic rules. Two interfaces run on one shared backend. The CLI takes flags, pipes, and scripts. The TUI walks users through a visual menu. Both produce the same output. Ethical scraping integrated as a part of the system architecture as a constraint.

Tools: Claude Code, Gemini, Figma, Drawio, Affinity Designer, Procreate.

Process: UX research, JTBD, heuristic analysis, product design, illustration, hand drawing, PRD, TDD, spec-driven implementation and context engineering in iterative cycles

Download a short version of the Case Study below:

Capcat Case Study Short 2.6 MB

Update: User testing surfaced two unmet needs. One of portable installation and other of Obsidian integration. I shipped solution for both. Replaced the bash wrapper with pipx install capcat. A full YAML configuration abstraction layer to give users control over themes and sources. Obsidian frontmatter and back-linking for specific sources.

Download the Flow maps for the change below:

Capcat V2 Flows 2.7 MB

Capcat Full Demo

Challenge and Context

I read across several sources a week and keep what matters in all the popular tools. Over the years, this built several collections with no connection between them.

I searched online for things I had already read. Pages disappeared. Browser bookmarks took one click to save, but told me nothing about the broader context and time.

When I needed something from my own archive, I could not find it without serious effort. Time that should have gone to reading went to searching.

Every tool I tried solved one of the UX problems. None solved capturing and categorizing.

The task consisted of four things that had to work together:

archive content before it disappears
catalogue it so search works
find it on demand
share it without depending on an external service.

User need:

Permanent, searchable, knowledge archive

Current solution space:

Search online
Browser bookmarks
Pocket/Instapaper
Manual coping and pasting
Evernote and Notion

Defined pain points:

Data locked in
Unsorted links
Manual curation
Disappearing content

JTBD Problem Space 375 KB

Discovery and User Insights

I acted as the primary research subject. I mapped my own workflow before the design process started. A heuristic analysis using Nielsen's 10 principles and Laws of UX followed.

Before building, I ran informal conversations with three people in my network: a developer, a researcher, and an ML engineer. I asked each of them to walk me through how they save and retrieve things they read online. I did not describe the problem or the solution. Two of them independently mentioned losing content when the original page went down. That confirmed the core job-to-be-done before building a design solution and writing a PRD.

That process transformed how I saw the problem. Saving articles became one of the goals. The general product definition aimed to own a permanent archive. The actual work: organizing, searching, getting content back out in a format that works.

The same problem belongs to four groups of people:

Design engineers building information pipelines want to automate content collection so they can focus on systems, not manual gathering.
Developers maintaining technical archives want searchable local copies so they can find reference material without depending on a website staying up.
ML engineers tracking research across sources want structured, processable output so they can feed articles into tools and agents.
CLI-familiar users who want speed and control also want flags and scripts so they can skip the GUI entirely. All four share the same gap: no reliable path from intent to content.

JTBD to Design Decisions 312 KB

Information Architecture and Strategy

The first version of Capcat shipped as a single scraping command. ./scangrab fetch hn which pulled articles from Hacker News and saved them as Markdown. It worked, but it had no error handling, and no way to add new sources without editing code.

That first version shipped under a different name: ScanGrab.

I stopped, ran a full JTBD and heuristic analysis, wrote a minimal PRD, and restarted. The name changed because the thinking changed. Cap(ture)Cat(alogue), the new name encoded the two core jobs the tool actually performed.

The PRD locked three non-negotiable constraints:

1. The archive belongs to the user, not a service. Technically: all output delivers self-contained files with no server dependency. I can open the Capcat archive on a different machine years from now, and every article, document, will remain accessible.

2. The tool engages users at their current level, offering guided or direct paths.. Technically: a visual step-by-step TUI and a flag-based CLI on one shared backend.

3. Ethical scraping - built in, not opted in. Technically: the EthicalScrapingManager sits at the architecture level. Every source passes through it before touching the network.

I evaluated every feature request during development against these three. I didn't create a GUI because quality native implementation required Swift, outside the current stack, and the TUI already covered the guided experience job.

The CLI came first because the target users live in the terminal. DX requirements covered:

documentation plugin architecture
automation support
HTML generation with easy theming
self-contained output for sharing
Markdown structured for Obsidian and LLM agents. An expert user who learns the commands can type ./capcat bundle tech --count 10 --html, and get things done in three steps.

But not everyone who needs this tool thinks in flags and arguments. Once the CLI tested as stable, I had to decide how the TUI interactive mode should relate to it.

A wrapper sitting on top of the CLI would inherit its vocabulary. That works for someone who already knows the commands. A novice user does not know the commands. They need to see their options, follow guided steps, and have error prevention built into the flow.

One interface trying to serve both would compromise both. I built the TUI as a separate surface on the same processing core. The novice path walks through eight steps: launch, see options, select, choose a bundle, decide on HTML, review, confirm, execute. The expert path takes three. Both produce identical files in the same folder structure.

Archive structure as information architecture.

The folder hierarchy matches the way a user thinks about their own content. Batch fetches land in News/news_DD-MM-YYYY/, organized by date at the top level and by source below it. Single article captures land in Capcats/DD-MM-YYYY-Article-Title/, with the date prefix keeping them in chronological order regardless of source. Within each article folder, consistent structure: article.md and article.html for the content, comments.md and comments.html for the discussion, an images/ sub-folder for downloaded media, and a pdfs/sub-folder when applicable.

The naming conventions carry meaning intentionally. The date format: DD-MM-YYYY rather than YYYY-MM-DD because the archive serves humans every day. Article titles in folder names: slugified but kept readable, so a user browsing the archive in Finder or a file manager can identify content without opening anything.

This matters because tools other than Capcat can access the archive. A user might browse it in their file manager, search it with script, open it in Obsidian, or index it with a local search tool. The folder structure has to be legible to all contexts without Capcat acting as an intermediary. The IA dictated to make the archive self-documenting in which the structure itself tells you what is in it and when it arrived.

Dual Interface Architecture - Mental Models 256 KB

Dual Interface Architecture 277 KB

Iterative Design and Testing

Before writing and generating any code, I designed the CLI vocabulary. Every command follows a verb-first structure: fetch, bundle, single, list, catch.

I wanted someone who has used git or docker to open Capcat and already have a rough sense of how it works. The reference standard is clig.dev, which documents what human-centered command-line design looks like in practice.

The harder problem is what happens when someone opens the tool and does not know what to type. In a GUI, there is a menu or a home screen. In a terminal, there is a blinking cursor and nothing else. An uncertain user has nowhere obvious to go. I needed to solve that before building the TUI.

From the Application folder, ./capcat catch launches the full interactive menu. Six options appear on the screen. Arrow keys navigate. Every sub-menu carries an explicit "Back to Main Menu" option. During any text input, Ctrl+C returns to the menu instead of crashing the program. I did not want a single moment where someone feels stuck.

H3, User Control and Freedom: Users often perform actions by mistake. They need a marked "emergency exit" to leave the unwanted action without having to go through an extended process.

The same principle shaped the relationship between the two interfaces: neither locks the user in, both are independently complete, and they are separate tools that share a backend rather than modes you switch between mid-task.

The TUI follows H1 and H6. Every fetch operation reports progress as it runs, so the user always knows what the system is doing. The main menu shows all six paths on launch, so nothing requires memorization. Arrow keys move between options. Space toggles check boxes. Enter confirms. The interaction vocabulary is small enough to learn in one session.

H1, Visibility of System Status: The design should always keep users informed about what is going on, through appropriate feedback within a reasonable amount of time.

H6, Recognition Rather Than Recall: Minimize the user's memory load by making elements, actions, and options visible. The user should not have to remember information from one part of the interface to another.

On the CLI side I utilized H7. An expert user who types ./capcat fetch hn,bbc --count 20 --html gets the same output as someone who walked through every TUI step. The CLI is the accelerator layer that the novice never needs to see, but the power user depends on.

H7, Flexibility and Efficiency of Use: Shortcuts - hidden from novice users - may speed up the interaction for the expert user so that the design can cater to both inexperienced and experienced users. Allow users to tailor frequent actions.

The --html flag generates self-contained HTML files with embedded CSS and JavaScript. Each article page includes syntax highlighting for code blocks, dark and light theme support, responsive layout, breadcrumb navigation, and back/forward links between articles. The templates work with a minimal design system using CSS variables, so changing the look of every generated page takes one file edit. The HTML is self-contained, meaning you can send an article folder to someone, and they can open it in a browser with no dependencies, no server, no internet connection.

CLI Command Decision Tree 382 KB

Interactive Menu Decision Tree 340 KB

Heuristic-Driven Design Process 431 KB

The Final Solution

Bundles function as a driver for compacting sources. Predefined source groups organized by category: tech, techpro, news, science, AI, sports. I pick a bundle, choose whether to generate HTML, review a summary, confirm, and execute. Output lands in ../News/news_DD-MM-YYYY/, organized by source and date. The list flow works the same way but lets me hand-pick individual sources with check boxes instead of committing to a full bundle.

The single article path handles any URL. I paste it and the system figures out what to do. If the URL matches a known source, Capcat uses that source's config. If it matches a platform like Medium or Substack, a dedicated handler checks for paywall content and falls back when needed. If the URL is unknown, a generic scraper auto-detects selectors. The user does not see any of this. Output always lands in ../Capcats/DD-MM-YYYY-Article-Title/ as clean Markdown with downloaded images, self-contained and ready to share.

Source management handles the third case. Add a new RSS source by pasting a feed URL. The system validates it, suggests an ID, asks for a category, optionally assigns it to a bundle, and optionally runs a test fetch.

For complex sites, an interactive wizard generates a full YAML config with selectors, rate limiting, and skip patterns.

Bundle Selection Flow 505 KB

HTML Output

The HTML output is the result of TUI fetch with selected html option and --html argument called by CLI. This is also the most visible surface Capcat produces, the one a reader actually opens. Design style inspired by Bauhaus minimal aesthetic and Dieter Rams accent color practice.

The archive as publication. The index page presents each day's fetch as an organized collection: sources listed as sections, each article as a titled entry with its comment count and metadata. The reader can scan a day's content the way they would scan a newspaper front page, then go deeper into individual pieces. This is a different mental model from a bookmarks list or a folder of files. The archive has an editorial structure and feeling built in.

Ownership and portability as the primary constraint. Every design decision in the HTML output I evaluated against one question: does this work when the folder is closed and reopened on a different machine with no internet connection? This ruled out external fonts, hosted scripts, and CDN dependencies of any kind. It also shaped the image strategy. Saving images into the article folder at fetch time, so the article is visually complete offline. The result is an archive that belongs to the user in a literal sense. It does not require a service to remain running.

The HTML output is a folder. The user can move it, copy it, zip it and email it, or open it on any device. Nothing about how is generated imposes constraints.

Reading surface design. I designed the article page for sustained reading. Content width constrained to a comfortable line length. Headings use a serif typeface at a light weight, which creates a visual distinction between article content and interface chrome without requiring the reader to consciously register it. A reading progress bar at the top of the sticky header answers the implicit question any reader asks on a long article: how far through ? Navigation links appear at both the top and bottom of the article, so the reader never has to scroll back to find their way out. If needed a dedicated Go to top button provides this functionality.

Theme as user preference, not a feature. The dark and light theme toggle is a persistent choice, not a per-session setting. The user sets it once and the entire archive respects it, across every article and every navigation.

The design system behind both themes is fully open. Every color, spacing decision, and typographic choice in the output traces back to a single design-system.css file that ships with Capcat. A user who wants to change how the archive looks edits that file. The change propagates to every article on the next fetch. This is user control at the output level. The minimal design serves also as a good starting point for customization.

Comment hierarchy. Comments are one of the harder design problems in the output. A Hacker News discussion can run hundreds of replies deep, with threads that branch and re-branch. The challenge was making depth legible without making the page feel cluttered. The solution uses a left border on each comment thread shifts slightly in tone as depth increases. The hierarchy is visible at a glance. The reader can see at once which replies are top-level with nested responses, without needing to count indent levels or read author names to orient themselves.

Author anonymization is built into the output, not offered as a setting. Names are replaced at write time. The visual design reflects the comment header and shows a neutral label rather than drawing attention to identity.

Code syntax themes. Technical articles frequently contain code blocks, and I designed the syntax highlighting as two distinct themes rather than one inverted palette. The dark theme follows the One Dark convention. Keywords in purple, strings in green, numbers and attributes in orange, functions and built-ins in blue, variables in red, classes in yellow. The palette is warm, with high contrast against the near-black code background. The light theme follows the GitHub convention keywords to be in red with added weight, functions and classes in purple, strings in deep navy, numbers and constants in blue.

PDF integration. For sources that publish PDFs alongside articles, a linked reference bar appears at the top of the article page. The PDFs save locally in the article folder and link by filename. The reader can open them without leaving the archive or depending on the original source remaining available. This closes the gap between reading an article that references a paper and actually having that paper.

H2, Match Between System and the World: The design should speak the users' language. Use words, phrases, and concepts familiar to the user. Follow real-world conventions, making information appear in a natural and logical order.

Capcat HTML Output - Index view (dark theme)

Capcat HTML Output - Index view (light theme)

Capcat HTML Output - Article view (dark theme)

Capcat HTML Output - Article view (light theme)

Capcat HTML Output - PDF and code syntax

Validation and Impact

I counted every choice in the TUI. Hick's Law says decision time increases with the number of options. I kept the main menu at six:

What would you like me to do?

 > Catch articles from a bundle of sources
 Catch articles from a list of sources
 Catch from a single source
 Catch a single article by URL
 Manage Sources (add/remove/configure)
 Exit

Each sub-level adds two to three more, never a wall of choices. The bundle selection:

Select a news bundle and hit Enter for activation.

 > tech - Technology News
 (IEEE, Mashable)
 news - General News
 (BBC News, The Guardian)
 science - Science News
 (Nature News, Scientific American)
 ai - AI & Machine Learning
 (MIT News)
 sports - Sports News
 (BBC Sport)

The source list uses check boxes instead of single selection, so the user builds a custom set without navigating back and forth:

Select sources with <space> and press Enter to continue:

 [ ] hn Hacker News
 [x] lb Lobsters
 [x] iq InfoQ
 [ ] bbc BBC News
 [ ] guardian The Guardian

 (Use <space> to select, <enter> to confirm)

The single article flow strips the interface down to two questions:

(Use Ctrl+C to go to the Main Menu)

 Please enter the article URL: https://example.com/article

 Generate HTML for web browsing?
 > Yes
 No

And source management keeps its full capability behind one sub-menu with a clear exit:

Source Management - Select an option:

 > Add New Source from RSS Feed
 Generate Custom Source Config
 Remove Existing Sources
 List All Sources
 Test a Source

Back to Main Menu

The user never sees more than they need at any given step. Complexity is there, but it reveals itself as you go deeper.

Hick's Law, Progressive Disclosure: The time it takes to make a decision increases with the number and complexity of choices.

Miller's Law puts working memory capacity at around seven items, plus or minus two. Recent cognitive science suggests four plus or minus one for complex information. The CLI has nine primary commands: single, fetch, bundle, list, config, add-source, remove-source, generate-config, catch. That sits slightly above the threshold, but Capcat targets technically capable users who routinely hold more in working memory. Every command is a verb. Every verb maps to one action.

Miller's Law: The average person can hold about 7 (+-2) items in working memory at once. Recent cognitive science suggests 4+-1 is more accurate for complex information. The key principle: reduce cognitive burden by organizing information into meaningful, manageable chunks.

Jakob's Law says users bring mental models from every other tool they have used. Fighting those models creates friction.

./capcat fetch hn,bbc --count 20 maps to git fetch. The verb, the comma-separated targets, the flag structure. ./capcat bundle tech --count 10 maps to git bundle. ./capcat list sources maps to docker ps --list. ./capcat remove-source maps to how every package manager handles removal. A CLI-familiar user opens Capcat and already has a rough sense of how it works before reading any documentation.

Jakob's Law, Pattern Transfer: Users spend most of their time on other sites. This means that users prefer your site to work the same way as all the other sites they already know.

Hicks Law + Progressive Disclosure 310 KB

Millers Law applied to Capcat CLI 151 KB

Jakobs Law + Pattern Transfer 163 KB

Retrospective and Learning's

The System Architecture diagram has four layers: User Interface, Source System, Processing Pipeline, Output. The placement that matters most is Ethical Scraping. It sits inside the Source System at the same level as Source Factory and Source Registry.

Every source goes through the EthicalScrapingManager before it touches the network. Robots.txt is cached, rate limiting is enforced per domain, and the user agent identifies itself to the network. Ethical behavior is a constraint the architecture enforces, not a feature the user toggles.

This is an ethical constraint with a UX consequence. The user never has to wonder whether Capcat is behaving lawfully. That uncertainty is removed by design. And the tool identifies itself in the network. The product has integrity in both senses of the word.

In summary Ethical Scraping:

Respects robots.txt
Rate Limiting (1 request per 10 seconds)
Prefers RSS/APIs over HTML Scrapping
No Paywall Circumvention
Proper Source Attribution

A CLI and a TUI are two separate design problems that share a backend. Ethical constraints belong in the architecture.

Product Website

I built the website with the same process as the tool: context engineering, a PRD, and iterative development with an LLM. Most of the CSS I wrote by hand, including the design system, color decisions, spacing, and typography scale.

capcat.org serves three audiences. A user evaluating the tool sees the feature set, the interface options, the preconfigured sources, and a three-step getting started flow. A developer reads the architecture docs, the source development guide, and the API reference. A contributor finds the GitHub link, the issue tracker, and the ethical scraping guidelines. Primary navigation covers Features, How It Works, Tutorials, Case Study, Get Started, and Ethical Scraping. The footer carries a parallel layer grouped by function: Documentation, Resources, About.

The design system runs on CSS variables. Typography uses a system font stack with no external dependencies, a performance and accessibility choice, not a default. I designed the color palette around a nine-step orange scale. The primary accent color is the same orange used in the terminal output coloring, so the brand and the interface share one identity. Cream base background, dark ink for text, hover states, tints, and semantic aliases all derive from the same scale. The same tokens drive the Mermaid diagrams across all documentation pages.

Eight custom SVG icons illustrate the feature section, one per capability: Command-Line Mode, Interactive Menu, Bulk RSS Fetching, Local Markdown Storage, HTML Generation, Offline Accessibility, Add Your Own Sources, and the Capcat mascot.

The footer carries information for the project and documentation.

Branding and Illustration

The name is a compression. Cap from Capture. Cat from Catalogue. Those are the two core actions the tool performs. Hidden inside the same four letters are two more layers: Cat, the mascot that gives the product a face (a FOSS tradition), and cat, the Unix command that reads and outputs file contents. The name works on first encounter and holds more the longer you look at it.

The logotype is a handmade serif. The letter forms are a custom serif drawn by hand. The capital C opens with a pronounced curved terminal that no standard typeface carries, giving the word an immediate identity at the first letter.

Capcat Logotype - Brand Color

The white space inside the aperture curves in a way that reads as a cat's tail, so the mascot association is present in the first letter. The logotype carries the brand identity and the character reference in the same stroke.

The lowercase descender on the p is deep, anchoring the word to the baseline with a tail that echoes the mascot the second time, without depicting it. Ink traps at the stroke joints keep the forms clean at small sizes. The whole word sits on a precise construction grid, visible in the presentation versions, which shows the spacing and optical alignment are intentional rather than assumed.

Four versions cover the practical range of contexts where the logotype appears.

Capcat mascot - a cat dressed as a baseball player catching a loading ball

The mascot is a cat dressed as a baseball player, catching a loading ball from a progress bar. Behind it, a crowd of computers cheers from the stands. Every FOSS project has a mascot. The inspiration is the style of Top Cat cartoon. Top Cat is an American-animated sitcom produced by Hanna-Barbera Productions.

I wanted one that carried the product's name, referenced its function, and had enough personality to anchor a brand.

I created the illustration by hand on paper, refined it in Procreate, and vectorized it in Affinity Designer.

The slogan ties the most important functionality in one line:

"Archive Articles with Confidence. Share without Limits."