You select an article, hit Cmd+C, paste it into your notes app — and the result
is a wall of unstyled text. Headings collapsed into normal paragraphs, lists turned into
comma-separated runs, links stripped to bare anchor text, sidebars and cookie banners
sprinkled between paragraphs. Plain copy-paste was designed for prose snippets, not for
capturing a structured document.
Markdown solves this. It is a lightweight format that keeps headings, lists, links, code blocks, and tables intact while staying readable as plain text. Once a webpage is in Markdown, you can drop it into a prompt for ChatGPT or Claude, archive it in Obsidian, publish it to Notion, or commit it to a documentation repository — without spending five minutes cleaning up formatting by hand.
This guide walks through why standard copy-paste fails for structured content, what Markdown gives you in return, and how a one-click browser extension — Copy Content — turns the operation into a single keystroke.
Why Standard Copy-Paste Loses the Structure
When you copy text from a browser, the source data is HTML. The destination decides what to do with it. A rich-text editor (Word, Google Docs, Notion's main canvas) tries to preserve the visual styles and ends up importing the page's fonts, colors, inline backgrounds, and sometimes its CSS classes. A plain-text editor (a code editor, the terminal, an LLM prompt box) drops the HTML entirely and keeps only the visible characters.
Either path loses information you actually want to keep:
- Headings disappear. An
<h2>becomes a normal paragraph indistinguishable from the body. The document outline is gone. - Lists become text blobs.
<ul>and<ol>elements often paste as a single line of comma-separated items, with their bullets and numbering stripped. - Links lose URLs. An anchor renders as its inner text — readers cannot tell which words were clickable, let alone where they pointed.
- Code blocks get reflowed. A pre-formatted code sample loses its indentation and line breaks; semicolons and brackets end up on the wrong side of the wrap.
- Tables collapse. Rows merge into one line; columns become unaligned text separated by random spaces.
On top of that, modern article pages carry a lot of cargo: cookie banners, newsletter prompts, author bios, related-articles widgets, share buttons, advertising slots. A naive Select-All grabs all of it. Cleaning it up manually is exactly the busywork you wanted to avoid.
What Markdown Gives You
Markdown is plain text with a small set of conventions for structure. A line that starts
with ## is a heading. A line that starts with - is a list item. [text](url) is a link. ``` wraps a code block. The same file opens cleanly
in a text editor, renders correctly on GitHub, imports into Obsidian and Notion, and is exactly
the format LLMs are trained to read.
Concretely, converting a page to Markdown preserves:
- Document outline. H1, H2, H3 levels are encoded as
#,##,###— a heading hierarchy survives the round trip. - Linked references. Both anchor text and URL are kept side by side, so you can later cite or follow the source.
- Code samples. Indented blocks and language hints stay intact, which is critical for technical content.
- Tables. Pipe-and-dash syntax keeps row and column structure readable and re-renderable.
- Emphasis. Bold and italic markers are preserved, so the author's highlighting does not get flattened into uniform prose.
The same Markdown file you saved for your notes can be pasted unchanged into a Claude or ChatGPT prompt. The model reads the structure as you intended it — headings define sections, lists define enumerations, code blocks define literal content the model should not paraphrase. A wall of unstructured text gives the model far less to work with.
Why Markdown Is the Right Format for LLM Prompts
LLMs were trained on huge volumes of Markdown — every README on GitHub, every Stack Overflow answer, most documentation sites. The format is native to them. When you feed a prompt that contains Markdown structure, the model has clearer signals about what each section means.
Practical effects when researching with Claude or ChatGPT:
- Better summaries. The model can target a specific heading ("summarize section 3") instead of the whole blob.
- Fewer hallucinated quotes. A code block makes it obvious what is literal source material; the model is less likely to "improve" it.
- Cleaner follow-ups. "Compare the bullet list at the top to the table at the bottom" is a meaningful instruction only if both survived the paste.
- Token efficiency. Markdown is dense. The same article in Markdown uses fewer tokens than the same article pasted as raw rendered HTML or as a screenshot converted by OCR.
Real Workflows Where This Matters
Research and AI Prompts
You are comparing three tutorials to ask Claude which approach is correct. Copy each one to Markdown, paste all three under labelled headings into a single prompt, and ask the model to pick out contradictions. The headings give the model anchors; the lists and code blocks keep technical detail intact.
Knowledge Bases — Obsidian and Notion
Obsidian stores notes as Markdown files on disk. A Markdown clip from a webpage can be saved
directly into your vault with no conversion step. Notion's /import accepts Markdown and reproduces headings, callouts, and toggles. Compared to clipping a rendered
article via a Web Clipper, Markdown is editable, diffable, and future-proof.
Technical Documentation Migrations
Moving a help center from a hosted CMS to a static-site generator usually means converting published HTML pages back to Markdown. A one-click extraction is far faster than running each URL through a server-side converter, especially when you only need a few articles.
Drafting Newsletters and Posts
Editors who curate links into newsletters need each excerpt with its title, source link, and a short quote. A Markdown extraction gives all three at once and pastes into platforms like Substack or Beehiiv with formatting preserved.
How Copy Content Cleans the Page Before Conversion
Naively converting full HTML to Markdown still produces noise — the cookie banner becomes a Markdown blockquote, the share-buttons row becomes a list of icon names. The Copy Content extension applies a content filter before the conversion runs:
- Boilerplate removal. Navigation bars, footers, sticky banners, and inline ad slots are dropped before the Markdown is generated.
- Main-content detection. The extension targets the
main,article, and equivalent semantic regions of the page. Sidebars and comment sections are excluded by default. - Element picker. When the auto-detection grabs too much or too little, you can click a single element on the page and copy only its subtree — useful for one documentation section out of a long page.
- Structure preservation. Headings, ordered and unordered lists, links with their URLs, inline code, fenced code blocks, blockquotes, and tables all map to their Markdown equivalents.
- One keystroke. The default shortcut (
Alt+C, orOption+Con macOS) performs the extraction and writes Markdown to the clipboard. No popup, no settings dialog.
Everything happens locally in the browser. There is no remote service, no upload, no account. The page DOM is read by the extension, transformed in memory, and the resulting Markdown is placed on the system clipboard.
How It Compares to Common Alternatives
Browser "Reader Mode"
Reader mode strips ads and renders the article in a clean view. It is great for reading, but the output is still HTML. Copying from reader mode produces the same flat text problem as copying from the original page. There is no Markdown export.
Server-Side URL-to-Markdown Services
Tools that take a URL and return Markdown work, but require a network round-trip and break on pages behind authentication, paywalls, or single-page-app routing. They cannot extract a section of a page — only the whole URL. Copy Content runs on the rendered DOM, so logged-in pages and SPAs work the same as plain articles.
MarkDownload and Other Browser Extensions
MarkDownload is a popular alternative with download-to-file features. Copy Content is lighter: one click to clipboard, no download, no settings panel to configure templates. It also ships natively for Firefox, where MarkDownload's experience is less polished, and includes the per-element picker for grabbing one section instead of the whole document.
Manual Cmd+Shift+V "Paste Without Formatting"
Pasting without formatting solves the cosmetic style problem (no inherited fonts) but deletes all structure as well. Headings, lists, and links all flatten into one stream. Markdown is the opposite trade — drop the visual styling, keep the structure.
A Concrete End-to-End Example
You read a tutorial about HTTP caching, want Claude to compare it to two others, and store the result in your Obsidian vault. The workflow with Copy Content:
- Open the article in your browser. Press
Alt+C. The clean Markdown is on your clipboard — no banners, no related-article widgets, structure intact. - Paste it into a Claude prompt under a heading like
## Source 1: HTTP Caching Tutorial. Repeat for the other two articles. - Ask Claude: "Compare the three sources above. Which recommendations conflict, which agree, and which are unique to one source?"
- Save Claude's answer plus the original three Markdown blocks into a single
.mdfile in your Obsidian vault. The headings become navigable in the file's outline view; the links stay clickable.
The whole sequence takes a couple of minutes. Without a Markdown extractor in the loop, each article would need manual cleanup — fixing headings, restoring lists, deleting cookie banners — and the LLM step would receive lower-quality input.
Get the Extension
Copy Content is free, works in Chrome, Edge, and Firefox, and runs entirely in your browser. Install it from the linked page, pick a keyboard shortcut, and the next time you need to capture a webpage's content as clean Markdown, it is one keystroke away.