Open Specification · Draft

The web, spoken fluently to AI agents.

content-md is an open specification for high-fidelity textual representation for AI agents.

Built by a team that create documents every day

article.md
---
title: Introducing Content-md
description: >-
  AI agents should be first-class visitors,
  let's give them a tailored experience.
date: 2026-04-29
author: Alessio
license: CC-BY-4.0
---

# Introducing Content-md

AI Agents are increasingly browsing the web on behalf
of humans. The web was built with humans in mind that
demand quality and pleasant interaction. Agents go
straight to the point and prefer a more structured approach.

## The Problem

Converting complex HTML pages with navigation, ads,
and JavaScript into LLM-friendly plain text is both
difficult and imprecise.
YAML

Frontmatter

Serves as an introductory summary — ~100 tokens, ~540 characters. AI agents read this first to decide if the full document is relevant before fetching it. Functions as a lightweight preflighted index.

MD

Markdown body

CommonMark or GitHub-flavored Markdown. Must open with a first-level heading. Prefer text over images — link images and include alternate text. Preserve document hierarchy starting from level two.

Why content-md

AI agents can read HTML or complex formats. Time and tokens set the rules.

Use tokens wisely

Every token sent to an LLM is billed. A typical web page with navigation and layout markup can run to tens of thousands of tokens. This page weighs around 40 KB as HTML; as content-md, the same content fits in under 3 KB.

The creator wins

No scraper knows your content better than you do. Automatically converted HTML loses context, collapses structure, and makes wrong guesses. content-md is authored by the people who wrote the page.
Ecosystem

Tools & plugins.

These tools help you start serving content-md without building from scratch.

Caddy Content Negotiation

Serve pre-existing markdown files via the Caddy web server with proper content negotiation headers built in.

WordPress Post to Markdown

Serve post content as Markdown directly from Wordpress.
Comparison

vs. LLMs.txt, Agents.md and Skills.

Content-md vs. LLMs.txt

llmstxt.org ↗

Covers the whole website: one URL listing everything available. Think of it as a sitemap.

→ Predictable URL at website root

→ Birds-eye view of all content

They coexist like sitemaps and pages. content-md describes individual resources.

Content-md vs. Agents.md

agents.md ↗

Targets coding agents with README context for code repositories: build steps, tests, conventions.

→ Instruct coding agents

→ Repository-scoped

content-md does not target coding agents. The two serve entirely different contexts.

Content-md vs. Skills

agentskills.io ↗

Provides additional knowledge and a birds-eye view of available content to agents, packaged as folders.

→ Not discovered via direct URL

content-md responses are nearly compatible with Skills; the frontmatter fields map closely.

The web, spoken fluently to AI agents