Advancing Web Semantics: The Promise of the Block Protocol

By

From Documents to Data: The Web's Unfinished Journey

Since the mid-1990s, the World Wide Web has primarily served as a platform for publishing human-readable documents. Web pages are built with HTML, which provides basic formatting directives—like identifying paragraphs or emphasizing words. CSS adds visual flair, such as styling text in tiny gray sans-serif fonts. While this approach works well for human readers, it leaves computers largely in the dark about the actual meaning of the content.

Advancing Web Semantics: The Promise of the Block Protocol
Source: www.joelonsoftware.com

Consider a typical mention of a book on a web page: the title might be bolded, but a program reading the page cannot reliably distinguish that this is a book reference, let alone extract details like the author, illustrator, publisher, or year. The underlying structure is almost nonexistent.

The Semantic Web Vision

As early as 1999, Tim Berners-Lee articulated a dream for a more intelligent web—one where computers could analyze content, links, and transactions automatically. He wrote: “I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A ‘Semantic Web’, which makes this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines.”

To realize this vision, web authors would need to add structured metadata to their pages. Standards like schema.org provide vocabularies for describing things (books, events, people, etc.), and formats such as RDF and JSON-LD allow embedding that data within HTML. In theory, this would make web content both human- and machine-readable.

Why Adoption Stalled

Despite its promise, adding semantic markup has remained a tedious, homework-like task. After crafting a blog post, few authors have the motivation to research schema types and manually insert JSON-LD blocks. Without immediate reward or widespread tooling, most give up. As a result, semantic markup is rare on the web even two decades after the Semantic Web was first proposed.

Enter the Block Protocol

We believe the solution lies in lowering the barrier to entry. The Block Protocol is a new approach that enables content authors to add structured data as easily as they insert an image or a video. It works by defining reusable “blocks”—self-contained components that carry their own semantic meaning. For example, a book block would automatically include all relevant metadata (title, author, ISBN) in a machine-readable format, without requiring the author to write any special code.

Advancing Web Semantics: The Promise of the Block Protocol
Source: www.joelonsoftware.com

How It Works

Blocks are built on existing web standards and can be plugged into any supporting platform (like WordPress, Notion, or custom sites). Each block contains a piece of content (text, media, interactive widget) and an attached “schema” that describes its meaning in a structured way. When a user adds a block, the system automatically handles the structured data behind the scenes.

Benefits

The Path Forward

By making structured data a byproduct of normal content creation, the Block Protocol aims to finally realize the Semantic Web’s original promise. It shifts the effort from individual web authors to the developers who build these blocks, accelerating adoption. Human progress depends on making information more accessible—not just to people, but to the intelligent programs that can process it at scale. With the Block Protocol, that future is within reach.

To learn more about implementing blocks, see our introduction or the block protocol specification.

Tags:

Related Articles

Recommended

Discover More

The Psychedelic Renaissance: Who Is Being Left Behind?How to Build a Sovereign Cloud Strategy Using Microsoft’s Platform Approach10 Insights Into Open-Source Documentaries: The People Behind the CodeTesla’s 1 Million Humanoid Robots Per Year: Who Will Buy Them All?Amazon Bedrock Guardrails Gets Cross-Account AI Safety Controls – Centralized Enforcement Now GA