Best practices for LLMs

Optimising your site not just for search engine crawlers but also LLMs is an essential part of modern optimisation.

LLMs most of all love clean content, preferably markdown, and with Pullnote all of your content is stored in markdown format.

Best practices for adding markdown support for LLMs

1. Create a simple ".md" route that matches your content pages but with ".md" on the end, e.g. this page would be /docs/llms.md

In SvelteKit you can do this with a folder labelled [...path].md containing a +server.js page, e.g.:

// src/routes/[...path].md/+server.js
import { PULLNOTE_KEY } from '$env/static/private';
import { PUBLIC_PULLNOTE_API_URL } from '$env/static/public';
import { PullnoteClient } from '@pullnote/client';
import { error } from '@sveltejs/kit';

export async function GET({ params, url }) {
  const pn = new PullnoteClient(PULLNOTE_KEY, PUBLIC_PULLNOTE_API_URL);
  await pn.clear();
  let { path } = params;
  var note = await pn.get(path);
  if (!note) {
    throw error(404, "Note for " + path + " not found");
  }

  const title = await pn.getTitle(path);
  const content = await pn.getMd(path);
  const head = await pn.getHead(path);
  let imgUrl = note?.data?.imgUrl ?? (head?.imgUrl ?? null);
  if (imgUrl && !imgUrl.startsWith("http")) {
    imgUrl = PUBLIC_PULLNOTE_API_URL + imgUrl;
  }
  const cdnCacheControl = "public, max-age=2592000, stale-while-revalidate=86400";

let md = `---
title: '${title}'
date: '${new Date(note.modified).toISOString()}'
author: ${note?.author}
description: '${head?.description}'
image: ${imgUrl}
published: ${new Date(note.created).toISOString()}
type: 'article'
url: ${url.href.replace(/\.md$/, '')}
id: ${note?._id}
---

[image ${imgUrl} priority=true schema=false]

` + content;

  return new Response(md, {
    headers: {
      "Content-Type": "text/markdown; charset=utf-8",
      "Cache-Control": "public, max-age=3600",
      "CDN-Cache-Control": cdnCacheControl,
      "Vercel-CDN-Cache-Control": cdnCacheControl
    }
  });
}

2. Let the crawler know an alternative markdown version of your content is available much the same as an (RSS reader finds an RSS feed) by adding this to your header:

    <link rel="alternate" type="text/markdown" href={"/docs/llms.md"} />

3. (optional) Offer an alternate route that allows "Accept: text/markdown" requests in the header

// hooks.server.js
// Rewrite markdown-accepting requests to the .md endpoint so layouts/pages don't render.
export async function handle({ event, resolve }) {
  const method = event.request.method;
  const accept = event.request.headers.get('accept') || '';

  if ((method === 'GET' || method === 'HEAD') && accept.includes('text/markdown')) {
    const { pathname } = event.url;
    const lastSegment = pathname.split('/').pop() || '';
    const hasExtension = lastSegment.includes('.');

    if (pathname !== '/' && !pathname.endsWith('.md') && !hasExtension && !pathname.startsWith('/_app/')) {
      event.url.pathname = `${pathname}.md`;
    }
  }

  return resolve(event);
}

Using LLMs.txt

Add an llms.txt in your website's root folder (extend it if you already have one) to instruct the model on where the content is and how to digest it.
For example, here is our llm.txt:

## Clean Content Access

This site exposes a clean Markdown version of content for LLMs and other
machine readers. Use any of the methods below to retrieve the raw content
without HTML, CSS, or layout cruft.

### 1) Use the `.md` endpoint (recommended)

Append `.md` to any content page, e.g. `/docs/llms.md`

This returns raw Markdown with metadata and embedded image references.

### 2) Send `Accept: text/markdown`

If you request a normal page URL but include an `Accept: text/markdown`
header, the server rewrites the request internally to the `.md` endpoint.

Example:

curl -H "Accept: text/markdown" https://www.pullnote.com/docs/llms.md

### Notes

* The `.md` endpoint returns `Content-Type: text/markdown; charset=utf-8`.
* Redirects for renamed content preserve `.md` extensions.
* For structured metadata, read the YAML-style front matter at the top
    of the markdown response.