Every few months a new acronym lands on the trade and someone asks whether they need to act on it. In 2026 it’s llms.txt and “AI crawler controls.” The honest answer for tree surgeons is: one is a small, sensible extra, and the other is a setting you should check today so you’re not accidentally invisible to ChatGPT and Google’s AI. This guide explains both in plain English, and tells you exactly what’s worth doing.

This is a practical, hands-on corner of our wider GEO and AI SEO playbook for tree surgeons — the discipline of getting recommended by AI engines, not just ranked by Google.

What is llms.txt?

llms.txt is a small plain-text file you place at the root of your website (yourdomain.co.uk/llms.txt) that points AI engines at your most important content. It uses simple Markdown: a heading with your business name, a short summary of what you do, and lists of links to your key pages — services, service areas, contact.

The idea, proposed by Jeremy Howard of Answer.AI in 2024, is that large language models work best when handed clean, curated content instead of being left to crawl and parse a messy website. llms.txt is essentially a tidy table of contents written for machines.

Two things to be clear-eyed about:

  • It’s a proposed convention, not an official standard. No AI engine is obliged to read it, and adoption in 2026 is still early and uneven.
  • It doesn’t change your visible site. Like robots.txt, it’s a behind-the-scenes file most of your customers will never see.

So why mention it at all? Because it’s cheap, harmless, and signals you’re early — and in a trade where almost no competitor has touched it, early matters.

What are AI crawlers, and how are they different?

An AI crawler is a bot that visits websites to gather content for an AI product — to train a model, or to fetch live information to answer a user’s question. They’re the AI-era cousins of Googlebot. You control them through your long-standing robots.txt file, the same place you’ve always told search engines which pages they may access.

Here are the AI crawlers most relevant to a UK tree surgeon in 2026:

Crawler (user-agent)Who runs itWhat it feeds
GPTBotOpenAIChatGPT training and browsing
OAI-SearchBotOpenAIChatGPT’s search results
Google-ExtendedGoogleGemini and AI training (separate from Googlebot)
PerplexityBotPerplexityPerplexity’s answer engine
ClaudeBotAnthropicClaude training and retrieval

A key point that trips people up: Google-Extended is separate from Googlebot. Blocking Google-Extended affects Gemini and Google’s AI training, but it does not remove you from normal Google Search or AI Overviews, which are served via Googlebot. Google has documented this distinction in its crawler help pages — if in doubt, check Google’s official “Google crawlers” documentation rather than guessing.

llms.txt vs robots.txt: what’s the difference?

They’re easy to confuse because both are plain-text files at your site root. They do opposite jobs:

  • robots.txt restricts. It’s a gatekeeper — it tells crawlers what they may and may not access. This is where you allow or block AI bots by name.
  • llms.txt promotes. It’s a guide — it highlights your best content and links to it so AI engines can find and understand you faster.

You can, and usually should, have both: an open robots.txt that welcomes the AI crawlers you care about, and an llms.txt that hands them your greatest hits.

Should a tree surgeon block AI crawlers?

For the overwhelming majority of tree surgeons: no. Here’s the reasoning.

The main argument for blocking is to keep your content out of AI training data. That makes sense for a publisher whose words are the product — a news site, a paywalled course. But your “product” is removing trees, reducing crowns and grinding stumps in specific towns. Your website exists to win that work. When a homeowner types “who can take down a dangerous oak near me?” into ChatGPT or Perplexity, you want to be in the answer — and you can’t be if you’ve blocked the crawler that builds it.

Worse, blocking often happens by accident. Some website builders, security plugins and CDN settings block AI bots by default, so a tree surgeon can be excluded from AI answers without ever choosing to be. That’s why checking your robots.txt is step one.

Block AI crawlers only if you have a concrete reason — for example, genuinely original written content you sell or licence. For a local trade chasing local jobs, an open door beats a closed one almost every time.

Will llms.txt get me cited by AI engines?

This is the honest part most articles skip. No single file makes AI engines quote you. An open robots.txt lets them read you; llms.txt helps them find your best pages; but the decision to cite you still comes down to relevance, clarity, structure and trust.

In other words, the files are plumbing. The water is your content. AI engines favour pages that answer a question directly, are cleanly structured, and come from a business they can verify as real and reputable. That’s why the heavy lifting lives elsewhere in this cluster:

Treat llms.txt as a tidy bonus on top of that work, not a replacement for it.

What should go in a tree surgeon’s llms.txt?

Write it the way you’d want to be quoted. Lead with facts, name your towns, list your services, and skip the marketing adjectives. A simple structure works:

# Smith's Tree Surgery

> NPTC-qualified, fully insured tree surgeons covering Guildford,
> Woking and Farnham. 24-hour emergency storm call-outs.

## Services
- [Tree removal](https://example.co.uk/tree-removal): safe felling and sectional dismantling.
- [Crown reduction & pruning](https://example.co.uk/crown-reduction): reshaping and thinning.
- [Stump grinding](https://example.co.uk/stump-grinding): below-grade stump removal.
- [Emergency tree work](https://example.co.uk/emergency): 24/7 storm-damage call-outs.

## Areas we cover
- [Guildford](https://example.co.uk/areas/guildford)
- [Woking](https://example.co.uk/areas/woking)

## Contact
- [Get a quote](https://example.co.uk/contact)

Use a quick checklist to keep it sharp:

  • H1 with your exact trading name
  • One-line summary naming your services and main towns
  • Links to each core service page with a plain-English description
  • Links to your top service-area pages
  • A contact or quote link
  • No fluff — every line should read well if an AI quoted it alone

How do you check which AI crawlers are visiting?

You don’t have to guess whether the bots are reaching you. Your hosting or server logs record the user-agent of every visitor, including bots. Search those logs for the names in the table above — GPTBot, Google-Extended, PerplexityBot, ClaudeBot, OAI-SearchBot — to see which engines are already reading your site, and whether any are being turned away.

If you’re on a managed platform (Wix, Squarespace, a hosted WordPress) you may find this in a security or analytics dashboard instead of raw logs. Either way, it tells you two useful things: which AI engines have discovered you, and whether a default setting is silently blocking one you’d rather allow. Pair this with proper measurement of AI search traffic and referrals and you start to see the AI channel clearly rather than treating it as a black box.

So — should tree surgeons bother?

Here’s the verdict, scaled to effort:

TaskWorth it?Why
Check robots.txt isn’t blocking AI botsYes — do it todayPrevents accidental invisibility in AI answers
Keep AI crawlers allowedYesYou want to be quoted to local searchers
Add an llms.txt fileYes — low effortCheap, forward-looking, first-mover signal
Obsess over llms.txt as your AI strategyNoContent, schema and trust decide citations
Block AI crawlers to “protect” contentRarelyCosts you visibility for little gain as a local trade

The whole job — reviewing robots.txt and adding a first llms.txt — takes under an hour. It won’t single-handedly land you in AI Overviews, but it removes a silent failure mode and plants an early flag in a space your competitors haven’t noticed yet.

Getting it done properly

The risk with files like these isn’t doing too little — it’s doing it wrong and not realising. A misplaced Disallow: / can hide your entire site; a default plugin setting can block ChatGPT’s crawler without a word. Because we come from a data and analytics background, our approach to SEO for tree surgeons is to check these controls, watch the crawler logs, and tie the whole AI channel back to tracked leads — the same way we rebuilt Jax Tree Removal’s site and wired up lead tracking so every enquiry could be traced to its source.

If you’re not sure whether your site is welcoming AI crawlers or quietly shutting them out, book a free audit and we’ll show you what your robots.txt is doing, whether you have llms.txt, and what’s actually worth fixing first.