angeo / module-llms-txt
angeo/module-llms-txt
Magento 2 module for AI Engine Optimization (AEO). Generates spec-compliant llms.txt and llms-full.txt per llmstxt.org standard, plus streaming JSONL for vector indexing. Multi-store, multi-website, CLI, cron, async admin UI, Page Builder-aware sanitization, customer-group pricing, atomic writes, ETag/Cache-Control, .md mirrors.
Angeo LLMs.txt — Magento 2 Module
AI Engine Optimization (AEO) for Magento 2 / Adobe Commerce. Generates
spec-compliant llms.txt, llms-full.txt, and JSONL files so ChatGPT,
Claude, Gemini, Perplexity, and other LLM-powered crawlers can ingest your
catalog efficiently.
Current version: 3.2.0 — performance release. Opt-in single-pass
generation pipeline: one catalog pass per store renders all enabled
formats (default stayslegacy; legacy pipeline and the old
ProviderInterfaceSPI are deprecated and will be removed in 4.0.0).
Builds on 3.1.1 (SQL-level stock filtering, price-index pricing, dedicated
cron group) and 3.1.0 (security & hardening). See
CHANGELOG.md for details and upgrade notes.
What this module does
After install, your storefront serves:
| URL | What it is |
|---|---|
https://shop/llms.txt |
Spec-compliant llmstxt.org file (compact markdown) |
https://shop/llms-full.txt |
Same structure, full sanitized descriptions inline |
https://shop/llms.jsonl |
One JSON record per line, for vector indexing |
https://shop/{url-key}.md |
On-the-fly Markdown mirror of any product/category/CMS page |
Generation happens via cron (daily by default), CLI, or the admin "Generate
Now" button. The output is streamed to disk with bounded memory, atomically
renamed on completion, and served with proper ETag / Cache-Control headers.
Why this module exists
LLM crawlers can ingest a typical Magento storefront — full theme, JS, image
sprites, navigation chrome — but that's wasteful for everyone. The
llmstxt.org standard defines a clean text format
optimized for AI ingestion: stable links, structured headings, descriptions
in their natural prose form rather than buried in product cards.
This module produces that format for Magento, with care taken for the things
Magento makes hard: multi-store layout, Page Builder content, CMS directive
resolution, customer-group pricing, and very large catalogs.
Installation
composer require angeo/module-llms-txt:^3.0
bin/magento module:enable Angeo_LlmsTxt
bin/magento setup:upgrade
bin/magento setup:di:compile # only in production mode
bin/magento setup:static-content:deploy adminhtml # only in production mode
bin/magento cache:flush
Then generate your first batch:
bin/magento angeo:llms:generate
Visit https://your-store.tld/llms.txt.
Configuration reference
All settings live at Stores → Configuration → Angeo → LLMs.txt.
General
| Field | Default | Notes |
|---|---|---|
| Enable | Yes | Master switch. |
| Exclude This Scope | No | Available at website + store scope. Skips generation for this scope. |
| Store Summary | — | One-line summary used as the spec-compliant blockquote. If empty, falls back to Design → HTML Head → Default Description. |
Content
| Field | Default | Notes |
|---|---|---|
| Include Categories | Yes | |
| Include CMS Pages | Yes | |
| Include Products | Yes | |
Products under ## Optional |
Yes | Recommended. Lets context-budget-constrained AI clients drop products without losing categories / pages. |
| Product Limit | 5000 | 0 = unlimited. |
| Exclude Out-of-Stock Products | No | |
| CMS Identifiers to Exclude | no-route, enable-cookies, privacy-policy-cookie-restriction-mode |
Comma- or newline-separated. |
| Customer Group for Pricing | NOT LOGGED IN | Which group's final price (with special / group prices) is exposed. |
Output formats
| Field | Default | Notes |
|---|---|---|
| Generate llms.txt | Yes | |
| Generate llms-full.txt | No | 5–50× larger; enable only if you actually want it. |
| Generate JSONL | Yes | One record per line; embeds-ready. |
Serve /url-key.md Mirrors |
No | Per-entity Markdown rendering; on-the-fly, no disk. |
Content sanitization
| Field | Default | Notes |
|---|---|---|
| Resolve CMS Directives | Yes | Renders {{widget}}, {{block}}, {{var}} via Magento's frontend filter. |
| Page Builder Strategy | Exclude | See below. |
| Excluded Content-Types | products, banner, slider, slide, video, map, buttons, button-item, block, dynamic-block, divider, spacer |
Used under Exclude strategy. |
| Allowed Content-Types | text, heading, html, tabs, tab-item, row, column, column-group |
Used under Allow strategy. |
Page Builder strategies
| Strategy | Effect |
|---|---|
| Preserve | Keep all Page Builder content; only strip wrapper attributes. |
| Exclude | Drop elements whose data-content-type is in the excluded list. Default. |
| Allow | Drop everything EXCEPT data-content-type in the allowed list. |
| Strip | Drop ALL elements that carry a data-content-type attribute. |
The filter parses content with DOMDocument (not regex), so nested Page
Builder containers are handled correctly. Known content-types include:
row, column-group, column, tabs, tab-item, text, heading,
html, image, video, map, divider, spacer, buttons, button-item,
banner, slider, slide, products, block, dynamic-block.
Performance
| Field | Default | Notes |
|---|---|---|
| Collection Page Size | 1000 | Lower if hitting memory limits on shared hosting. |
HTTP caching
| Field | Default | Notes |
|---|---|---|
| Cache-Control TTL (s) | 3600 | Sent as public, max-age=… on the served files. |
Cron
| Field | Default | Notes |
|---|---|---|
| Cron Expression | 0 2 * * * |
Daily at 02:00 server time. |
CLI commands
# Generate everything for all eligible stores
bin/magento angeo:llms:generate
# Single store, skip JSONL
bin/magento angeo:llms:generate --store=default --no-jsonl
# Per-store/per-format last-run status
bin/magento angeo:llms:status
# Lint generated files for spec compliance
bin/magento angeo:llms:validate
Extending — custom providers
Drop a new section into llms.txt (e.g. a "Brands" list, a "Recent Posts"
section, etc.) by implementing Angeo\LlmsTxt\Api\ProviderInterface and
registering it via di.xml.
namespace Vendor\Module\Provider\Llms;
use Angeo\LlmsTxt\Api\OutputContextInterface;
use Angeo\LlmsTxt\Model\Provider\AbstractProvider;
class BrandsProvider extends AbstractProvider
{
public function provide(OutputContextInterface $context): iterable
{
yield "## Brands\n\n";
foreach ($this->brandRepo->getList($context->getStore()->getId()) as $brand) {
$label = $this->escapeMarkdown($brand->getName());
yield "- [{$label}]({$brand->getUrl()})\n";
}
yield "\n";
}
}
<!-- etc/di.xml -->
<type name="Angeo\LlmsTxt\Model\Generator\LlmsTxtGenerator">
<arguments>
<argument name="providers" xsi:type="array">
<item name="brands" xsi:type="object">Vendor\Module\Provider\Llms\BrandsProvider</item>
</argument>
</arguments>
</type>
The base class gives you escapeMarkdown(), encodeJsonl(), isJsonl(),
isFullTxt(), and isApplicable() overridable to opt out per-format.
Extending — custom sanitizer filters
Insert your own filter between Page Builder and HTML stripping (e.g. to
remove <script> data attributes, redact phone numbers, etc.) by implementing
Angeo\LlmsTxt\Api\SanitizerFilterInterface and re-declaring the pipeline
in di.xml.
<type name="Angeo\LlmsTxt\Model\Sanitizer\Sanitizer">
<arguments>
<argument name="filters" xsi:type="array">
<item name="cms_directive" xsi:type="object">Angeo\LlmsTxt\Model\Sanitizer\Filter\CmsDirectiveFilter</item>
<item name="page_builder" xsi:type="object">Angeo\LlmsTxt\Model\Sanitizer\Filter\PageBuilderFilter</item>
<item name="redact_pii" xsi:type="object">Vendor\Module\Sanitizer\Filter\PiiRedactionFilter</item>
<item name="html" xsi:type="object">Angeo\LlmsTxt\Model\Sanitizer\Filter\HtmlFilter</item>
<item name="whitespace" xsi:type="object">Angeo\LlmsTxt\Model\Sanitizer\Filter\WhitespaceFilter</item>
</argument>
</arguments>
</type>
Events
Hook in via observers — three events are dispatched per store/format pass:
| Event | Data |
|---|---|
angeo_llms_generation_before |
store, format, context |
angeo_llms_generation_after |
store, format, file, bytes, items, duration |
angeo_llms_generation_failed |
store, format, error |
Migrating from 2.x
- Old files in
media/llms/can be deleted (output now lives inmedia/angeo/llms/). - Any custom
ProviderInterfaceimplementations must change from returning astringto yieldingiterable<string>. See Extending — custom providers. - Drop any reverse-proxy / Nginx rewrites pointing at the old paths.
- Re-run Stores → Configuration → Angeo → LLMs.txt to set the new fields (Page Builder strategy, customer group, etc.).
- External tooling that called the GET
/admin/angeo_llms/generate/indexURL must switch to the CLI command (the admin endpoint is now POST + CSRF).
License
MIT — see LICENSE.
Support
- GitHub Issues: https://github.com/angeo-dev/module-llms-txt/issues
- Email: [email protected]
Changelog
All notable changes to Angeo_LlmsTxt are documented in this file.
The format follows Keep a Changelog,
and this project adheres to Semantic Versioning.
[3.2.0] — 2026-06-10
Single-pass generation pipeline (opt-in). Fully backward compatible: the
default mode remains legacy, all pre-3.2 behavior, file paths, events, and
extension points keep working unchanged. Everything superseded is marked
@deprecated and will be removed in 4.0.0.
Added
- Single-pass pipeline (
Model/Pipeline/SinglePassGenerator). With
Stores → Configuration → Angeo LLMs.txt → Performance → Generation Pipeline = Single pass, each store's catalog is iterated once and every
enabled format (llms.txt, llms-full.txt, llms.jsonl) is rendered from that
one pass:- one frontend emulation per store (legacy: one per format),
- one url_rewrite warm-up per store (legacy: one per format),
- each entity loaded and sanitized exactly once (legacy: 2–3× per
product description), - all format files written in parallel streams with atomic rename, under one
per-store lock (media/angeo/llms/store_{code}.lock).
Combined with 3.1.1 this gives roughly 3× faster generation on top of the
3.1.1 gains, with identical output files.
- New
@apiextension points (implement these going forward):Api\EntityProviderInterface— yields format-agnostic entity records once
per entity (successor of the format-specificProviderInterface);Api\Data\EntityRecordInterface+Model\Data\EntityRecord— immutable
record DTO carrying already-sanitized content;Api\FormatRendererInterface— serializes records into one output format;Model\Output\FilePathResolver— the single source of truth for generated
file paths (used by both pipelines and the frontend controller);Model\Text\Truncator— shared word-boundary truncation (the Sanitizer
now delegates to it; behavior is byte-identical).
- Bundled single-pass providers/renderers registered via
di.xml
(SinglePassGenerator→entityProviders,renderers). Third parties add
their own items the same way. Model/Config/Source/GenerationMode+ new system.xml field
angeo_llms/performance/generation_mode(global scope, defaultlegacy).- Unit tests:
TruncatorTest, including the down-truncation invariant that
guarantees single-pass renderers reproduce legacy truncation byte-for-byte.
Backward compatibility
generation_modedefaults to legacy — upgrading changes nothing until
you opt in.- In single-pass mode the output files, on-disk paths, served URLs, generation
status records, and theangeo_llms_generation_before/after/failedevents
(dispatched per format) are identical to legacy. - Custom providers built on the legacy
ProviderInterfacekeep working in
both modes. In single-pass mode they are detected automatically (anything
registered on the legacy generators beyond the bundled providers) and
executed through a compatibility pass that appends their output to the
corresponding format stream. - The only semantic difference: the
itemscounter in generation status now
counts rendered records rather than raw stream chunks.
Deprecated (removal in 4.0.0)
Api\ProviderInterfaceandModel\Provider\AbstractProvider— implement
Api\EntityProviderInterfaceinstead.- All eight bundled legacy providers under
Model\Provider\Llms\*and
Model\Provider\Jsonl\*— superseded byModel\Pipeline\Provider\*+
format renderers. Model\Generator\AbstractGenerator,LlmsTxtGenerator,
LlmsFullTxtGenerator,JsonlGenerator— superseded by
SinglePassGenerator; file-path resolution moved toFilePathResolver.- The
legacygeneration mode itself: 4.0.0 ships single-pass as the only
pipeline and removes everything listed above.
Changed (internal, not @api)
Service\GenerationServiceroutes by generation mode; new constructor
dependency (SinglePassGenerator).Controller\Index\Indexresolves file paths viaFilePathResolverinstead
of the deprecated generators (constructor change).Model\Sanitizer\Sanitizeraccepts an optionalTruncator(defaults
internally — existing instantiations and tests are unaffected).AbstractGenerator::getProviders()added so the single-pass pipeline can
discover third-party legacy providers.
Upgrade notes
bin/magento setup:upgrade && bin/magento setup:di:compile- Optional but recommended: switch Performance → Generation Pipeline to
Single pass, runbin/magento angeo:llms:generate, and diff the
generated files against the legacy output for your data. - If you maintain custom providers, plan their migration to
EntityProviderInterfacebefore 4.0.0.
[3.1.1] — 2026-06-10
Performance release. No public-API changes; drop-in upgrade from 3.1.0.
Performance
- Out-of-stock filtering moved into SQL. Both
ProductProviders now use
StockHelper::addIsInStockFilterToCollection()(a JOIN on
cataloginventory_stock_status) instead of oneStockRegistryround-trip
per product. On a 100k-SKU catalog with Exclude Out-of-Stock enabled this
removes ~100,000 queries per format per store. - Prices come from the price index. Product collections call
addPriceData($customerGroupId, $websiteId); the final price (group-aware,
special-/tier-price-aware) is read from the joined
catalog_product_index_pricecolumn instead of invoking the PHP price
calculation chain per product — which for configurable/bundle products
lazy-loads child products (another hidden N+1). A per-product fallback to the
legacy calculation remains for rows missing from the index (e.g. reindex
pending). - Dedicated cron group
angeo_llmswithuse_separate_process=1
(newetc/cron_groups.xml). Long generation runs no longer block
default-group jobs (transactional emails, scheduled indexers, etc.). - Default
collection_page_sizelowered 1000 → 500. Each page holds full
HTML descriptions of every product in memory; 500 halves the peak without a
measurable throughput cost. Explicitly configured values are unaffected. - Duplicate-description sanitization skipped in
llms-full.txt: when
descriptionis byte-identical toshort_description(a common merchant
pattern), the content is sanitized once instead of twice.
Behavior notes
- Exclude Out-of-Stock is now strict: products whose stock status cannot be
resolved are excluded by the SQL filter, whereas 3.1.0 included them on
lookup failure ("default in stock"). With a healthy stock index the output
is identical. - Prices require the price index to be up to date (
bin/magento indexer:reindex catalog_product_price) — standard for any production store; stale index
rows fall back to the slow per-product calculation rather than emitting a
wrong price. - The cron job moved from group
defaultto groupangeo_llms. If your
crontab invokesbin/magento cron:runwith explicit--groupfilters, add
the new group. - Internal constructor change (not
@api): bothProductProviders now take
Magento\CatalogInventory\Helper\Stockinstead of
StockRegistryInterface. Recompile DI (setup:di:compile); if you extended
these concrete classes, update your constructors.
MSI note
Stock filtering still reads the legacy cataloginventory_stock_status table,
which MSI keeps in sync for the default stock. Multi-source/multi-stock setups
that need salable-quantity semantics per stock should override the providers —
now a single JOIN swap instead of a per-product call.
[3.1.0] — 2026-06-10
Security & hardening release following an external security code review.
Upgrading is strongly recommended for all installations, especially those
with the .md mirror feature enabled.
Security
- [HIGH]
.mdmirror no longer serves disabled or hidden entities
(information disclosure).Controller/Index/MdMirrornow verifies entity
state before rendering: products must be Enabled, catalog-visible, and
assigned to the current website; categories must be active; CMS pages must
be active. Previously a staleurl_rewriterow could expose embargoed,
recalled, or intentionally unpublished content — including price and full
description — at/{url_key}.md. Hidden entities now return the same 404
as unknown paths, so their existence is not confirmed. - [HIGH]
.mdmirror DoS mitigation. Rendered markdown is now cached in
the Magento cache (tagANGEO_LLMS_MD, TTL = configured HTTP Cache-Control
TTL), so crawls no longer re-trigger entity loads, CMS directive resolution,
and DOM-based sanitization on every request. Unknown paths are
negative-cached for 5 minutes to blunt enumeration sweeps; request paths
longer than 1024 bytes are rejected outright. The cache is flushed
automatically after every generation run, so mirrors never serve a stale
catalog state for a full TTL. - [HIGH] Frontend router no longer hijacks the
*.mdURL space
(route hijacking / availability). The routersortOrdermoved from 10 to
70 — after the urlrewrite (20), standard (30), and CMS (60) routers — so any
real merchant content whose URL ends in.mdalways wins; this module only
claims paths that would otherwise 404. The.mdbranch is additionally
gated on the md-mirror feature being enabled for the resolved store: when
the feature is off, the router declines the match instead of swallowing the
request with a 404. - [MEDIUM] Template-directive injection surface reduced for product content.
{{block}}/{{widget}}/{{var}}resolution inside product attribute
content (descriptions frequently imported from supplier/PIM feeds) is now
controlled by a separate flag,angeo_llms/sanitizer/resolve_directives_products,
default OFF. When off, directives found in product content are stripped —
never resolved and never leaked as source. CMS pages and categories keep the
existingresolve_directivesbehavior. On any directive-resolution failure
the filter now strips directive source instead of returning it raw. - [MEDIUM]
HtmlFilteroutput-encoding fixes (stored-XSS defense for
downstream consumers; secret-leak prevention):- HTML entities are decoded before the final tag-strip pass, then the
result is stripped again —<script>…</script>can no longer
materialize as live markup in the generated output. - Unterminated
<script>/<style>blocks (and unterminated HTML
comments) are removed to end-of-input, so inline JS — which can carry
analytics tokens or API keys — can never leak intollms.txt,
llms-full.txt, or.mdmirrors.
- HTML entities are decoded before the final tag-strip pass, then the
- [MEDIUM] Wholesale-price disclosure warning. The Customer Group for
Pricing admin field now carries an explicit warning that the generated
files are public and CDN-cacheable, and that selecting a logged-in / B2B
group publishes that group's negotiated pricing to the internet. - [LOW] Admin error messages no longer expose exception internals. The
"Generate Now" and "Schedule" actions log full exceptions to
var/log/system.logand show a generic message in the admin UI. - [LOW]
X-Content-Type-Options: nosniffis now sent on all.mdmirror
responses and all 404 responses (previously only on the file endpoint's
200 responses). - [LOW] Admin status panel embeds its polling URL via
json_encode()
instead of raw string interpolation inside a<script>block, per Magento
secure-rendering guidelines.
Fixed
- Large-file serving no longer loads the whole file into PHP memory.
Controller/Index/Indexstreams files above 4 MB to the client in 256 KB
chunks; concurrent requests for a multi-hundred-MBllms-full.txtcan no
longer exhaust the PHP memory limit.Content-Lengthis now always sent. - All file serving goes through Magento's
Filesystemabstraction —
no nativeis_file/filemtime/file_get_contentson raw paths —
making the endpoint compatible with Adobe Commerce Cloud remote storage
(AWS S3) drivers. - Generation status writes are now concurrency-safe.
GenerationStatusRepositoryperforms a locked read-modify-write (flock on a
sidecar lock file) followed by an atomic tmp-rename, so parallel
generators / cron / CLI runs can no longer lose each other's updates or
leave a truncatedstatus.json. - "Schedule (Async)" no longer piles up duplicate cron jobs. A new run is
only queued when noangeo_llms_generaterow is already pending or running;
the admin is informed otherwise. - Corrected a misleading comment in
MdMirror: the rewrite-lookup fallback
appends the configured.htmlURL suffix (it never tried a trailing slash).
Changed
UrlResolver::warmUp()streamsurl_rewriterows from the DB cursor
instead offetchAll(), roughly halving peak memory on very large rewrite
tables.- New public API:
AbstractGenerator::getRelativePath()(media-relative path
of the generated file; preferred overgetFilePath()for
Filesystem-abstraction readers).getFilePath()is retained for backward
compatibility. - New well-known shared-context key
OutputContextInterface::SHARED_ENTITY_TYPE; all bundled providers and the
.mdmirror publish it before sanitizing so filters can apply
entity-specific policies. Third-party providers are encouraged to do the
same. - Admin field comments updated (md-mirror caching behavior, directive
resolution semantics).
Added
- Config:
angeo_llms/sanitizer/resolve_directives_products(default0). - Cache tag
ANGEO_LLMS_MDfor rendered.mdmirrors (flush with
bin/magento cache:cleanor automatically on each generation run). - Unit tests:
HtmlFiltersecurity regressions (unterminated script blocks,
entity-encoded markup resurrection, legitimate<text preservation) and
CmsDirectiveFilterproduct-content gating.
Upgrade notes
- Run
bin/magento setup:upgrade && bin/magento cache:flushafter deploying. - If you relied on
{{widget}}/{{block}}directives inside product
descriptions being rendered into the generated files, re-enable this
explicitly at Stores → Configuration → Angeo → LLMs.txt → Content
Sanitization → Resolve Directives in Product Content after reviewing the
security note on that field. - If a customization called
AbstractGenerator::getFilePath()to read
generated files, consider migrating togetRelativePath()plus a
Filesystemmedia read-directory for remote-storage compatibility. - Behavior change: URLs ending in
.mdthat collide with real merchant
content are now served by that content (the mirror no longer takes
precedence). URLs of hidden or disabled entities now return 404.
[3.0.5] — 2026-06-04
Admin-config bugfix. Safe drop-in upgrade from 3.0.x.
Fixed
- System Config "Save Config" no longer throws
Cannot read properties of undefined (reading 'settings'). TheGeneratebuttonfrontend_model
template (generate_button.phtml) rendered two<form>elements inside
the admin system-config form (#config-edit-form). Nested forms are invalid
HTML: the browser re-parents the inner inputs/buttons onto the outer form, so
on Save the jQuery validator (jquery.validate.js metadataRules) iterated an
orphaned submit button that has no rule metadata and crashed, aborting the
whole submit. The buttons are now plaintype="button"elements that POST via
a JS-built form appended to<body>(outside the config form). CSRF
protection is unchanged — the form key is still submitted.
Install-blocking bugfix plus PHP 8.5 support. Safe drop-in upgrade from 3.0.x.
Fixed
setup:upgradeno longer fails XSD validation onetc/adminhtml/system.xml.
Two<comment>elements (cache_ttl_secondsandschedule) contained raw
<code>HTML without a CDATA wrapper.system_file.xsdonly allows amodel
child inside<comment>, so the literal markup tripped
Element 'code': This element is not expected. Expected is ( model )and
aborted module loading. Both comments are now wrapped in<![CDATA[ … ]]>,
matching every other HTML-bearing comment in the file.
Changed
- Added PHP 8.5 to the supported range (
…||~8.5.0). Intended for Magento
2.4.9+, which is the first line to support PHP 8.5; on 2.4.8 and earlier,
PHP 8.4 remains the recommended runtime.
Admin-config bugfix. No functional or API changes — safe drop-in upgrade
from 3.0.x.
Fixed
- System Config "Save Config" no longer throws a JS
TypeError. Three
numeric fields inetc/adminhtml/system.xmldeclared validation classes
that are not registered in Magento'smage/validationruleset
(validate-greater-than-zeroandinteger). On 2.4.8-p4 the admin form
validator (jquery.validate.jsmetadataRules) looks up
settingson each rule object; the missing rules resolved toundefined,
producingCannot read properties of undefined (reading 'settings')and
aborting the entire form submit. Replaced with registered rules:collection_page_size: →validate-digits validate-digits-range digits-range-0-1000000product_limit: →validate-digitscache_ttl_seconds: →validate-digits
[3.0.4] — 2026-06-03
Compatibility patch. No functional or API changes — safe drop-in upgrade
from 3.0.x.
Changed
- Lowered the minimum PHP to 8.1 (
~8.1.0||~8.2.0||~8.3.0||~8.4.0).
The module uses no PHP 8.2+ only syntax, so it runs on 2.4.5 / 2.4.6 stores
that are still on PHP 8.1 as well as on 2.4.7 / 2.4.8 (PHP 8.3 / 8.4). - Broadened dependency constraints to cover 2.4.5 through 2.4.8. Every
Magento dependency inrequirenow uses an open lower-bound (>=) pinned to
the major line that shipped with 2.4.5 — e.g.magento/framework: >=102.0
andmagento/module-url-rewrite: >=102.0. Because these major lines do not
change between 2.4.5 and 2.4.8, the module installs cleanly across all of
those minors. This replaces the earlier exact carets (such as the^101.2
onmodule-url-rewrite) that failed on 2.4.8, where that module ships as
102.x.
[3.0.2] — 2026-06-03
Marketplace-readiness patch. No functional or API changes — safe drop-in
upgrade from 3.0.0.
Fixed
- Replaced
md5()withhash('sha256', …)for ETag generation in the
file-serving controller. The Magento Coding Standard forbidsmd5(); the
ETag only needs to be stable and unique, so the switch is behaviour-neutral. - Removed error-silencing
@operators from filesystem calls
(fopen/flock/fclose) in the atomic-write lock helper and in the
validate command. Return values were already checked explicitly, so
dropping@changes no behaviour while clearing the coding-standard errors.
Changed
- Dependency constraints pinned to real 2.4.x major lines.
requirenow
uses caret ranges matching the actual published modules — notably
magento/module-url-rewrite: ^102.0(the 101.2 line never existed). This
resolves acomposer requirefailure on clean 2.4.8 installs. - Added an explicit
versionfield (3.0.1) tocomposer.jsonso the
package version matches the Marketplace submission form.
[3.0.0] — 2026-05-23
A full rebuild against the architectural review of 2.1.4. This release is
not drop-in compatible — see the Breaking Changes section below for
migration steps.
Breaking changes
ProviderInterface::provide()signature changed fromstringto
iterable<string>. Custom providers contributed by third-party modules
must now yield chunks rather than return one concatenated string. This is
the change that lets the generator stream to disk with bounded memory./llms-full.txtnow serves a genuinely-different file (full sanitized
descriptions inline). Previously, this URL silently aliased to/llms.txt,
which was misleading.- llms.txt header is now spec-compliant. A single blockquote summary line,
with currency / locale / base-URL moved to a plain markdown paragraph below.
The 2.x output used four blockquote lines, which broke llmstxt.org-spec
parsers. - Status tracking moved out of
core_config_dataand into
var/angeo_llms/status.json. Old status rows underangeo_llms/status/*
are no longer read. Drop them viabin/magento config:set --lock-env angeo_llms/status/... ""if you want a clean state, but it's harmless to leave them. media/llms/is no longer used as the file output directory; output now
lives undermedia/angeo/llms/. Old files can be deleted; remove any reverse-proxy rewrites pointing at the old path.- Admin "Generate" action moved to POST + CSRF. If you have any external
tooling that hit the old GET URL, switch to the CLI command instead. - Module namespace unchanged: still
Angeo\LlmsTxt. Composer package
name unchanged.
Added
- Page Builder element filter with four strategies — preserve, exclude,
allow, strip — driven by the element'sdata-content-typeattribute.
Default list of excluded types drops common visual-only elements
(products carousel, banner, slider, video, map, buttons, block,
dynamic-block, divider, spacer) so the output focuses on semantic text.
Configurable per-store at Stores → Configuration → Angeo → LLMs.txt →
Content Sanitization. - Streaming generation via PHP generators. Memory stays bounded at one
collection page (default 1000 products) regardless of catalog size. - Atomic writes: each file is written to
.tmp, then renamed. Readers
never see a half-written file. Generation locks via a separate.lockfile
withflock(LOCK_EX | LOCK_NB), so concurrent runs cannot corrupt output. - Cursor pagination by
entity_id ASC > $lastIdinstead of skip/limit, so
products inserted mid-run can neither be duplicated nor skipped. - Batch URL resolver loads every URL rewrite for a store in one query
(vs. the per-productgetProductUrl()query that 2.x triggered N times). - Real
llms-full.txtwith full sanitized descriptions inline. /{url_key}.mdmirrors — every product, category, and CMS page exposes
a clean Markdown rendering at its URL with.mdappended. Generated on the
fly; no extra disk storage.- CMS directive resolution —
{{widget}},{{block}},{{var}}, and
{{store}}directives are now rendered via Magento's standard frontend
filter before being stripped, instead of leaking as literal text. - Customer-group-aware pricing — admin can choose which customer group's
final price (with special-price and group-price applied) gets exposed. - HTTP caching —
ETag,Last-Modified,Cache-Control: public, max-age=,
X-Robots-Tag: noindex, follow, and 304 responses on conditional GETs. - Async admin action — Schedule (Async) inserts a
cron_schedulerow for
the next tick so admins don't have to wait through a synchronous generation. - Live admin status panel polling
/angeo_llms/status/indexevery 60s. - Three CLI commands:
bin/magento angeo:llms:generate [--store=…] [--no-jsonl] [--no-llms] [--no-full]bin/magento angeo:llms:statusbin/magento angeo:llms:validate [--store=…]
- JSONL JSON-Schema at
etc/jsonl-schema.jsonfor downstream pipelines. - Events:
angeo_llms_generation_before,angeo_llms_generation_after,
angeo_llms_generation_failed— for custom hooks. - PHPUnit test suite under
Test/Unit/.
Changed
frontend_default_meta_descriptionis now the fallback for the store
summary, before falling back to the generic stub.- Multi-store store-code routing handles the last URL path segment, so
/de/llms.txtworks on path-based stores. - Spec compliance: products go under
## Optionalby default (admin
toggleable) so context-budget-constrained clients can drop them. - Out-of-stock products excluded by an explicit
StockRegistrylookup
(configurable). - Logger context is now structured: every log line is prefixed
[Angeo LlmsTxt]and includes store/format keys.
Fixed
- Pseudo-locking in 2.x: a
'w'open truncates the file before the
flock()call, so two concurrent generations both saw an empty file and
the last writer won unpredictably. 3.0 uses a separate.lockfile. - CSRF-exposed admin generate: 2.x used a GET URL; 3.0 requires POST with
the form key. - Synchronous admin "Generate" timing out on large catalogs (now async option).
- N+1 URL rewrite queries: now batched.
- Literal
{{widget}}text appearing in 2.x output: now resolved. - Stale files for stores that became inactive or excluded: now cleaned up
on every generation run.
Removed
media/llms/legacy directory (see breaking-changes notes).- GET endpoint for admin generation.
- Documented-but-non-existent config fields from 2.x README.
[2.1.4] — Pre-rebuild baseline
Last release in the 2.x line. See the architectural review document for
the issues that motivated 3.0.0.
| Version | Stability | QA Status | Compatibility | Released |
|---|---|---|---|---|
| 3.2.0 | stable | Fail | Magento 2.4.7-2.4.9 Details | 2026-06-14 18:59:28 |
| 3.0.5 | stable | Fail | Magento 2.4.7-2.4.9 Details | 2026-06-04 19:39:51 |
| 3.0.4 | stable | Not tested | Not yet tested Details | 2026-06-03 18:23:19 |
| 3.0.3 | stable | Not tested | Not yet tested Details | 2026-06-03 18:04:49 |
| 3.0.2 | stable | Not tested | Not yet tested Details | 2026-06-03 17:46:25 |
| 3.0.1 | stable | Not tested | Not yet tested Details | 2026-06-03 16:17:59 |
| 3.0.0 | stable | Not tested | Not yet tested Details | 2026-05-29 20:31:58 |
| 2.1.4 | stable | Not tested | Not yet tested Details | 2026-05-06 04:36:21 |
| 2.1.3 | stable | Not tested | Not yet tested Details | 2026-04-30 07:41:35 |
| 2.1.2 | stable | Not tested | Not yet tested Details | 2026-04-30 05:05:25 |
| 2.1.1 | stable | Not tested | Not yet tested Details | 2026-04-29 20:38:27 |
| 2.1.0 | stable | Not tested | Not yet tested Details | 2026-04-29 20:07:40 |
| 2.0.0 | stable | Not tested | Not yet tested Details | 2026-04-16 18:52:27 |
| 1.1.2 | stable | Not tested | Not yet tested Details | 2026-03-20 18:38:35 |
| 1.1.1 | stable | Not tested | Not yet tested Details | 2026-03-18 18:33:44 |
Requires 12
| Package | Constraint |
|---|---|
| ext-json | * |
| ext-mbstring | * |
| magento/framework | >=102.0 |
| magento/module-backend | >=102.0 |
| magento/module-catalog | >=104.0 |
| magento/module-catalog-inventory | >=100.4 |
| magento/module-catalog-url-rewrite | >=100.4 |
| magento/module-cms | >=104.0 |
| magento/module-config | >=101.2 |
| magento/module-store | >=101.0 |
| magento/module-url-rewrite | >=102.0 |
| php | ~8.1.0||~8.2.0||~8.3.0||~8.4.0||~8.5.0 |
Requires-dev 3
| Package | Constraint |
|---|---|
| magento/magento-coding-standard | ^32.0 |
| phpstan/phpstan | ^1.10 |
| phpunit/phpunit | ^10.5 |
Suggests 2
| Package | Reason |
|---|---|
| magento/module-page-builder | Enable to opt-in or opt-out of Page Builder content elements per content-type during sanitization |
| magento/module-shared-catalog | Adobe Commerce: integrate B2B shared catalogs so llms.txt only exposes the allowed catalog |
Compatibility
Each Magento release line is installed on its supported PHP versions, then the module is built (DI compilation + static-content deploy) and its unit and integration suites are run. The matrix shows the lines and PHP versions the module is confirmed to install and run on. Code-quality results further down (phpstan, phpcs, …) are reported separately and never affect compatibility.
Code Quality
Advisory checks against the module's source. Static analysis runs once across the whole module; PHPStan re-runs per Magento + PHP version because resolvable symbols differ between releases. These NEVER affect the Compatibility badge. A phpcs finding can't make a module incompatible.
Static analysis
Coding standards (phpcs), mess detection (phpmd), copy-pasted code (cpd), PHP cross-version compatibility, composer.json validity. Each runs once for the whole module.
| Tool | Status | Findings | Summary |
|---|---|---|---|
| PHPCS | Fail | 125 | 1 error, 124 warnings (ruleset: Magento2) — 48 auto-fixable with phpcbf |
| PHPMD | Warning | 33 | 33 rule violations (CyclomaticComplexity:8, MissingImport:8, NPathComplexity:6, UnusedFormalParameter:4, ExcessiveMethodLength:3) |
| Cpd | Pass | 0 | |
| Composer validate | Info | 10 | valid; 10 advisory notes (composer validate --strict) |
PHPStan
Type-checks the module's PHP against a real Magento install at the configured gate level. Re-runs per Magento and PHP version because resolvable symbols differ between releases.
Tests
Unit and integration suites, run for each applicable Magento and PHP version. A test failure speaks to the module's behaviour, not its compatibility with a Magento line, so it is reported here separately and never reddens the compatibility matrix.
Unit tests
Integration tests
| Magento | PHP 8.2 | PHP 8.3 | PHP 8.4 | PHP 8.5 |
|---|---|---|---|---|
| 2.4.7 | N/A | N/A | ||
| 2.4.8 | N/A | N/A | ||
| 2.4.9 | N/A | N/A |
Security
Security checks run directly against the module: an audit of its declared dependencies for known vulnerabilities (composer audit) and a scan of its source for malware and web-shell signatures. Each runs once. A malware detection fails the version outright.
More from angeo
View vendorMagento 2 module for AI Engine Optimization (AEO). Injects AI crawler rules (OAI-SearchBot, GPTBot, ChatGPT-User, PerplexityBot, Perplexity-User, Google-Extended, ClaudeBot, anthropic-ai, Claude-User, Applebot, cohere-ai, Amazonbot, Meta-ExternalAgent) into robots.txt — without overwriting your existing configuration. Supports per-bot Allow/Disallow lists, Crawl-delay, Sitemap directives, multi-store, and a public Api\RobotsStatusInterface for cross-module integration with angeo/module-aeo-audit.
Live AI brand visibility audit for Magento 2. Queries ChatGPT, Claude, Perplexity, Gemini and Groq with brand-probing prompts and scores real-world AI recall, citation rate and recommendation presence. Extends angeo/module-aeo-audit v3 via CheckerInterface as the 16th signal, alongside the 15 built-in technical checks.
Magento 2 AEO (AI Engine Optimization) Audit. v3 covers 15 signals — robots.txt AI bots, llms.txt + llms.jsonl, Product / Organization / FAQ schema, merchant return + shipping policies, sitemap.xml, UCP profile, AI product feed, OG tags, canonical + hreflang, JSON-LD quality, well-known endpoint matrix, Core Web Vitals via CrUX. Score Trend dashboard, Admin UI, cron, dynamic fix commands, dependency-injected extension point for custom checkers.
Spec-compliant Universal Commerce Protocol (UCP) profile generator for Magento 2. Generates /.well-known/ucp at protocol version 2026-04-08 with ECDSA P-256 signing keys, declared capabilities, and proper cache headers. v0.1.x is profile-only — catalog, cart, checkout endpoints land in later releases.
Turn an existing module into recurring revenue.
If you already maintain a Magento 2 module on GitHub or GitLab, listing it on Packagento takes about five minutes. We mirror your tags, handle distribution signing, and route paid licenses through Stripe Connect, so you can keep shipping the way you already do.