Well-known URIs
| Check | Phase | Weight | Description |
|---|---|---|---|
| /.well-known/agent.json (A2A AgentCard) well_known.agent_json | passive | 10 | Looks for an A2A AgentCard at `/.well-known/agent.json` ([a2aprotocol.ai](https://a2aprotocol.ai)) and checks for the four load-bearing top-level keys: `name`, … |
| /.well-known/ai-plugin.json manifest well_known.ai_plugin_json | passive | 8 | Looks for the ChatGPT-plugin manifest at `/.well-known/ai-plugin.json` and counts how many of the load-bearing OpenAI-schema keys are present (`name_for_human`,… |
| /.well-known/mcp.json (Model Context Protocol) well_known.mcp_json | passive | 8 | Looks for an MCP (Model Context Protocol) manifest at `/.well-known/mcp.json`. Records the top-level keys but only insists on the presence of at least one MCP i… |
| OpenID Connect configuration + dynamic registration well_known.oidc_configuration | passive | 5 | Looks for `/.well-known/openid-configuration` (OIDC Discovery 1.0 / RFC 8414). Grants full credit when the document includes a `registration_endpoint` (RFC 7591… |
| /.well-known/openapi.{json,yaml} well_known.openapi | passive | 6 | Probes the two RFC 8615 well-known OpenAPI paths (`/.well-known/openapi.json`, `/.well-known/openapi.yaml`) and confirms the body is OpenAPI 3.x. This is narrow… |
Root-level files
| Check | Phase | Weight | Description |
|---|---|---|---|
| /ai.txt AI-crawler directives root_level.ai_txt | passive | 4 | Looks for `/ai.txt` — a secondary, still-draft root-level declaration of AI-crawler directives. Multiple drafts compete, so this check only records presence + a… |
| /llms-full.txt long-form index root_level.llms_full_txt | passive | 8 | Looks for `/llms-full.txt` — the long-form companion to `/llms.txt` ([llmstxt.org](https://llmstxt.org)). Passes on a substantive (≥ 1 KB) text body; warns on s… |
| /llms.txt index for LLMs root_level.llms_txt | passive | 8 | Looks for `/llms.txt` at the site root — a Markdown index of important URLs + summaries for LLM consumers ([llmstxt.org](https://llmstxt.org)). Presence is the … |
Crawl & indexing
| Check | Phase | Weight | Description |
|---|---|---|---|
| RSS/Atom feed crawl.feed | passive | 4 | Looks for an RSS/Atom feed via the conventional paths (`/feed`, `/feed.xml`, `/rss`, `/atom.xml`) and the homepage `` tag. The link-alternate declaration is the… |
| robots.txt AI-agent rules crawl.robots_txt | passive | 13 | Parses `/robots.txt` and checks whether the major AI crawlers (GPTBot, ClaudeBot, PerplexityBot, CCBot, Google-Extended, and others) are allowed to crawl the si… |
| XML sitemap discovery crawl.sitemap | passive | 10 | Looks for a sitemap via both the conventional paths (`/sitemap.xml`, `/sitemap_index.xml`) and any `Sitemap:` directives in `/robots.txt`. Accepts `` and ``. Ab… |
HTML & meta
| Check | Phase | Weight | Description |
|---|---|---|---|
| meta description html_meta.description | passive | Info | Informational (not graded): reports whether the homepage has a `` and its length. The meta description is a human search-snippet signal — an LLM reads the full … |
| JSON-LD structured data html_meta.json_ld | passive | 8 | Parses every `` block on the homepage and categorises by `@type`. `WebAPI` or `SoftwareApplication` is the primary agent signal — those entries explicitly decla… |
| Open Graph tags html_meta.open_graph | passive | Info | Informational (not graded): reports whether the homepage carries the core Open Graph tags (`og:title`, `og:description`, `og:type`). Open Graph drives human soc… |
API discoverability
| Check | Phase | Weight | Description |
|---|---|---|---|
| GraphQL introspection api.graphql_introspection | passive | 8 | POSTs a minimal GraphQL introspection query (`{ __schema { queryType { name } } }`) to `/graphql`, `/api/graphql`, `/query`. Passes when the server returns the … |
| JSON error bodies for API callers api.json_error_body | passive | 5 | Requests a random non-existent path with `Accept: application/json`. Passes when the server returns a JSON error body (`application/json` or `application/proble… |
| OpenAPI specification discovery api.openapi_discovery | passive | 10 | Probes nine conventional paths for an OpenAPI spec (`/openapi.json`, `/api/openapi.yaml`, `/swagger.json`, etc.) and confirms the top-level `openapi` key declar… |
Protocols
| Check | Phase | Weight | Description |
|---|---|---|---|
| A2A AgentCard conformance protocols.a2a_agent_card | passive | 8 | Deep conformance check on the A2A AgentCard. Requires `version`, a non-empty `skills` array (each with `name` + `description`) and a non-empty `endpoints` array… |
| Public MCP registry listing protocols.mcp_registry_presence | passive | 10 | Searches the public MCP registries (the official registry.modelcontextprotocol.io and Glama) for the target. A hit requires a verifiable tie — a registry entry … |
Registries
| Check | Phase | Weight | Description |
|---|---|---|---|
| GitHub public repository registries.github_repo | passive | 5 | Finds GitHub repositories attributable to the target via verifiable signals: a homepage link to the repo or its owner, the repo `homepage` field pointing at the… |
| npm SDK package registries.npm_package | passive | 6 | Finds npm packages attributable to the target via verifiable signals: a homepage link to the package or its org on npmjs.com, or package metadata (homepage / re… |
| PyPI SDK package registries.pypi_package | passive | 6 | Direct-probes PyPI for `` and `-sdk` plus any package the homepage links on pypi.org. A hit only counts when the package metadata (home_page / project_urls) poi… |
Documentation
| Check | Phase | Weight | Description |
|---|---|---|---|
| Docs platform discoverability docs.platform | passive | 6 | Probes the conventional docs paths (`/docs`, `/documentation`, `/api`, `/api/docs`, `/reference`, `/developers`) and fingerprints the first hit. Recognises Mint… |
| SDK availability across languages docs.sdk_availability | passive | 8 | Counts language-level SDKs by combining the npm + PyPI registry findings (verified-attributable packages only) with install commands scraped from the docs homep… |
LLM training data
| Check | Phase | Weight | Description |
|---|---|---|---|
| Common Crawl index presence llm_training.common_crawl | passive | 8 | Queries the Common Crawl CDX endpoint for the most recent monthly snapshot and counts the target's pages. Common Crawl's corpus underpins most open-source LLM t… |
| Hacker News mentions llm_training.hn_mentions | passive | 5 | Queries the Algolia-hosted HN Search API for the target host and verifies each hit before counting it — the linked URL must resolve to the host, or the text mus… |
| Wikipedia article llm_training.wikipedia | passive | 8 | Searches Wikipedia for the service (top results for the org part of the host) and credits the first article whose external links resolve to the target domain. T… |
Anti-bot posture
| Check | Phase | Weight | Description |
|---|---|---|---|
| Anti-bot interstitial anti_bot.cloudflare_interstitial | passive | 10 | Fetches the homepage and flags Cloudflare / PerimeterX / Akamai / HUMAN interstitials by header (`cf-mitigated`, `cf-chl-*`, `x-px-captcha`, `x-akamai-session-i… |
| User-agent sniffing anti_bot.user_agent_sniffing | passive | 5 | Fetches the homepage as `AgentDisco/1.0` and again as `curl/7.88.0`, then compares status, content-type, and body length. A large divergence indicates user-agen… |
Identity & verification
| Check | Phase | Weight | Description |
|---|---|---|---|
| Email auth (SPF, DMARC, DKIM) identity.email_auth | passive | Info | Informational (not graded): reports whether the domain has SPF (`v=spf1`), DMARC (`_dmarc.`), and DKIM (probing the selectors `default`, `google`, `s1`, `select… |
| security.txt responsible-disclosure declaration identity.security_txt | passive | Info | Informational (not graded): reports whether RFC 9116 `security.txt` is present (at `/.well-known/security.txt` or `/security.txt`) and declares a `Contact:`. Re… |
| TLS + HSTS + HTTPS redirect identity.tls | passive | 10 | Trust check: TLS certificate validity (chain + dates) and a `http://` → `https://` redirect — both needed for an agent to reach the service. Cert problems are a… |
Agent onboarding
| Check | Phase | Weight | Description |
|---|---|---|---|
| API-key / signup path discoverability onboarding.api_key_path | passive | 6 | Looks for API-key signup discoverability: probes conventional paths (/signup, /register, /developers, /api-keys, /account/api) plus anchors on the homepage and … |