Skip to content

knownagents/node-sdk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Known Agents SDK

NPM version

This library provides convenient access to Known Agents from server-side TypeScript or JavaScript.

Install the Package

Download and include the package via NPM:

npm install @knownagents/sdk

Initialize the Client

Sign up for Known Agents, create a project, and copy your access token from the project's settings page. Then, create a new instance of KnownAgents.

import { KnownAgents } from "@knownagents/sdk"

const knownAgents = new KnownAgents("YOUR_ACCESS_TOKEN")

How To Set Up Agent & LLM Analytics (Full Docs)

Get realtime insight into the hidden ecosystem of crawlers, scrapers, AI agents, and other bots browsing your website. Measure human traffic coming from AI chat and search platforms like ChatGPT, Perplexity, and Gemini.

To collect this data, call trackVisit for each incoming request in the endpoints where you serve your pages.

knownAgents.trackVisit(request)

For richer analytics, include the response and duration:

knownAgents.trackVisit(request, response, responseDurationInMilliseconds)

Use Middleware if Possible

If you can, add this in middleware to track incoming requests to all pages from a single place.

Here's an example with Express, but you can apply this same technique with other frameworks:

import express from "express"
import { KnownAgents } from "@knownagents/sdk"

const app = express()
const knownAgents = new KnownAgents("YOUR_ACCESS_TOKEN")

app.use((req, res, next) => {
    const start = Date.now()
    res.on('finish', () => {
        const duration = Date.now() - start
        knownAgents.trackVisit(req, res, duration)
    })
    
    next()
})

app.get("/", (req, res) => {
    res.send("Hello, world!")
})

app.listen(3000, () => console.log("Server running on port 3000"))

Batch Requests If Possible

For high-traffic websites, batch multiple visits together and send them periodically (e.g. every 30 seconds) using trackVisits:

import { VisitRequest } from "@knownagents/sdk"

const visits: VisitRequest[] = []

// Collect visits
visits.push({
    request_path: req.url,
    request_method: req.method,
    request_headers: req.headers,
    response_status_code: res.statusCode,
    response_duration_in_milliseconds: duration,
    created: new Date().toISOString()
})

// Send batch periodically
setInterval(() => {
    if (visits.length > 0) {
        knownAgents.trackVisits(visits.splice(0))
    }
}, 30000)

Test Your Integration

  • Open your project's settings page
  • Click Send a Test Visit
  • Click Realtime

If your website is correctly connected, you should see visits from the Known Agent in the realtime timeline within a few seconds.

How To Set Up Automatic Robots.txt (Full Docs)

Protect sensitive content from unwanted access and scraping. Generate a continuously updating robots.txt that stays up to date with all current and future bots in the specified categories automatically.

Use the generateRobotsTXT function. Select which AgentTypes you want to block, and a string specifying which URLs are disallowed (e.g. "/" to disallow all paths).

const robotsTxt = await knownAgents.generateRobotsTXT([
  AgentType.AIDataScraper,
  AgentType.Scraper,
  AgentType.IntelligenceGatherer,
  AgentType.SEOCrawler
  // ...
], "/")

The return value is a plain text robots.txt string. Generate a robotsTxt periodically (e.g. once per day), then cache and serve it from your website's /robots.txt endpoint.

How To Use Agent Identification (Full Docs)

Use the identifyAgent and identifyAgents functions to identify and verify agents from network requests using Web Bot Auth (HTTP message signatures), IP matching, or other available methods. This can be useful for implementing access policies based on verified agent identity or enriching your own datasets.

Identify a Single Request

Call identifyAgent with the incoming request.

const identification = await knownAgents.identifyAgent(request)

if (identification.result === "verified") {
    // Agent is legitimate
} else if (identification.result === "verification_failed") {
    // Agent is not legitimate
}

Identify Multiple Requests

Use identifyAgents to identify multiple requests at once:

const identifications = await knownAgents.identifyAgents([
    {
        id: "request-1",
        request_headers: request1.headers
    },
    {
        id: "request-2",
        request_headers: request2.headers
    }
])

The functions return an object (or array of objects) with the following fields:

  • id: The identifier from the request (if provided)
  • result: The identification result:
    • "verified": The agent is identified and verified
    • "verification_failed": The agent was identified but could not be verified
    • "unknown_agent": The agent is not in our database
    • "not_verifiable": The agent cannot be verified (no verification method available)
  • agent_id: The unique ID of the agent (if identified)
  • agent_token: The name of the agent (e.g. "Googlebot") (if identified)
  • agent_url: The documentation URL of the agent (if identified)
  • agent_type_name: The type of agent (e.g. "AI Agent") (if identified)
  • operator_name: The company behind the agent (e.g. "Google") (if identified)

Requirements

TypeScript >= 4.7 is supported.

The following runtimes are supported:

  • Node.js 18 LTS or later (non-EOL) versions.

Support

Please open an issue with questions, bugs, or suggestions.

Releases

No releases published

Contributors