Output format (v2.0) | Documentation | Website Categorization API | WhoisXML API

Output format

{
    "categories": [
        {
            "tier1": {
                "confidence": 0.6479678859489982,
                "id": "IAB-379",
                "name": "News and Politics"
            },
            "tier2": {
                "confidence": 0.9644738361093003,
                "id": "IAB-390",
                "name": "Weather"
            }
        }
    ],
    "domainName": "cnn.com",
    "websiteResponded": true
}

Output parameters

domainName
Website's URL
websiteResponded

Determines if the website was active during the crawling. The website is considered to be active if it:

  • Responds within 20 seconds timeout (Connection timeout - 10s, Response read timeout - 10s)
  • Responds with 200 HTTP code
  • Sends Content-Type header which is either text/html or text/plain
categories

The list of possible website's categories. Get all possible categories here.

categories[0].tier1

The top level category object.

The Website Categorization API performs classification based on the IAB Content Taxonomy 2.2.

Initially, IAB taxonomy used up to 4 tiers for some categories. However, the lower-level categories are too narrow, so we combined the lower-level tiers and shrunk the taxonomy to 2 levels for better accuracy and readability. All the original IDs are kept without changes, so you can restore the full path from IAB files if necessary.

  • Tier 1: reflects the top-level category of the content. Usually, such categories are too broad and point only to a general content topic.
  • Tier 2: reflects IAB's tiers 2, 3, and 4. Such categories are narrower and describe content more specifically.

Usually, the Website Categorization API returns multiple categories sorted by the relevance in descending order. Without the "minConfidence" parameter, the API returns all the categories having a relevance greater than 0.5. All the results are sorted by their relevance in descending order. The relevance is calculated as the maximum of both tiers' probabilities. For example, if Tier 1's probability is 0.90 and Tier 2's is 0.99, the overall relevance of the set will be 0.99. And vice versa, if Tier 1 is 0.98 and Tier 2's is 0.8, the final probability is 0.98.

categories[0].tier2
The 2nd level category object (if present).
categories[0].tier1.id
The unique category identifier.
categories[0].tier1.confidence
The probability of how the category may be relevant for the website.
categories[0].tier1.name
The readable name of the category.