2.1 Deconstructing the Flaw in Keyword Density
The existing analysis engine heavily relies on “keyword density,” a metric that measures the frequency of a keyword relative to the total word count. While once a cornerstone of SEO, this metric is now dangerously outdated and can lead to poor optimization strategies. Its primary flaw is its lack of context; it treats all words equally and fails to distinguish between a term that is genuinely important to a topic and one that is simply common across all language.17
This encourages “keyword stuffing,” a practice of unnaturally repeating keywords to inflate density, which modern search engines like Google can easily detect and penalize.18 For example, a page about “apple pie recipes” that mentions the word “apple” 20 times has the same keyword density for “apple” as a page about “Apple computers.” The metric alone provides no signal as to which page is more relevant for a given query, rendering it ineffective for sophisticated content analysis. To evolve, the analysis engine must move from this simplistic counting method to a model that can mathematically evaluate a term’s true relevance.
2.2 Implementing TF-IDF for Meaningful Term Weighting
The industry-standard replacement for keyword density is Term Frequency-Inverse Document Frequency (TF-IDF). This is a statistical measure from the field of information retrieval used to evaluate how important a word is to a document within a collection of documents, or “corpus”.20 TF-IDF succeeds where keyword density fails because it evaluates a term’s importance on two axes: its frequency within a single document and its rarity across all other documents.
The calculation is a two-part formula 18:
- Term Frequency (TF): This is similar to keyword density and measures how often a term (t) appears in a specific document (d).
$$ TF(t, d) = \frac{\text{Number of times term } t \text{ appears in document } d}{\text{Total number of terms in document } d} $$ - Inverse Document Frequency (IDF): This is the transformative component. It measures the rarity of a term across the entire corpus (D). Common words that appear in many documents (like “the” or “is”) receive a very low IDF score, while rare, specific terms receive a high score. The logarithm is used to dampen the effect of the scale.
$$ IDF(t, D) = \log\left(\frac{\text{Total number of documents in corpus } D}{\text{Number of documents containing term } t}\right) $$
The final TF-IDF score is the product of these two values: TF−IDF(t,d,D)=TF(t,d)×IDF(t,D). A term achieves a high TF-IDF score if it is frequent in a specific document but rare across the overall corpus, which is the mathematical definition of a strong, descriptive keyword.
A critical architectural consideration arises here: to calculate IDF, the analysis tool requires a corpus. For SEO purposes, analyzing a single URL in a vacuum is not useful. The most relevant and powerful corpus is the content of the top-ranking competitor pages for the target keyword. Therefore, the tool’s workflow must be updated: before analyzing the user’s page, it must first crawl and extract the text from the top 10-20 search results for the keyword in question. This collection of competitor content forms the corpus against which the user’s page is compared, providing immensely more actionable insights.
The natural library in JavaScript provides a straightforward module for implementing TF-IDF.22
JavaScript
const natural = require(‘natural’);
const TfIdf = natural.TfIdf;
const tfidf = new TfIdf();
// 1. Build the corpus from competitor pages (scraped text)
tfidf.addDocument(“Competitor A text about node js and ruby.”);
tfidf.addDocument(“Competitor B text about ruby on rails.”);
tfidf.addDocument(“Competitor C text about python and node js.”);
// 2. Analyze the user’s page against this corpus
const userPageText = “The user’s page is about node js examples.”;
tfidf.addDocument(userPageText);
// 3. Calculate TF-IDF scores for terms on the user’s page
console.log(‘— TF-IDF Scores for User Page —‘);
tfidf.listTerms(3 /* index of user’s document */).forEach(function(item) {
console.log(`${item.term}: ${item.tfidf}`);
});
// Example Output might show ‘examples’ and ‘node’ having high scores
// because they are prominent on the user’s page but less so across the entire corpus.
This approach requires re-architecting the websiteData.keywordAnalysis.topKeywords object to store keywords along with their TF-IDF scores, enabling a more intelligent ranking of term importance.
2.3 Beyond Single Words: N-gram Analysis for Phrase-Based Relevance
Modern search queries and topics are often expressed as phrases, not single words. A user searches for “best laptop for programming,” not just “laptop.” An analysis that only considers unigrams (single words) will miss the crucial context embedded in these multi-word phrases.24
N-gram analysis solves this by examining contiguous sequences of ‘n’ items from a text.
The most common n-grams in SEO analysis are 26:
- Unigrams (1-grams): Single words (“seo”, “content”).
- Bigrams (2-grams): Two-word phrases (“content marketing”, “link building”).
- Trigrams (3-grams): Three-word phrases (“best seo tools”, “on-page optimization”).
By generating and analyzing n-grams from the competitor corpus, the tool can identify the key phrases and sub-topics that Google has already deemed relevant for a given query. For example, an n-gram analysis of top-ranking articles for “how to build a deck” might reveal high-frequency bigrams like “pressure-treated lumber” and “building permit.” These are not just keywords; they are core concepts that signal comprehensive, authoritative content.
Comparing the high-scoring n-grams from the competitor corpus against the n-grams present on the user’s page allows for automated content gap analysis. The recommendation engine can then deliver highly specific advice, such as: “Your content is missing the key phrase ‘joist spacing,’ which appears in 8 of the top 10 competing articles.”
N-grams can be generated with simple JavaScript functions or libraries like ngram-js.27 Once generated, these phrases can be treated as single tokens and scored using the same TF-IDF methodology, allowing the tool to identify not just important words, but important phrases as well.
The following table contrasts the outdated metric with the proposed modern approaches.
| Metric | How It’s Calculated | Pros | Cons | Recommendation for Your Tool |
| Keyword Density | (Keyword Count / Total Words) * 100 | Simple to calculate. | Easily manipulated (keyword stuffing); ignores context and term rarity; poor indicator of relevance. | Deprecate. Replace with TF-IDF and N-gram analysis for all scoring and recommendations. |
| TF-IDF Score | Term Frequency * Inverse Document Frequency | Measures term importance relative to a competitor corpus; rewards topical specificity; resistant to keyword stuffing. | Requires a relevant corpus (competitor pages); computationally more complex than density. | Implement as the primary score for unigram (single-word) importance. |
| N-gram Phrase Score | Treat a phrase (e.g., “link building”) as a single token and calculate its frequency or TF-IDF score. | Identifies important multi-word concepts; reveals user intent and sub-topics; excellent for content gap analysis. | Can generate a large number of phrases, requiring filtering and ranking. | Implement to analyze bigrams and trigrams. Use TF-IDF to score them and identify high-value phrases and content gaps. |