MovableType Plugin: MTKeywords

MTKeywords is a Movable Type plugin that compiles a list of fairly relevant keywords from the aggregate of the body of an entry, its title, and its comments. While the default purpose of the entry’s Keywords field is to populate the page’s Keywords meta tag, I found that extra data entry field to be most advantageously used for other purposes. Plus, there hasn’t been a way to auto-generate a list of relevant keywords. Thus, MTKeywords was born!

If you’re curious how the plugin works, it simply gathers the text from the three basic sources, removes any HTML, filters out more than 100 short common words, calculates the unigram and bigram counts for every word and word pair, and reports back the most common n-grams. Both the input and the results are case-sensitive when gathering unigrams. Obviously, this version is designed for English-language blogs, but if you modify the code and substitute a 100 or so of the most common words (including cardinal numbers, prepositions, articles, contractions, and to-be verb conjugations) it should still work pretty well. I doubt it will work for non-Western languages without major modifications, though. Sorry!

Impact on rebuilding seems negligible; I rebuilt the individual archives of this entire blog (consisting at the time of publication of just over 400 entries) in 46 seconds — the same elapsed time whether I used the MTKeywords plugin or not.

To see an example of the plugin in action, my brief synopsis of A Connecticut Yankee in King Arthur’s Court yielded keywords of “king arthur, connecticut yankee, yankee, Connecticut, Yankee, long, connecticut, King, Arthur, only, myself, court, king, fun, classic, Twain, century, england, complete, twain, disbelief, gun, England, took, arthur”. Not perfect, but fairly accurate.

And my history of Wake Island produced “wake island, pan american, peale island, history wake, united states, island, wake, Wake, Island, japanese, Japanese, american, history, American, great, pan, Pan, years, water, war, prisoners, atoll, civilian, been, islands”. The longer the blog entry is (and the more on-topic the comments are), the more representative the results will be.

By the way, I fully recognize that few search engines use the Keywords meta tag anymore, but hope springs eternal for a comeback once the algorithms to reduce keyword spammers are improved.

Download
You can get the latest version of MTKeywords by downloading Keywords.txt.

Installation
Save file as Keywords.pl in your Movable Type plugins directory and set the permissions to 755.

Usage
Add the <$MTKeywords$> tag into the HEAD section of your Individual Archive Template.
<meta name="keywords" content="<$MTKeywords$>" />

<$MTKeywords delimiter="|"$> — specify your own delimiters.
<$MTKeywords caseSensitive="false"$> — consolidates ‘Richard’ and ‘richard’ into just ‘richard’.
<$MTKeywords includeBigrams="false"$> — skips displaying of common word pairs.

Version History
1.01: 04/08/2006; add parameters for delimiter, case sensitivity, and bigram usage
0.99b: 01/10/2005; fixed problem with Perl v5.8x
0.99a: 11/08/2004; included more basic file and web extensions to exclude
0.99: 10/19/2004; initial release

If you liked this, you might also be interested in:

Responses

44 Responses to “MovableType Plugin: MTKeywords”

Pages:« 1 2 [3] 4 5 » Show All

  1. Response #21
    Ravensky’s Blog (IP) on March 23rd, 2005 at 9:31 pm

    Getting things done

    Alrighty, so now I have a nice template, made by Neil Turner. He’s made some very nice themes and I even used to use this once a long time ago when I was still new with Movable Type. I’ve also…

  2. Response #22
    bopuc/weblog (IP) on April 7th, 2005 at 3:11 am

    Yahoo! Term extraction for MT

    So Jonas has gotten this working for WordPress, but I have some ideas of how to use it, somewhat differently, with Movable Type. I couldn’t code Perl (or anything else really) to save my life so here is just the…

  3. Response #23
    Dan Wolfgang (IP) on April 18th, 2005 at 10:04 am

    This is a cool tag, but it doesn’t do exactly what I want. I like using the Keywords field for tags. As I “tag” more and more entries, I think more and more about the backlog of untagged entries I have. With that in mind, it’d be cool if this could somehow be used to populate any existing empty Keywords field to sort of play catch-up. Any ideas on the possibility of that?

  4. Response #24
    Chris Short (IP) on May 21st, 2005 at 6:50 pm

    The question is can I change the seperator and the number of keywords presented?

  5. Response #25
    richard on May 21st, 2005 at 8:13 pm

    The last two code sections gather the results into the $result variable. The first section includes up to the first 4 common bigrams (determined by $keywordcount < 5) and the second section pads the results to the 24th instance of a keyword (determined by $keywordcount < 25). Change the desired keywordcounts to be whatever you wish. Both sections stitch the results together with $result .= “, “. You can change the comma/space into whatever separator you desire. - RDL

  6. Response #26
    Chris Short (IP) on May 22nd, 2005 at 9:10 am

    Thanks. Next question, is there a way to keep the same keyword from repeating more than three times?

  7. Response #27
    richard on May 22nd, 2005 at 11:52 am

    No, not at this time. The plugin is intended to be case sensitive. - RDL

  8. Response #28
    Chris Short (IP) on May 22nd, 2005 at 12:41 pm

    Also, my Main Index page isn’t generating keywords.

  9. Response #29
    richard on May 23rd, 2005 at 10:29 am

    As designed. The MTKeywords plugin is for use within individual archive templates. - RDL

  10. Response #30
    Chris Short (IP) on May 23rd, 2005 at 2:29 pm

    Um… my question was about keyword repetition. Not case sensitivity.

Pages: « 1 2 [3] 4 5 » Show All