MovableType Plugin: MTKeywords
MTKeywords is a Movable Type plugin that compiles a list of fairly relevant keywords from the aggregate of the body of an entry, its title, and its comments. While the default purpose of the entry’s Keywords field is to populate the page’s Keywords meta tag, I found that extra data entry field to be most advantageously used for other purposes. Plus, there hasn’t been a way to auto-generate a list of relevant keywords. Thus, MTKeywords was born!
If you’re curious how the plugin works, it simply gathers the text from the three basic sources, removes any HTML, filters out more than 100 short common words, calculates the unigram and bigram counts for every word and word pair, and reports back the most common n-grams. Both the input and the results are case-sensitive when gathering unigrams. Obviously, this version is designed for English-language blogs, but if you modify the code and substitute a 100 or so of the most common words (including cardinal numbers, prepositions, articles, contractions, and to-be verb conjugations) it should still work pretty well. I doubt it will work for non-Western languages without major modifications, though. Sorry!
Impact on rebuilding seems negligible; I rebuilt the individual archives of this entire blog (consisting at the time of publication of just over 400 entries) in 46 seconds — the same elapsed time whether I used the MTKeywords plugin or not.
To see an example of the plugin in action, my brief synopsis of A Connecticut Yankee in King Arthur’s Court yielded keywords of “king arthur, connecticut yankee, yankee, Connecticut, Yankee, long, connecticut, King, Arthur, only, myself, court, king, fun, classic, Twain, century, england, complete, twain, disbelief, gun, England, took, arthur”. Not perfect, but fairly accurate.
And my history of Wake Island produced “wake island, pan american, peale island, history wake, united states, island, wake, Wake, Island, japanese, Japanese, american, history, American, great, pan, Pan, years, water, war, prisoners, atoll, civilian, been, islands”. The longer the blog entry is (and the more on-topic the comments are), the more representative the results will be.
By the way, I fully recognize that few search engines use the Keywords meta tag anymore, but hope springs eternal for a comeback once the algorithms to reduce keyword spammers are improved.
Download
You can get the latest version of MTKeywords by downloading Keywords.txt.
Installation
Save file as Keywords.pl in your Movable Type plugins directory and set the permissions to 755.
Usage
Add the <$MTKeywords$> tag into the HEAD section of your Individual Archive Template.
<meta name="keywords" content="<$MTKeywords$>" />
<$MTKeywords delimiter="|"$> — specify your own delimiters.
<$MTKeywords caseSensitive="false"$> — consolidates ‘Richard’ and ‘richard’ into just ‘richard’.
<$MTKeywords includeBigrams="false"$> — skips displaying of common word pairs.
Version History
1.01: 04/08/2006; add parameters for delimiter, case sensitivity, and bigram usage
0.99b: 01/10/2005; fixed problem with Perl v5.8x
0.99a: 11/08/2004; included more basic file and web extensions to exclude
0.99: 10/19/2004; initial release
Slightly better form for the meta tag would have a / before the closing > like the rest of the template has. Also, the meta tag should go in the “HEAD” section…
Well, you caught an OOPS! and mentioned one of those very obvious instructions that it didn’t even occur to me to point out to the usually clueless masses. And here I am, a veritable preacher for “well-formed-ness”! Points taken. Text updated. Thanks! - RDL
nice, does it work on 3.12?
I know of no reason why it shouldn’t work on 3.X, but it has not been tested. - RDL
Is there any reason why examining the source code of *some* of my MT pages, after installing Keywords.pl would show this, an empty set of comments: content=”" (from http://thedavidlawrenceshow.com/002336.html, a page filled with words about the newscaster in Cleveland that got naked for a sweeps-month stunt) and other pages *would* have the results of your plugin. Does it take time to spread through a large blog? All pages have plenty of text to examine. Thanks for this plug-in!
[We narrowed the "problem" down to the fact that on his errant entries he was only including real-time feeds from other sources using a JavaScript RSS reader, and not writing his own body of text. MTKeywords cannot index keywords from text from sources other than MT. - RDL]
Useless use of a variable in void context at /www/sites/mt.lockergnome.com/mt/plugins/Keywords.pl line 33. What’s that?
I am getting the following when I run it: Use of uninitialized value in concatenation (.) or string at plugins/Keywords.pl line 38. It repeats three times. It still generates the keywords, but I’m wondering if the error can be fixed. FYI I’m using MT 3.12 or whatever the current release is.
I just upgraded to MT3.14 and have run the upgrade srcipt and it pointed out the following: **** WARNING: Parentheses missing around “my” list at plugins/Keywords.pl line 33. **** WARNING: Useless use of a variable in void context at plugins/Keywords.pl line 33. Thought you would want to know!
Nice plugin - thanks - but I’m receiving the same “Useless use” error as others. I have MT3.14 and Perl 5.8.0. I hope this helps.
The errors above should be corrected with the new version 0.99b. - RDL