Using Word Clouds (Wordles) for Competitive Intelligence

[An updated version of this post can be found here]

It’s been a while since I updated this blog on my efforts to build my on-line competitive intelligence service (codenamed ScrewTinny). As I’m now not too far away from releasing an alpha version, I thought I’d start to share some of the output. At the same time I’m starting to see some competitive intelligence value in the humble wordle.

ScrewTinny is a competitive intelligence tool that I’m building to examine and visualise high-tech vendors’ marketing copy in an attempt to expose the strategies within. The principle is simple. As we market any product or service, the words we use to cunningly ensnare our prospects also betray the strategy we are following to beat our competition. If I describe something as “the fastest” (superlative) or “faster than” (comparative) then it’s fairly certain that performance is probably a key feature and one that I might see as a differentiator.

I’ve blogged before about MITICOR and PIPESCOM. These are REPAMA studies that I’ve automated to infer a vendor’s value proposition approach and product feature focus respectively. A by-product of this analysis is that I’m also collecting a lot of textual statistics from vendor’s marketing copy. So I’ve decided that it would interesting to allow users of my service to visualise these text statistics as Word Clouds (Wordles). I’m also starting to think that there may be some competitive intelligence value in examining and comparing wordles from different vendors.

Wordles are not the most precise way to show the frequency of words in a text, but if precision isn’t important then they do provide an excellent at-a-glance guide to the words that are repeated most frequently. If the text we use to produce the Wordle is the marketing copy that a vendor uses to communicate with the market, then what we are seeing is the priority (subliminal or conscious) that a vendor attaches to the terms it uses to communicate with the market.

As regular readers of this blog will know, while I’m building and testing ScrewTinny I’m focussing on the Enterprise Service Bus (ESB) market before I add additional market categories. The ESB market is simply one I know very well and as such it makes testing much easier. The vendors that I’m looking at initially in the ESB market segment are the following (Axway, Talend, Mule, Progress, TIBCO, Seeburger, Intersystems, Tmaxsoft, iWay, WS02, Fiorano, JBoss and FuseSource. I recognise that some of these products / vendors have merged, been acquired or divested since I set up my test data, but they are still shown here separately.

I pointed ScrewTinny at a bunch of WS02 content (web site pages and PDFs) which it analysed and produced a Wordle of the most frequently occurring 420 tokens (you might think of these as words) in the content that ScrewTinny examined.

The result is shown below.

 WS02 ESB Marketing Copy Wordle

Click for larger image.

One term that jumps out for me, is mobile. To my cynical eye it appears to be disproportionately prominent for a product in the ESB market category. If I were a product marketing manager for a competitive ESB product, this would make me ask a few questions. I also see SAP given quite a high priority. Again I would not be surprised to see SAP in the marketing copy for an ESB, but perhaps not quite so prominently unless this represented a key strategy for the vendor.

So let’s have a look at the marketing copy wordle for Talend.

Talend ESB Marketing Copy Wordle

Click for larger image.

Looking at the Talend wordle I see neither SAP nor mobile appearing prominently. Most of the terms that look prominent to my eye all relate to the development process – drag-and-drop, graphical, components, studio, etc. This would suggest that Talend believes it has a strength and perhaps a differentiator here. One term that does jump out is democratizing. This is such an uncommon word that I wouldn’t expect to be repeated in a marketing text unless it is a key marketing term, and one that I would expect to be used as a differentiating philosophy.

By scanning manually between the two different wordles, it suggests that tokens such as esb, enterprise, integration and service all appear frequently in both vendors’ marketing copy. No real surprise there. But I’ve designed ScrewTinny to be able to combine and compare results of the different studies from vendor to vendor, or from vendor to product segment average. So in the diagram below we can see the most common tokens when combining the marketing copy from both WS02 and Talend.

Combined WS02 and Talend ESB Marketing Copy Wordle

Click for larger image.

And to take this to its natural conclusion, here I’ve asked ScrewTinny to look at all marketing content in the entire ESB product category (13 vendors). The result is shown below.

ESB Segment (13 vendors) Marketing Copy Wordle

I’ve modelled each of the individual vendors in the ESB product category and their wordles are shown in the gallery below. Click to enlarge the image.

Whilst this might look simply like an interesting and aesthetically pleasing exercise, I can see real value in looking for unusual terms that stand out – just as mobile, SAP and democratizing do above. But the real value starts to emerge when, instead of looking at all words in the marketing copy, you instead examine the frequency of the different parts of speech that a vendor or group of vendors use to describe their products. For example, which adjectives are most commonly used by vendors – including comparatives and superlatives? Which verbs feature most prominently, etc. I’ll follow-up with another post that looks at the parts of speech soon.

*Caveat lector. The software used to produce the Wordle images above is in alpha at the moment. As such, erroneous tokens and words are currently not being filtered out. In addition, when the marketing copy of multiple vendors is being compared, the frequency of tokens is not yet normalised. This means that if twice as much text was ‘read’ for vendor a than for vendor b, the wordle would be skewed in favour of vendor a. This will be addressed in a future version.

Danny Goodall

Posted in Competition and Competitive Intelligence, Marketing Strategy, natural language processing, REPAMATron, REPAMATron ESB Study and tagged , , , , , , , , , , , , , , , , , , .

Leave a Reply

Your email address will not be published. Required fields are marked *