How To Mine Data In Web Applications

Gathering business intelligence from e-commerce sites

VARBusiness logo By Matt Cutler

12:23 PM EDT Mon. Oct. 25, 1999
From the October 25, 1999 issue of VARBusiness
When you have completed this seminar, which is adapted from a presentation made at the Xplor Marketspace '99 conference, sponsored by Xplor International, you will know:
* The issues in mining business intelligence data for e-business clients.
* How cookies and user profiles help build customer profiles clients can use for analysis.
* How business intelligence reports can feed clients' understanding and demand for Web site customization and changes.

_______________

Copyright 1999 Matt Cutler. All rights reserved.

Today, managers who are coping with the new e-business model face a much, much different infrastructure than they did in a traditional client-server decision support system. The Internet has changed how rapidly they need information about their online customer, how rapidly they can implement changes and how rapidly they need to react to competitive threats.

Using software to make better business decisions is inherently different in the online realm. You have more dynamic systems than you did in a client-server environment. Today you have a broad number of what we call federated databases that are loosely linked and you have very high data volumes, along with immediate cycle times, that are associated with them. An e-commerce client needs to be able to know what's happening as it's happening and needs to be able to react to it very quickly.

Then And Now
For a long time, most people asked questions like "how many hits per day did I get? Tell me about bytes transferred." That really didn't tell them very much. Lately, we've begun to see more e-businesses asking sophisticated questions like "who are the best customers? How effective are our advertising and marketing campaigns? Who are the best partners? How much is our site design either encouraging or discouraging use of the information?" While these may be new in the online world, they are really the same questions business managers have been asking of their off-line business all along. Online, the answer depends on their e-business model. If they have an advertising supported site, for instance, they're going to be concerned about page views. If it's a content site they're going to be concerned about page views and retention as well. On a commerce site they'll be concerned about the number of transactions. In an intranet environment, they're going to be more interested in understanding how internal users are finding different types of information and how they are communicating and collaborating.

What are the issues that you have to overcome to even begin to gather the level of data that you can begin mining for an e-business client? Ninety percent of the battle is effective data collection and data assembly. And then understanding what is there so that you can ask sophisticated questions.

Data Collection Challenges

Back in 1994, '95 and '96, you had a browser that talked to a Web server that was serving htm files out of a single directory. As we all know, modern Web applications are much more complex today. There are many different specialized software systems which, together, have to interact and interrelate to create a seamless end-user experience. The modern infrastructure has a lot of complexity associated with it and the idea is to collect data from each one of those components so you can understand the actual user experience and figure out what worked and what didn't.

But this is just one type of complexity and I think it's the easiest complexity to get your mind around. There are a lot of other challenges inherent here. A different and more typical type of complexity is your client's business complexity--both site and organizational.

For instance, Bell Atlantic Corp. is a customer of ours. Bell Atlantic, as an organization, is the largest ARBOC. They're No. 27 on the Fortune 500. They're a regulated business which is divided up into roughly 30 different business units, all of which have separate decision-makers, content areas, P&L statements, but that ultimately have to be part of Bell Atlantic as an organization. The Web site had to reflect this and today bellatlantic.com actually has content from their 30 different business unit areas, all served out of the same content base. But someone different within the organization owns each one of those content areas. They have different goals and they have different technology that they're using to produce the content.

So the challenge there was primarily one of scale. A lot of different people come into the Bell Atlantic home page looking for a lot of different things. And there are lots of different users inside Bell Atlantic, users who need access to information that is relative to their area of content. Not only do you have scaling issues here, you also have the issue of getting the right information to the right person internally. That was a challenging project but one which actually ended up being very satisfying for us to do. And it is the sort of thing we're seeing more of today.

Crisis Or Opportunity?

We really feel modern e-businesses are facing a crisis right now because they're making big decisions, multi-million dollar decisions that involve multi-year efforts with incomplete information about their online customers. That's really the problem they need to solve.

A quote attributed to John F. Kennedy says: The Chinese word for crisis is composed of two characters--one character means danger; one character means opportunity. And that's exactly what your e-business clients face here, because all this complexity and all this challenge is exactly the same complexity and challenge that is faced by their competition. So if you have the ability and wherewithal to go through that complexity and overcome it you'll help them gain a competitive advantage. They'll get a much deeper and holistic understanding of what users are doing on the site and what applications are working so that they can serve their customers better and achieve customer preference and differentiation.

Key Questions

One thing we often get asked is "what are effective ways to model our audience? How do we go about understanding what's good and what's bad about our user set?" One thing that we put forward is a typical audience measurement system called quintile analysis which is not new. It's been used in the off-line audience, like among radio stations, to understand what their constituencies are listening to. It's very applicable to the online world.

Basically you take a set of users who have come onto a Web site, say 100,000, and order them according to a quality metric. The metric could be average number of page views per user or how long someone typically stays on the site. You order them from one to 100,000 and then divide them in quintiles, equal groups of 20,000 each. Then you plot the average success factor in each one of these quintiles. So in your top quintile, users 1 to 20,000 might have an average number of page views per users of roughly five whereas your bottom quintile, the 80,000 to 100,000, is maybe less than one page view per user. Then you plot a graph. If the line which corresponds to how your best customers differ from your worst customers shows a steep slope in the middle, it says your top group is behaving differently from your bottom group. That's healthy. But if this is a very flat line, it means there's a very little in behavior between your best customers and your worst customers and no one is really expressing preferences on the site.

It gets more interesting when your clients ask "now that I'm looking at my best customers, tell me more about them. Tell me what are they looking at? Are my best users looking at different things than my bottom users? Or if you have an application which is linked in a lot of different places inside your intranet, you can understand how people find your application and then what they do from there. What this allows you to do is segment your customers, determine the differentiating factors between good customers and bad customers, and determine how you can push your bad customers to behave more like your good customers. For instance, if you've found that people who come in through the home page tend to be pretty bad customers, perhaps the real interesting content is not linked on the home page. If they have to click two to three levels down to find it, it would be pretty straightforward for to say "based on this information, let's increase the number of links that people are interested in at the home page so we can drive people to the effective content."

Categories Of Understanding

A lot of really technical terms are thrown around about how you really leverage the data effectively and the process you go through to gather and use it. We believe most customers and companies today have a Category One understanding, based on something like page views, which means they know nothing about their customers as individuals. The users coming to their sites are still ghosts. If you add a persistent user identifier, like a cookie, it gives a little it more solidity and moves the understanding to Category Two. There, they don't know anything more about their customers as individuals but at least they know when they come back. And they can start to get retention, frequency and recency reports out of it. The ghosts become stick figures.

You get a quantum leap when you apply an anonymous user profile. Either an internal registration system or a third-party profiling system where you apply an anonymous demographic solution or profile to each user. Now the stick figure has been transformed into a woman, for instance who is 30 to 35 years old and computer literate. Now, knowing you have 100,000 users of that profile makes it much easier to design your applications and select your content. It becomes much easier to promote to that group and allows the client to do segment targeting based on demographics and psycographics.

Finally, when you have a discreet user identity, such as when someone gives you their name and a password and an explicitly declared demographic and psycographic profile, cartoon characters get transformed into actual people. If you know that the woman, 30 to 35 years old, speaks on a cell phone and looks at financial information, you now have discreet information to know exactly who that person is. The advantage of this is that now that you have the magic key which allows your e-business clients to correlate the online user with their existing legacy systems. Although this woman is visiting the Web site, there may be entries about her in their sales transaction system. She may be logged into their support system. She may have participated in a number of their marketing programs, some of which she responded to and some of which she didn't. Now the online users have become real and the client recognizes them as individuals. They can correlate them with existing legacy data and really start to build a holistic understanding of the online customer in the context of their business.


--------------------------

This tutorial is excerpted from a session given at the Xplor Marketspace '99 conference and exhibition, sponsored by by Xplor International in Atlanta earlier this year. The Marketspace conference focuses on the management of documents that support the transactions of business, primarily the generation and collection of money in cyberspace.

Xplor International is a worldwide, not-for-profit professional association representing 2,900 organizations that develop and use the technology of the U.S. $124-billion document systems industry. For more information, please visit the Xplor International web site or call (310) 791-9521.

 
Channelweb : Promofinder
FEATURED PROMOTIONS
Disaster Recovery for Servers
The next trend of backups for businesses are being used along with virtualization technology. With servers being consolidated...
Endian UTM Empowering VARS
Endian empowers VARs with Partners Rock! Channel Program.
RELATED BLOG >>
Photo
Everything Channel, in conjunction with the XChange advisory boards and key vendor executives, has been working to refresh the event's content and develop a track for solution-provider executives who want to transform business practices.
Media Kits | Reprints | Privacy Statement | Copyright © 2010 United Business Media LLC | Terms of Service
CRN Logo ChannelWeb Logo CRN Logo CRNTech Logo Everything Channel Events IPED
ADVERTISEMENT




CHANNEL SERVICES >>