hwsesstat

A Tool Analyzing the Trails Users Frequently Take Through a HyperWave Database


Introduction

This is the first release of a tool which produces some very interesting information about server access. When I wrote it, my intention was to get some hints about the typical HyperWave users' behaviour: where are their the favourite entry points, how do they proceed from there, how long do their sessions last and where do they usually leave the system.
The source of information are the logfiles produced by wavemaster. The script watches the sessions within a specified period of time and records all navigational steps taken by the users. From this information, a tree-like structure is built, which contains the trails of all sessions.

To make this information visible I decided to apply hypertexts. The script outputs a lot of hyperlinked documents, which are put into a single collection. When the collection is visited, only a document of general statistics is presented to the user, and all others are accessible from there.
With each document in these and subsequent lists, the number of users, percentage of users who came here from the previous document and overall percentages are listed. Moreover, the name of each document in the list contains a link to those which were accessed from here.

Since this is the first release, there are surely many things that could be improved and added, and maybe they even will, if you write me your comments about it.


Parameters

The basis for this script was hwlogstat, so most parameters have remained the same, and usage is quite similar. So, if you get along with hwlogstat, you should have no problems with hwsesstat.

-dir <directory>
Defines the directory the logfiles are stored in. Only the logfiles in this directory will be examined (e.g. ~hgsystem/logs).

-file <filename>
Specifies the name of the current logfile (as defined in .db.contr.rc). The default name is wave.log, which also assumes the files to be in wavemaster's format.

-old
If present, this parameter indicates that the logfiles to be analyzed are in the wwwmaster's format.
Note that in this case the filename (-file) should be altered, too.

-hghost <name>
Name of HyperWave host (DNS)

-hname <string>
Hostname that shall appear in the summary's title

-pname <coll>
Name of collection to put the result into.

-cname <coll>
The name of the collection that will contain the hyperlinked document. By default, the name is pname/waythrough.<timestamp>

-ctitle <string>
The collection's title.

-from <yy/mm/dd>
First day to analyze. Should be in the form yy/mm/dd.

-to <yy/mm/dd>
Last day to analyze. By default, yesterday's date is assumed. Format as above.

-lastseven
Analyzes the last seven days (may be used instead of -from and -to).

-lastmonth
Analyzes the last month (may be used instead of -from and -to).

-top <number>
Specifies the top n items to be listed (20 by default).

-depth <number>
Number of steps to follow the trail (5 by default).

-percent <number>
Neglect trails followed by less than the given percentage of users.
Note that increasing -top and -depth and decreasing -percent increases the number of documents produced.

-v
Verbose mode


Example

hwsesstat -hghost hgiicm.iicm.edu -dir '~hgsystem/logs' -pname statistics/sessions -ctitle "Some Info about Sessions" -lastmonth -top 20 -v


Alfons Schmid
aschmid@iicm.edu
Oct. 9th, 1996