hginsmail 1.12 ---------------------------------------------------------------------------- Name hginsmail - a Perl script for archiving e-mail messages on a Hyper-G server Synopsis hginsmail [parameters] [file(s)] For a short description of the parameters try hginsmail -h, for more details see below. ---------------------------------------------------------------------------- Description hginsmail takes e-mail messages from a list of files, matches them against user-defined rules and inserts the messages into collections associated with the rules. Depending on the user's parameters, the script inserts each message into one collection only (the one corresponding to the first matching rule), or it evaluates the whole list of rules, which may yield several destination collections, and inserts the message into each of these. Before actually inserting a message into a collection, hginsmail tests whether it is already there (the messages' Ids and authors are stored as keywords). Thus, it does not insert a message into a collection twice, and even if the script processes a file of e-mail messages more than once, it won't produce duplicate messages in the archive (except the user overrides the script's suggestions.) Upon insertion, the script also tests whether a message is a reply to some other message. In case it is, a link to the referenced document is generated. (Thanks to Hyper-G's link-management, this even works when the referenced object happens to be inserted at some later time.) Thus, the thread of a discussion can easily be reconstructed, and, moreover, it may be graphically represented with Harmony's Local Map! In order to get the full thread of a discussion, however, the messages a user sends should appear in the archive, too. This may easily be accomplished by adding a 'Bcc: $USER' header field to each message sent, which copies the message into the user's mailbox. The files that are read from may contain e-mail messages in either the standard e-mail format (as from the standard mail-file (/usr/spool/mail/$USER)) or in emacs' RMAIL-format. (Implicitly, newsgroup articles may be archived with hginsmail, too.) The rules for categorization along with some general information should be stored in a file. More details about that follow below. hginsmail does not directly insert the messages into the database; instead, it writes them to a HIF-file (which is stored in $HOME/.hgmail), and when all messages are read, it hands that file over to hifimport, a tool which then inserts all messages into the database at one swoop. The HIF-file is deleted if hifimport terminates successfully. Sometimes it may happen, though, that something goes wrong in the course of insertion; maybe a collection does not exist, the HG host is down, or whatever. In this case, the user is notified of the failure and the HIF-file is left in the directory. Before terminating, hginsmail always checks $HOME/.hgmail for the presence of any HIF-files and if there are any, it tries to insert them again. In the case of success, the files are deleted, otherwise the user is notified. (The user could also insert the HIF-files manually by calling hifimport.) ---------------------------------------------------------------------------- Parameters (only the first four letters are essential) -test Test Mode. The script evaluates the user-defined rules for each message in the given files and echoes the designated destination collections. This mode allows the user to check the correctness of the rules. -query Query Mode. The script echoes the destination collections for each message and gives the user the chance to change them before writing the messages to the HIF-file. -archive Archive Mode. Usually the script inserts the messages into their categories' collections and additionally copies them to a so called 'incoming mail' collection. The purpose of this feature is that only a single collection has to be searched for newly inserted objects (hginsmail may be regularly executed to insert new mail into the database, and in this case the 'incoming mail' collection is especially useful.) The situation more frequently encountered, however, is to just archive e-mail that has been presorted using some other mail-tool. In this case it would be bothersome to have a copy of each message lying around. Thus, the option -archive keeps the script from copying the messges to the 'incoming mail' collection. -selective In this mode, the script skips those messages which do not match any query. -multiple Insert into multiple collections. In this mode, the whole list of rules is evaluated and a message is inserted into each collections associated with a matching rule. Without this parameter, a message is inserted into the collection associated with the first matching rule only. -force Forces all messages to be put into the collection following that argument. This feature is especially useful when testing the script and its settings. (Note that -force disables the Archive Mode!) -empty Empties the files after inserting their contents. -rc ... Define the name of the rc-file that contains the general settings and rules for this run. (See below for more information about the rc-file.) -cname ... Select a different collection into which messages are put when they do not satisfy any query. The default collection is defined in hginsmail.rc. (See below). -hghost ... Select a Hyper-G host other than the one defined in hginsmail.rc. -identify User must enter username and password before the messages are inserted into the database. By default, the user is automatically identified. -v Activates verbose mode. file(s) This is the list of files containing the mail messages that are to be archived. When no files are given, the script chooses the standard mail-file, the directory of which is defined in hginsmail.rc. ---------------------------------------------------------------------------- hginsmail.rc This file holds all information that is individual for each user and/or each execution of the script. Being exactly, it defines the default Hyper-G host, the default collection for messages that cannot be categorized, the name of the 'incoming mail' collection, the default access rights for the documents in the database, the directory of the standard mail-file, and, of course, the rules for categorizing the messages. Since this information may differ from case to case, the commandline parameter -rc allows the user to pass the name of the rc-file currently desired. If this argument is missing, the script first looks for a file named hginsmail.rc in the current directory, and if it cannot find it there, it checks the directory ~/.hgmail/. An example rc-file can be found at the same location as hginsmail. It might be best to just copy it and adapt it for one's own needs. Rules While the definition of host and default collection etc. do not need much explanation, the definition of the rules probably does. Each rule consists of two parts; a description and the name of a destination collection (which may optionally be followed by the access rights that a document matching this rule should be assigned). The two parts are separated by '->'. Description This is a pattern matching command in Perl syntax. Expressions may be arbitrarily complex as long as Perl is able to execute them. ($Subject =~ /Landscape/) ($From =~ /will|summer|joe@somehost/i) ($To =~ /aschmid/) A comment in the example rc-file lists which variables may legally be used within the expressions. Destination This is simply the name of a collection (on the selected HG-server) into which a message shall be put when it matches the corresponding description. Separated by commas, access rights may be defined, which will be assigned to the document when it is inserted into the database. The reserved name _SKIP_ induces the script to just skip the messages. Having introduced the definitions, here some examples: ($Subject =~ /Hyper-G/) -> hgstuff_asj, hgteam, hgintern ($Subject =~ /\[holiday\]/) -> _SKIP_ ($To =~ /aschmid/) -> to_asj_pers, aschmid (($Cc =~ /hgteam|hgintern/)) -> hgstuff_asj ($Gnus =~ /alt.prose/) -> gnus_alt_prose_asj ---------------------------------------------------------------------------- What may come in the Future * Selective Insertion This concept describes the feature that messages matching a query are inserted into the database, while all others are left in the mailfile. The current version of selective insertion does not care about the skipped messages. ---------------------------------------------------------------------------- Changes since hginsmail 1.1 * I've added a commandline parameter -selective which tells the script to insert just those messges which match a query; all others are ignored; * The Rights attribute is no longer ignored ... * The produced HTML documents had two entries; that is removed now (thanks michael-k!) Thanks to Michael Klemme <michael-k@cs.auckland.ac.nz> for the ideas to following changes: * It is now possible to define rights along with each rule in the .rc-file; * The header field Sender is now accepted, too; * The script will execute even without default collection defined; Thanks to Tasos Koutoumanos <tkout@softlab.ece.ntua.gr> for the ideas to following changes: * It is now possible to define the files to be read from in the rc-file; but still, additional ones may be added via commandline; * The user now may define a custom Header and Footer part, which will be pre- / appended to each message; * The name of the rc-file may be defined as a commandline argument (-rc); if this argument is missing, the script looks for an rcfile in the current directory, and then in ~/.hgmail; * When an 'http://...' URL apprears somewhere in the text, a hyperlink is automatically inserted; * Similarly, there are mailto links to a message's author and the address given in the Reply-To field; ---------------------------------------------------------------------------- See Also hgsendmail Some of you have probably wondered about Harmony's Mail functions, which usually do not produce any visible effect when selected. However, when hgsendmail is installed on your system and the file hgsendmail.rc is in $HOME/.hgmail, they will work just as you'd expect them to. For more information please see the documentation of hgsendmail. ---------------------------------------------------------------------------- Author Alfons Schmid (aschmid@iicm.edu) - July 22, 1996