http://listserver.info/mercury/filtering.htm is best if you want to tag or reject the email after it is recieved. The only issue I have with that is that it does not let the spammer know that your email address is not valid.
Rejecting at TRANSFLT.MER allows you to "clean" your email off the spammer list by sending a 550 result which tells him the address is invalid. The fact that you have already accepted the address probably doesn't matter, since the spammers log will only show the reason the entire delivery failed.
Unfortunatly it would appear that the subject line can not be filtered based on character sets. I have tried the following in my TRANSFLT.MER:
S, "*koi8-r*",RS, "550 spam: character set. Use plain ascii or call 1-xxx-xxx-xxxx."
and it fails to match email with:
subject: =?koi8-r?B?gobblygook
On the other hand
Subject: test koi8-r test
triggers the filter.
So Mercury is translating the character set stuff /first/ and then applying the TRANSFLT rules. I guess there must be a good reason for this, but it does prevent us from filtering out all the e.g. crylic spam.
A better alternative is
S, "*[¡-ÿ]*", RS, "550 invalid address: Wierd Characters"
Where the first character in the [] is 0xA1 although I wish it was 0X80 which is the first character above the printable ones in the ASCII chart and the second character is 0xFF which is the maximum value of any single ASCII character. These were entered via the Mercury TRANSFLT editor by pressing alt+173 for the first and the last is alt+152. Interestingly enough, entering alt+255 does not put in an 0xFF but alt+152, DOES cause the correct value: 0xFF to appear in the TRANSFLT.MER file. The closest value to 0X80 that I could enter with the alt+xxx keystroke was alt+173 which for some strange reason ends up being an 0xA1
This catches most, but not all. 0x80 to 0xA0 will not be filtered out. The unprintable characters below space also can not be entered by the editor included in Mercury as far as I can tell. This should still trigger in most cases, as it is unlikely that the spam will happen to use only the characters that are not detected.
As Michael advises, an external editor with binary capability could be used to enter the characters better, but the above will work with the editor included in Mercury and so is a quick way to instantly reject wierd characters.
Again:
S, "*[¡-ÿ]*", RS, "550 invalid address: Wierd Characters"
Where the first wierd character is entered with alt+173 (use the numeric keypad for the 173) and the ending wierd character is alt+152.