[lug] Procmail Recipe Help Needed

Bill Thoen bthoen at gisnet.com
Wed Dec 29 07:51:37 MST 2004


Never mind... I resolved this one. The solution is to add parentheses and
escape the brakets, e.g.:
 :0
 * -5^0
 * 1^1 B ?? ()\<mit\>
 * 1^1 B ?? ()\<[Dd]u\>
etc.

I don't know what the parentheses are for, but I found this notation on a
web site and they were not explained. Anyway, this recipe now traps
german language messages and files them in the spam filter. Now I'm going
to try it on spanish and korean, and get rid of those.

- Bill Thoen



On Tue, 28 Dec 2004, Bill Thoen wrote:

> I'm trying out the "scoring" technique of procmail to weed out german 
> language schpam, and I tried this to score messages based on common german 
> whole words, but it doesn't work. What am I doing wrong?
> 
> :0
> * -5^0
> * 1^1 B ?? <mit>
> * 1^1 B ?? <[Dd]u>
> * 1^1 B ?? <[Aa]us>
> * 1^1 B ?? <das>
> * 1^1 B ?? <auf>
> * 1^1 B ?? <sich>
> * 1^1 B ?? <und>
> * 1^1 B ?? <macht>
> * 1^1 B ?? <der>
> {
>   :0 fwh
>   | formail -I"Subject: GERMAN SCHPAM: ${SUBJ_}"
>   :0
>   $SPAMFOLDER
> }
> 
> The variable SUBJ_ is defined as:
> SUBJ_=`formail -xSubject: \
>   | expand | sed -e 's/^[ ]*//g' -e 's/[ ]*$//g'`
> 
> and SPAMFOLDER is correct (these work on other messages.)
> 
> The test message had two "mit"s, a "das", a "macht",  a "sich" and a 
> "Du", which should have made the score positive. Am I wrong in assuming 
> that whole words are identified by text in angle brackets? I don't want 
> words like "wonderful" to count as a "der."
> 
> - Bill Thoen
> 
> 
> 
> _______________________________________________
> Web Page:  http://lug.boulder.co.us
> Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
> Join us on IRC: lug.boulder.co.us port=6667 channel=#colug
> 




More information about the LUG mailing list