Harvest netblocks of good MTAs from SPF for whitelisting from greylisting

Discussion:

Harvest netblocks of good MTAs from SPF for whitelisting from greylisting

Constantine A. Murenin

2013-02-12 21:12:15 UTC

Hi,

I'm configuring greylisting with pf(4) and OpenBSD spamd, and one of
the things I would like to do is explicitly whitelist good MTAs of
e.g. google.com, apple.com, ebay.com, schwab.com, freebsd.org,
uwaterloo.ca etc.

It seems like with the proliferation of SPF, this might be relatively
easy to do: the netblocks of many valid email servers are individually
published by each respective domain through various SPF-compliant
records. (And even when there are no explicit "ip4" or "ip6" records
with the IP addresses or netblocks, an explicit or implied "v=spf1 a
mx" could still do the trick.)

What's left is to have a list of "good" domains, and a script that
will go through all such domains once a week to compile the list of
good netblocks. Such netblocks could then be exempt from any kind of
greylisting, such as to never delay the mail from trustworthy domains
whatsoever.

Cheers,
Constantine.

Stuart D Gathman

2013-02-12 22:16:03 UTC

Post by Constantine A. Murenin
Hi,
I'm configuring greylisting with pf(4) and OpenBSD spamd, and one of
the things I would like to do is explicitly whitelist good MTAs of
e.g. google.com, apple.com, ebay.com, schwab.com, freebsd.org,
uwaterloo.ca etc.
It seems like with the proliferation of SPF, this might be relatively
easy to do: the netblocks of many valid email servers are individually
published by each respective domain through various SPF-compliant
records. (And even when there are no explicit "ip4" or "ip6" records
with the IP addresses or netblocks, an explicit or implied "v=spf1 a
mx" could still do the trick.)
What's left is to have a list of "good" domains, and a script that
will go through all such domains once a week to compile the list of
good netblocks. Such netblocks could then be exempt from any kind of
greylisting, such as to never delay the mail from trustworthy domains
whatsoever.

You are overthinking it. With SPF, you don't need to mess with
netblocks at all (and that would not be practical at all with IP6). If
SPF passes, and the domain is trusted, you are good. That is the whole
point of SPF.

I recommend keeping a policy database for *every* SPF result, not just
Pass. I use the sendmail access file with a python milter, for example:

# for sysadmins who still refuse to use the correct hostname, and also
publish an SPF policy prohibiting the bogus name they do use:
HELO-Fail:owa.johnsjames.com OK
# for "good" domains like you describe
SPF-Pass:smashwords.com WHITELIST
# reject yahoo messages that don't pass DKIM
DKIM-Fail:yahoo.com REJECT
# reject "evil" domains, even if they do understand SPF (unlike most
"good" domains)
SPF-Pass:sintys.gov.ar REJECT
# Send DSN nagging about their stupid syntax error, then use pyspf
heuristics to guess what they really meant
SPF-PermError:volvocars.com DSN
# Nag them about their unauthorized MTA, but accept the mail anyway
SPF-Softfail:patriot.net DSN
# Reject Neutral mail, even though their SPF policy is too timid to use -all
SPF-Neutral:financial.ca REJECT
# Reject Neutral mail, and add the connect IP to a DNS blacklist!
SPF-Neutral:coca-cola.com BAN
# Even though the MTA has invalid HELO, invalid PTR, and no SPF record,
accept mail after verifying the localpart via CallBackValidation
SPF-None:hbham.com CBV

Constantine A. Murenin

2013-02-12 22:40:16 UTC

Post by Constantine A. Murenin
Hi,
I'm configuring greylisting with pf(4) and OpenBSD spamd, and one of
the things I would like to do is explicitly whitelist good MTAs of
e.g. google.com, apple.com, ebay.com, schwab.com, freebsd.org,
uwaterloo.ca etc.
It seems like with the proliferation of SPF, this might be relatively
easy to do: the netblocks of many valid email servers are individually
published by each respective domain through various SPF-compliant
records. (And even when there are no explicit "ip4" or "ip6" records
with the IP addresses or netblocks, an explicit or implied "v=spf1 a
mx" could still do the trick.)
What's left is to have a list of "good" domains, and a script that
will go through all such domains once a week to compile the list of
good netblocks. Such netblocks could then be exempt from any kind of
greylisting, such as to never delay the mail from trustworthy domains
whatsoever.

You are overthinking it. With SPF, you don't need to mess with netblocks at
all (and that would not be practical at all with IP6). If SPF passes, and
the domain is trusted, you are good. That is the whole point of SPF.

But I am not using SPF in my MTA, and I do not plan to.

I'd guesstimate that a setup with pf(4) whitelisting of common MTAs
through the SPF harvesting approach described, together with
greylisting at the firewall level, for my domains would be much more
effective in combating spam than any kind of SPF or DKIM
implementations at my MTA level, and without the false positives.

Also, there's no reason why IPv6 would be different here than IPv4: a
/48 IPv6 netblock with 2^80 addresses can still be represented through
a single record.

C.

Stuart D Gathman

2013-02-12 22:55:56 UTC

Post by Constantine A. Murenin
I'd guesstimate that a setup with pf(4) whitelisting of common MTAs
through the SPF harvesting approach described, together with
greylisting at the firewall level, for my domains would be much more
effective in combating spam than any kind of SPF or DKIM
implementations at my MTA level, and without the false positives.

But you are doing the same thing as SPF - except guessing the valid MTAs
instead of using the official list conveniently provided in SPF records.
If you are worried about efficiency, SPF records that don't involve
localpart or PTR macros can be resolved to a set of IPs (more general
than a netblock) with a TTL, and cached. I believe libspf2 has this
feature already. Take the union of the IP sets of all your "good"
domains if you are going to treat them all the same.

Constantine A. Murenin

2013-02-13 00:19:42 UTC

Post by Stuart D Gathman

Post by Constantine A. Murenin
I'd guesstimate that a setup with pf(4) whitelisting of common MTAs
through the SPF harvesting approach described, together with
greylisting at the firewall level, for my domains would be much more
effective in combating spam than any kind of SPF or DKIM
implementations at my MTA level, and without the false positives.

But you are doing the same thing as SPF - except guessing the valid MTAs
instead of using the official list conveniently provided in SPF records.
If you are worried about efficiency, SPF records that don't involve
localpart or PTR macros can be resolved to a set of IPs (more general
than a netblock) with a TTL, and cached. I believe libspf2 has this
feature already. Take the union of the IP sets of all your "good"
domains if you are going to treat them all the same.

No. I cannot have my firewall do SPF evaluations, and I'm not
attempting to do the same thing as SPF. I am also not guessing valid
MTAs, I'm getting their list deterministically based on static SPF
information that is published by relevant entities whose mail I might
care to never delay.

Let's do a quick reality check here: SPF cannot block spam. It can
only block forgeries, and if you configure it as such, it is also very
much so likely to have many false positives, too.

My approach, as described above, involves the following:

* collect SPF information statically (every week or so) and compile a
list of netblocks
* pass connections from such whitelist directly to sendmail/qmail/mta of choice
* pass other connections to spamd, to do greylisting
* if a host passes greylisting, spamd updates the firewall, and the
next connection attempt will go to sendmail/qmail/etc

The novelty of my approach is collecting a whitelist through an
SPF-based record harvesting.

SPF-wise, this approach has limitations of, for example, obviously not
supporting SPF's "ptr" and "exists" declarations, and also not being
real-time; however, even for such domains that rely on such features,
they'd still be able to deliver mail after passing greylisting, and
with no false positives.

Non-SPF-wise, this approach might make SPF actually useful for those
who think that it is not, and would complement any greylisting policy
with very little overhead and a 0,000% false positive rate.

Yes, I'm not trying to use SPF as it was designed, but who cares?
It's not like we use IPv4 for what it was originally designed either,
is it?

Right now I'm trying to see if someone has already done this before,
as it seems simple enough. Else, some ideas on a possible
implementation, or perhaps just stirring some interest amongst people
who are thinking of better ways to use the SPF records that remain
largely unused.

Best regards,
Constantine.

Tim Draegen

2013-02-13 01:20:26 UTC

Post by Constantine A. Murenin

Post by Stuart D Gathman

Post by Constantine A. Murenin
I'd guesstimate that a setup with pf(4) whitelisting of common MTAs
through the SPF harvesting approach described, together with
greylisting at the firewall level, for my domains would be much more
effective in combating spam than any kind of SPF or DKIM
implementations at my MTA level, and without the false positives.

But you are doing the same thing as SPF - except guessing the valid MTAs
instead of using the official list conveniently provided in SPF records.
If you are worried about efficiency, SPF records that don't involve
localpart or PTR macros can be resolved to a set of IPs (more general
than a netblock) with a TTL, and cached. I believe libspf2 has this
feature already. Take the union of the IP sets of all your "good"
domains if you are going to treat them all the same.

No. I cannot have my firewall do SPF evaluations, and I'm not
attempting to do the same thing as SPF. I am also not guessing valid
MTAs, I'm getting their list deterministically based on static SPF
information that is published by relevant entities whose mail I might
care to never delay.

Constantine, I think your approach is novel and interesting. I'd like to see what the actual results are, as nothing beats real life data.

FWIW, my understanding is that there is a gap between the stuff that is easy to determine as "legit" and the stuff that is easy to flag as "unwanted" (when one throws the kitchen sink of tools available to make such a determination). The trouble lies in the gap -- for any single piece of email in "the gap", is this piece of email:

- legit but routed in a weird way?
- a handcrafted phish?
- spam/malware from a compromised (but otherwise trusted) source?

Given the above, I'm curious to know how much email "in the gap" would be pulled out if you generate a bunch of netblocks from "good" SPF records and use that to manage your grey-listing.

I would guess that you'd only be reinforcing stuff that has no problem being passed through as "legit". But if you have data that shows otherwise, that is definitely something fun to poke at.

$0.02,
-= Tim

Stuart D Gathman

2013-02-13 01:50:01 UTC

Post by Constantine A. Murenin
No. I cannot have my firewall do SPF evaluations, and I'm not
attempting to do the same thing as SPF. I am also not guessing valid
MTAs, I'm getting their list deterministically based on static SPF
information that is published by relevant entities whose mail I might
care to never delay.

You misunderstood. My only suggestion was to calculate the *actual* IP
set, not the "netblock" set. SPF is much more fine grained than
netblocks. The same colocation facility might have both "good" and
"evil" domains. The idea of statically compiling the IP set for your
type of application is a good one, and has already been implemented.

Constantine A. Murenin

2013-02-13 02:36:33 UTC

Post by Constantine A. Murenin
No. I cannot have my firewall do SPF evaluations, and I'm not
attempting to do the same thing as SPF. I am also not guessing valid
MTAs, I'm getting their list deterministically based on static SPF
information that is published by relevant entities whose mail I might
care to never delay.

You misunderstood. My only suggestion was to calculate the *actual* IP set,
not the "netblock" set. SPF is much more fine grained than netblocks. The

I think you're putting distinct meanings into the terms "IP set" and
"netblock", or ignoring the fact that bigger providers actually
publish their IP set as netblocks within SPF.

Else, why would someone specify a netblock in their SPF that's not
their actual IP set? If they do, then their SPF is broken, and it'll
still be broken regardless whether you do the harvesting like I
contemplate, or implement SPF on your MTA.

same colocation facility might have both "good" and "evil" domains. The
idea of statically compiling the IP set for your type of application is a
good one, and has already been implemented.

Where has it been implemented? Please provide me a link, I'm very
interested in applying it to my setup.

C.

Stuart D Gathman

2013-02-13 03:47:48 UTC

not the "netblock" set. SPF is much more fine grained than netblocks. The
I think you're putting distinct meanings into the terms "IP set" and
"netblock", or ignoring the fact that bigger providers actually
publish their IP set as netblocks within SPF.

I may have misunderstood you also. I thought you were broadening
narrower IPs to include the entire netblock.

Else, why would someone specify a netblock in their SPF that's not
their actual IP set? If they do, then their SPF is broken, and it'll
still be broken regardless whether you do the harvesting like I
contemplate, or implement SPF on your MTA.

same colocation facility might have both "good" and "evil" domains. The
idea of statically compiling the IP set for your type of application is a
good one, and has already been implemented.

Where has it been implemented? Please provide me a link, I'm very
interested in applying it to my setup.

http://www.libspf2.org/docs/html/spf__compile_8c.html

I'm not sure if that was intended to be used by itself - libspf2 uses it
to cache SPF records in a very fast form.

Also, I have a similar feature I added recently to pyspf - it isn't
checked in to CVS yet, not thoroughly test. I added it because I've
often just wanted a list of MTA IPs for debugging mail problems...

Stuart D Gathman

2013-02-13 03:59:45 UTC

Post by Constantine A. Murenin

same colocation facility might have both "good" and "evil" domains. The
idea of statically compiling the IP set for your type of application is a
good one, and has already been implemented.

Where has it been implemented? Please provide me a link, I'm very
interested in applying it to my setup.

I did a quick and dirty for pyspf (just add ips to a set while evaluating):

http://spidey2.bmsi.com/pyspf/spf.py

$ python spf.py 0.0.0.0 ***@google.com google.com
(('softfail', 250, 'domain owner discourages use of this host'), '~all')
216.239.32.0/19
64.233.160.0/19
66.249.80.0/20
72.14.192.0/18
209.85.128.0/17
66.102.0.0/20
74.125.0.0/16
64.18.0.0/20
207.126.144.0/20
173.194.0.0/16
216.73.93.70/31
216.73.93.72/31

This needs to ensure that all paths are followed, and combine adjacent
blocks (e.g. the last two above). It also needs to handle IP6.

Stuart D Gathman

2013-02-13 04:10:07 UTC

Post by Stuart D Gathman

Post by Constantine A. Murenin
Where has it been implemented? Please provide me a link, I'm very
interested in applying it to my setup.

http://spidey2.bmsi.com/pyspf/spf.py
(('softfail', 250, 'domain owner discourages use of this host'), '~all')
216.239.32.0/19
64.233.160.0/19
66.249.80.0/20
72.14.192.0/18
209.85.128.0/17
66.102.0.0/20
74.125.0.0/16
64.18.0.0/20
207.126.144.0/20
173.194.0.0/16
216.73.93.70/31
216.73.93.72/31
This needs to ensure that all paths are followed, and combine adjacent
blocks (e.g. the last two above). It also needs to handle IP6.

Hmm, using 0.0.0.0 to "never match" is a good hack, but fails with
fancy includes - so I'll need to handle those.

It needs to keep 4 separate ipsets for Pass, Fail, SoftFail, and Neutral.
I'll make a separate method for computing ipsets.

Constantine A. Murenin

2013-02-13 05:49:17 UTC

Post by Stuart D Gathman

Post by Stuart D Gathman

Post by Constantine A. Murenin
Where has it been implemented? Please provide me a link, I'm very
interested in applying it to my setup.

http://spidey2.bmsi.com/pyspf/spf.py
(('softfail', 250, 'domain owner discourages use of this host'), '~all')
216.239.32.0/19
64.233.160.0/19
66.249.80.0/20
72.14.192.0/18
209.85.128.0/17
66.102.0.0/20
74.125.0.0/16
64.18.0.0/20
207.126.144.0/20
173.194.0.0/16
216.73.93.70/31
216.73.93.72/31
This needs to ensure that all paths are followed, and combine adjacent
blocks (e.g. the last two above). It also needs to handle IP6.

Hmm, using 0.0.0.0 to "never match" is a good hack, but fails with
fancy includes - so I'll need to handle those.
It needs to keep 4 separate ipsets for Pass, Fail, SoftFail, and Neutral.
I'll make a separate method for computing ipsets.

Seems great!

Also, make sure to not use "exist", either -- don't want it to have
undefined effects.

% dig +shor txt rambler.ru
"v=spf1 ip4:81.19.66.0/23 ip4:81.19.88.0/24 ip4:81.19.92.32/27
-exists:%{ir}.spf.rambler.ru -exists:%{l}.u.spf.rambler.ru ~all"
%

Do you have it in a git repo I could fork?

C.

Stuart D Gathman

2013-02-13 15:33:09 UTC

Post by Constantine A. Murenin
Also, make sure to not use "exist", either -- don't want it to have
undefined effects.
% dig +shor txt rambler.ru
"v=spf1 ip4:81.19.66.0/23 ip4:81.19.88.0/24 ip4:81.19.92.32/27
-exists:%{ir}.spf.rambler.ru -exists:%{l}.u.spf.rambler.ru ~all"
%
Do you have it in a git repo I could fork?

pyspf is CVS at sourceforge:

http://pymilter.cvs.sourceforge.net/viewvc/pymilter/pyspf/

When I modernize, it will be to bzr. Sorry :-)

I haven't checked in my ipset hack yet.

Scott Kitterman

2013-02-13 02:11:30 UTC

Post by Constantine A. Murenin

Post by Stuart D Gathman

Post by Constantine A. Murenin
I'd guesstimate that a setup with pf(4) whitelisting of common MTAs
through the SPF harvesting approach described, together with
greylisting at the firewall level, for my domains would be much more
effective in combating spam than any kind of SPF or DKIM
implementations at my MTA level, and without the false positives.

But you are doing the same thing as SPF - except guessing the valid

MTAs

Post by Stuart D Gathman
instead of using the official list conveniently provided in SPF

records.

Post by Stuart D Gathman
If you are worried about efficiency, SPF records that don't involve
localpart or PTR macros can be resolved to a set of IPs (more general
than a netblock) with a TTL, and cached. I believe libspf2 has this
feature already. Take the union of the IP sets of all your "good"
domains if you are going to treat them all the same.

No. I cannot have my firewall do SPF evaluations, and I'm not
attempting to do the same thing as SPF. I am also not guessing valid
MTAs, I'm getting their list deterministically based on static SPF
information that is published by relevant entities whose mail I might
care to never delay.
Let's do a quick reality check here: SPF cannot block spam. It can
only block forgeries, and if you configure it as such, it is also very
much so likely to have many false positives, too.
* collect SPF information statically (every week or so) and compile a
list of netblocks
* pass connections from such whitelist directly to sendmail/qmail/mta of choice
* pass other connections to spamd, to do greylisting
* if a host passes greylisting, spamd updates the firewall, and the
next connection attempt will go to sendmail/qmail/etc
The novelty of my approach is collecting a whitelist through an
SPF-based record harvesting.
SPF-wise, this approach has limitations of, for example, obviously not
supporting SPF's "ptr" and "exists" declarations, and also not being
real-time; however, even for such domains that rely on such features,
they'd still be able to deliver mail after passing greylisting, and
with no false positives.
Non-SPF-wise, this approach might make SPF actually useful for those
who think that it is not, and would complement any greylisting policy
with very little overhead and a 0,000% false positive rate.
Yes, I'm not trying to use SPF as it was designed, but who cares?
It's not like we use IPv4 for what it was originally designed either,
is it?
Right now I'm trying to see if someone has already done this before,
as it seems simple enough. Else, some ideas on a possible
implementation, or perhaps just stirring some interest amongst people
who are thinking of better ways to use the SPF records that remain
largely unused.

This seems way harder than just whitelisting mail from the domains on your list that pass SPF. You are assuming that all mail from those hosts is as trustworthy as the mail from your list of good domains. I don't think that is a safe assumption.

Scott K

Stuart D Gathman

2013-02-13 02:41:15 UTC

Post by Scott Kitterman
This seems way harder than just whitelisting mail from the domains on
your list that pass SPF. You are assuming that all mail from those
hosts is as trustworthy as the mail from your list of good domains. I
don't think that is a safe assumption.

He is only whitelisting mail from the domains on his list - but by
statically compiling the IPs like libspf2 does, except that he is
coarsening the resolution of the set to netblocks. I was suggesting
that he might not want to do the netblocks thing when starting with
SPF records for the "good" domains.

Constantine A. Murenin

2013-02-13 04:02:15 UTC

Post by Stuart D Gathman

Post by Scott Kitterman
This seems way harder than just whitelisting mail from the domains on
your list that pass SPF. You are assuming that all mail from those
hosts is as trustworthy as the mail from your list of good domains. I
don't think that is a safe assumption.

He is only whitelisting mail from the domains on his list - but by
statically compiling the IPs like libspf2 does, except that he is
coarsening the resolution of the set to netblocks. I was suggesting
that he might not want to do the netblocks thing when starting with
SPF records for the "good" domains.

I now see your misunderstanding.

I never said that I wanted to coarse the SPF data into the resolution
of netblocks. That just makes so very little sense on so many levels.

What I meant is, when Google.com provides this:

% dig +short txt gmail.com; dig +short txt _spf.google.com; dig +short txt _netblocks.google.com
"v=spf1 redirect=_spf.google.com"
"v=spf1 include:_netblocks.google.com include:_netblocks2.google.com include:_netblocks3.google.com ?all"
"v=spf1 ip4:216.239.32.0/19 ip4:64.233.160.0/19 ip4:66.249.80.0/20 ip4:72.14.192.0/18 ip4:209.85.128.0/17 ip4:66.102.0.0/20 ip4:74.125.0.0/16 ip4:64.18.0.0/20 ip4:207.126.144.0/20 ip4:173.194.0.0/16 ?all"
%
Etc.

And NetBSD provides this:

% dig +short txt netbsd.org ; dig +short mx netbsd.org ; dig +short mail.netbsd.org ; dig +short mail.netbsd.org aaaa
10 mail.netbsd.org.
149.20.53.66
2001:4f8:3:7::25
%

And so and so trusted site provides etc.

Then I want to take all of those exact netblocks, IPv4 and IPv6
specifications from SPF, and MX resolutions, and harvest them into
a whitelist of netblocks (again, without any kind of coarsening of
gathered data), and use such list as a whitelist in my greylisting
setup, to avoid an unnecessary delay of mail from hosts that are
very much so unlikely to be spammers, or from whom all mail would
be accepted anyways, since they would be extremely unlikely not to
pass the greylisting, so, why delay the inevitable?

But instead of compiling such a list manually, and having it
become outdated or cumbersome within mere days, or base it on
he-said-she-said style of information, I'd like to have a script
that could automatically generate such a list, and automatically
update it weekly etc.

I hope this clarifies my intentions.

Cheers,
Constantine.

Scott Kitterman

2013-02-13 04:45:18 UTC

Post by Constantine A. Murenin

Post by Stuart D Gathman

Post by Scott Kitterman
This seems way harder than just whitelisting mail from the domains on
your list that pass SPF. You are assuming that all mail from those
hosts is as trustworthy as the mail from your list of good domains. I
don't think that is a safe assumption.

He is only whitelisting mail from the domains on his list - but by
statically compiling the IPs like libspf2 does, except that he is
coarsening the resolution of the set to netblocks. I was suggesting
that he might not want to do the netblocks thing when starting with
SPF records for the "good" domains.

I now see your misunderstanding.
I never said that I wanted to coarse the SPF data into the resolution
of netblocks. That just makes so very little sense on so many levels.
% dig +short txt gmail.com; dig +short txt _spf.google.com; dig +short txt
_netblocks.google.com "v=spf1 redirect=_spf.google.com"
"v=spf1 include:_netblocks.google.com include:_netblocks2.google.com
include:_netblocks3.google.com ?all" "v=spf1 ip4:216.239.32.0/19
ip4:64.233.160.0/19 ip4:66.249.80.0/20 ip4:72.14.192.0/18
ip4:209.85.128.0/17 ip4:66.102.0.0/20 ip4:74.125.0.0/16 ip4:64.18.0.0/20
ip4:207.126.144.0/20 ip4:173.194.0.0/16 ?all" %
Etc.
% dig +short txt netbsd.org ; dig +short mx netbsd.org ; dig +short
mail.netbsd.org ; dig +short mail.netbsd.org aaaa 10 mail.netbsd.org.
149.20.53.66
2001:4f8:3:7::25
%
And so and so trusted site provides etc.
Then I want to take all of those exact netblocks, IPv4 and IPv6
specifications from SPF, and MX resolutions, and harvest them into
a whitelist of netblocks (again, without any kind of coarsening of
gathered data), and use such list as a whitelist in my greylisting
setup, to avoid an unnecessary delay of mail from hosts that are
very much so unlikely to be spammers, or from whom all mail would
be accepted anyways, since they would be extremely unlikely not to
pass the greylisting, so, why delay the inevitable?
But instead of compiling such a list manually, and having it
become outdated or cumbersome within mere days, or base it on
he-said-she-said style of information, I'd like to have a script
that could automatically generate such a list, and automatically
update it weekly etc.
I hope this clarifies my intentions.

I understand your intentions. My point is that just because an MTA sends mail
from one domain that you think is a safe one, doesn't mean it doesn't also
send mail for other, less savory, domains. Skipping greylisting on SPF pass
for one of your 'good' domains would accomplish what your after without also
giving a free pass to other domain that may not be so friendly. Tumgreyspf
does something similar to this.

Scott K

alan

2013-02-13 00:57:46 UTC

first i would say yes if doing the sort of greylisting you are talking about pre-whitelisting ips from spf records is not a bad way to cut down on initial delays

secondly i would say if the ip(s) is not available via an spf record do not assume 'a mx' as 'a' largely NEVER is a legit mta its usually a website, b MX has usually at most 1 ip that sends most others are receive only, and often all are receive only

thirdly yes this will save your greylisting work/initial DB build, but also greylisting equally will not reduce the spam much as most modern spambots will also retry later so will also pass greylisting (why they do this and TLS nowdays)

fourthly you should still use spf/dkim/spamassasin-at-data/etc on your mta for connections that have come past greylisting because still a lot of spam comes via greylist-passing bots and legit mailservers with compromised users

either way this use of SPF is not relevant to this list really?

as greylisting done properly is not a concern for legit senders as it just delays their first email via each server for each new greylisting user. so less than any % of use

Post by Constantine A. Murenin

Post by Stuart D Gathman

Post by Constantine A. Murenin
I'd guesstimate that a setup with pf(4) whitelisting of common MTAs
through the SPF harvesting approach described, together with
greylisting at the firewall level, for my domains would be much more
effective in combating spam than any kind of SPF or DKIM
implementations at my MTA level, and without the false positives.

But you are doing the same thing as SPF - except guessing the valid MTAs
instead of using the official list conveniently provided in SPF records.
If you are worried about efficiency, SPF records that don't involve
localpart or PTR macros can be resolved to a set of IPs (more general
than a netblock) with a TTL, and cached. I believe libspf2 has this
feature already. Take the union of the IP sets of all your "good"
domains if you are going to treat them all the same.

No. I cannot have my firewall do SPF evaluations, and I'm not
attempting to do the same thing as SPF. I am also not guessing valid
MTAs, I'm getting their list deterministically based on static SPF
information that is published by relevant entities whose mail I might
care to never delay.
Let's do a quick reality check here: SPF cannot block spam. It can
only block forgeries, and if you configure it as such, it is also very
much so likely to have many false positives, too.
* collect SPF information statically (every week or so) and compile a
list of netblocks
* pass connections from such whitelist directly to sendmail/qmail/mta of choice
* pass other connections to spamd, to do greylisting
* if a host passes greylisting, spamd updates the firewall, and the
next connection attempt will go to sendmail/qmail/etc
The novelty of my approach is collecting a whitelist through an
SPF-based record harvesting.
SPF-wise, this approach has limitations of, for example, obviously not
supporting SPF's "ptr" and "exists" declarations, and also not being
real-time; however, even for such domains that rely on such features,
they'd still be able to deliver mail after passing greylisting, and
with no false positives.
Non-SPF-wise, this approach might make SPF actually useful for those
who think that it is not, and would complement any greylisting policy
with very little overhead and a 0,000% false positive rate.
Yes, I'm not trying to use SPF as it was designed, but who cares?
It's not like we use IPv4 for what it was originally designed either,
is it?
Right now I'm trying to see if someone has already done this before,
as it seems simple enough. Else, some ideas on a possible
implementation, or perhaps just stirring some interest amongst people
who are thinking of better ways to use the SPF records that remain
largely unused.
Best regards,
Constantine.
-------------------------------------------
Sender Policy Framework: http://www.openspf.net [http://www.openspf.net]
Modify Your Subscription: http://www.listbox.com/member/ [http://www.listbox.com/member/]
Archives: https://www.listbox.com/member/archive/735/=now
RSS Feed: https://www.listbox.com/member/archive/rss/735/13124949-ec5a0568
Modify Your Subscription: https://www.listbox.com/member/?&
Unsubscribe Now: https://www.listbox.com/unsubscribe/?&&post_id=20130212192229:6C9697B6-7573-11E2-B11D-86D6A19F48A7
Powered by Listbox: http://www.listbox.com

Nicolai

2013-02-13 19:05:06 UTC

Post by alan
either way this use of SPF is not relevant to this list really?

Well, seeing as there hasn't been any discussion on the list since
September, it can't hurt, right? Besides, there was some discussion of
pyspf, which I didn't know about.

To the OP, here's what I'd do in your shoes: grab a good domain
whitelist and generate IPs & netblocks using some SPF tool. If a
whitelist is not available to you, just generate your own based on
normal traffic, using the top domains list from Alexa as a guide.

http://s3.amazonaws.com/alexa-static/top-1m.csv.zip

Nicolai

18 Replies
12 Views
Permalink to this page
Disable enhanced parsing

Thread Navigation

Constantine A. Murenin 2013-02-12 21:12:15 UTC

Stuart D Gathman 2013-02-12 22:16:03 UTC

Constantine A. Murenin 2013-02-12 22:40:16 UTC

Stuart D Gathman 2013-02-12 22:55:56 UTC

Constantine A. Murenin 2013-02-13 00:19:42 UTC

Tim Draegen 2013-02-13 01:20:26 UTC

Stuart D Gathman 2013-02-13 01:50:01 UTC

Constantine A. Murenin 2013-02-13 02:36:33 UTC

Stuart D Gathman 2013-02-13 03:47:48 UTC

Stuart D Gathman 2013-02-13 03:59:45 UTC

Stuart D Gathman 2013-02-13 04:10:07 UTC

Constantine A. Murenin 2013-02-13 05:49:17 UTC

Stuart D Gathman 2013-02-13 15:33:09 UTC

Scott Kitterman 2013-02-13 02:11:30 UTC

Stuart D Gathman 2013-02-13 02:41:15 UTC

Constantine A. Murenin 2013-02-13 04:02:15 UTC

Scott Kitterman 2013-02-13 04:45:18 UTC

alan 2013-02-13 00:57:46 UTC

Nicolai 2013-02-13 19:05:06 UTC

about - legalese

Loading...