porn filtering, blacklists, and squid log file analysis

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

porn filtering, blacklists, and squid log file analysis

dave-5
Hello,
    I'm trying to implement porn filtering. I'm trying a variety of setups
to see which will give me the best results. First i'm using squid (2.6 port
on FreeBSD), as a transparent proxy in all setups. Setup1 is using
squidGuard, and the Mesd blacklist. When i dropped in mesd to the picture
the situation improved, a lot of previously accessible sites were now
blocked. My volunteer has a test machine for this and was able to google and
to either pull up images, nothing with pornographic-like names, but that
kind of images, and sites that weren't on the list. I update the blacklist
every night, but i need to write a script that goes through the access.log,
finds machine accesses and where they go, and then sets up a list of sites.
It then goes through said list, eliminating all duplicate entries, and sees
which domains still work, those that do are automatically added to a custom
squidguard blacklist and squidguard is reconfigured, squid reloaded.
    After that explanation i use grep on the access.log to find only the
accesses from the machine i want my test box, put that in another file. I
then use cut to take out i think it's field 10 or 11 it's the url of the
page, drop that in another file. The problem is i have a file containing
9500 entries, manually going through this isn't an option. If anyone can
help with this i can put the file somewhere where it can be downloaded.
        On the subject of blacklists aside from the mesd list, is there
anymore lists for squid/squidguard, that are free or free for noncommercial
purposes?
    My second setup involves dansguardian. My issue with this is first the
last time i tried this yes it worked though i never stress-tested this to
the extent i'm going for now, and second it seemed to slow the internet down
very noticeably to the point where everyone was telling me. I've got squid
as a transparent proxy using pf and i'd like to keep that arrangement, last
time i had to change this if there's an alternative i'm open to suggestions.
Thanks.
Dave.

Reply | Threaded
Open this post in threaded view
|

Re: porn filtering, blacklists, and squid log file analysis

Adrian Chadd
Look at urlblacklist.com; and don't be afraid to pay their monthly subscription
amount. It feeds right into dansguardian.


Adrian

On Sun, Jul 08, 2007, Dave wrote:

> Hello,
>    I'm trying to implement porn filtering. I'm trying a variety of setups
> to see which will give me the best results. First i'm using squid (2.6 port
> on FreeBSD), as a transparent proxy in all setups. Setup1 is using
> squidGuard, and the Mesd blacklist. When i dropped in mesd to the picture
> the situation improved, a lot of previously accessible sites were now
> blocked. My volunteer has a test machine for this and was able to google
> and to either pull up images, nothing with pornographic-like names, but
> that kind of images, and sites that weren't on the list. I update the
> blacklist every night, but i need to write a script that goes through the
> access.log, finds machine accesses and where they go, and then sets up a
> list of sites. It then goes through said list, eliminating all duplicate
> entries, and sees which domains still work, those that do are automatically
> added to a custom squidguard blacklist and squidguard is reconfigured,
> squid reloaded.
>    After that explanation i use grep on the access.log to find only the
> accesses from the machine i want my test box, put that in another file. I
> then use cut to take out i think it's field 10 or 11 it's the url of the
> page, drop that in another file. The problem is i have a file containing
> 9500 entries, manually going through this isn't an option. If anyone can
> help with this i can put the file somewhere where it can be downloaded.
>        On the subject of blacklists aside from the mesd list, is there
> anymore lists for squid/squidguard, that are free or free for noncommercial
> purposes?
>    My second setup involves dansguardian. My issue with this is first the
> last time i tried this yes it worked though i never stress-tested this to
> the extent i'm going for now, and second it seemed to slow the internet
> down very noticeably to the point where everyone was telling me. I've got
> squid as a transparent proxy using pf and i'd like to keep that
> arrangement, last time i had to change this if there's an alternative i'm
> open to suggestions.
> Thanks.
> Dave.