Introducing Charcoal - Centralised URL Filter for squid

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Introducing Charcoal - Centralised URL Filter for squid

Nishant Sharma
Hi,

We are excited to invite early users to test drive Charcoal
(http://charcoal.io) - a Squid URL Rewriter for distributed proxies.

Charcoal is designed to help administrators manage access rules for the
proxies at just one place with a GUI, instead of editing configuration
of individual proxy servers.

It has come out of our need of managing ACLs for 100+ proxy servers on
embedded devices (OpenWRT/LEDE) running at our customer offices across
the geography of India. We are releasing it in the hope that it will be
useful for Squid users who have to manage multiple proxy servers everyday.

The architecture is API key driven client-server, where a squid
url-rewrite helper contacts server to query access controls for the
incoming requests.

Current features:
-----------------
- Supports Squid 2.x and 3.x
- 70+ pre-existing domains blacklist
- Custom destination groups/categories
- Custom source groups for IPs and Networks (usernames in the pipeline)
- As of now only domain filter support (no full url filtering)
- API key driven

Configuration:
--------------
- Download the helper from
https://raw.githubusercontent.com/Hopbox/charcoal-helper/master/squid/charcoal-helper.pl.
- Make sure IO::Socket module for Perl is installed.
- Add following lines to squid.conf after downloading the helper:

url_rewrite_program /path/to/charcoal-helper.pl YOUR_API_KEY
url_rewrite_children X startup=Y idle=Z concurrency=1

YOUR_API_KEY for our hosted Charcoal service can be requested by filling
in the form at http://charcoal.io or writing in to [hidden email].
The credentials for login to https://active.charcoal.io to manage the
ACL will be emailed along with YOUR_API_KEY.

License:
--------
URL Rewrite helper for squid is licensed under GPLv2.0 while Charcoal
Server is licensed under AGPLv3.0.

GIT Repo:
---------
Squid URL Rewrite helper can be downloaded from
https://github.com/Hopbox/charcoal-helper

Git repository for Charcoal Server is at https://github.com/Hopbox/charcoal

Regards,
Nishant
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Introducing Charcoal - Centralised URL Filter for squid

Benjamin E. Nichols
This sounds great, and would you mind specifying the source of the
blacklist data at the core of your services?

In other words, what I dare ask you is this, and im sure others might
want to know, are you using the blacklists from shalla, UT1, or
urlblacklist? Or have you developed your own domain management technology?


--
Signed,

Benjamin E. Nichols

http://www.squidblacklist.org


On 6/14/2017 5:36 AM, Nishant Sharma wrote:

> Hi,
>
> We are excited to invite early users to test drive Charcoal
> (http://charcoal.io) - a Squid URL Rewriter for distributed proxies.
>
> Charcoal is designed to help administrators manage access rules for
> the proxies at just one place with a GUI, instead of editing
> configuration of individual proxy servers.
>
> It has come out of our need of managing ACLs for 100+ proxy servers on
> embedded devices (OpenWRT/LEDE) running at our customer offices across
> the geography of India. We are releasing it in the hope that it will
> be useful for Squid users who have to manage multiple proxy servers
> everyday.
>
> The architecture is API key driven client-server, where a squid
> url-rewrite helper contacts server to query access controls for the
> incoming requests.
>
> Current features:
> -----------------
> - Supports Squid 2.x and 3.x
> - 70+ pre-existing domains blacklist
> - Custom destination groups/categories
> - Custom source groups for IPs and Networks (usernames in the pipeline)
> - As of now only domain filter support (no full url filtering)
> - API key driven
>
> Configuration:
> --------------
> - Download the helper from
> https://raw.githubusercontent.com/Hopbox/charcoal-helper/master/squid/charcoal-helper.pl.
> - Make sure IO::Socket module for Perl is installed.
> - Add following lines to squid.conf after downloading the helper:
>
> url_rewrite_program /path/to/charcoal-helper.pl YOUR_API_KEY
> url_rewrite_children X startup=Y idle=Z concurrency=1
>
> YOUR_API_KEY for our hosted Charcoal service can be requested by
> filling in the form at http://charcoal.io or writing in to
> [hidden email]. The credentials for login to
> https://active.charcoal.io to manage the ACL will be emailed along
> with YOUR_API_KEY.
>
> License:
> --------
> URL Rewrite helper for squid is licensed under GPLv2.0 while Charcoal
> Server is licensed under AGPLv3.0.
>
> GIT Repo:
> ---------
> Squid URL Rewrite helper can be downloaded from
> https://github.com/Hopbox/charcoal-helper
>
> Git repository for Charcoal Server is at
> https://github.com/Hopbox/charcoal
>
> Regards,
> Nishant
> _______________________________________________
> squid-users mailing list
> [hidden email]
> http://lists.squid-cache.org/listinfo/squid-users

--
Signed,

Benjamin E. Nichols

http://www.squidblacklist.org

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Introducing Charcoal - Centralised URL Filter for squid

Nishant Sharma
Hi Benjamin,

On Wednesday 14 June 2017 08:22 PM, Benjamin E. Nichols wrote:
> This sounds great, and would you mind specifying the source of the
> blacklist data at the core of your services?
>
> In other words, what I dare ask you is this, and im sure others might
> want to know, are you using the blacklists from shalla, UT1, or
> urlblacklist? Or have you developed your own domain management technology?
>

Thanks for the kind words.

For the test run, we are using Shalla.

I understand that quality of blacklists matters. It is also possible to
mix-match multiple blacklists and that should be the ideal scenario with
most of the bases covered. And that depends on the user-base and the
financial aspects of sourcing the blacklists.

Right now, our first priority is to fix a handful of bugs reported just
after the announcement.

Thanks & Regards,
Nishant
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Introducing Charcoal - Centralised URL Filter for squid

Eliezer Croitoru
In reply to this post by Nishant Sharma
Hey Nishant,

I want to offer you a more advanced helper that supports actual concurrency compared to the current perl helper on github,
which understands the protocol but do not use threads or any other method of concurrency.

Let me know if it's of any interest for you.
The skeleton is at:
http://wiki.squid-cache.org/EliezerCroitoru/GolangFakeHelper

I am willing to take my time and write the code for you. So..

Eliezer

----
Eliezer Croitoru
Linux System Administrator
Mobile: +972-5-28704261
Email: [hidden email]


-----Original Message-----
From: squid-users [mailto:[hidden email]] On Behalf Of Nishant Sharma
Sent: Wednesday, June 14, 2017 1:37 PM
To: [hidden email]
Subject: [squid-users] Introducing Charcoal - Centralised URL Filter for squid

Hi,

We are excited to invite early users to test drive Charcoal
(http://charcoal.io) - a Squid URL Rewriter for distributed proxies.

Charcoal is designed to help administrators manage access rules for the proxies at just one place with a GUI, instead of editing configuration of individual proxy servers.

It has come out of our need of managing ACLs for 100+ proxy servers on embedded devices (OpenWRT/LEDE) running at our customer offices across the geography of India. We are releasing it in the hope that it will be useful for Squid users who have to manage multiple proxy servers everyday.

The architecture is API key driven client-server, where a squid url-rewrite helper contacts server to query access controls for the incoming requests.

Current features:
-----------------
- Supports Squid 2.x and 3.x
- 70+ pre-existing domains blacklist
- Custom destination groups/categories
- Custom source groups for IPs and Networks (usernames in the pipeline)
- As of now only domain filter support (no full url filtering)
- API key driven

Configuration:
--------------
- Download the helper from
https://raw.githubusercontent.com/Hopbox/charcoal-helper/master/squid/charcoal-helper.pl.
- Make sure IO::Socket module for Perl is installed.
- Add following lines to squid.conf after downloading the helper:

url_rewrite_program /path/to/charcoal-helper.pl YOUR_API_KEY url_rewrite_children X startup=Y idle=Z concurrency=1

YOUR_API_KEY for our hosted Charcoal service can be requested by filling in the form at http://charcoal.io or writing in to [hidden email].
The credentials for login to https://active.charcoal.io to manage the ACL will be emailed along with YOUR_API_KEY.

License:
--------
URL Rewrite helper for squid is licensed under GPLv2.0 while Charcoal Server is licensed under AGPLv3.0.

GIT Repo:
---------
Squid URL Rewrite helper can be downloaded from https://github.com/Hopbox/charcoal-helper

Git repository for Charcoal Server is at https://github.com/Hopbox/charcoal

Regards,
Nishant
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Introducing Charcoal - Centralised URL Filter for squid

Nishant Sharma


Hi Eliezer,

On 14 June 2017 11:07:16 PM IST, Eliezer  Croitoru <[hidden email]> wrote:

>I want to offer you a more advanced helper that supports actual
>concurrency compared to the current perl helper on github,
>which understands the protocol but do not use threads or any other
>method of concurrency.
>
>Let me know if it's of any interest for you.
>The skeleton is at:
>http://wiki.squid-cache.org/EliezerCroitoru/GolangFakeHelper

Thanks a lot for the offer. It surely is interesting.

The current state of helper is due to the fact that it was written for embedded/low powered devices running Linux. OpenWrt doesn't cross-compile Go as of now, so we had to go for Perl. It is good enough for low request proxies at small offices.

We are modifying it as per recommendations by Amos and will check-in the updated code soon.

>I am willing to take my time and write the code for you. So..

Glad to know about your willingness to write it in Go. It will help the community at large to run it on more powerful machines that serve a lot of requests.

Another version of helper that we are writing will use memcached on local proxy to cache the access granted from the cloud server and will greatly increase the speed.

Regards,
Nishant
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Introducing Charcoal - Centralised URL Filter for squid

Eliezer Croitoru
I wanted to be sure I am not day-dreaming but from the code it seems that every request is given a single TCP connection.
Am I right?
If so there is much to improve.
You can use the same tcp connection for more then a single request and also have a reconnect option for the very far from realiy case of a closed connection.

All The Bests,
Eliezer

----
Eliezer Croitoru
Linux System Administrator
Mobile: +972-5-28704261
Email: [hidden email]


-----Original Message-----
From: Nishant Sharma [mailto:[hidden email]]
Sent: Saturday, June 17, 2017 06:40
To: Eliezer Croitoru <[hidden email]>
Cc: [hidden email]
Subject: RE: [squid-users] Introducing Charcoal - Centralised URL Filter for squid



Hi Eliezer,

On 14 June 2017 11:07:16 PM IST, Eliezer  Croitoru <[hidden email]> wrote:

>I want to offer you a more advanced helper that supports actual
>concurrency compared to the current perl helper on github, which
>understands the protocol but do not use threads or any other method of
>concurrency.
>
>Let me know if it's of any interest for you.
>The skeleton is at:
>http://wiki.squid-cache.org/EliezerCroitoru/GolangFakeHelper

Thanks a lot for the offer. It surely is interesting.

The current state of helper is due to the fact that it was written for embedded/low powered devices running Linux. OpenWrt doesn't cross-compile Go as of now, so we had to go for Perl. It is good enough for low request proxies at small offices.

We are modifying it as per recommendations by Amos and will check-in the updated code soon.

>I am willing to take my time and write the code for you. So..

Glad to know about your willingness to write it in Go. It will help the community at large to run it on more powerful machines that serve a lot of requests.

Another version of helper that we are writing will use memcached on local proxy to cache the access granted from the cloud server and will greatly increase the speed.

Regards,
Nishant
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Introducing Charcoal - Centralised URL Filter for squid

Amos Jeffries
Administrator
On 17/06/17 19:07, Eliezer Croitoru wrote:
> I wanted to be sure I am not day-dreaming but from the code it seems that every request is given a single TCP connection.
> Am I right?
> If so there is much to improve.

You are seeing correct. That is one of the things I brought up and is
being worked on already. see issue #3 in their tracker.

Amos

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Introducing Charcoal - Centralised URL Filter for squid

Nishant Sharma
In reply to this post by Eliezer Croitoru
Hi Eliezer,

On Saturday 17 June 2017 12:37 PM, Eliezer  Croitoru wrote:
> I wanted to be sure I am not day-dreaming but from the code it seems that every request is given a single TCP connection.
> Am I right?
> If so there is much to improve.
> You can use the same tcp connection for more then a single request and also have a reconnect option for the very far from realiy case of a closed connection.

Your observation is correct.

We have updated the helper with the latest commit
https://github.com/Hopbox/charcoal-helper/commit/2cd3a0f985c2083046267eee82f6c7df16113113

It tries to address the issues you mentioned, but is not yet ideal.
Since, it is invoked by squid, number of children started depends on
squid. Total no. of sockets in states ESTABLISHED & CLOSE_WAIT are equal
to the number of helper children started by squid.

May be, the helper architecture could be changed such that a parent
process creates a pool of network connections that children use. Thus,
limiting the number of sockets being used at any moment. And squid
controls the number of those parent processes.

Regards,
Nishant
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Introducing Charcoal - Centralised URL Filter for squid

Amos Jeffries
Administrator
On 17/06/17 21:59, Nishant Sharma wrote:
> May be, the helper architecture could be changed such that a parent
> process creates a pool of network connections that children use. Thus,
> limiting the number of sockets being used at any moment. And squid
> controls the number of those parent processes.

That would mean making Squid aware of the internal workings of the
helper. Namely that it uses connections to a specific server, port and
which transport. One of the major points of flexibility with helpers is
that this kind of thing is kept completely separate from Squid.

The URL-rewrite API being used by charcoal has the purpose of altering
the URI which Squid fetches content for a client from. Doing access
control through it instead of the access control API (external ACL
helper) is kind of borked from the start.

Amos

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Introducing Charcoal - Centralised URL Filter for squid

Nishant Sharma


On 17 June 2017 11:17:38 PM IST, Amos Jeffries <[hidden email]> wrote:

>That would mean making Squid aware of the internal workings of the
>helper. Namely that it uses connections to a specific server, port and
>which transport. One of the major points of flexibility with helpers is
>
>that this kind of thing is kept completely separate from Squid.

Re-reading my mail made me realise that it conveyed that helper architecture of squid be modified,  instead I wanted to say that we can modify the architecture of our helper, where it internally manages its own children which may speed-up the URL rewrite process.

>The URL-rewrite API being used by charcoal has the purpose of altering
>the URI which Squid fetches content for a client from. Doing access
>control through it instead of the access control API (external ACL
>helper) is kind of borked from the start.

I agree, external ACL helper will also allow to have access to  additional information like user-agent, reply content-type etc. to have more granular control.

Regards,
Nishant
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Introducing Charcoal - Centralised URL Filter for squid

Eliezer Croitoru
Hey Nishant,

Responding to your idea and the whole concept of the helper and also comparing to GoLang binaries.

About the software design to run on-top of embedded hardware and a GoLang binary:
A GoLang helper can be compiled to almost any modern embedded device(else them mips based)
Also its more efficient then any software you will write with perl.
The only "limit" is CPU comparability and Binary size vs device free space.(in any case a perl helper would use more then a GoLang one).

The idea of writing a software that will implement concurrency in perl or python is nice and noble but I believe that probably
for small embedded devices you won't need a "robust" helper that supports concurrency or any other more complex solutions.

I believe that you should aim for the more standard hardware devices which squid can be built on-top such as:
- x86
- x86_64
- arm64
- arm5
- arm8

The above will benefit from a good and robust helper which supports concurrency.
Now that it's clear that your socket can handle more then only one request I will write a helper in GoLang that works with:
- concurrency
- better connection handling(being able to handle responses whenever they received)

I already wrote most of the code so I believe it's a matter of days for the helper to be ready.
Would I be able to receive some testing api key\token once the helper will be ready?

Thanks,
Eliezer

----
Eliezer Croitoru
Linux System Administrator
Mobile: +972-5-28704261
Email: [hidden email]


-----Original Message-----
From: squid-users [mailto:[hidden email]] On Behalf Of Nishant Sharma
Sent: Saturday, June 17, 2017 21:30
To: [hidden email]
Subject: Re: [squid-users] Introducing Charcoal - Centralised URL Filter for squid



On 17 June 2017 11:17:38 PM IST, Amos Jeffries <[hidden email]> wrote:

>That would mean making Squid aware of the internal workings of the
>helper. Namely that it uses connections to a specific server, port and
>which transport. One of the major points of flexibility with helpers is
>
>that this kind of thing is kept completely separate from Squid.

Re-reading my mail made me realise that it conveyed that helper architecture of squid be modified,  instead I wanted to say that we can modify the architecture of our helper, where it internally manages its own children which may speed-up the URL rewrite process.

>The URL-rewrite API being used by charcoal has the purpose of altering
>the URI which Squid fetches content for a client from. Doing access
>control through it instead of the access control API (external ACL
>helper) is kind of borked from the start.

I agree, external ACL helper will also allow to have access to  additional information like user-agent, reply content-type etc. to have more granular control.

Regards,
Nishant
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Introducing Charcoal - Centralised URL Filter for squid

Nishant Sharma
Hi Eliezer,

On Sunday 18 June 2017 02:12 AM, Eliezer  Croitoru wrote:
> I believe that you should aim for the more standard hardware devices which squid can be built on-top such as:
> - x86
> - x86_64
> - arm64
> - arm5
> - arm8

In order to improve response time on capable hardware, we have just
pushed the helper version with support for memcached to github:

https://github.com/Hopbox/charcoal-helper

This is one step closer to supporting standard hardware platforms, until
we have a more capable helper in place.

Regards,
Nishant
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users