URL encoding in squid

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

URL encoding in squid

Anton
Good day.

I have squid+squidguard configuration. I need to filter a lot o URLs with national
symbols in it. My URL list consist mostly from percent-encoded URLs. So when squid
checks such URLs by squidGuard it transmits URL as-is with no percent-encoding.

SquidGuard see no URL because it has percent-encoded this URL.

URL list made from "zapret-info" if some one knows :-). It can contain non-consistent data:
%-encoded URLs can be in cp1251 or utf-8 after decoding and some URLs are not encoded at all.
I cannot to decode URLs from % in a right way.

IMHO it is better way is to %-encode not-encoded URLs to %-encoded and to use others as is.


So can squd+squdGuard make dial with percent-encoded URLs ?

Is it possible to path %-encoded URL to squidGuard ?


P.S. I use debian 8.7 and squid from debian repo (3.4.8-6+deb8u4).
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: URL encoding in squid

Amos Jeffries
Administrator
On 21/02/2017 11:43 p.m., Anton wrote:

> Good day.
>
> I have squid+squidguard configuration. I need to filter a lot o URLs with national
> symbols in it. My URL list consist mostly from percent-encoded URLs. So when squid
> checks such URLs by squidGuard it transmits URL as-is with no percent-encoding.
>
> SquidGuard see no URL because it has percent-encoded this URL.
>
> URL list made from "zapret-info" if some one knows :-). It can contain non-consistent data:
> %-encoded URLs can be in cp1251 or utf-8 after decoding and some URLs are not encoded at all.
> I cannot to decode URLs from % in a right way.

Ew.

>
> IMHO it is better way is to %-encode not-encoded URLs to %-encoded and to use others as is.
>
>
> So can squd+squdGuard make dial with percent-encoded URLs ?
>

Squid should be normalizing the %-encoding on the URLs as they arrive,
but I'm not seeing where it does that in the code so maybe not. What
SquidGuard does with them or its input data file is not under Squid control.

Also SG is very outdated and no longer maintained, you might find
ufdbGuard better able to handle this nasty input. I've bcc'd Marcus in
case there is something he can (or has) do about this type of mess in
that helper.

> Is it possible to path %-encoded URL to squidGuard ?

Not with Squid-3.4. The 3.5 releases have a url_rewrite_extras directive
which takes logformat codes. You could use that to send an extra
%-encoded copy of the URL to the helper in addition to the normal URL
input. (sorry there is no package yet in Debian 8 for 3.5).

Amos

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: URL encoding in squid

Marcus Kool


On 21/02/17 17:17, Amos Jeffries wrote:

>> Is it possible to path %-encoded URL to squidGuard ?
>
> Not with Squid-3.4. The 3.5 releases have a url_rewrite_extras directive
> which takes logformat codes. You could use that to send an extra
> %-encoded copy of the URL to the helper in addition to the normal URL
> input. (sorry there is no package yet in Debian 8 for 3.5).
>
> Amos

ufdbGuard has a database format that supports UTF8 characters but
only the latest beta (ufdbguard 1.32.5beta9) fully supports it.
I can send you a link to the beta software if you are interested.

how it works:
ufdbGuard as a utility to convert domains+urls files into a
database file which converts all %-encoded characters.
The URLs that Squid sends to ufdbGuard are also all converted
which means that URLs with %-encoded URLs and URLs without %-encoding
match.

Marcus


_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: URL encoding in squid

Anton
Yes, I am interesting in ufdbguard 1.32.5beta9.
Marcus pleace send me a link.

I will try to install the last 3.5.* squid and ufdbguard 1.32.5beta9 by
hands (./configure --a-lots-off-stuf... ; make ; make install to /usr/local/squid)


On Tue, 21 Feb 2017 17:38:36 -0300
Marcus Kool <[hidden email]> wrote:

> On 21/02/17 17:17, Amos Jeffries wrote:
>
> >> Is it possible to path %-encoded URL to squidGuard ?  
> >
> > Not with Squid-3.4. The 3.5 releases have a url_rewrite_extras directive
> > which takes logformat codes. You could use that to send an extra
> > %-encoded copy of the URL to the helper in addition to the normal URL
> > input. (sorry there is no package yet in Debian 8 for 3.5).
> >
> > Amos  
>
> ufdbGuard has a database format that supports UTF8 characters but
> only the latest beta (ufdbguard 1.32.5beta9) fully supports it.
> I can send you a link to the beta software if you are interested.
>
> how it works:
> ufdbGuard as a utility to convert domains+urls files into a
> database file which converts all %-encoded characters.
> The URLs that Squid sends to ufdbGuard are also all converted
> which means that URLs with %-encoded URLs and URLs without %-encoding
> match.
>
> Marcus
>
>
> _______________________________________________
> squid-users mailing list
> [hidden email]
> http://lists.squid-cache.org/listinfo/squid-users




--
Anton,
инженер отдела управления сетью связи,
ООО "ИКА" (Томика) 634050 г. Томск
пр. Ленина 55, оф. 101
Тел: 701-855
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users