Header order in squid proxy

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Header order in squid proxy

Sonya Roy
Hi,

I noticed that squid changes the header order received from the client before sending it to the origin server.

I assume this is because squid parses the header data and adds some headers depending on the config file and then recreates the header data.

Is there any way to prevent this? To keep the header order received from the client and only just remove headers like Proxy-Connection, Proxy-Authorization,... which squid does anyway.

I am asking because some sites detect bots using the header order and they drop any such connection. So they unintentionally block squid proxies even if its not being used by a bot.

With regards,
Sonya Roy.

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Header order in squid proxy

Alex Rousskov
On 06/22/2017 11:49 AM, Sonya Roy wrote:

> I noticed that squid changes the header order received from the client
> before sending it to the origin server.
>
> I assume this is because squid parses the header data and adds some
> headers depending on the config file and then recreates the header data.

IIRC, modern Squids change a header field position when the received
field is deleted and then added back. This is typical for hop-by-hop
headers such as Connection, but there are other reasons for Squid to
delete and add a header field. When the value of the added field is the
same as the value of the removed field, such pointless "editing" looks
like mindless "reordering" to the outside observer.

The two actions (field deletion and addition) may happen in a single
piece of code or may be separated by lots of code and even time.
Preventing pointless editing in the former cases is straightforward, but
the latter cases are difficult to handle. Correct avoidance of pointless
editing may improve performance and, if it does, can be considered a
useful optimization on its own, regardless of your use case.


> Is there any way to prevent this?

Not without changing Squid code (or adding more proxies). However,
before we even talk about code changes, we should clarify the problem we
are dealing with. The questions below will guide you.

It is probably much easier to ensure some fixed field send order
(regardless of the received order) than to preserve the received order.
Will a fixed order (e.g., always alphabetical) address your use case?
This feature will hurt performance, but you might be able to convince
others to accept it if you have a very compelling/specific/detailed use
case because it can be disabled by default.


> I am asking because some sites detect bots using the header order and
> they drop any such connection. So they unintentionally block squid
> proxies even if its not being used by a bot.

Are you implying that bots often change header field order between their
requests? Or that bots often use a different (fixed) header field order
than the (fixed) field order used by non-bots? Preserving received order
may help in the former case but not in the latter case.

Also, do those blocking sites pay attention to all headers or just
end-to-end headers?

Please note that there are many other ways to detect a proxy so if a
site wants to block proxies rather than bots, then it is probably
pointless to fight it (or, at least, the Squid Project should not).


HTH,

Alex.
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Header order in squid proxy

Sonya Roy
In reply to this post by Sonya Roy
The sites I am talking about check the User-Agent header and makes sure the user-agent is for a well-known browser, i.e. a browser that they support. And any browser like Firefox, Chrome, Safari, Edge for example, sends the headers in a certain order and the order depends on the browser. And this header order for well-known headers like Accept, Accept-Language, Accept-Encoding, Content-Length, Host, Connection, Referer, Cookie, etc. And they match the order of the received request with the standard header order for the browser for that user-agent.

This detects bots like a poorly written bot(i.e ones that don't consider this header order) using python requests or in any language for that matter where the requests are handled using a low level http requests library. 

So, keeping the header order sent from the client intact would prevent them from dropping proxied requests(ones that use squid). I know for a fact that they don't intend to block proxies.

Could you point me in the direction to where I should look for in the source code of squid? the part that handles the header data sent from the client.

With regards,
Sonya Roy.

On Fri, Jun 23, 2017 at 12:02 AM, Alex Rousskov <[hidden email]> wrote:
On 06/22/2017 11:49 AM, Sonya Roy wrote:

> I noticed that squid changes the header order received from the client
> before sending it to the origin server.
>
> I assume this is because squid parses the header data and adds some
> headers depending on the config file and then recreates the header data.

IIRC, modern Squids change a header field position when the received
field is deleted and then added back. This is typical for hop-by-hop
headers such as Connection, but there are other reasons for Squid to
delete and add a header field. When the value of the added field is the
same as the value of the removed field, such pointless "editing" looks
like mindless "reordering" to the outside observer.

The two actions (field deletion and addition) may happen in a single
piece of code or may be separated by lots of code and even time.
Preventing pointless editing in the former cases is straightforward, but
the latter cases are difficult to handle. Correct avoidance of pointless
editing may improve performance and, if it does, can be considered a
useful optimization on its own, regardless of your use case.


> Is there any way to prevent this?

Not without changing Squid code (or adding more proxies). However,
before we even talk about code changes, we should clarify the problem we
are dealing with. The questions below will guide you.

It is probably much easier to ensure some fixed field send order
(regardless of the received order) than to preserve the received order.
Will a fixed order (e.g., always alphabetical) address your use case?
This feature will hurt performance, but you might be able to convince
others to accept it if you have a very compelling/specific/detailed use
case because it can be disabled by default.


> I am asking because some sites detect bots using the header order and
> they drop any such connection. So they unintentionally block squid
> proxies even if its not being used by a bot.

Are you implying that bots often change header field order between their
requests? Or that bots often use a different (fixed) header field order
than the (fixed) field order used by non-bots? Preserving received order
may help in the former case but not in the latter case.

Also, do those blocking sites pay attention to all headers or just
end-to-end headers?

Please note that there are many other ways to detect a proxy so if a
site wants to block proxies rather than bots, then it is probably
pointless to fight it (or, at least, the Squid Project should not).


HTH,

Alex.


_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Header order in squid proxy

Alex Rousskov
On 06/22/2017 12:54 PM, Sonya Roy wrote:
> The sites I am talking about check the User-Agent header and makes sure
> the user-agent is for a well-known browser, i.e. a browser that they
> support. And any browser like Firefox, Chrome, Safari, Edge for example,
> sends the headers in a certain order and the order depends on the
> browser. And this header order for well-known headers like Accept,
> Accept-Language, Accept-Encoding, Content-Length, Host, Connection,
> Referer, Cookie, etc. And they match the order of the received request
> with the standard header order for the browser for that user-agent.

FWIW, Connection and possibly some "etc." headers are hop-by-hop headers
so if a blocking site really pays attention to them, it should be told
to exclude them.


> Could you point me in the direction to where I should look for in the
> source code of squid?

The answer depends on whether you want to:

A) prevent pointless edits (difficult and less effective but has a
fighting chance of official acceptance because it is a useful
performance optimization) or

B) simply reorder all the fields just before sending them, based on a
User-Agent field-driven order table (easy to hack in and effective but
less likely to be officially accepted due to performance overheads and
configuration/support complexities).

If you want a general vague answer, search for calls to non-const
HttpHeader methods like HttpHeader::delByName() and
HttpHeader::insertEntry(). There are about 20-30 potentially relevant
methods AFAICT. And examine the sending code in
HttpStateData::httpBuildRequestHeader().


Please note that the discussion about Squid code belongs to squid-dev,
not squid-users.


HTH,

Alex.


> On Fri, Jun 23, 2017 at 12:02 AM, Alex Rousskov wrote:
>
>     On 06/22/2017 11:49 AM, Sonya Roy wrote:
>
>     > I noticed that squid changes the header order received from the client
>     > before sending it to the origin server.
>     >
>     > I assume this is because squid parses the header data and adds some
>     > headers depending on the config file and then recreates the header data.
>
>     IIRC, modern Squids change a header field position when the received
>     field is deleted and then added back. This is typical for hop-by-hop
>     headers such as Connection, but there are other reasons for Squid to
>     delete and add a header field. When the value of the added field is the
>     same as the value of the removed field, such pointless "editing" looks
>     like mindless "reordering" to the outside observer.
>
>     The two actions (field deletion and addition) may happen in a single
>     piece of code or may be separated by lots of code and even time.
>     Preventing pointless editing in the former cases is straightforward, but
>     the latter cases are difficult to handle. Correct avoidance of pointless
>     editing may improve performance and, if it does, can be considered a
>     useful optimization on its own, regardless of your use case.
>
>
>     > Is there any way to prevent this?
>
>     Not without changing Squid code (or adding more proxies). However,
>     before we even talk about code changes, we should clarify the problem we
>     are dealing with. The questions below will guide you.
>
>     It is probably much easier to ensure some fixed field send order
>     (regardless of the received order) than to preserve the received order.
>     Will a fixed order (e.g., always alphabetical) address your use case?
>     This feature will hurt performance, but you might be able to convince
>     others to accept it if you have a very compelling/specific/detailed use
>     case because it can be disabled by default.
>
>
>     > I am asking because some sites detect bots using the header order and
>     > they drop any such connection. So they unintentionally block squid
>     > proxies even if its not being used by a bot.
>
>     Are you implying that bots often change header field order between their
>     requests? Or that bots often use a different (fixed) header field order
>     than the (fixed) field order used by non-bots? Preserving received order
>     may help in the former case but not in the latter case.
>
>     Also, do those blocking sites pay attention to all headers or just
>     end-to-end headers?
>
>     Please note that there are many other ways to detect a proxy so if a
>     site wants to block proxies rather than bots, then it is probably
>     pointless to fight it (or, at least, the Squid Project should not).
>
>
>     HTH,
>
>     Alex.
>
>
>
>
> _______________________________________________
> squid-users mailing list
> [hidden email]
> http://lists.squid-cache.org/listinfo/squid-users
>

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Header order in squid proxy

Eliezer Croitoru
In reply to this post by Sonya Roy
If I may add a word or two:
If sites are securing their systems based on headers order then I believe they are aiming at the wrong target.
It's a "nice to have" but not actual deep application level defense.(based on my low level in the subject)
One example I have seen of a DOS\DDOS issue is:
"Hey, We are having high CPU usage, what should we do?"
- The bot was hammering the service from an AWS instance ... so block it..
- How many requests per second from a single IP is considered normal?
- Then, how many *new* cookies requests per second is considered normal?
- What about NAT? would a Chinese client be considered legit despite to him being under one big NAT?
- Would you be able to differentiate between a specific single ip or subnet that is considered legit?
- What about RBL?

The above are a things I heard here or there which I think are more important than headers order.
Take my words as coming from a person which is not an expert in the security area.

All The Bests,
Eliezer

----
http://ngtech.co.il/lmgtfy/
Linux System Administrator
Mobile: +972-5-28704261
Email: [hidden email]


From: squid-users [mailto:[hidden email]] On Behalf Of Sonya Roy
Sent: Thursday, June 22, 2017 21:54
To: [hidden email]
Subject: Re: [squid-users] Header order in squid proxy

The sites I am talking about check the User-Agent header and makes sure the user-agent is for a well-known browser, i.e. a browser that they support. And any browser like Firefox, Chrome, Safari, Edge for example, sends the headers in a certain order and the order depends on the browser. And this header order for well-known headers like Accept, Accept-Language, Accept-Encoding, Content-Length, Host, Connection, Referer, Cookie, etc. And they match the order of the received request with the standard header order for the browser for that user-agent.

This detects bots like a poorly written bot(i.e ones that don't consider this header order) using python requests or in any language for that matter where the requests are handled using a low level http requests library.

So, keeping the header order sent from the client intact would prevent them from dropping proxied requests(ones that use squid). I know for a fact that they don't intend to block proxies.

Could you point me in the direction to where I should look for in the source code of squid? the part that handles the header data sent from the client.

With regards,
Sonya Roy.

On Fri, Jun 23, 2017 at 12:02 AM, Alex Rousskov <mailto:[hidden email]> wrote:
On 06/22/2017 11:49 AM, Sonya Roy wrote:

> I noticed that squid changes the header order received from the client
> before sending it to the origin server.
>
> I assume this is because squid parses the header data and adds some
> headers depending on the config file and then recreates the header data.

IIRC, modern Squids change a header field position when the received
field is deleted and then added back. This is typical for hop-by-hop
headers such as Connection, but there are other reasons for Squid to
delete and add a header field. When the value of the added field is the
same as the value of the removed field, such pointless "editing" looks
like mindless "reordering" to the outside observer.

The two actions (field deletion and addition) may happen in a single
piece of code or may be separated by lots of code and even time.
Preventing pointless editing in the former cases is straightforward, but
the latter cases are difficult to handle. Correct avoidance of pointless
editing may improve performance and, if it does, can be considered a
useful optimization on its own, regardless of your use case.


> Is there any way to prevent this?

Not without changing Squid code (or adding more proxies). However,
before we even talk about code changes, we should clarify the problem we
are dealing with. The questions below will guide you.

It is probably much easier to ensure some fixed field send order
(regardless of the received order) than to preserve the received order.
Will a fixed order (e.g., always alphabetical) address your use case?
This feature will hurt performance, but you might be able to convince
others to accept it if you have a very compelling/specific/detailed use
case because it can be disabled by default.


> I am asking because some sites detect bots using the header order and
> they drop any such connection. So they unintentionally block squid
> proxies even if its not being used by a bot.

Are you implying that bots often change header field order between their
requests? Or that bots often use a different (fixed) header field order
than the (fixed) field order used by non-bots? Preserving received order
may help in the former case but not in the latter case.

Also, do those blocking sites pay attention to all headers or just
end-to-end headers?

Please note that there are many other ways to detect a proxy so if a
site wants to block proxies rather than bots, then it is probably
pointless to fight it (or, at least, the Squid Project should not).


HTH,

Alex.


_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Header order in squid proxy

Sonya Roy
The sites I was talking about don't just target the header order. That's just one of the things they check. Of course, they have their own system to protect themselves again ddos attacks or use services like akamai or cloudflare. The header order is just one of the common bot-detection techniques that they use to filter out unwanted traffic.

For example, akamai's bot detection system checks header order as well among a lot of other things.

Anyway, after Alex pointed me to the right direction, I managed to edit couple lines in squid to prevent the change in header order.

With regards,

On Tue, Jun 27, 2017 at 8:02 PM, Eliezer Croitoru <[hidden email]> wrote:
If I may add a word or two:
If sites are securing their systems based on headers order then I believe they are aiming at the wrong target.
It's a "nice to have" but not actual deep application level defense.(based on my low level in the subject)
One example I have seen of a DOS\DDOS issue is:
"Hey, We are having high CPU usage, what should we do?"
- The bot was hammering the service from an AWS instance ... so block it..
- How many requests per second from a single IP is considered normal?
- Then, how many *new* cookies requests per second is considered normal?
- What about NAT? would a Chinese client be considered legit despite to him being under one big NAT?
- Would you be able to differentiate between a specific single ip or subnet that is considered legit?
- What about RBL?

The above are a things I heard here or there which I think are more important than headers order.
Take my words as coming from a person which is not an expert in the security area.

All The Bests,
Eliezer

----
http://ngtech.co.il/lmgtfy/
Linux System Administrator
Mobile: +972-5-28704261
Email: [hidden email]


From: squid-users [mailto:[hidden email]] On Behalf Of Sonya Roy
Sent: Thursday, June 22, 2017 21:54
To: [hidden email]
Subject: Re: [squid-users] Header order in squid proxy

The sites I am talking about check the User-Agent header and makes sure the user-agent is for a well-known browser, i.e. a browser that they support. And any browser like Firefox, Chrome, Safari, Edge for example, sends the headers in a certain order and the order depends on the browser. And this header order for well-known headers like Accept, Accept-Language, Accept-Encoding, Content-Length, Host, Connection, Referer, Cookie, etc. And they match the order of the received request with the standard header order for the browser for that user-agent.

This detects bots like a poorly written bot(i.e ones that don't consider this header order) using python requests or in any language for that matter where the requests are handled using a low level http requests library.

So, keeping the header order sent from the client intact would prevent them from dropping proxied requests(ones that use squid). I know for a fact that they don't intend to block proxies.

Could you point me in the direction to where I should look for in the source code of squid? the part that handles the header data sent from the client.

With regards,
Sonya Roy.

On Fri, Jun 23, 2017 at 12:02 AM, Alex Rousskov <mailto:[hidden email]> wrote:
On 06/22/2017 11:49 AM, Sonya Roy wrote:

> I noticed that squid changes the header order received from the client
> before sending it to the origin server.
>
> I assume this is because squid parses the header data and adds some
> headers depending on the config file and then recreates the header data.

IIRC, modern Squids change a header field position when the received
field is deleted and then added back. This is typical for hop-by-hop
headers such as Connection, but there are other reasons for Squid to
delete and add a header field. When the value of the added field is the
same as the value of the removed field, such pointless "editing" looks
like mindless "reordering" to the outside observer.

The two actions (field deletion and addition) may happen in a single
piece of code or may be separated by lots of code and even time.
Preventing pointless editing in the former cases is straightforward, but
the latter cases are difficult to handle. Correct avoidance of pointless
editing may improve performance and, if it does, can be considered a
useful optimization on its own, regardless of your use case.


> Is there any way to prevent this?

Not without changing Squid code (or adding more proxies). However,
before we even talk about code changes, we should clarify the problem we
are dealing with. The questions below will guide you.

It is probably much easier to ensure some fixed field send order
(regardless of the received order) than to preserve the received order.
Will a fixed order (e.g., always alphabetical) address your use case?
This feature will hurt performance, but you might be able to convince
others to accept it if you have a very compelling/specific/detailed use
case because it can be disabled by default.


> I am asking because some sites detect bots using the header order and
> they drop any such connection. So they unintentionally block squid
> proxies even if its not being used by a bot.

Are you implying that bots often change header field order between their
requests? Or that bots often use a different (fixed) header field order
than the (fixed) field order used by non-bots? Preserving received order
may help in the former case but not in the latter case.

Also, do those blocking sites pay attention to all headers or just
end-to-end headers?

Please note that there are many other ways to detect a proxy so if a
site wants to block proxies rather than bots, then it is probably
pointless to fight it (or, at least, the Squid Project should not).


HTH,

Alex.




_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Header order in squid proxy

AndreiZeeGiant
Hello,

I'm running into similar issues due to "reordering". Could you provide the
mentioned code changes, or suggest what I should be looking for? I recently
upgraded from 3.1 (had no such issues) to 3.5, and retained the same
configuration file.



--
Sent from: http://squid-web-proxy-cache.1019090.n4.nabble.com/Squid-Users-f1019091.html
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users