Squid4 '%<la' empty sometimes

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Squid4 '%<la' empty sometimes

Peter Viskup
On Linux Virtual Server balanced cluster the %<la is empty when Squid
errors ERR_ACCESS_DENIED or TCP_MEM_HIT are logged.

This is causing issue when processing logs on remote machine (e.g.
ELK). The "proxy node" cannot be differentiated.

This string used in logformat to differentiate Squid instance and LVS node:
proxyPort=%lp proxyIP=%la proxyNode=%<la

Example:
for ERR_ACCESS_DENIED and TCP_MEM_HIT entries:
proxyPort=3128 proxyIP=10.x.y.z proxyNode=-

for ERR_DIR_LISTING entry:
proxyPort=3128 proxyIP=10.x.y.z proxyNode=10.x.y.a

How to differentiate the nodes in such cases?
Should the bug being considered?

Peter
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid4 '%<la' empty sometimes

Amos Jeffries
Administrator
On 04/04/18 21:15, Peter Viskup wrote:
> On Linux Virtual Server balanced cluster the %<la is empty when Squid
> errors ERR_ACCESS_DENIED or TCP_MEM_HIT are logged.
>
> This is causing issue when processing logs on remote machine (e.g.
> ELK). The "proxy node" cannot be differentiated.

It sounds to me like you are misinterpreting what is happening and/or
using the wrong logformat tags/codes.

The denied or HIT transactions you mention *do not* involve any upstream
server. Thus no upstream server connection. No logformat codes will log
details from a non-existing thing.

NP: the set of correct/valid values includes "-" for non-existing data
on most logformat codes, only a few use "0" for legacy reasons.


>
> This string used in logformat to differentiate Squid instance and LVS node:
> proxyPort=%lp proxyIP=%la proxyNode=%<la

All of those are displaying Squid endpoint IP:port details. They just
apply to various of the multiple different connections Squid used (or
not) for the particular HTTP transaction.

"Local" in the docs there means Squid end of the TCP connections.


If you want to log the LVS IP:port details you need to log the IP:port
details of the *other* end of the connection between Squid and the LVS.

The '<' and '>' respectively denote the data being from the
client<-Squid connection or Squid->server connection(s).

NP: There many be many Squid->server connections actually used in the
event of failures. Only the final one is recorded.


>
> Example:
> for ERR_ACCESS_DENIED and TCP_MEM_HIT entries:
> proxyPort=3128 proxyIP=10.x.y.z proxyNode=-
>

A client (unknown) connected to Squid 10.x.y.z:3128.
No server involvement.


> for ERR_DIR_LISTING entry:
> proxyPort=3128 proxyIP=10.x.y.z proxyNode=10.x.y.a
>

A client (unknown) connected to Squid 10.x.y.z:3128.
Squid contacted a server (unknown) from IP 10.x.y.a:(port unknown).


> How to differentiate the nodes in such cases?


So can you clarify what you mean by "Linux Virtual Server balanced
cluster" ?

Is that a cluster of LB devices/VM in front (or behind) a Squid proxy?
 or some Squid running on each node of the cluster?
 or something else?


Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid4 '%<la' empty sometimes

Peter Viskup
On Wed, Apr 4, 2018 at 5:42 PM, Amos Jeffries <[hidden email]> wrote:

> On 04/04/18 21:15, Peter Viskup wrote:
>> On Linux Virtual Server balanced cluster the %<la is empty when Squid
>> errors ERR_ACCESS_DENIED or TCP_MEM_HIT are logged.
>>
>> This is causing issue when processing logs on remote machine (e.g.
>> ELK). The "proxy node" cannot be differentiated.
>
> It sounds to me like you are misinterpreting what is happening and/or
> using the wrong logformat tags/codes.
>
> The denied or HIT transactions you mention *do not* involve any upstream
> server. Thus no upstream server connection. No logformat codes will log
> details from a non-existing thing.
>
> So can you clarify what you mean by "Linux Virtual Server balanced
> cluster" ?
>
> Is that a cluster of LB devices/VM in front (or behind) a Squid proxy?
>  or some Squid running on each node of the cluster?
>  or something else?

Two VMs with heartbeat+ldirectord+LVS. Every server running the Squid
proxy. The service IP with LVS configuration (managed by ldirectord)
is made highly available.
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid4 '%<la' empty sometimes

Amos Jeffries
Administrator
On 05/04/18 21:39, Peter Viskup wrote:

> On Wed, Apr 4, 2018 at 5:42 PM, Amos Jeffries wrote:
>> On 04/04/18 21:15, Peter Viskup wrote:
>>> On Linux Virtual Server balanced cluster the %<la is empty when Squid
>>> errors ERR_ACCESS_DENIED or TCP_MEM_HIT are logged.
>>>
>>> This is causing issue when processing logs on remote machine (e.g.
>>> ELK). The "proxy node" cannot be differentiated.
>>
>> It sounds to me like you are misinterpreting what is happening and/or
>> using the wrong logformat tags/codes.
>>
>> The denied or HIT transactions you mention *do not* involve any upstream
>> server. Thus no upstream server connection. No logformat codes will log
>> details from a non-existing thing.
>>
>> So can you clarify what you mean by "Linux Virtual Server balanced
>> cluster" ?
>>
>> Is that a cluster of LB devices/VM in front (or behind) a Squid proxy?
>>  or some Squid running on each node of the cluster?
>>  or something else?
>
> Two VMs with heartbeat+ldirectord+LVS. Every server running the Squid
> proxy. The service IP with LVS configuration (managed by ldirectord)
> is made highly available.
>


Okay. So..

0) you could also configure a specific unique value in each Squid's
logformat definition. So it is explicitly present in all log lines
generated by that proxy / node.

... or if you need something more dynamic ...

1) you could use the "note" directive to algorithmically determine (in
its ACL matching) a value for %note to record in a shared logformat
definition on a per-request, per-node basis.

2) If you have OpenFlows or similar recording of the inbound TCP
connections you might use the %>a:%>p:%>la tuplet (client IP:port +
Squid-IP) to match up with other info about the client TCP connection.
That is the most reliable relationship (as the server connection range
from none to many).

 [ note that %>la and %la may be different in the presence of NAT,
TPROXY, or PROXY protocol. ]

3) If the LVS is managing packet routing using TOS or packet markings
you could use the %>qos or %>nfmark to explicitly display those in the
log file.

4) If the VMs all have unique interface MAC/EUI addresses you could use
that uniqueness as node IDs with the %>eui code.

5) you could use all of the codes in #2, #3, and #4 for a 6-tuplet
identifier.


Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid4 '%<la' empty sometimes

Peter Viskup
Amos,
thank you for your helpful reply.

Setup with use of %note is working. Both servers have same
configuration from GIT. Added one include for the proxyNode note
configuration with "GIT ignore" flag on that file.
Realized the note can be used to track the ACL the transaction passed to.

Proxy node ID:
note proxyNode 10.x.y.a
logformat ... proxyNode=%{proxyNode}note ...

Passed via ACL:
acl sslstep1 at_step SslBump1
acl sslstep3 at_step SslBump3

acl test_s src <some_ip>
acl test_d dstdomain <some_domain>
acl test_ssl ssl::server_name <some_domain>
acl SSH_port port 22

# splice at step1 for SSH connections
note acl test_splice_ssh test_s test_d SSH_port
ssl_bump splice sslstep1 test_s test_d SSH_port

# splice at step3 for other TLS connections
note acl test_splice_http test_s test_ssl
ssl_bump splice sslstep3 test_s test_ssl

logformat ... acl=%{acl}note ...


On Thu, Apr 5, 2018 at 12:31 PM, Amos Jeffries <[hidden email]> wrote:

> On 05/04/18 21:39, Peter Viskup wrote:
>> On Wed, Apr 4, 2018 at 5:42 PM, Amos Jeffries wrote:
>>> On 04/04/18 21:15, Peter Viskup wrote:
>>>> On Linux Virtual Server balanced cluster the %<la is empty when Squid
>>>> errors ERR_ACCESS_DENIED or TCP_MEM_HIT are logged.
>>>>
>>>> This is causing issue when processing logs on remote machine (e.g.
>>>> ELK). The "proxy node" cannot be differentiated.
>>>
>>> It sounds to me like you are misinterpreting what is happening and/or
>>> using the wrong logformat tags/codes.
>>>
>>> The denied or HIT transactions you mention *do not* involve any upstream
>>> server. Thus no upstream server connection. No logformat codes will log
>>> details from a non-existing thing.
>>>
>>> So can you clarify what you mean by "Linux Virtual Server balanced
>>> cluster" ?
>>>
>>> Is that a cluster of LB devices/VM in front (or behind) a Squid proxy?
>>>  or some Squid running on each node of the cluster?
>>>  or something else?
>>
>> Two VMs with heartbeat+ldirectord+LVS. Every server running the Squid
>> proxy. The service IP with LVS configuration (managed by ldirectord)
>> is made highly available.
>>
>
>
> Okay. So..
>
> 0) you could also configure a specific unique value in each Squid's
> logformat definition. So it is explicitly present in all log lines
> generated by that proxy / node.
>
> ... or if you need something more dynamic ...
>
> 1) you could use the "note" directive to algorithmically determine (in
> its ACL matching) a value for %note to record in a shared logformat
> definition on a per-request, per-node basis.
>
> 2) If you have OpenFlows or similar recording of the inbound TCP
> connections you might use the %>a:%>p:%>la tuplet (client IP:port +
> Squid-IP) to match up with other info about the client TCP connection.
> That is the most reliable relationship (as the server connection range
> from none to many).
>
>  [ note that %>la and %la may be different in the presence of NAT,
> TPROXY, or PROXY protocol. ]
>
> 3) If the LVS is managing packet routing using TOS or packet markings
> you could use the %>qos or %>nfmark to explicitly display those in the
> log file.
>
> 4) If the VMs all have unique interface MAC/EUI addresses you could use
> that uniqueness as node IDs with the %>eui code.
>
> 5) you could use all of the codes in #2, #3, and #4 for a 6-tuplet
> identifier.
>
>
> Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid4 '%<la' empty sometimes

Amos Jeffries
Administrator
On 07/04/18 00:48, Peter Viskup wrote:

> Amos,
> thank you for your helpful reply.
>
> Setup with use of %note is working. Both servers have same
> configuration from GIT. Added one include for the proxyNode note
> configuration with "GIT ignore" flag on that file.
> Realized the note can be used to track the ACL the transaction passed to.
>
> Proxy node ID:
> note proxyNode 10.x.y.a
> logformat ... proxyNode=%{proxyNode}note ...
>
> Passed via ACL:
> acl sslstep1 at_step SslBump1
> acl sslstep3 at_step SslBump3
>
> acl test_s src <some_ip>
> acl test_d dstdomain <some_domain>
> acl test_ssl ssl::server_name <some_domain>
> acl SSH_port port 22
>
> # splice at step1 for SSH connections
> note acl test_splice_ssh test_s test_d SSH_port
> ssl_bump splice sslstep1 test_s test_d SSH_port

dstdomain ACL type (test_d is not valid at ssl_bump processing time. The
HTTP request the domain comes from has not (yet or ever) been decrypted
from the crypted bytes.


>
> # splice at step3 for other TLS connections
> note acl test_splice_http test_s test_ssl
> ssl_bump splice sslstep3 test_s test_ssl
>
> logformat ... acl=%{acl}note ...

Well, yes for this - but take care. Since the 'note ...' directive is
ONLY evaluated right at logging time it may show the wrong thing if the
ACL state changes between the initial directive (ssl_bump) evaluation
and the logging at the end of the transaction.
For example;
* the ssl_server_name ACL state changes with each SSL-Bump step that
completes, and
* dstdomain uses DNS lookups when only a raw-IP exists (eg in ssl_bump
step 1). The DNS records may expire and be changed in the time it takes
to transfer all the data in a spliced tunnel, before note re-runs the
dstdomain test_d check for logging.

Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid4 '%<la' empty sometimes

Alex Rousskov
On 04/06/2018 07:59 AM, Amos Jeffries wrote:
> On 07/04/18 00:48, Peter Viskup wrote:
>> # splice at step3 for other TLS connections
>> note acl test_splice_http test_s test_ssl
>> ssl_bump splice sslstep3 test_s test_ssl
>>
>> logformat ... acl=%{acl}note ...

> Since the 'note ...' directive is
> ONLY evaluated right at logging time it may show the wrong thing if the
> ACL state changes between the initial directive (ssl_bump) evaluation
> and the logging at the end of the transaction.

Yes, and the correct solution to those problems is the new
annotate_transaction and/or annotate_client ACLs. They can be used to
reliably remember and log ACL-based decisions.

Alex.
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users