On 5/04/2017 6:00 p.m., daveh wrote:
> Hi squid users
> Is there any way to change the request url log format for HTTPS messages?
> I am using %ru to pull out the URL. When we get https connections, we see
> the url logged as www.microsoft.com:443
You are assumping that URI means HTTPS. It may seem reasonable, but is
The CONNECT request is a _tunnel_ request. It is an opaque *TCP* tunnel.
There is no guarantee that any given port-443 tunnel request is actually
HTTPS these days. There is WebSockets, SPDY, HTTP/2, and a number of
custom protocols inside TLS, and non-TLS protocols as well all using the
When HTTPS does go through a port-443 tunnel, there is often more than
one HTTPS request. So writing https://blah/ to the log would be a lie,
and a deceptive one at that.
> is there any way to reformat the log message to remove the appended port?
Well, the log %ru code is intended to record the *actual* details being
received. What you are seeing is what actually exists in the traffic.
However, you will need to do that for a separate log to other traffic
and as mentioned above keep in mind that port-443 does not necessarily
To actually log https:// URL requires either passing Squid https:// URLs
instead of CONNECT request, or decrypting the traffic (with SSL-Bump
feature) and see what is inside the TLS (if it is TLS, it may not be).
Squid will then log the appropriate https:// URL for each received or
decrypted HTTPS request, no changes necessary.
PS: If you are asking this because of some tool that is doing broken
things when passed real URIs (not URL ... *URI*) that tool needs to be
Im parsing squid logs to send to a SIEM to identify IOCs. The SIEM agent requires a URL to be formatted with http|https://<URI>
It knows then that it can break the string out into various components such as request URL authority, host etc
Your comment on logging https connections is not what I have found. I would expect that typing https://something.net will return that extact string in the log. Every https connection is logged as a CONNECT with the FQDN appended the :443. Is there something in the config to force this to happen? DOesnt seem to be a way of doing it with log formatting
Im simply rewriting to strip the 443 port and prepending https://. Doesn't matter to me if CONNECT != HTTPS I simply need my url to be properly formed in the logs
On 10/04/2017 1:36 p.m., daveh wrote:
> Thanks for the reply.
> Im parsing squid logs to send to a SIEM to identify IOCs. The SIEM agent
> requires a URL to be formatted with http|https://<URI>
> It knows then that it can break the string out into various components such
> as request URL authority, host etc
So it can understand *URL* format. But that is not what is being logged.
Squid technically logs a URI, and this log processing is one of the
cases were the difference between URI and URL matters.
> Your comment on logging https connections is not what I have found. I would
I think you misread what I wrote. There are only two ways to get Squid
to know what the https:// URL was - neither of them are normal proxy usage.
> expect that typing https://something.net will return that extact string in
> the log. Every https connection is logged as a CONNECT with the FQDN
> appended the :443.
You expect wrong.
The URL you entered into some client software starts with the schema
"https://" ... which requires that the fetching of that URL is done
securely. The last thing you should expect is that URL being sent over
plain-text / "in the clear" to some external software.
To do HTTPS the client software has to setup multiple layers of
protocols and security.
1) First it has to open a TCP connection to the proxy.
3) Then it has to setup TLS/SSL encryption over those two TCP
connections. So the crypto happens directly between the client and the
server (as if the proxy were not there).
4) Then, and only then, after all that has been successful does it start
to send the first (or potentially many, hundreds, thousands...) of HTTP
requests over the connection:
GET /index.html HTTP/1.1
If you look closely at that #4 layer request there is no "https://"
there. Nor any way to reconstruct it.
It might even be another CONNECT (thought TOR invented onion routing?
HTTPS beat it by decades).
That meme from The Matrix "there is no spoon" has never been more apt.
There is no "https://" - at least, not once the client interprets its
input URL. It vanishes right there and then.
> Is there something in the config to force this to happen?
There is no simple config option. In fact we go out of our way to ensure
data accuracy. So the log contains reality and log interpreters can make
whatever assumptions you want it to about what they read there.
p-PS. I find it particularly odd that you would be trying to feed false
information into a SIEM system - security event detection depends on
accuracy of inputs. But its your neck.
> DOesnt seem to be a way of doing it with log formatting
I'm not changing the raw squid log, only the normalised event. I'm simply pulling out the url host (the FQDN) from the URL as my SIEM agent doesn't natively understand how to parse these CONNECT messages. It doesnt matter to me if CONNECT requests are not always https requests. For my purposes I need to compare the FQDN to a list of IOCs.
If I have a use case specific to the use of CONNECT requests in the future, I still have all of that information as is, from the proxy.