SSL bump memory leak

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

SSL bump memory leak

Steve Hill

I'm looking into (what appears to be) a memory leak in the Squid 3.5
series.  I'm testing this in 3.5.13, but this problem has been observed
in earlier releases too.  Unfortunately I haven't been able to reproduce
the problem in a test environment yet, so my debugging has been limited
to what I can do on production systems (so no valgrind, etc).

These systems are configured to do SSL peek/bump/splice and I see the
Squid workers grow to hundreds or thousands of megabytes in size over a
few hours.  A configuration reload does not reduce the memory
consumption.  For debugging purposes, I have set
"dynamic_cert_mem_cache_size=0KB" to disable the certificate cache,
which should eliminate bug 4005.  I've taken a core dump to analyse and
have found:

Running "strings" on the core, I can see that there are vast numbers of
strings that look like certificate subject/issuer identifiers.  e.g.:
        /C=GB/ST=Greater Manchester/L=Salford/O=Comodo CA Limited/CN=Secure
Certificate Services

The vast majority of these seem to refer to root and intermediate
certificates.  There are a few that include a host name and are probably
server certificates, such as:
        /OU=Domain Control Validated/CN=*.soundcloud.com
But these are very much in the minority.

Also, notably they are mostly duplicates.  Compare the total number:
$ strings -n 10 -t x core.21693|egrep '^ *[^ ]+ /.{1,3}='|wc -l
131599
with the number of unique strings:
$ strings -n 10 -t x core.21693|egrep '^ *[^ ]+ /.{1,3}='|sort -u -k 2|wc -l
658

There are also a very small number of lines that look something like:
        /C=US/ST=California/L=San Francisco/O=Wikimedia Foundation,
Inc./CN=*.wikipedia.org+Sign=signTrusted+SignHash=SHA256
I think the "+Sign=signTrusted+SignHash=SHA256" part would indicate that
this is a Squid database key, which is very confusing since with the
certificate cache disabled I wouldn't expect to see these at all.

--
  - Steve Hill
    Technical Director
    Opendium Limited     http://www.opendium.com

Direct contacts:
    Instant messager: xmpp:[hidden email]
    Email:            [hidden email]
    Phone:            sip:[hidden email]

Sales / enquiries contacts:
    Email:            [hidden email]
    Phone:            +44-1792-824568 / sip:[hidden email]

Support contacts:
    Email:            [hidden email]
    Phone:            +44-1792-825748 / sip:[hidden email]
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: SSL bump memory leak

Amos Jeffries
Administrator
On 24/02/2016 4:31 a.m., Steve Hill wrote:
>
> There are also a very small number of lines that look something like:
>     /C=US/ST=California/L=San Francisco/O=Wikimedia Foundation,
> Inc./CN=*.wikipedia.org+Sign=signTrusted+SignHash=SHA256
> I think the "+Sign=signTrusted+SignHash=SHA256" part would indicate that
> this is a Squid database key, which is very confusing since with the
> certificate cache disabled I wouldn't expect to see these at all.
>

NP: Thats just the caching for re-use being disabled. If they are being
used at all then they should still be generated.

And a leak (real or pseudo) means they are still hanging around in
memory for some reason other than cert-cache references (being in the
cache by definition is not-leaking). For example as part of active TLS
sessions when the core was produced.

Amos

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: SSL bump memory leak

Steve Hill
On 23/02/16 17:30, Amos Jeffries wrote:

> And a leak (real or pseudo) means they are still hanging around in
> memory for some reason other than cert-cache references (being in the
> cache by definition is not-leaking). For example as part of active TLS
> sessions when the core was produced.

Seems pretty unlikely that there were over 130 thousand active TLS
sessions in just one of 2 worker threads at the time the core was generated.

I'm seeing Squid processes continually increase to many gigabytes in
size before I have to restart them to avoid the servers ending up deep
in swap.  If this was just things held during "active sessions" I would
expect to see the memory freed up again over night when there isn't much
traffic - I see no such reduction in memory usage.

--

 - Steve Hill
   Technical Director
   Opendium Limited     http://www.opendium.com

Direct contacts:
   Instant messager: xmpp:[hidden email]
   Email:            [hidden email]
   Phone:            sip:[hidden email]

Sales / enquiries contacts:
   Email:            [hidden email]
   Phone:            +44-1792-825748 / sip:[hidden email]

Support contacts:
   Email:            [hidden email]
   Phone:            +44-1792-824568 / sip:[hidden email]
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: SSL bump memory leak

Amos Jeffries
Administrator
On 24/02/2016 10:08 a.m., Steve Hill wrote:

> On 23/02/16 17:30, Amos Jeffries wrote:
>
>> And a leak (real or pseudo) means they are still hanging around in
>> memory for some reason other than cert-cache references (being in the
>> cache by definition is not-leaking). For example as part of active TLS
>> sessions when the core was produced.
>
> Seems pretty unlikely that there were over 130 thousand active TLS
> sessions in just one of 2 worker threads at the time the core was generated.
>
> I'm seeing Squid processes continually increase to many gigabytes in
> size before I have to restart them to avoid the servers ending up deep
> in swap.  If this was just things held during "active sessions" I would
> expect to see the memory freed up again over night when there isn't much
> traffic - I see no such reduction in memory usage.
>

Ah, you said "a small number" of wiki cert strings with those details. I
took that as meaning a small number of definitely squid generated ones
amidst the 130K indeterminate ones leaking.

Amos

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: SSL bump memory leak

Steve Hill
On 23/02/16 21:28, Amos Jeffries wrote:

> Ah, you said "a small number" of wiki cert strings with those details. I
> took that as meaning a small number of definitely squid generated ones
> amidst the 130K indeterminate ones leaking.

Ah, a misunderstanding on my part - sorry.  Yes, there were 302 strings
containing "signTrusted" (77 of them unique), all of them appear to be
server certificates (i.e. with a CN containing a domain name), so it is
possibly reasonable to assume that they were for in-progress sessions
and would therefore be cleaned up.

This leaves around 131297 other subject/issuer strings (581 unique)
which, to my mind, can't be explained by anything other than a leak
(whether that be a "real" leak where the pointers have been discarded
without freeing the data, or a "pseudo" leak caused by references to
them being held forever).

The SslBump wiki page (http://wiki.squid-cache.org/Features/SslBump)
says that the SSL context used for talking to servers is wiped on
reconfigure, and from what I've seen in the code it looks like this
should still be true.  However, a reconfigure doesn't seem to help in
this case, so my assumption is that this data is not part of that SSL
context.  I'm not sure where else all of this data could be from though.

As much of the data seem to be intermediate and root CA certificates, it
is presumably being collected from web servers, rather than being
generated locally.  Of the 131K strings not containing "signTrusted",
only 2760 of them appear to be server certificates (86 unique), so it
seems to me that the rest of the data are probably the intermediate
certificate chains from web servers that Squid has connected to.

It looks like there were also over 400K bumped requests split across 2
workers, so although 131K certificates is a massive amount of "leaked"
data, I don't think we are leaking on every connection.  Coupled with
the fact that I can't seem to reproduce this in a test environment,
suggests that there is something a little abnormal going on to trigger
the leak.  Also bear in mind that a single certificate will show up as 2
separate strings, since it has both a subject and an issuer, so we're
probably actually talking about around 65K certificates.

--
  - Steve Hill
    Technical Director
    Opendium Limited     http://www.opendium.com

Direct contacts:
    Instant messager: xmpp:[hidden email]
    Email:            [hidden email]
    Phone:            sip:[hidden email]

Sales / enquiries contacts:
    Email:            [hidden email]
    Phone:            +44-1792-824568 / sip:[hidden email]

Support contacts:
    Email:            [hidden email]
    Phone:            +44-1792-825748 / sip:[hidden email]
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: SSL bump memory leak

Amos Jeffries
Administrator
On 24/02/2016 11:17 p.m., Steve Hill wrote:

> On 23/02/16 21:28, Amos Jeffries wrote:
>
>> Ah, you said "a small number" of wiki cert strings with those details. I
>> took that as meaning a small number of definitely squid generated ones
>> amidst the 130K indeterminate ones leaking.
>
> Ah, a misunderstanding on my part - sorry.  Yes, there were 302 strings
> containing "signTrusted" (77 of them unique), all of them appear to be
> server certificates (i.e. with a CN containing a domain name), so it is
> possibly reasonable to assume that they were for in-progress sessions
> and would therefore be cleaned up.
>
> This leaves around 131297 other subject/issuer strings (581 unique)
> which, to my mind, can't be explained by anything other than a leak
> (whether that be a "real" leak where the pointers have been discarded
> without freeing the data, or a "pseudo" leak caused by references to
> them being held forever).
>

I agree its amost certainly a leak.

Christos and William L. have been fixed some leaks in the Squid-4 cert
generator non-caching configs recently. I'm not sure yet if its
applicable to 3.5 or not, but from the sounds of this it very well could
be the same thing.
Unfortunately the code is quite a bit different in this area now so the
patches wont directly prot. I think you had best get in touch with
Christos about this.

Amos

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: SSL bump memory leak

djch
I’m just catching up with this one, but we’ve observed some memory leaks on a small percentage of our boxes, which we migrated to Peek & Splice late last year.

We’re on 3.5.13, about to move to 3.5.15.

What’s the least disruptive way to keep this under control, if there is one?

Is there anything I can do to help get it patched?

> On 25 Feb 2016, at 9:37 AM, Amos Jeffries <[hidden email]> wrote:
>
> On 24/02/2016 11:17 p.m., Steve Hill wrote:
>> On 23/02/16 21:28, Amos Jeffries wrote:
>>
>>> Ah, you said "a small number" of wiki cert strings with those details. I
>>> took that as meaning a small number of definitely squid generated ones
>>> amidst the 130K indeterminate ones leaking.
>>
>> Ah, a misunderstanding on my part - sorry.  Yes, there were 302 strings
>> containing "signTrusted" (77 of them unique), all of them appear to be
>> server certificates (i.e. with a CN containing a domain name), so it is
>> possibly reasonable to assume that they were for in-progress sessions
>> and would therefore be cleaned up.
>>
>> This leaves around 131297 other subject/issuer strings (581 unique)
>> which, to my mind, can't be explained by anything other than a leak
>> (whether that be a "real" leak where the pointers have been discarded
>> without freeing the data, or a "pseudo" leak caused by references to
>> them being held forever).
>>
>
> I agree its amost certainly a leak.
>
> Christos and William L. have been fixed some leaks in the Squid-4 cert
> generator non-caching configs recently. I'm not sure yet if its
> applicable to 3.5 or not, but from the sounds of this it very well could
> be the same thing.
> Unfortunately the code is quite a bit different in this area now so the
> patches wont directly prot. I think you had best get in touch with
> Christos about this.
>
> Amos
>
> _______________________________________________
> squid-users mailing list
> [hidden email]
> http://lists.squid-cache.org/listinfo/squid-users

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: SSL bump memory leak

Amos Jeffries
Administrator
On 25/02/2016 11:44 a.m., Dan Charlesworth wrote:
> I’m just catching up with this one, but we’ve observed some memory leaks on a small percentage of our boxes, which we migrated to Peek & Splice late last year.
>
> We’re on 3.5.13, about to move to 3.5.15.
>
> What’s the least disruptive way to keep this under control, if there is one?

I suspect using ttl=1 on the helper cache options instead of =0 or 0MB
disabling (I'm not cler on the config exactly myself for this). The
issue seemed to be worst when the add-to-cache action did not add to the
cache, just dropped stuff into the ether. Adding then almost immediately
replacing should go through the delete code AFAIK.

>
> Is there anything I can do to help get it patched?
>

Christos would be the one to ask about that. I suspect sponsorship for
the backporting time and/or testing will be the needs.

Amos

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users