tuning squid memory (aka avoiding the reaper)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

tuning squid memory (aka avoiding the reaper)

Aaron Turner
So I'm testing squid 3.5.26 on an m3.xlarge w/ 14GB of RAM.  Squid is
the only "real" service running (sshd and the like).  I'm running 4
workers, and 2 rock cache.  The workers seem to be growing unbounded
and given ~30min or so will cause the kernel to start killing off
processes until memory is freed.  Yes, my clients (32 of them) are
hitting this at about 250 URL's/min which doesn't seem that crazy, but
¯\_(ツ)_/¯

cache_mem 1 GB resulted in workers exceeding 4GB resident.  So I tried
500 MB, same problem.  Now I'm down to 250 MB and I'm still seeing
workers using 3-4GB of RAM after a few minutes and still growing which
is surprising since the docs indicate I should expect total memory to
be roughly 3x cache_mem.

mgr:info reports:

Squid Object Cache: Version 3.5.26
Build Info:
Service Name: squid
Start Time: Mon, 25 Sep 2017 22:53:22 GMT
Current Time: Mon, 25 Sep 2017 23:15:21 GMT
Connection information for squid:
Number of clients accessing cache: 12
Number of HTTP requests received: 568290
Number of ICP messages received: 0
Number of ICP messages sent: 0
Number of queued ICP replies: 0
Number of HTCP messages received: 0
Number of HTCP messages sent: 0
Request failure ratio: 0.00
Average HTTP requests per minute since start: 25851.6
Average ICP messages per minute since start: 0.0
Select loop called: 10686802 times, 2.009 ms avg
Cache information for squid:
Hits as % of all requests: 5min: 7.5%, 60min: 9.8%
Hits as % of bytes sent: 5min: 12.3%, 60min: 17.3%
Memory hits as % of hit requests: 5min: 36.2%, 60min: 40.0%
Disk hits as % of hit requests: 5min: 27.7%, 60min: 25.5%
Storage Swap size: 4481632 KB
Storage Swap capacity: 4.3% used, 95.7% free
Storage Mem size: 254656 KB
Storage Mem capacity: 99.5% used,  0.5% free
Mean Object Size: 26.30 KB
Requests given to unlinkd: 0
Median Service Times (seconds)  5 min    60 min:
HTTP Requests (All):   0.00800  0.00569
Cache Misses:          0.03766  0.04489
Cache Hits:            0.00030  0.00000
Near Hits:             0.05364  0.07135
Not-Modified Replies:  0.00236  0.00168
DNS Lookups:           0.04438  0.04540
ICP Queries:           0.00000  0.00000
Resource usage for squid:
UP Time: 1319.019 seconds
CPU Time: 1617.476 seconds
CPU Usage: 122.63%
CPU Usage, 5 minute avg: 107.50%
CPU Usage, 60 minute avg: 124.63%
Maximum Resident Size: 60715904 KB
Page faults with physical i/o: 8
Memory accounted for:
Total accounted:        44968 KB
memPoolAlloc calls: 158302074
memPoolFree calls:  159053228
File descriptor usage for squid:
Maximum number of file descriptors:   98304
Largest file desc currently in use:   1891
Number of file desc currently in use: 3423
Files queued for open:                   0
Available number of file descriptors: 94881
Reserved number of file descriptors:   600
Store Disk files open:                   8
Internal Data Structures:
 1702 StoreEntries
  454 StoreEntries with MemObjects
 4475 Hot Object Cache Items
170394 on-disk objects

top when things go bad:
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
21450 squid     20   0 5379004 3.984g 619260 D  28.5 27.9   8:54.37
(squid-4) -f /etc/squid/squid.conf
21451 squid     20   0 5253164 3.850g 601248 D  26.6 27.0   7:46.71
(squid-3) -f /etc/squid/squid.conf
21453 squid     20   0 4750376 3.299g 504192 D  17.3 23.1   5:36.19
(squid-1) -f /etc/squid/squid.conf
21452 squid     20   0 4448280 2.994g 482136 D  16.6 21.0   5:43.58
(squid-2) -f /etc/squid/squid.conf
21449 squid     20   0 1292376 358544 346416 D   1.4  2.4   0:45.83
(squid-disk-5) -f /etc/squid/squid.conf
21448 squid     20   0 1292376 356924 344788 D   1.2  2.4   0:46.51
(squid-disk-6) -f /etc/squid/squid.conf
21447 squid     20   0  943584  11624     20 S   0.0  0.1   0:00.19
(squid-coord-7) -f /etc/squid/squid.conf


I'm trying to figure out why and how to fix.  One thing I've read
about the cache_mem knob is:

"If circumstances require, this limit will be exceeded.

Specifically, if your incoming request rate requires more than
'cache_mem' of memory to hold in-transit objects, Squid will
exceed this limit to satisfy the new requests.  When the load
decreases, blocks will be freed until the high-water mark is
reached.  Thereafter, blocks will be used to store hot
objects."

Not sure if this is the cause of my problem?  Maybe something else?
The FAQ says try a different malloc, so tried recompiling with
--enable-dlmalloc, but that had no impact.


--
Aaron Turner
https://synfin.net/         Twitter: @synfinatic
My father once told me that respect for the truth comes close to being
the basis for all morality.  "Something cannot emerge from nothing,"
he said.  This is profound thinking if you understand how unstable
"the truth" can be.  -- Frank Herbert, Dune
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: tuning squid memory (aka avoiding the reaper)

Alex Rousskov
On 09/25/2017 05:23 PM, Aaron Turner wrote:

> So I'm testing squid 3.5.26 on an m3.xlarge w/ 14GB of RAM.  Squid is
> the only "real" service running (sshd and the like).  I'm running 4
> workers, and 2 rock cache.  The workers seem to be growing unbounded
> and given ~30min or so will cause the kernel to start killing off
> processes until memory is freed.  Yes, my clients (32 of them) are
> hitting this at about 250 URL's/min which doesn't seem that crazy, but
> ¯\_(ツ)_/¯
>
> cache_mem 1 GB resulted in workers exceeding 4GB resident.  So I tried
> 500 MB, same problem.  Now I'm down to 250 MB and I'm still seeing
> workers using 3-4GB of RAM after a few minutes and still growing

It is not the Squid memory cache that consumes your RAM, apparently.


> the docs indicate I should expect total memory to be roughly 3x cache_mem.

... which is an absurd formula for those using disk caches: Roughly
speaking, most large busy Squids spend most of their RAM on

* memory cache,
* disk cache indexes,
* SSL-related caches, and
* in-flight transactions.

Only one of those 4 components is proportional to cache_mem, with a
coefficient closer to 1 than to 3.


> mgr:info reports:

Thank you for posting this useful info. When you are using disk caching,
please also include the mgr:storedir report.


> I'm trying to figure out why and how to fix.

I recommend disabling all caching (memory and disk) and SslBump (if any)
to establish a baseline first. If everything looks stable and peachy for
a few hours, record/store the baseline measurements, and add one new
memory consumer (e.g., the memory cache). Ideally, this testing should
be done in a lab rather than on real users, but YMMV.


> One thing I've read about the cache_mem knob is:
>
> "If circumstances require, this limit will be exceeded.
>
> Specifically, if your incoming request rate requires more than
> 'cache_mem' of memory to hold in-transit objects, Squid will
> exceed this limit to satisfy the new requests.  When the load
> decreases, blocks will be freed until the high-water mark is
> reached.  Thereafter, blocks will be used to store hot
> objects."

The above is more-or-less accurate, but please note that in-transit
objects do not usually eat memory cache RAM in SMP mode. It is usually
best to think of in-flight transactions as a distinct SMP memory
consumer IMO.


> Not sure if this is the cause of my problem?

It could be -- it is difficult for me to say by looking at one random
mgr:info snapshot. If I have to guess based on that snapshot alone, then
my answer would be "no" because you have less than 4K concurrent
transactions and transaction response times are low. Hopefully somebody
else on the list can tell you more.



> The FAQ says try a different malloc, so tried recompiling with
> --enable-dlmalloc, but that had no impact.

Do not bother unless your deployment environment is very unusual. This
hint was helpful 20 years ago, but is rarely relevant these days AFAIK.
See above for a different attack plan.


HTH,

Alex.
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: tuning squid memory (aka avoiding the reaper)

Aaron Turner
Ok, so did some research and what I'm finding is that:

If I set sslflags=NO_DEFAULT_CA for http_port and disable both mem and
disk cache then memory is very stable.  It goes up for a little bit
and then pretty much stabilizes (it actually goes up and down a
little, but doesn't seem to be growing or trending up).

I then enabled memory cache (10GB worth) and ran that for a while.  As
the cache filled, memory usage obviously went up.  Once the cache
filled, memory usage continued to increase, but at a slower rate.
Unlike before, it doesn't seem to stabilize.  I'm seeing memory usage
increase in top (virtual, resident & shared) as well as in mgr:info's
"Total accounted" line.  It's not growing as fast before when I didn't
have the sslflags option, but it is growing.

What other information would be useful to debug this?

--
Aaron Turner
https://synfin.net/         Twitter: @synfinatic
My father once told me that respect for the truth comes close to being
the basis for all morality.  "Something cannot emerge from nothing,"
he said.  This is profound thinking if you understand how unstable
"the truth" can be.  -- Frank Herbert, Dune


On Mon, Sep 25, 2017 at 8:26 PM, Alex Rousskov
<[hidden email]> wrote:

> On 09/25/2017 05:23 PM, Aaron Turner wrote:
>> So I'm testing squid 3.5.26 on an m3.xlarge w/ 14GB of RAM.  Squid is
>> the only "real" service running (sshd and the like).  I'm running 4
>> workers, and 2 rock cache.  The workers seem to be growing unbounded
>> and given ~30min or so will cause the kernel to start killing off
>> processes until memory is freed.  Yes, my clients (32 of them) are
>> hitting this at about 250 URL's/min which doesn't seem that crazy, but
>> ¯\_(ツ)_/¯
>>
>> cache_mem 1 GB resulted in workers exceeding 4GB resident.  So I tried
>> 500 MB, same problem.  Now I'm down to 250 MB and I'm still seeing
>> workers using 3-4GB of RAM after a few minutes and still growing
>
> It is not the Squid memory cache that consumes your RAM, apparently.
>
>
>> the docs indicate I should expect total memory to be roughly 3x cache_mem.
>
> ... which is an absurd formula for those using disk caches: Roughly
> speaking, most large busy Squids spend most of their RAM on
>
> * memory cache,
> * disk cache indexes,
> * SSL-related caches, and
> * in-flight transactions.
>
> Only one of those 4 components is proportional to cache_mem, with a
> coefficient closer to 1 than to 3.
>
>
>> mgr:info reports:
>
> Thank you for posting this useful info. When you are using disk caching,
> please also include the mgr:storedir report.
>
>
>> I'm trying to figure out why and how to fix.
>
> I recommend disabling all caching (memory and disk) and SslBump (if any)
> to establish a baseline first. If everything looks stable and peachy for
> a few hours, record/store the baseline measurements, and add one new
> memory consumer (e.g., the memory cache). Ideally, this testing should
> be done in a lab rather than on real users, but YMMV.
>
>
>> One thing I've read about the cache_mem knob is:
>>
>> "If circumstances require, this limit will be exceeded.
>>
>> Specifically, if your incoming request rate requires more than
>> 'cache_mem' of memory to hold in-transit objects, Squid will
>> exceed this limit to satisfy the new requests.  When the load
>> decreases, blocks will be freed until the high-water mark is
>> reached.  Thereafter, blocks will be used to store hot
>> objects."
>
> The above is more-or-less accurate, but please note that in-transit
> objects do not usually eat memory cache RAM in SMP mode. It is usually
> best to think of in-flight transactions as a distinct SMP memory
> consumer IMO.
>
>
>> Not sure if this is the cause of my problem?
>
> It could be -- it is difficult for me to say by looking at one random
> mgr:info snapshot. If I have to guess based on that snapshot alone, then
> my answer would be "no" because you have less than 4K concurrent
> transactions and transaction response times are low. Hopefully somebody
> else on the list can tell you more.
>
>
>
>> The FAQ says try a different malloc, so tried recompiling with
>> --enable-dlmalloc, but that had no impact.
>
> Do not bother unless your deployment environment is very unusual. This
> hint was helpful 20 years ago, but is rarely relevant these days AFAIK.
> See above for a different attack plan.
>
>
> HTH,
>
> Alex.
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: tuning squid memory (aka avoiding the reaper)

Amos Jeffries
Administrator
On 29/09/17 09:19, Aaron Turner wrote:

> Ok, so did some research and what I'm finding is that:
>
> If I set sslflags=NO_DEFAULT_CA for http_port and disable both mem and
> disk cache then memory is very stable.  It goes up for a little bit
> and then pretty much stabilizes (it actually goes up and down a
> little, but doesn't seem to be growing or trending up).
>
> I then enabled memory cache (10GB worth) and ran that for a while.  As
> the cache filled, memory usage obviously went up.  Once the cache
> filled, memory usage continued to increase, but at a slower rate.
> Unlike before, it doesn't seem to stabilize.  I'm seeing memory usage
> increase in top (virtual, resident & shared) as well as in mgr:info's
> "Total accounted" line.  It's not growing as fast before when I didn't
> have the sslflags option, but it is growing.
>
> What other information would be useful to debug this?
>

Since the accounted is growing the mgr:mem report should contain some
clues. It is a TSV spreadsheet of memory allocations, you may need a few
snapshots of it to see trends.

Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: tuning squid memory (aka avoiding the reaper)

Aaron Turner
Ok, i'll work on that.  One other thing, is that if I let it run long
enough, squid will crash with errors like the following:

FATAL: Received Bus Error...dying.
2017/09/28 23:28:09 kid4| Closing HTTP port 10.93.3.4:3128
2017/09/28 23:28:09 kid4| Closing HTTP port 127.0.0.1:3128
2017/09/28 23:28:09 kid4| storeDirWriteCleanLogs: Starting...
2017/09/28 23:28:09 kid4|   Finished.  Wrote 0 entries.
2017/09/28 23:28:09 kid4|   Took 0.00 seconds (  0.00 entries/sec).
CPU Usage: 0.541 seconds = 0.466 user + 0.075 sys
Maximum Resident Size: 121440 KB
Page faults with physical i/o: 0

At first I thought the bus error was hardware, but it's happened on
two different EC2 instances now.

--
Aaron Turner
https://synfin.net/         Twitter: @synfinatic
My father once told me that respect for the truth comes close to being
the basis for all morality.  "Something cannot emerge from nothing,"
he said.  This is profound thinking if you understand how unstable
"the truth" can be.  -- Frank Herbert, Dune


On Thu, Sep 28, 2017 at 3:32 PM, Amos Jeffries <[hidden email]> wrote:

> On 29/09/17 09:19, Aaron Turner wrote:
>>
>> Ok, so did some research and what I'm finding is that:
>>
>> If I set sslflags=NO_DEFAULT_CA for http_port and disable both mem and
>> disk cache then memory is very stable.  It goes up for a little bit
>> and then pretty much stabilizes (it actually goes up and down a
>> little, but doesn't seem to be growing or trending up).
>>
>> I then enabled memory cache (10GB worth) and ran that for a while.  As
>> the cache filled, memory usage obviously went up.  Once the cache
>> filled, memory usage continued to increase, but at a slower rate.
>> Unlike before, it doesn't seem to stabilize.  I'm seeing memory usage
>> increase in top (virtual, resident & shared) as well as in mgr:info's
>> "Total accounted" line.  It's not growing as fast before when I didn't
>> have the sslflags option, but it is growing.
>>
>> What other information would be useful to debug this?
>>
>
> Since the accounted is growing the mgr:mem report should contain some clues.
> It is a TSV spreadsheet of memory allocations, you may need a few snapshots
> of it to see trends.
>
> Amos
> _______________________________________________
> squid-users mailing list
> [hidden email]
> http://lists.squid-cache.org/listinfo/squid-users
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: tuning squid memory (aka avoiding the reaper)

Amos Jeffries
Administrator
On 29/09/17 12:35, Aaron Turner wrote:

> Ok, i'll work on that.  One other thing, is that if I let it run long
> enough, squid will crash with errors like the following:
>
> FATAL: Received Bus Error...dying.
> 2017/09/28 23:28:09 kid4| Closing HTTP port 10.93.3.4:3128
> 2017/09/28 23:28:09 kid4| Closing HTTP port 127.0.0.1:3128
> 2017/09/28 23:28:09 kid4| storeDirWriteCleanLogs: Starting...
> 2017/09/28 23:28:09 kid4|   Finished.  Wrote 0 entries.
> 2017/09/28 23:28:09 kid4|   Took 0.00 seconds (  0.00 entries/sec).
> CPU Usage: 0.541 seconds = 0.466 user + 0.075 sys
> Maximum Resident Size: 121440 KB
> Page faults with physical i/o: 0
>
> At first I thought the bus error was hardware, but it's happened on
> two different EC2 instances now.
>

Yes "Bus Error" is definitely hardware. The OS kernel had an error
loading some data from RAM into the CPU or something along those lines.

The only things we can do about it is check that your Squid is up to
date with the system environment - eg fairly recent Squid version built
with the OS latest compiler version against its current libc or whatever
the equivalents of those are for your system. If there are binary level
issues with the libc interfaces weird things can happen.

Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: tuning squid memory (aka avoiding the reaper)

Aaron Turner
So this is smelling like a mem leak to me.  First after running for a few hours:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 3188 squid     20   0 3586264 3.175g 1.007g R  63.4 22.2 162:59.38 squid
 3187 squid     20   0 2941332 2.585g 1.005g S  45.5 18.1 129:36.40 squid
 3190 squid     20   0 2641828 2.304g 1.001g R  41.4 16.1 109:49.08 squid
 3189 squid     20   0 2524892 2.182g 0.987g S  42.1 15.3 110:30.96 squid

I configured squid w/ 1GB mem cache, no disk cache, ssl bumping and 4
workers.  Looks like they've all pretty much mapped the 1GB which is
what I'd expect.  However, resident memory is clearly quite high
considering my config.

While this was running I was capturing the output of mgr:mem and
started looking at the numbers.  Now I'm not 100% I understand the
meanings of all the columns, but I also don't see any indication of
what is using all that resident memory.

I've grabbed a few of the mgr:mem output spanning the test and
uploaded them here since I hate sending attachments to lists:

https://synfin.net/misc/watch_share.tar.gz
--
Aaron Turner
https://synfin.net/         Twitter: @synfinatic
My father once told me that respect for the truth comes close to being
the basis for all morality.  "Something cannot emerge from nothing,"
he said.  This is profound thinking if you understand how unstable
"the truth" can be.  -- Frank Herbert, Dune


On Thu, Sep 28, 2017 at 10:05 PM, Amos Jeffries <[hidden email]> wrote:

> On 29/09/17 12:35, Aaron Turner wrote:
>>
>> Ok, i'll work on that.  One other thing, is that if I let it run long
>> enough, squid will crash with errors like the following:
>>
>> FATAL: Received Bus Error...dying.
>> 2017/09/28 23:28:09 kid4| Closing HTTP port 10.93.3.4:3128
>> 2017/09/28 23:28:09 kid4| Closing HTTP port 127.0.0.1:3128
>> 2017/09/28 23:28:09 kid4| storeDirWriteCleanLogs: Starting...
>> 2017/09/28 23:28:09 kid4|   Finished.  Wrote 0 entries.
>> 2017/09/28 23:28:09 kid4|   Took 0.00 seconds (  0.00 entries/sec).
>> CPU Usage: 0.541 seconds = 0.466 user + 0.075 sys
>> Maximum Resident Size: 121440 KB
>> Page faults with physical i/o: 0
>>
>> At first I thought the bus error was hardware, but it's happened on
>> two different EC2 instances now.
>>
>
> Yes "Bus Error" is definitely hardware. The OS kernel had an error loading
> some data from RAM into the CPU or something along those lines.
>
> The only things we can do about it is check that your Squid is up to date
> with the system environment - eg fairly recent Squid version built with the
> OS latest compiler version against its current libc or whatever the
> equivalents of those are for your system. If there are binary level issues
> with the libc interfaces weird things can happen.
>
> Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: tuning squid memory (aka avoiding the reaper)

Aaron Turner
One more update before I restart squid:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 3188 squid     20   0 4821844 4.337g 1.008g R  53.0 30.4 349:31.24 squid
 3187 squid     20   0 3539696 3.153g 1.008g R  31.9 22.1 259:15.31 squid
 3190 squid     20   0 3198228 2.834g 1.008g S  29.2 19.8 230:23.30 squid
 3189 squid     20   0 3033460 2.680g 1.008g R  27.0 18.8 226:17.63 squid

https://synfin.net/misc/mgr_mem_1000.txt

--
Aaron Turner
https://synfin.net/         Twitter: @synfinatic
My father once told me that respect for the truth comes close to being
the basis for all morality.  "Something cannot emerge from nothing,"
he said.  This is profound thinking if you understand how unstable
"the truth" can be.  -- Frank Herbert, Dune


On Fri, Sep 29, 2017 at 9:45 AM, Aaron Turner <[hidden email]> wrote:

> So this is smelling like a mem leak to me.  First after running for a few hours:
>
>   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
>  3188 squid     20   0 3586264 3.175g 1.007g R  63.4 22.2 162:59.38 squid
>  3187 squid     20   0 2941332 2.585g 1.005g S  45.5 18.1 129:36.40 squid
>  3190 squid     20   0 2641828 2.304g 1.001g R  41.4 16.1 109:49.08 squid
>  3189 squid     20   0 2524892 2.182g 0.987g S  42.1 15.3 110:30.96 squid
>
> I configured squid w/ 1GB mem cache, no disk cache, ssl bumping and 4
> workers.  Looks like they've all pretty much mapped the 1GB which is
> what I'd expect.  However, resident memory is clearly quite high
> considering my config.
>
> While this was running I was capturing the output of mgr:mem and
> started looking at the numbers.  Now I'm not 100% I understand the
> meanings of all the columns, but I also don't see any indication of
> what is using all that resident memory.
>
> I've grabbed a few of the mgr:mem output spanning the test and
> uploaded them here since I hate sending attachments to lists:
>
> https://synfin.net/misc/watch_share.tar.gz
> --
> Aaron Turner
> https://synfin.net/         Twitter: @synfinatic
> My father once told me that respect for the truth comes close to being
> the basis for all morality.  "Something cannot emerge from nothing,"
> he said.  This is profound thinking if you understand how unstable
> "the truth" can be.  -- Frank Herbert, Dune
>
>
> On Thu, Sep 28, 2017 at 10:05 PM, Amos Jeffries <[hidden email]> wrote:
>> On 29/09/17 12:35, Aaron Turner wrote:
>>>
>>> Ok, i'll work on that.  One other thing, is that if I let it run long
>>> enough, squid will crash with errors like the following:
>>>
>>> FATAL: Received Bus Error...dying.
>>> 2017/09/28 23:28:09 kid4| Closing HTTP port 10.93.3.4:3128
>>> 2017/09/28 23:28:09 kid4| Closing HTTP port 127.0.0.1:3128
>>> 2017/09/28 23:28:09 kid4| storeDirWriteCleanLogs: Starting...
>>> 2017/09/28 23:28:09 kid4|   Finished.  Wrote 0 entries.
>>> 2017/09/28 23:28:09 kid4|   Took 0.00 seconds (  0.00 entries/sec).
>>> CPU Usage: 0.541 seconds = 0.466 user + 0.075 sys
>>> Maximum Resident Size: 121440 KB
>>> Page faults with physical i/o: 0
>>>
>>> At first I thought the bus error was hardware, but it's happened on
>>> two different EC2 instances now.
>>>
>>
>> Yes "Bus Error" is definitely hardware. The OS kernel had an error loading
>> some data from RAM into the CPU or something along those lines.
>>
>> The only things we can do about it is check that your Squid is up to date
>> with the system environment - eg fairly recent Squid version built with the
>> OS latest compiler version against its current libc or whatever the
>> equivalents of those are for your system. If there are binary level issues
>> with the libc interfaces weird things can happen.
>>
>> Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: tuning squid memory (aka avoiding the reaper)

Aaron Turner
Anyone see anything useful?
--
Aaron Turner
https://synfin.net/         Twitter: @synfinatic
My father once told me that respect for the truth comes close to being
the basis for all morality.  "Something cannot emerge from nothing,"
he said.  This is profound thinking if you understand how unstable
"the truth" can be.  -- Frank Herbert, Dune


On Fri, Sep 29, 2017 at 4:57 PM, Aaron Turner <[hidden email]> wrote:

> One more update before I restart squid:
>
>   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
>  3188 squid     20   0 4821844 4.337g 1.008g R  53.0 30.4 349:31.24 squid
>  3187 squid     20   0 3539696 3.153g 1.008g R  31.9 22.1 259:15.31 squid
>  3190 squid     20   0 3198228 2.834g 1.008g S  29.2 19.8 230:23.30 squid
>  3189 squid     20   0 3033460 2.680g 1.008g R  27.0 18.8 226:17.63 squid
>
> https://synfin.net/misc/mgr_mem_1000.txt
>
> --
> Aaron Turner
> https://synfin.net/         Twitter: @synfinatic
> My father once told me that respect for the truth comes close to being
> the basis for all morality.  "Something cannot emerge from nothing,"
> he said.  This is profound thinking if you understand how unstable
> "the truth" can be.  -- Frank Herbert, Dune
>
>
> On Fri, Sep 29, 2017 at 9:45 AM, Aaron Turner <[hidden email]> wrote:
>> So this is smelling like a mem leak to me.  First after running for a few hours:
>>
>>   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
>>  3188 squid     20   0 3586264 3.175g 1.007g R  63.4 22.2 162:59.38 squid
>>  3187 squid     20   0 2941332 2.585g 1.005g S  45.5 18.1 129:36.40 squid
>>  3190 squid     20   0 2641828 2.304g 1.001g R  41.4 16.1 109:49.08 squid
>>  3189 squid     20   0 2524892 2.182g 0.987g S  42.1 15.3 110:30.96 squid
>>
>> I configured squid w/ 1GB mem cache, no disk cache, ssl bumping and 4
>> workers.  Looks like they've all pretty much mapped the 1GB which is
>> what I'd expect.  However, resident memory is clearly quite high
>> considering my config.
>>
>> While this was running I was capturing the output of mgr:mem and
>> started looking at the numbers.  Now I'm not 100% I understand the
>> meanings of all the columns, but I also don't see any indication of
>> what is using all that resident memory.
>>
>> I've grabbed a few of the mgr:mem output spanning the test and
>> uploaded them here since I hate sending attachments to lists:
>>
>> https://synfin.net/misc/watch_share.tar.gz
>> --
>> Aaron Turner
>> https://synfin.net/         Twitter: @synfinatic
>> My father once told me that respect for the truth comes close to being
>> the basis for all morality.  "Something cannot emerge from nothing,"
>> he said.  This is profound thinking if you understand how unstable
>> "the truth" can be.  -- Frank Herbert, Dune
>>
>>
>> On Thu, Sep 28, 2017 at 10:05 PM, Amos Jeffries <[hidden email]> wrote:
>>> On 29/09/17 12:35, Aaron Turner wrote:
>>>>
>>>> Ok, i'll work on that.  One other thing, is that if I let it run long
>>>> enough, squid will crash with errors like the following:
>>>>
>>>> FATAL: Received Bus Error...dying.
>>>> 2017/09/28 23:28:09 kid4| Closing HTTP port 10.93.3.4:3128
>>>> 2017/09/28 23:28:09 kid4| Closing HTTP port 127.0.0.1:3128
>>>> 2017/09/28 23:28:09 kid4| storeDirWriteCleanLogs: Starting...
>>>> 2017/09/28 23:28:09 kid4|   Finished.  Wrote 0 entries.
>>>> 2017/09/28 23:28:09 kid4|   Took 0.00 seconds (  0.00 entries/sec).
>>>> CPU Usage: 0.541 seconds = 0.466 user + 0.075 sys
>>>> Maximum Resident Size: 121440 KB
>>>> Page faults with physical i/o: 0
>>>>
>>>> At first I thought the bus error was hardware, but it's happened on
>>>> two different EC2 instances now.
>>>>
>>>
>>> Yes "Bus Error" is definitely hardware. The OS kernel had an error loading
>>> some data from RAM into the CPU or something along those lines.
>>>
>>> The only things we can do about it is check that your Squid is up to date
>>> with the system environment - eg fairly recent Squid version built with the
>>> OS latest compiler version against its current libc or whatever the
>>> equivalents of those are for your system. If there are binary level issues
>>> with the libc interfaces weird things can happen.
>>>
>>> Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: tuning squid memory (aka avoiding the reaper)

Amos Jeffries
Administrator
On 03/10/17 04:39, Aaron Turner wrote:
> Anyone see anything useful?

The numbers in those reports all seem reasonable to me. Nothing is
showing up with GB of RAM used.

Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: tuning squid memory (aka avoiding the reaper)

Aaron Turner
So it's leaking memory and not tracking it?  Clearly 'top' is showing
it is using a lot of memory and growing over time.  I'm happy to do
more tests/etc, but right now I can't go into production with this
memory leak.  Should I try squid4?
--
Aaron Turner
https://synfin.net/         Twitter: @synfinatic
My father once told me that respect for the truth comes close to being
the basis for all morality.  "Something cannot emerge from nothing,"
he said.  This is profound thinking if you understand how unstable
"the truth" can be.  -- Frank Herbert, Dune


On Mon, Oct 2, 2017 at 8:25 PM, Amos Jeffries <[hidden email]> wrote:
> On 03/10/17 04:39, Aaron Turner wrote:
>>
>> Anyone see anything useful?
>
>
> The numbers in those reports all seem reasonable to me. Nothing is showing
> up with GB of RAM used.
>
> Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: tuning squid memory (aka avoiding the reaper)

Alex Rousskov
On 10/02/2017 09:37 PM, Aaron Turner wrote:
> So it's leaking memory and not tracking it?


That combination (or, to be more precise, its implication) is possible
but relatively unlikely in your specific case -- when GBs are leaked,
there is usually something tracked related to those GBs. Please note
that what Squid _tracks_ may not amount to GBs of RAM! For example,
Squid can track a tiny object that is included in every large untracked
leaked object.

A frequent leak often manifests itself in mgr:mem snapshots as a nearly
always increasing counter of alive associated objects. If you take one
snapshot every 30 minutes or so, then you may be able to identify
suspects by comparing same-object alive counters across 5-10 snapshots.
Sorry, I do not have the time to do that for the snapshots you have
shared (and you probably need a different collection of snapshots to
make this search more productive).

Alex.


> On Mon, Oct 2, 2017 at 8:25 PM, Amos Jeffries <[hidden email]> wrote:
>> On 03/10/17 04:39, Aaron Turner wrote:
>>>
>>> Anyone see anything useful?
>>
>>
>> The numbers in those reports all seem reasonable to me. Nothing is showing
>> up with GB of RAM used.
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: tuning squid memory (aka avoiding the reaper)

Aaron Turner
So more testing.  I haven't found the line in the info:mem logs which
is the red flag, but additional testing proves that the memleak has
something to do with ssl bumping.  Once I turn that off, the memory
leaks stop.

this was the ssl related config options:

http_port 10.0.0.1:3128 ssl-bump generate-host-certificates=on
dynamic_cert_mem_cache_size=400MB cert=/etc/squid/ssl_cert/myCA.pem
sslflags=NO_DEFAULT_CA
http_port localhost:3128
ssl_bump bump all

sslcrtd_program /usr/lib64/squid/ssl_crtd -s /var/lib/squid/ssl_db -M 4MB
sslcrtd_children 32 startup=2 idle=2
sslproxy_session_cache_size 100 MB
sslproxy_cert_error allow all
sslproxy_flags DONT_VERIFY_PEER

--
Aaron Turner
https://synfin.net/         Twitter: @synfinatic
My father once told me that respect for the truth comes close to being
the basis for all morality.  "Something cannot emerge from nothing,"
he said.  This is profound thinking if you understand how unstable
"the truth" can be.  -- Frank Herbert, Dune


On Wed, Oct 4, 2017 at 10:53 AM, Alex Rousskov
<[hidden email]> wrote:

> On 10/02/2017 09:37 PM, Aaron Turner wrote:
>> So it's leaking memory and not tracking it?
>
>
> That combination (or, to be more precise, its implication) is possible
> but relatively unlikely in your specific case -- when GBs are leaked,
> there is usually something tracked related to those GBs. Please note
> that what Squid _tracks_ may not amount to GBs of RAM! For example,
> Squid can track a tiny object that is included in every large untracked
> leaked object.
>
> A frequent leak often manifests itself in mgr:mem snapshots as a nearly
> always increasing counter of alive associated objects. If you take one
> snapshot every 30 minutes or so, then you may be able to identify
> suspects by comparing same-object alive counters across 5-10 snapshots.
> Sorry, I do not have the time to do that for the snapshots you have
> shared (and you probably need a different collection of snapshots to
> make this search more productive).
>
> Alex.
>
>
>> On Mon, Oct 2, 2017 at 8:25 PM, Amos Jeffries <[hidden email]> wrote:
>>> On 03/10/17 04:39, Aaron Turner wrote:
>>>>
>>>> Anyone see anything useful?
>>>
>>>
>>> The numbers in those reports all seem reasonable to me. Nothing is showing
>>> up with GB of RAM used.
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: tuning squid memory (aka avoiding the reaper)

Alex Rousskov
On 10/25/2017 10:41 AM, Aaron Turner wrote:
> More testing.  This time with 4.0.21.  Disabled all caching, only
> enabled ssl bumping.  Same config as last time.  Still leaking memory.
> I took two snapshots of info & mem usage and honestly I don't see a
> smoking gun pointing to why my squid processes were getting as large
> as 1.4GB.
>
> I've attached the two files incase someone with more experience can
> find something useful.

I do not have the time to study your snapshots, unfortunately, but I do
continue to recommend that you take a lot more than two snapshots (e.g.
one snapshot every 10 minutes over a few hours of steady load) and then
either find lines with an increasing counter of alive objects OR confirm
that those lines do not exist.

A Perl script at [1] implements the above analysis, but it has not been
updated for a few years, may not work "as is" with the current mgr:mem
output format, and may need additional tuning to work well in your
specific case, so you may better off starting from scratch.

  [1] http://www.measurement-factory.com/tmp/attachments/memdiff.pl


Cheers,

Alex.


> On Mon, Oct 9, 2017 at 5:04 PM, Aaron Turner <[hidden email]> wrote:
>> So more testing.  I haven't found the line in the info:mem logs which
>> is the red flag, but additional testing proves that the memleak has
>> something to do with ssl bumping.  Once I turn that off, the memory
>> leaks stop.
>>
>> this was the ssl related config options:
>>
>> http_port 10.0.0.1:3128 ssl-bump generate-host-certificates=on
>> dynamic_cert_mem_cache_size=400MB cert=/etc/squid/ssl_cert/myCA.pem
>> sslflags=NO_DEFAULT_CA
>> http_port localhost:3128
>> ssl_bump bump all
>>
>> sslcrtd_program /usr/lib64/squid/ssl_crtd -s /var/lib/squid/ssl_db -M 4MB
>> sslcrtd_children 32 startup=2 idle=2
>> sslproxy_session_cache_size 100 MB
>> sslproxy_cert_error allow all
>> sslproxy_flags DONT_VERIFY_PEER
>>
>> --
>> Aaron Turner
>> https://synfin.net/         Twitter: @synfinatic
>> My father once told me that respect for the truth comes close to being
>> the basis for all morality.  "Something cannot emerge from nothing,"
>> he said.  This is profound thinking if you understand how unstable
>> "the truth" can be.  -- Frank Herbert, Dune
>>
>>
>> On Wed, Oct 4, 2017 at 10:53 AM, Alex Rousskov
>> <[hidden email]> wrote:
>>> On 10/02/2017 09:37 PM, Aaron Turner wrote:
>>>> So it's leaking memory and not tracking it?
>>>
>>>
>>> That combination (or, to be more precise, its implication) is possible
>>> but relatively unlikely in your specific case -- when GBs are leaked,
>>> there is usually something tracked related to those GBs. Please note
>>> that what Squid _tracks_ may not amount to GBs of RAM! For example,
>>> Squid can track a tiny object that is included in every large untracked
>>> leaked object.
>>>
>>> A frequent leak often manifests itself in mgr:mem snapshots as a nearly
>>> always increasing counter of alive associated objects. If you take one
>>> snapshot every 30 minutes or so, then you may be able to identify
>>> suspects by comparing same-object alive counters across 5-10 snapshots.
>>> Sorry, I do not have the time to do that for the snapshots you have
>>> shared (and you probably need a different collection of snapshots to
>>> make this search more productive).
>>>
>>> Alex.
>>>
>>>
>>>> On Mon, Oct 2, 2017 at 8:25 PM, Amos Jeffries <[hidden email]> wrote:
>>>>> On 03/10/17 04:39, Aaron Turner wrote:
>>>>>>
>>>>>> Anyone see anything useful?
>>>>>
>>>>>
>>>>> The numbers in those reports all seem reasonable to me. Nothing is showing
>>>>> up with GB of RAM used.

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users