SQUID memory error after vm.swappines changed from 60 to 10

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: SQUID memory error after vm.swappines changed from 60 to 10

Bike dernikov1
On Mon, Nov 13, 2017 at 4:36 PM, Alex Rousskov
<[hidden email]> wrote:

> On 11/13/2017 02:34 AM, Bike dernikov1 wrote:
>> On Fri, Nov 10, 2017 at 4:43 PM, Alex Rousskov wrote:
>>> Squid swapping in production is an arguably worse disaster, as you have
>>> learned. In many cases, it is better to deal with a lack of swap than to
>>> rely on swap's magical effects that most humans poorly understand. YMMV.
>
>> In this scenario, swap is backup cache (as I understand)?
>
> In this scenario, swap is not a cache! In fact it is pretty much the
> opposite:
>
> * A cache is, by definition, an optional unreliable "fast" storage meant
> to reduce the need to go to some "slow" storage.
>
> * When in active use, swap is required reliable slow storage meant to
> extend fast storage (RAM) capacity.
>
> Do you see how almost every adjective in the first bullet is replaced
> with an antonym in the second one?

Yes, definitely mixed up, I meant right but wrote wrong.

> Some services, including many databases, over allocate RAM to store
> rarely used (computed and/or preloaded) data. When that data is swapped
> out, the service often continues to operate normally because the data is
> rarely accessed (and/or because swapping it in is still cheaper than
> computing it from scratch).
>
> With Squid, it is very difficult for the OS to correctly identify the
> rarely used RAM areas to swap out. When the OS swaps out the wrong area,
> Squid slows down (to access that area), which only increases the number
> of concurrent transactions and, hence, the amount of RAM Squid needs to
> operate, which triggers more wrong swap outs, creating a vicious cycle.

That is why best solution would be swap of. Definitely testing swapoff
next week, if we won't have new problems.

>> Swap could be used  to translate back data to mem if used, but it
>> stays on disk and purge after some time if not used ?
>
> The purging bit is wrong. Think of swap as very very very slow RAM.
>
> Alex.

So, when squid need something from swap, it will load that data back to ram.
For purge, data then  stay in swap forever ?

Swap very slow ram, best explanation. Use that metaphor during
mentoring younger coworkers :).
Thanks for explaining swap memory problems.
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: SQUID memory error after vm.swappines changed from 60 to 10

Alex Rousskov
On 11/14/2017 08:41 AM, Bike dernikov1 wrote:
> On Mon, Nov 13, 2017 at 4:36 PM, Alex Rousskov wrote:
>> On 11/13/2017 02:34 AM, Bike dernikov1 wrote:
>>> Swap could be used  to translate back data to mem if used, but it
>>> stays on disk and purge after some time if not used ?

>> The purging bit is wrong. Think of swap as very very very slow RAM.

> So, when squid need something from swap, it will load that data back to ram.

In this context, swap is an OS-level concept. Squid does not know that
the OS memory manager has swapped some of Squid data from RAM to disk.
OS does not know what swapped out data means to Squid. When Squid tries
to access data at a swapped out address, the OS blocks the Squid process
and loads the missing data from disk into RAM (usually after swapping
out some RAM bytes to free RAM space for those requested bytes).

Here, "data" essentially means any sequence of bytes allocated by Squid.
For example, some of those swapped out bytes may have nothing to do with
Squid memory cache. Swapped out bytes can even be Squid binary code.


> For purge, data then  stay in swap forever ?

Swapped out process data stays swapped out until it is either accessed
by the process (and is swapped in by the OS) or the process terminates
(without accessing those swapped out bytes). The latter is unlikely for
Squid data unless the Squid process dies prematurely (i.e., without
doing internal cleanup).


HTH,

Alex.
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: SQUID memory error after vm.swappines changed from 60 to 10

Bike dernikov1
On Tue, Nov 14, 2017 at 6:12 PM, Alex Rousskov
<[hidden email]> wrote:

> On 11/14/2017 08:41 AM, Bike dernikov1 wrote:
>> On Mon, Nov 13, 2017 at 4:36 PM, Alex Rousskov wrote:
>>> On 11/13/2017 02:34 AM, Bike dernikov1 wrote:
>>>> Swap could be used  to translate back data to mem if used, but it
>>>> stays on disk and purge after some time if not used ?
>
>>> The purging bit is wrong. Think of swap as very very very slow RAM.
>
>> So, when squid need something from swap, it will load that data back to ram.
>
> In this context, swap is an OS-level concept. Squid does not know that
> the OS memory manager has swapped some of Squid data from RAM to disk.
> OS does not know what swapped out data means to Squid. When Squid tries
> to access data at a swapped out address, the OS blocks the Squid process
> and loads the missing data from disk into RAM (usually after swapping
> out some RAM bytes to free RAM space for those requested bytes).
>
> Here, "data" essentially means any sequence of bytes allocated by Squid.
> For example, some of those swapped out bytes may have nothing to do with
> Squid memory cache. Swapped out bytes can even be Squid binary code.
>
>
>> For purge, data then  stay in swap forever ?
>
> Swapped out process data stays swapped out until it is either accessed
> by the process (and is swapped in by the OS) or the process terminates
> (without accessing those swapped out bytes). The latter is unlikely for
> Squid data unless the Squid process dies prematurely (i.e., without
> doing internal cleanup).
>
> HTH,
>
> Alex.

Superb explanation. I think that now I understand process much better.

If i can ask under same title:
Yesterday we had error in logs: syslog, cache.log, dmesg,access.log

segfault at 8 ip ....... sp ..... error 4 is squid
process pid exited due to signal 11 with status 0

Squid restarted,  that was at the end of work, and i didn't  notice
change while surfing.
I noticed change in used memory, after i went trough logs, and found segfault.

Can you point me, how to analyze what happened.
Can that be problem with kernel ?

Thanks for help.
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: SQUID memory error after vm.swappines changed from 60 to 10

Amos Jeffries
Administrator
On 16/11/17 01:32, Bike dernikov1 wrote:

>
> If i can ask under same title:
> Yesterday we had error in logs: syslog, cache.log, dmesg,access.log
>
> segfault at 8 ip ....... sp ..... error 4 is squid
> process pid exited due to signal 11 with status 0
>
> Squid restarted,  that was at the end of work, and i didn't  notice
> change while surfing.
> I noticed change in used memory, after i went trough logs, and found segfault.
>
> Can you point me, how to analyze what happened.
> Can that be problem with kernel ?
>

How to retrieve info about these type of things is detailed at
<https://wiki.squid-cache.org/SquidFaq/BugReporting>.

NP: If you do not have core files enabled, then the data from that
segfault is probably gone irretrievably. You may need to use the script
to capture segfault details from a running proxy (the 'minimal downtime'
section).


Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: SQUID memory error after vm.swappines changed from 60 to 10

Bike dernikov1
On Thu, Nov 16, 2017 at 8:58 AM, Amos Jeffries <[hidden email]> wrote:

> On 16/11/17 01:32, Bike dernikov1 wrote:
>>
>>
>> If i can ask under same title:
>> Yesterday we had error in logs: syslog, cache.log, dmesg,access.log
>>
>> segfault at 8 ip ....... sp ..... error 4 is squid
>> process pid exited due to signal 11 with status 0
>>
>> Squid restarted,  that was at the end of work, and i didn't  notice
>> change while surfing.
>> I noticed change in used memory, after i went trough logs, and found
>> segfault.
>>
>> Can you point me, how to analyze what happened.
>> Can that be problem with kernel ?
>>
>
> How to retrieve info about these type of things is detailed at
> <https://wiki.squid-cache.org/SquidFaq/BugReporting>.

I wasn't sure it is bug, so i didn't want to post it that is a  bug.
As you now confirm that it can be bug i will prepare for retriving
infos.
I just hope that bug won't  happen at high  load in middle of working day.


> NP: If you do not have core files enabled, then the data from that segfault
> is probably gone irretrievably. You may need to use the script to capture
> segfault details from a running proxy (the 'minimal downtime' section).

I am sure that i didn't enabled it.

>
> Amos
>
> _______________________________________________
> squid-users mailing list
> [hidden email]
> http://lists.squid-cache.org/listinfo/squid-users

Thanks for help.
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: SQUID memory error after vm.swappines changed from 60 to 10

Amos Jeffries
Administrator
On 17/11/17 03:49, Bike dernikov1 wrote:

> On Thu, Nov 16, 2017 at 8:58 AM, Amos Jeffries wrote:
>> On 16/11/17 01:32, Bike dernikov1 wrote:
>>>
>>>
>>> If i can ask under same title:
>>> Yesterday we had error in logs: syslog, cache.log, dmesg,access.log
>>>
>>> segfault at 8 ip ....... sp ..... error 4 is squid
>>> process pid exited due to signal 11 with status 0
>>>
>>> Squid restarted,  that was at the end of work, and i didn't  notice
>>> change while surfing.
>>> I noticed change in used memory, after i went trough logs, and found
>>> segfault.
>>>
>>> Can you point me, how to analyze what happened.
>>> Can that be problem with kernel ?
>>>
>>
>> How to retrieve info about these type of things is detailed at
>> <https://wiki.squid-cache.org/SquidFaq/BugReporting>.
>
> I wasn't sure it is bug, so i didn't want to post it that is a  bug.
> As you now confirm that it can be bug i will prepare for retriving
> infos.
> I just hope that bug won't  happen at high  load in middle of working day.
>

The how-to are just on that page because if you are reporting that kind
of bug those details are mandatory. You dont have to be reporting a bug
to use the techniques.

That said, segfault is almost always a bug. Though it could be a bug in
the system environment or hardware rather than Squid. The details you
get from looking at the traces should indicate whether those are actual
or not.


>
>> NP: If you do not have core files enabled, then the data from that segfault
>> is probably gone irretrievably. You may need to use the script to capture
>> segfault details from a running proxy (the 'minimal downtime' section).
>
> I am sure that i didn't enabled it.
>

Okay, then you will need to for further diagnosis.

Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: SQUID memory error after vm.swappines changed from 60 to 10

Bike dernikov1
On Fri, Nov 17, 2017 at 3:53 AM, Amos Jeffries <[hidden email]> wrote:

> On 17/11/17 03:49, Bike dernikov1 wrote:
>>
>> On Thu, Nov 16, 2017 at 8:58 AM, Amos Jeffries wrote:
>>>
>>> On 16/11/17 01:32, Bike dernikov1 wrote:
>>>>
>>>>
>>>>
>>>> If i can ask under same title:
>>>> Yesterday we had error in logs: syslog, cache.log, dmesg,access.log
>>>>
>>>> segfault at 8 ip ....... sp ..... error 4 is squid
>>>> process pid exited due to signal 11 with status 0
>>>>
>>>> Squid restarted,  that was at the end of work, and i didn't  notice
>>>> change while surfing.
>>>> I noticed change in used memory, after i went trough logs, and found
>>>> segfault.
>>>>
>>>> Can you point me, how to analyze what happened.
>>>> Can that be problem with kernel ?
>>>>
>>>
>>> How to retrieve info about these type of things is detailed at
>>> <https://wiki.squid-cache.org/SquidFaq/BugReporting>.
>>
>>
>> I wasn't sure it is bug, so i didn't want to post it that is a  bug.
>> As you now confirm that it can be bug i will prepare for retriving
>> infos.
>> I just hope that bug won't  happen at high  load in middle of working day.
>>
>
> The how-to are just on that page because if you are reporting that kind of
> bug those details are mandatory. You dont have to be reporting a bug to use
> the techniques.
>
> That said, segfault is almost always a bug. Though it could be a bug in the
> system environment or hardware rather than Squid. The details you get from
> looking at the traces should indicate whether those are actual or not.

In the begining, we had many crashes, and we thought that we have hardware bug.
We had two different servers, Fujitsu RX600 and X3550M3.  We was
testing Squid  on Centos and Debian.
Debian won because of new squidguard version on  which work
authorization with ldap.
First upgrade to Debian 9 (stable) crashed installation on Fujitsu. It
couldn't boot with new kernel.
Same Debian worked on IBM X3550M3. So it was a nightmare for testing.
We returned to stable kernel, and problems disappeared until now.
Although only one segfault so far in 3 days.

>>
>>> NP: If you do not have core files enabled, then the data from that
>>> segfault
>>> is probably gone irretrievably. You may need to use the script to capture
>>> segfault details from a running proxy (the 'minimal downtime' section).
>>
>>
>> I am sure that i didn't enabled it.
>>
>
> Okay, then you will need to for further diagnosis.

From  Monday we will start with reconfiguration. Each day new problem.
Migration slowed to stop :(

Today we had different problem (with exhausted inodes). Logs exploded,
with no space on disk errors (disk on 60% free)
Luckily, we found   what  was problem (sarg and scripted  generated
reports) under 5 minutes.
We lost half  day for rewriting scripts. I hope that we solve that
problem  for good :).

> Amos

We couldn't done it without you help.
Thanks a lot.
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
12