Cache digest vs ICP

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Cache digest vs ICP

Veiko Kukk-2
Hi,

We have cluster of squids in reverse proxy mode. Each one of those is sibling to others and they all have same originservers as parents. Siblings are configured with no-proxy keyword to achieve that they don't cache what other siblings already have in their cache. This is to minimize data usage costs from origin servers. What is in our cluster should never be fetched again from origin because it never changes. It's not typical web cache, it's CDN system for content that we create and control and squid is just one of the internal parts and not exposed directly to the clients.

So far digest_generation has been set to off and only ICP has been used between siblings. Mostly because digest stats had shown many rejects (not containing 100% of cache objects) and documentation about digests is confusing up to statements that while rebuilding digest, squid will stop serving requests.

Since we need to have more siblings and more far away from each other, ICP overhead becomes an issue (time spent on query). Having proper digest with all of the objects in cache_dir included in digest could be better solution due to not having delay of initial ICP request. Digest documentation states that it's including based on refresh_pattern. It's a problem because to get squid working as we want, we had to use offline_mode on.

Questions:
* What is the relationship between cache digests and ICP? If they are active together, how are they used together?
* How are objects added to digest when rebuilding? Does this include lot of disk i/o like scanning all cache_dir files or is it based on swap.state contents?
* How can i see which objects are listed in cache digest?
* Why does sibling false positive result in sending client 504 and not trying next sibling or parent? CD_SIBLING_HIT/192.168.1.52  TCP_MISS/504. How to achieve proceeding with next cache_peer?

Best regards,
Veiko





_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Cache digest vs ICP

Alex Rousskov
On 09/27/2017 03:46 AM, Veiko Kukk wrote:

> Siblings are configured with no-proxy keyword to achieve that they don't
> cache what other siblings already have in their cache.

I assume that by "no-proxy" you meant "proxy-only".


> This is to minimize data usage costs from origin servers.

The proxy-only option does not minimize the amount of data transmitted
between a proxy and the origin server. It reduces cache duplication
among cache peers.


> So far digest_generation has been set to off and only ICP has been used
> between siblings. Mostly because digest stats had shown many rejects
> (not containing 100% of cache objects) and documentation about digests
> is confusing up to statements that while rebuilding digest, squid will
> stop serving requests.

Please point me to the location of that statement. IMHO, it is not
confusing but incorrect. Non-SMP Squid stops servicing requests while
rebuilding a cache digest _chunk_, not the entire digest (unless the
digest is configured to have only one chunk, of course). The size if the
chunk is controlled by digest_rebuild_chunk_percentage.

Please note that non-SMP Squid stops servicing other requests when doing
virtually anything -- Squid is not threaded. The reason cache digests
are somewhat "special" in this context is because rebuilding the entire
digest may take a long time for large caches. Squid combats that by
splitting the digest rebuild process into chunks (a misleading term!),
digesting at most digest_rebuild_chunk_percentage of cached objects at a
time.

Cache Digests are not SMP aware (but should be). You may be able to work
around that limitation using SMP macros, but I have not tested that. I
do not remember whether a worker that is not configured to generate a
digest will still look it up in the cache when a peer asks for it.
Hopefully, the worker will do that lookup.


> Digest
> documentation states that it's including based on refresh_pattern. It's
> a problem because to get squid working as we want, we had to use
> offline_mode on.

If Cache Digests do not honor offline_mode, it is a (staleness
estimation code) bug that should be reported and fixed.

Meanwhile, does refresh_pattern stop working when offline_mode is on? If
not, then can you use refresh_pattern to emulate offline_mode effects
while still using offline_mode?


> * What is the relationship between cache digests and ICP?

IIRC, none, except the former is checked before the latter.


> If they are active together, how are they used together?

I have not tested this, but Cache Digests ought to be checked first, and
if they miss, then Squid should proceed to ICP/HTCP/etc. AFAICT, a Cache
Digest miss has no effect on other peer selection algorithms.


> * How are objects added to digest when rebuilding? Does this include lot
> of disk i/o like scanning all cache_dir files or is it based on
> swap.state contents?

Objects are digested based on the in-RAM cache index. There is no disk
I/O involved until the built digest needs to be stored on disk.


> * How can i see which objects are listed in cache digest?

A Cache Digest does not list/store object URLs -- it cannot produce a
list of previously digested objects. The only way to find out whether
object X was digested (with some degree of certainty) is to query the
digest for that object X.

I am not aware of any command-line interface for interrogating digests,
but it is certainly possible to build one. Please note that Squid
includes both the URL and the method into the object cache key (which is
what ends up being hashed by the Cache Digests code).


> * Why does sibling false positive result in sending client 504 and not
> trying next sibling or parent? CD_SIBLING_HIT/192.168.1.52
> TCP_MISS/504. How to achieve proceeding with next cache_peer?

Sounds like bug #4223 to me:
http://bugs.squid-cache.org/show_bug.cgi?id=4223


HTH,

Alex.
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Cache digest vs ICP

Veiko Kukk-2
Alex, thank you for your response!

2017-09-27 18:06 GMT+03:00 Alex Rousskov <[hidden email]>:
On 09/27/2017 03:46 AM, Veiko Kukk wrote:

> Siblings are configured with no-proxy keyword to achieve that they don't
> cache what other siblings already have in their cache.

I assume that by "no-proxy" you meant "proxy-only".

True, that was my mistake. 

> This is to minimize data usage costs from origin servers.

The proxy-only option does not minimize the amount of data transmitted
between a proxy and the origin server. It reduces cache duplication
among cache peers.
 
Exactly. 

> So far digest_generation has been set to off and only ICP has been used
> between siblings. Mostly because digest stats had shown many rejects
> (not containing 100% of cache objects) and documentation about digests
> is confusing up to statements that while rebuilding digest, squid will
> stop serving requests.

Please point me to the location of that statement. IMHO, it is not
confusing but incorrect.

I found it in the book by Duane Wessels http://etutorials.org/Server+Administration/Squid.+The+definitive+guide/Chapter+10.+Talking+to+Other+Squids/10.7+Cache+Digests/
Quoting: During each invocation of the rebuild function, Squid adds some percentage of the cache to the digest. Squid doesn't process user requests while this function runs.
 

Cache Digests are not SMP aware (but should be). You may be able to work
around that limitation using SMP macros, but I have not tested that. I
do not remember whether a worker that is not configured to generate a
digest will still look it up in the cache when a peer asks for it.
Hopefully, the worker will do that lookup.

That sounds very interesting. Could you point me to sample configuration?
 

> Digest
> documentation states that it's including based on refresh_pattern. It's
> a problem because to get squid working as we want, we had to use
> offline_mode on.

If Cache Digests do not honor offline_mode, it is a (staleness
estimation code) bug that should be reported and fixed.


> * Why does sibling false positive result in sending client 504 and not
> trying next sibling or parent? CD_SIBLING_HIT/192.168.1.52
> TCP_MISS/504. How to achieve proceeding with next cache_peer?

Sounds like bug #4223 to me:
http://bugs.squid-cache.org/show_bug.cgi?id=4223

I've patched 3.5.27 with patch found under that bug and build rpm for testing, and so far have not encountered that error anymore.

I have another issue. How frequently are cache digests refreshed from siblings? It seems to me that it takes quite a lot time and i have not found anything in documentation that could help enfroce digest refreshing. In test system, i've set 'digest_rebuild_period 60 second'. With clean cache and running test downloads sibling1 very quickly updates it's cache digest:

Local Digest:
store digest: size: 10492 bytes
entries: count: 415 capacity: 16787 util: 2%
deletion attempts: 0
bits: per entry: 5 on: 1648 capacity: 83936 util: 2%
bit-seq: count: 3224 avg.len: 26.03
added: 415 rejected: 0 ( 0.00 %) del-ed: 0
collisions: on add: 0.00 % on rej: -1.00 %

I've waited at least 20 minutes, several times ran downloads agains sibling2 (clean cache too) and sibling2 (192.168.1.52) still shows old, almost empty cache digest for sibling1(192.168.1.51):

Peer Digests:
no guess stats for all peers available

Per-peer statistics:

peer digest from 192.168.1.51
no guess stats for 192.168.1.51 available

event timestamp secs from now secs from init
initialized 1506952649 -1602 +0
needed 1506953341 -910 +692
requested 1506953341 -910 +692
received 1506953341 -910 +692
next_check 1506956584 +2333 +3935
peer digest state:
needed: yes, usable: yes, requested:  no

last retry delay: 0 secs
last request response time: 0 secs
last request result: success

peer digest traffic:
requests sent: 1, volume: 0 KB
replies recv:  1, volume: 0 KB

peer digest structure:
192.168.1.51 digest: size: 32 bytes
entries: count: 51 capacity: 51 util: 100%
deletion attempts: 0
bits: per entry: 5 on: 142 capacity: 256 util: 55%
bit-seq: count: 131 avg.len: 1.95


Algorithm usage:
Cache Digest:       0 ( -1%)
Icp:                0 ( -1%)
Total:              0 ( -1%)

Local Digest:
store digest: size: 1461 bytes
entries: count: 75 capacity: 2337 util: 3%
deletion attempts: 0
bits: per entry: 5 on: 296 capacity: 11688 util: 3%
bit-seq: count: 572 avg.len: 20.43
added: 75 rejected: 0 ( 0.00 %) del-ed: 0
collisions: on add: 0.00 % on rej: -1.00 %

Why is it so?
How is cache digest from siblings refreshed?
How can sibling cache digest refreshed more frequently?

Best regards,
Veiko

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Cache digest vs ICP

Alex Rousskov
On 10/02/2017 08:28 AM, Veiko Kukk wrote:

> I found it in the book by Duane Wessels
> Quoting: During each invocation of the rebuild function, Squid adds some
> percentage of the cache to the digest. Squid doesn't process user
> requests while this function runs.

The quoted statement is correct: Digesting a (configurable) percentage
of cache index is a blocking action -- Squid does not process anything
else while that action runs. As we discussed earlier, digesting the
whole cache index is not blocking. This is similar to how one network
read is blocking but receiving the entire response body is not blocking.



>     Cache Digests are not SMP aware (but should be). You may be able to work
>     around that limitation using SMP macros, but I have not tested that. I
>     do not remember whether a worker that is not configured to generate a
>     digest will still look it up in the cache when a peer asks for it.
>     Hopefully, the worker will do that lookup.
>
> That sounds very interesting. Could you point me to sample configuration?

I am not aware of any sample configurations that restrict digest
generation to a single worker, but that does not mean they do not exist.
SMP macros in general are described in the beginning of
squid.conf.documented.



> How frequently are cache digests refreshed from
> siblings?

The short answer is "a new peer digest is fetched PeerDigestReqMinGap
seconds after its earlier cached version has expired". I believe the
details are covered by the discussion below and the following FAQ entry:
https://wiki.squid-cache.org/SquidFaq/CacheDigests#How_are_Cache_Digests_transferred_between_peers.3F



> It seems to me that it takes quite a lot time and i have not
> found anything in documentation that could help enfroce digest
> refreshing.

digest_rebuild_period controls how often the local digest is refreshed.
Bugs notwithstanding, the local digest expiration (and the Expires field
in digest HTTP response) should be set accordingly.


> In test system, i've set 'digest_rebuild_period 60 second'.

Squid has several hard-coded rate limits for digest fetches:

* refresh a given peer digest no more than once in 5 minutes:

  /* min interval for requesting digests from a given peer */
  static const time_t PeerDigestReqMinGap = 5 * 60;   /* seconds */

* and request a digest no more frequently than once per minute:

  /* min interval for requesting digests (cumulative request stream) */
  static const time_t GlobDigestReqMinGap = 1 * 60;   /* seconds */


Notes for your future tests, if any:

* If you are running an SMP Squid, then please repeat the test without
SMP. Make sure non-SMP configuration works before you try to configure
SMP Squid (which will probably require lowering digest_rewrite_period as
well so that all workers can see the newly generated digest on disk).

* A 60 second refresh feels too aggressive to me. Any
digest_rebuild_period longer than digest generation should work in
theory, but I would be worried about various hard-coded hack interfering
with such a small value as 60 seconds. I recommend starting with 5
minute or longer periods. A longer regeneration period would also go
nicely with PeerDigestReqMinGap discussed above.


> With clean cache and running test downloads sibling1 very quickly
> updates it's cache digest:
>
> Local Digest:
> store digest: size: 10492 bytes
> entries: count: 415 capacity: 16787 util: 2%
> deletion attempts: 0
> bits: per entry: 5 on: 1648 capacity: 83936 util: 2%
> bit-seq: count: 3224 avg.len: 26.03
> added: 415 rejected: 0 ( 0.00 %) del-ed: 0
> collisions: on add: 0.00 % on rej: -1.00 %
>
> I've waited at least 20 minutes, several times ran downloads agains
> sibling2 (clean cache too) and sibling2 (192.168.1.52) still shows old,
> almost empty cache digest for sibling1(192.168.1.51):

Please note that if the old digest1 was generated before you changed
digest_rebuild_period for sibling1, then its old cached version will
still have that old expiration date. I am _not_ saying that is what
happens in your specific test, but please keep this caveat in mind.

Also, the 192.168.1.51 digest shown below is not "almost empty" -- the
stats below show that it has 55% of its bits turned on, with all 51
expected entries digested. AFAICT, that digest is full.


> Peer Digests:
> no guess stats for all peers available
>
> Per-peer statistics:
>
> peer digest from 192.168.1.51
> no guess stats for 192.168.1.51 available
>
> event          timestamp    secs from now    secs from init
> initialized    1506952649    -1602              +0
> needed         1506953341     -910            +692
> requested      1506953341     -910            +692
> received       1506953341     -910            +692
> next_check     1506956584    +2333           +3935

> peer digest state: needed: yes, usable: yes, requested:  no

> last retry delay: 0 secs
> last request response time: 0 secs
> last request result: success

> peer digest traffic:
> requests sent: 1, volume: 0 KB
> replies recv:  1, volume: 0 KB
>
> peer digest structure:
> 192.168.1.51 digest: size: 32 bytes
> entries: count: 51 capacity: 51 util: 100%
> deletion attempts: 0
> bits: per entry: 5 on: 142 capacity: 256 util: 55%
> bit-seq: count: 131 avg.len: 1.95

If I am reading the above sibling2 stats correctly, then sibling2
downloaded and cached a tiny 32-byte digest1 910 seconds ago and will
refresh the cached copy in 2333 seconds. That next check will come
(3935-692)/60 = (2333+910)/60 = 54 minutes after digest1 birth. You
should be able to correlate that with digest1 generation stats reported
by 192.168.1.51 at the time when this digest was generated.


HTH,

Alex.
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users