Squid transparent not caching apt requests from deb.debian.org

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
14 messages Options
zrm
Reply | Threaded
Open this post in threaded view
|

Squid transparent not caching apt requests from deb.debian.org

zrm
Greetings! Today I bring you a Squid cache mystery.

I configured a simple transparent proxy to cache package downloads for
Debian, using Squid on Debian 10. When apt clients download packages
from deb.debian.org, Squid says TCP_MISS, downloads the package, and
then doesn't cache it. The mystery is, why not.

It caches other requests by other applications. It caches other requests
by apt -- packages from security.debian.org get cached. It caches
requests for deb.debian.org when the client is not apt -- using wget or
curl on the same URL causes it to be cached, as does pasting the HTTP
request into netcat. The same HTTP request that apt sends. I checked the
packets with WireShark and the TCP payload for the request is byte for
byte identical, but when apt does it, it isn't cached. I'm not sure how
it's even distinguishing the requests in order to behave differently.

These are the changes from the default squid.conf packaged with Debian 10:
http_access allow localnet
http_port 3130
http_port 3128 intercept
maximum_object_size_in_memory 4 MB

Here are repeated requests using "apt remove -y vim;apt install vim",
access.log:

1585891724.223    560 192.168.111.55 TCP_MISS/200 1281195 GET
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
ORIGINAL_DST/199.232.64.204 application/x-debian-package
1585891726.697    277 192.168.111.55 TCP_MISS/200 1281195 GET
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
ORIGINAL_DST/199.232.64.204 application/x-debian-package

store.log:

1585891535.154 RELEASE -1 FFFFFFFF 04000000000000001712000001000000  200
1585891535 1560620795 1588483535 application/x-debian-package
1280476/1280476 GET
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb
1585891724.223 RELEASE -1 FFFFFFFF 05000000000000001712000001000000  200
1585891723 1560620795 1588483723 application/x-debian-package
1280476/1280476 GET
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb

But here are repeated requests for the same url using wget, access.log:

1585891729.700    433 192.168.111.55 TCP_MISS/200 1281195 GET
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
ORIGINAL_DST/199.232.66.133 application/x-debian-package
1585891731.089     70 192.168.111.55 TCP_REFRESH_UNMODIFIED/200 1281232
GET
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
ORIGINAL_DST/199.232.66.133 application/x-debian-package

store.log:

1585891726.697 RELEASE -1 FFFFFFFF 06000000000000001712000001000000  200
1585891726 1560620795 1588483726 application/x-debian-package
1280476/1280476 GET
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb
1585891731.047 RELEASE -1 FFFFFFFF 08000000000000001712000001000000  304
1585891731        -1 1588483731 unknown -1/-293 GET
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb

In the first case we get TCP_MISS every time because it isn't caching
the data, in the second case it's only the first time and after that we
get TCP_REFRESH_UNMODIFIED. But how and why is this happening?
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid transparent not caching apt requests from deb.debian.org

Antony Stone
On Friday 03 April 2020 at 22:26:13, zrm wrote:

> Greetings! Today I bring you a Squid cache mystery.

> In the first case we get TCP_MISS every time because it isn't caching
> the data, in the second case it's only the first time and after that we
> get TCP_REFRESH_UNMODIFIED. But how and why is this happening?

Given that this is an intercepting proxy and you're using HTTP (not HTTPS),
can you do a packet capture with tshark or similar on the internal interface,
to show the full details of the HTTP request which comes in to Squid from apt,
and the same for wget, to see what difference there is?


Antony.

--
Normal people think "If it ain't broke, don't fix it".
Engineers think "If it ain't broke, it doesn't have enough features yet".

                                                   Please reply to the list;
                                                         please *don't* CC me.
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid transparent not caching apt requests from deb.debian.org

Alex Rousskov
In reply to this post by zrm
On 4/3/20 4:26 PM, zrm wrote:
> In the first case we get TCP_MISS every time because it isn't caching
> the data, in the second case it's only the first time and after that we
> get TCP_REFRESH_UNMODIFIED. But how and why is this happening?

Those questions can often be answered by looking at HTTP headers
received and sent by Squid when talking to clients and origin servers.
You can see those headers in basic packet captures or if you set
debug_options to ALL,2.

Alex.
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
zrm
Reply | Threaded
Open this post in threaded view
|

Re: Squid transparent not caching apt requests from deb.debian.org

zrm
On 4/3/20 16:34, Alex Rousskov wrote:

> On 4/3/20 4:26 PM, zrm wrote:
>> In the first case we get TCP_MISS every time because it isn't caching
>> the data, in the second case it's only the first time and after that we
>> get TCP_REFRESH_UNMODIFIED. But how and why is this happening?
>
> Those questions can often be answered by looking at HTTP headers
> received and sent by Squid when talking to clients and origin servers.
> You can see those headers in basic packet captures or if you set
> debug_options to ALL,2.
>
> Alex.
>

apt to squid:
---------
GET /debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb HTTP/1.1
Host: deb.debian.org
User-Agent: Debian APT-HTTP/1.3 (1.8.2)


----------
squid to deb.debian.org:
---------
GET /debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb HTTP/1.1
User-Agent: Debian APT-HTTP/1.3 (1.8.2)
Host: deb.debian.org
Via: 1.1 tproxy (squid/4.6)
X-Forwarded-For: 192.168.111.55
Cache-Control: max-age=259200
Connection: keep-alive


----------
deb.debian.org to squid:
---------
HTTP/1.1 200 OK
Server: Apache
X-Content-Type-Options: nosniff
X-Frame-Options: sameorigin
Referrer-Policy: no-referrer
X-Xss-Protection: 1
Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
ETag: "1389dc-58b605823fa6e"
X-Clacks-Overhead: GNU Terry Pratchett
Cache-Control: public, max-age=2592000
Content-Type: application/x-debian-package
Via: 1.1 varnish
Content-Length: 1280476
Accept-Ranges: bytes
Date: Fri, 03 Apr 2020 05:28:46 GMT
Via: 1.1 varnish
Age: 4248100
Connection: keep-alive
X-Served-By: cache-ams21028-AMS, cache-wdc5559-WDC
X-Cache: HIT, HIT
X-Cache-Hits: 1, 2
X-Timer: S1585891726.434375,VS0,VE0

----------
squid to apt:
---------
HTTP/1.1 200 OK
Server: Apache
X-Content-Type-Options: nosniff
X-Frame-Options: sameorigin
Referrer-Policy: no-referrer
X-Xss-Protection: 1
Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
ETag: "1389dc-58b605823fa6e"
X-Clacks-Overhead: GNU Terry Pratchett
Cache-Control: public, max-age=2592000
Content-Type: application/x-debian-package
Content-Length: 1280476
Accept-Ranges: bytes
Date: Fri, 03 Apr 2020 05:28:46 GMT
Age: 4248100
X-Served-By: cache-ams21028-AMS, cache-wdc5559-WDC
X-Cache: HIT, HIT
X-Cache-Hits: 1, 2
X-Timer: S1585891726.434375,VS0,VE0
X-Cache: MISS from tproxy
X-Cache-Lookup: MISS from tproxy:3130
Via: 1.1 varnish, 1.1 varnish, 1.1 tproxy (squid/4.6)
Connection: keep-alive


----------



wget to squid:
---------
GET /debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb HTTP/1.1
User-Agent: Wget/1.20.1 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: deb.debian.org
Connection: Keep-Alive


----------
squid to deb.debian.org:
----------
GET /debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb HTTP/1.1
User-Agent: Wget/1.20.1 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: deb.debian.org
Via: 1.1 tproxy (squid/4.6)
X-Forwarded-For: 192.168.111.55
Cache-Control: max-age=259200
Connection: keep-alive


----------
deb.debian.org to squid:
---------
HTTP/1.1 200 OK
Server: Apache
X-Content-Type-Options: nosniff
X-Frame-Options: sameorigin
Referrer-Policy: no-referrer
X-Xss-Protection: 1
Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
ETag: "1389dc-58b605823fa6e"
X-Clacks-Overhead: GNU Terry Pratchett
Cache-Control: public, max-age=2592000
Content-Type: application/x-debian-package
Via: 1.1 varnish
Content-Length: 1280476
Accept-Ranges: bytes
Date: Fri, 03 Apr 2020 05:28:49 GMT
Via: 1.1 varnish
Age: 4248102
Connection: keep-alive
X-Served-By: cache-ams21028-AMS, cache-wdc5583-WDC
X-Cache: HIT, HIT
X-Cache-Hits: 1, 1
X-Timer: S1585891729.316362,VS0,VE0

----------
squid to wget:
---------
HTTP/1.1 200 OK
Server: Apache
X-Content-Type-Options: nosniff
X-Frame-Options: sameorigin
Referrer-Policy: no-referrer
X-Xss-Protection: 1
Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
ETag: "1389dc-58b605823fa6e"
X-Clacks-Overhead: GNU Terry Pratchett
Cache-Control: public, max-age=2592000
Content-Type: application/x-debian-package
Content-Length: 1280476
Accept-Ranges: bytes
Date: Fri, 03 Apr 2020 05:28:49 GMT
Age: 4248102
X-Served-By: cache-ams21028-AMS, cache-wdc5583-WDC
X-Cache: HIT, HIT
X-Cache-Hits: 1, 1
X-Timer: S1585891729.316362,VS0,VE0
X-Cache: MISS from tproxy
X-Cache-Lookup: MISS from tproxy:3130
Via: 1.1 varnish, 1.1 varnish, 1.1 tproxy (squid/4.6)
Connection: keep-alive


----------

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid transparent not caching apt requests from deb.debian.org

Alex Rousskov
On 4/3/20 4:55 PM, zrm wrote:
> On 4/3/20 16:34, Alex Rousskov wrote:
>> On 4/3/20 4:26 PM, zrm wrote:
>>> In the first case we get TCP_MISS every time because it isn't caching
>>> the data, in the second case it's only the first time and after that we
>>> get TCP_REFRESH_UNMODIFIED. But how and why is this happening?

> squid to apt:
> ---------
> X-Cache: MISS from tproxy
> X-Cache-Lookup: MISS from tproxy:3130

> squid to wget:
> ---------
> X-Cache: MISS from tproxy
> X-Cache-Lookup: MISS from tproxy:3130

The headers you have posted tell us that the object was not in Squid
cache when apt and wget transactions started. Since wget was started
after apt, we can speculate that apt transaction did not cache the
object. This is consistent with your observations. I saw nothing in the
posted headers that would explain the difference between apt and wget
outcomes.

Unfortunately, you did not show the headers for the case where the
object actually got cached by Squid. You can probably do that by
repeating the wget transaction twice. I would also repeat the apt
transaction twice after that. It would also be interesting to see the
wget-apt and apt-wget sequences. In summary, I would do
wget-wget-apt-apt-wget-wget. Sleep for 10+ seconds between each
transaction to reduce chances of overlapping cache operations.

BTW, you probably do not need to make ALL,2 logs pretty -- we can figure
out what happens based on Squid messages if you submit one transaction
at a time and disclose transaction sequence. You can just post (a link
to) raw (or sanitized) logs. Compress them if they are too big.

Before sharing the logs, please double check that the problem you want
to address was reproduced during the test.


> Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
> ETag: "1389dc-58b605823fa6e"
> Cache-Control: public, max-age=2592000
> Content-Length: 1280476
> Age: 4248100

FWIW: The object is 4'248'100 seconds old. The object max-age is
2'592'000 seconds. Your Squid is probably using an internal max-age of
259'200 seconds, so Squid will require cache hit revalidation during
subsequent transactions after Squid caches the object (if it caches it).


HTH,

Alex.


> apt to squid:
> ---------
> GET /debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb HTTP/1.1
> Host: deb.debian.org
> User-Agent: Debian APT-HTTP/1.3 (1.8.2)
>
>
> ----------
> squid to deb.debian.org:
> ---------
> GET /debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb HTTP/1.1
> User-Agent: Debian APT-HTTP/1.3 (1.8.2)
> Host: deb.debian.org
> Via: 1.1 tproxy (squid/4.6)
> X-Forwarded-For: 192.168.111.55
> Cache-Control: max-age=259200
> Connection: keep-alive
>
>
> ----------
> deb.debian.org to squid:
> ---------
> HTTP/1.1 200 OK
> Server: Apache
> X-Content-Type-Options: nosniff
> X-Frame-Options: sameorigin
> Referrer-Policy: no-referrer
> X-Xss-Protection: 1
> Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
> ETag: "1389dc-58b605823fa6e"
> X-Clacks-Overhead: GNU Terry Pratchett
> Cache-Control: public, max-age=2592000
> Content-Type: application/x-debian-package
> Via: 1.1 varnish
> Content-Length: 1280476
> Accept-Ranges: bytes
> Date: Fri, 03 Apr 2020 05:28:46 GMT
> Via: 1.1 varnish
> Age: 4248100
> Connection: keep-alive
> X-Served-By: cache-ams21028-AMS, cache-wdc5559-WDC
> X-Cache: HIT, HIT
> X-Cache-Hits: 1, 2
> X-Timer: S1585891726.434375,VS0,VE0
>
> ----------
> squid to apt:
> ---------
> HTTP/1.1 200 OK
> Server: Apache
> X-Content-Type-Options: nosniff
> X-Frame-Options: sameorigin
> Referrer-Policy: no-referrer
> X-Xss-Protection: 1
> Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
> ETag: "1389dc-58b605823fa6e"
> X-Clacks-Overhead: GNU Terry Pratchett
> Cache-Control: public, max-age=2592000
> Content-Type: application/x-debian-package
> Content-Length: 1280476
> Accept-Ranges: bytes
> Date: Fri, 03 Apr 2020 05:28:46 GMT
> Age: 4248100
> X-Served-By: cache-ams21028-AMS, cache-wdc5559-WDC
> X-Cache: HIT, HIT
> X-Cache-Hits: 1, 2
> X-Timer: S1585891726.434375,VS0,VE0
> X-Cache: MISS from tproxy
> X-Cache-Lookup: MISS from tproxy:3130
> Via: 1.1 varnish, 1.1 varnish, 1.1 tproxy (squid/4.6)
> Connection: keep-alive
>
>
> ----------
>
>
>
> wget to squid:
> ---------
> GET /debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb HTTP/1.1
> User-Agent: Wget/1.20.1 (linux-gnu)
> Accept: */*
> Accept-Encoding: identity
> Host: deb.debian.org
> Connection: Keep-Alive
>
>
> ----------
> squid to deb.debian.org:
> ----------
> GET /debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb HTTP/1.1
> User-Agent: Wget/1.20.1 (linux-gnu)
> Accept: */*
> Accept-Encoding: identity
> Host: deb.debian.org
> Via: 1.1 tproxy (squid/4.6)
> X-Forwarded-For: 192.168.111.55
> Cache-Control: max-age=259200
> Connection: keep-alive
>
>
> ----------
> deb.debian.org to squid:
> ---------
> HTTP/1.1 200 OK
> Server: Apache
> X-Content-Type-Options: nosniff
> X-Frame-Options: sameorigin
> Referrer-Policy: no-referrer
> X-Xss-Protection: 1
> Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
> ETag: "1389dc-58b605823fa6e"
> X-Clacks-Overhead: GNU Terry Pratchett
> Cache-Control: public, max-age=2592000
> Content-Type: application/x-debian-package
> Via: 1.1 varnish
> Content-Length: 1280476
> Accept-Ranges: bytes
> Date: Fri, 03 Apr 2020 05:28:49 GMT
> Via: 1.1 varnish
> Age: 4248102
> Connection: keep-alive
> X-Served-By: cache-ams21028-AMS, cache-wdc5583-WDC
> X-Cache: HIT, HIT
> X-Cache-Hits: 1, 1
> X-Timer: S1585891729.316362,VS0,VE0
>
> ----------
> squid to wget:
> ---------
> HTTP/1.1 200 OK
> Server: Apache
> X-Content-Type-Options: nosniff
> X-Frame-Options: sameorigin
> Referrer-Policy: no-referrer
> X-Xss-Protection: 1
> Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
> ETag: "1389dc-58b605823fa6e"
> X-Clacks-Overhead: GNU Terry Pratchett
> Cache-Control: public, max-age=2592000
> Content-Type: application/x-debian-package
> Content-Length: 1280476
> Accept-Ranges: bytes
> Date: Fri, 03 Apr 2020 05:28:49 GMT
> Age: 4248102
> X-Served-By: cache-ams21028-AMS, cache-wdc5583-WDC
> X-Cache: HIT, HIT
> X-Cache-Hits: 1, 1
> X-Timer: S1585891729.316362,VS0,VE0
> X-Cache: MISS from tproxy
> X-Cache-Lookup: MISS from tproxy:3130
> Via: 1.1 varnish, 1.1 varnish, 1.1 tproxy (squid/4.6)
> Connection: keep-alive
>
>
> ----------

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid transparent not caching apt requests from deb.debian.org

Amos Jeffries
Administrator

On 5/04/20 2:53 am, Alex Rousskov wrote:

> On 4/3/20 4:55 PM, zrm wrote:
>> On 4/3/20 16:34, Alex Rousskov wrote:
>>> On 4/3/20 4:26 PM, zrm wrote:
>>>> In the first case we get TCP_MISS every time because it isn't caching
>>>> the data, in the second case it's only the first time and after that we
>>>> get TCP_REFRESH_UNMODIFIED. But how and why is this happening?
>
>> squid to apt:
>> ---------
>> X-Cache: MISS from tproxy
>> X-Cache-Lookup: MISS from tproxy:3130
>
>> squid to wget:
>> ---------
>> X-Cache: MISS from tproxy
>> X-Cache-Lookup: MISS from tproxy:3130
>
> The headers you have posted tell us that the object was not in Squid
> cache when apt and wget transactions started. Since wget was started
> after apt, we can speculate that apt transaction did not cache the
> object. This is consistent with your observations. I saw nothing in the
> posted headers that would explain the difference between apt and wget
> outcomes.
>
> Unfortunately, you did not show the headers for the case where the
> object actually got cached by Squid. You can probably do that by
> repeating the wget transaction twice. I would also repeat the apt
> transaction twice after that. It would also be interesting to see the
> wget-apt and apt-wget sequences. In summary, I would do
> wget-wget-apt-apt-wget-wget. Sleep for 10+ seconds between each
> transaction to reduce chances of overlapping cache operations.
>
> BTW, you probably do not need to make ALL,2 logs pretty -- we can figure
> out what happens based on Squid messages if you submit one transaction
> at a time and disclose transaction sequence. You can just post (a link
> to) raw (or sanitized) logs. Compress them if they are too big.
>
> Before sharing the logs, please double check that the problem you want
> to address was reproduced during the test.
>
>
>> Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
>> ETag: "1389dc-58b605823fa6e"
>> Cache-Control: public, max-age=2592000
>> Content-Length: 1280476
>> Age: 4248100
>
> FWIW: The object is 4'248'100 seconds old. The object max-age is
> 2'592'000 seconds. Your Squid is probably using an internal max-age of
> 259'200 seconds, so Squid will require cache hit revalidation during
> subsequent transactions after Squid caches the object (if it caches it).
>
>

One thing to notice as well is that the object delivered by the upstream
caches expired over 19 days before these tests took place:

> Cache-Control: public, max-age=2592000
> Age: 4248100

The request from Squid in both transactions was to receive content no
longer than 3 days old. The other caches ignored that requirement and
served old content from their storage, apparently without even checking
an origin whether it was stale.

Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
zrm
Reply | Threaded
Open this post in threaded view
|

Re: Squid transparent not caching apt requests from deb.debian.org

zrm
In reply to this post by Alex Rousskov
On 4/4/20 10:53, Alex Rousskov wrote:

> The headers you have posted tell us that the object was not in Squid
> cache when apt and wget transactions started. Since wget was started
> after apt, we can speculate that apt transaction did not cache the
> object. This is consistent with your observations. I saw nothing in the
> posted headers that would explain the difference between apt and wget
> outcomes.
>
> Unfortunately, you did not show the headers for the case where the
> object actually got cached by Squid. You can probably do that by
> repeating the wget transaction twice. I would also repeat the apt
> transaction twice after that. It would also be interesting to see the
> wget-apt and apt-wget sequences. In summary, I would do
> wget-wget-apt-apt-wget-wget. Sleep for 10+ seconds between each
> transaction to reduce chances of overlapping cache operations.
Attached cache.log excerpt for wget-wget-apt-apt-wget-wget. It answers
the apt requests from the cache once it's in there, it just won't cache
it to begin with when apt makes the request:

[wget] 1586041686.600    725 192.168.111.55 TCP_MISS/200 1281195 GET
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
ORIGINAL_DST/199.232.66.133 application/x-debian-package
[wget] 1586041710.518    107 192.168.111.55 TCP_REFRESH_UNMODIFIED/200
1281232 GET
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
ORIGINAL_DST/199.232.66.133 application/x-debian-package
[apt] 1586041733.058     69 192.168.111.55 TCP_REFRESH_UNMODIFIED/200
1281234 GET
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
ORIGINAL_DST/151.101.200.204 application/x-debian-package
[apt] 1586041753.971    101 192.168.111.55 TCP_REFRESH_UNMODIFIED/200
1281234 GET
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
ORIGINAL_DST/151.101.200.204 application/x-debian-package
[wget] 1586041769.162    160 192.168.111.55 TCP_REFRESH_UNMODIFIED/200
1281232 GET
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
ORIGINAL_DST/199.232.66.133 application/x-debian-package
[wget] 1586041786.916     71 192.168.111.55 TCP_REFRESH_UNMODIFIED/200
1281232 GET
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
ORIGINAL_DST/151.101.250.133 application/x-debian-package

> BTW, you probably do not need to make ALL,2 logs pretty -- we can figure
> out what happens based on Squid messages if you submit one transaction
> at a time and disclose transaction sequence. You can just post (a link
> to) raw (or sanitized) logs. Compress them if they are too big.

> Before sharing the logs, please double check that the problem you want
> to address was reproduced during the test.

In this case we start with wget and then it is already in the cache for
the requests made by apt. The problem is the data not being cached when
apt makes the request and it isn't already there. The apt requests do
get answered from the cache if it is already there.

The headers from the previous email show what happens when apt makes the
request and it's not already in the cache.

>> Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
>> ETag: "1389dc-58b605823fa6e"
>> Cache-Control: public, max-age=2592000
>> Content-Length: 1280476
>> Age: 4248100
>
> FWIW: The object is 4'248'100 seconds old. The object max-age is
> 2'592'000 seconds. Your Squid is probably using an internal max-age of
> 259'200 seconds, so Squid will require cache hit revalidation during
> subsequent transactions after Squid caches the object (if it caches it).
That makes sense. These packages never really change at all -- the
package version is part of the URI so if it's updated the package URI
changes rather than the data at the old URI. I might set a longer max
age in the config once this is worked out.

> HTH,
>
> Alex.
>
>

Thanks.

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users

cache.log (30K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Squid transparent not caching apt requests from deb.debian.org

Alex Rousskov
On 4/4/20 8:02 PM, zrm wrote:
> Attached cache.log excerpt for wget-wget-apt-apt-wget-wget. It answers
> the apt requests from the cache once it's in there, it just won't cache
> it to begin with when apt makes the request

Thank you for sharing this log. I agree with your conclusion. The apt
query results in cache revalidation and does not purge the already
cached copy. This conclusion eliminates a few suspects.

There is probably something special about the combination of an apt
request and a 200 OK miss response that prevents Squid from caching that
response. I do not see anything wrong in the logs you have already
already posted. Perhaps others will spot something.

If you get no better responses, please post a link to a compressed
apt-apt-wget-wget log, starting from a cache that does not contain the
response in question and after enabling elevated Squid debugging with
"squid -k debug" or similar. You can find more instructions about
debugging individual transactions at
https://wiki.squid-cache.org/SquidFaq/BugReporting#Debugging_a_single_transaction

A detailed apt-apt-wget-wget log should tell us why Squid is refusing to
cache a 200 OK response to the apt query but caches a very similar
response to a very similar wget query.


Thank you,

Alex.



> [wget] 1586041686.600    725 192.168.111.55 TCP_MISS/200 1281195 GET
> http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
> ORIGINAL_DST/199.232.66.133 application/x-debian-package
> [wget] 1586041710.518    107 192.168.111.55 TCP_REFRESH_UNMODIFIED/200
> 1281232 GET
> http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
> ORIGINAL_DST/199.232.66.133 application/x-debian-package
> [apt] 1586041733.058     69 192.168.111.55 TCP_REFRESH_UNMODIFIED/200
> 1281234 GET
> http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
> ORIGINAL_DST/151.101.200.204 application/x-debian-package
> [apt] 1586041753.971    101 192.168.111.55 TCP_REFRESH_UNMODIFIED/200
> 1281234 GET
> http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
> ORIGINAL_DST/151.101.200.204 application/x-debian-package
> [wget] 1586041769.162    160 192.168.111.55 TCP_REFRESH_UNMODIFIED/200
> 1281232 GET
> http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
> ORIGINAL_DST/199.232.66.133 application/x-debian-package
> [wget] 1586041786.916     71 192.168.111.55 TCP_REFRESH_UNMODIFIED/200
> 1281232 GET
> http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
> ORIGINAL_DST/151.101.250.133 application/x-debian-package
>
>> BTW, you probably do not need to make ALL,2 logs pretty -- we can figure
>> out what happens based on Squid messages if you submit one transaction
>> at a time and disclose transaction sequence. You can just post (a link
>> to) raw (or sanitized) logs. Compress them if they are too big.
>
>> Before sharing the logs, please double check that the problem you want
>> to address was reproduced during the test.
>
> In this case we start with wget and then it is already in the cache for
> the requests made by apt. The problem is the data not being cached when
> apt makes the request and it isn't already there. The apt requests do
> get answered from the cache if it is already there.
>
> The headers from the previous email show what happens when apt makes the
> request and it's not already in the cache.
>
>>> Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
>>> ETag: "1389dc-58b605823fa6e"
>>> Cache-Control: public, max-age=2592000
>>> Content-Length: 1280476
>>> Age: 4248100
>>
>> FWIW: The object is 4'248'100 seconds old. The object max-age is
>> 2'592'000 seconds. Your Squid is probably using an internal max-age of
>> 259'200 seconds, so Squid will require cache hit revalidation during
>> subsequent transactions after Squid caches the object (if it caches it).
>
> That makes sense. These packages never really change at all -- the
> package version is part of the URI so if it's updated the package URI
> changes rather than the data at the old URI. I might set a longer max
> age in the config once this is worked out.
>
>> HTH,
>>
>> Alex.
>>
>>
>
> Thanks.

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid transparent not caching apt requests from deb.debian.org

Amos Jeffries
Administrator
On 7/04/20 3:49 am, Alex Rousskov wrote:

> On 4/4/20 8:02 PM, zrm wrote:
>> Attached cache.log excerpt for wget-wget-apt-apt-wget-wget. It answers
>> the apt requests from the cache once it's in there, it just won't cache
>> it to begin with when apt makes the request
>
> Thank you for sharing this log. I agree with your conclusion. The apt
> query results in cache revalidation and does not purge the already
> cached copy. This conclusion eliminates a few suspects.
>
> There is probably something special about the combination of an apt
> request and a 200 OK miss response that prevents Squid from caching that
> response. I do not see anything wrong in the logs you have already
> already posted. Perhaps others will spot something.


zrm, can you please do a clean test for the apt sequence exactly as Alex
requested rather than just referring to the earlier one done. A swapout
timing overlap by as much as a few nanoseconds in the original test
could mean the different between HIT and MISS. So anyone able to find
this will need those details to work with.

Amos
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
zrm
Reply | Threaded
Open this post in threaded view
|

Re: Squid transparent not caching apt requests from deb.debian.org

zrm
In reply to this post by Alex Rousskov
cache.log for apt-apt-wget-wget with squid -k debug:

https://www.trustiosity.com/squid/cache-debug.log.xz

On 4/6/20 11:49, Alex Rousskov wrote:

> On 4/4/20 8:02 PM, zrm wrote:
>> Attached cache.log excerpt for wget-wget-apt-apt-wget-wget. It answers
>> the apt requests from the cache once it's in there, it just won't cache
>> it to begin with when apt makes the request
>
> Thank you for sharing this log. I agree with your conclusion. The apt
> query results in cache revalidation and does not purge the already
> cached copy. This conclusion eliminates a few suspects.
>
> There is probably something special about the combination of an apt
> request and a 200 OK miss response that prevents Squid from caching that
> response. I do not see anything wrong in the logs you have already
> already posted. Perhaps others will spot something.
>
> If you get no better responses, please post a link to a compressed
> apt-apt-wget-wget log, starting from a cache that does not contain the
> response in question and after enabling elevated Squid debugging with
> "squid -k debug" or similar. You can find more instructions about
> debugging individual transactions at
> https://wiki.squid-cache.org/SquidFaq/BugReporting#Debugging_a_single_transaction
>
> A detailed apt-apt-wget-wget log should tell us why Squid is refusing to
> cache a 200 OK response to the apt query but caches a very similar
> response to a very similar wget query.
>
>
> Thank you,
>
> Alex.
>
>
>
>> [wget] 1586041686.600    725 192.168.111.55 TCP_MISS/200 1281195 GET
>> http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
>> ORIGINAL_DST/199.232.66.133 application/x-debian-package
>> [wget] 1586041710.518    107 192.168.111.55 TCP_REFRESH_UNMODIFIED/200
>> 1281232 GET
>> http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
>> ORIGINAL_DST/199.232.66.133 application/x-debian-package
>> [apt] 1586041733.058     69 192.168.111.55 TCP_REFRESH_UNMODIFIED/200
>> 1281234 GET
>> http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
>> ORIGINAL_DST/151.101.200.204 application/x-debian-package
>> [apt] 1586041753.971    101 192.168.111.55 TCP_REFRESH_UNMODIFIED/200
>> 1281234 GET
>> http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
>> ORIGINAL_DST/151.101.200.204 application/x-debian-package
>> [wget] 1586041769.162    160 192.168.111.55 TCP_REFRESH_UNMODIFIED/200
>> 1281232 GET
>> http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
>> ORIGINAL_DST/199.232.66.133 application/x-debian-package
>> [wget] 1586041786.916     71 192.168.111.55 TCP_REFRESH_UNMODIFIED/200
>> 1281232 GET
>> http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
>> ORIGINAL_DST/151.101.250.133 application/x-debian-package
>>
>>> BTW, you probably do not need to make ALL,2 logs pretty -- we can figure
>>> out what happens based on Squid messages if you submit one transaction
>>> at a time and disclose transaction sequence. You can just post (a link
>>> to) raw (or sanitized) logs. Compress them if they are too big.
>>
>>> Before sharing the logs, please double check that the problem you want
>>> to address was reproduced during the test.
>>
>> In this case we start with wget and then it is already in the cache for
>> the requests made by apt. The problem is the data not being cached when
>> apt makes the request and it isn't already there. The apt requests do
>> get answered from the cache if it is already there.
>>
>> The headers from the previous email show what happens when apt makes the
>> request and it's not already in the cache.
>>
>>>> Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
>>>> ETag: "1389dc-58b605823fa6e"
>>>> Cache-Control: public, max-age=2592000
>>>> Content-Length: 1280476
>>>> Age: 4248100
>>>
>>> FWIW: The object is 4'248'100 seconds old. The object max-age is
>>> 2'592'000 seconds. Your Squid is probably using an internal max-age of
>>> 259'200 seconds, so Squid will require cache hit revalidation during
>>> subsequent transactions after Squid caches the object (if it caches it).
>>
>> That makes sense. These packages never really change at all -- the
>> package version is part of the URI so if it's updated the package URI
>> changes rather than the data at the old URI. I might set a longer max
>> age in the config once this is worked out.
>>
>>> HTH,
>>>
>>> Alex.
>>>
>>>
>>
>> Thanks.
>
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid transparent not caching apt requests from deb.debian.org

Alex Rousskov
On 4/7/20 8:48 PM, zrm wrote:

> https://www.trustiosity.com/squid/cache-debug.log.xz

I found the reason for the difference.

After the destination IP address of your apt requests fails Host header
validation, Squid marks the request as "not cachable":

> hostHeaderIpVerify: IP 151.101.248.204:80 does not match from-Host IP 151.101.202.133
> hostHeaderIpVerify: FAILED to validate IP 151.101.248.204:80
> clientInterpretRequestHeaders: REQ_CACHABLE = NOT SET


After the destination IP address of your wget requests passes Host
header validation, Squid marks the request as "cachable":

> hostHeaderIpVerify: validated IP 151.101.202.133:80
> clientInterpretRequestHeaders: REQ_CACHABLE = SET


N.B. The log lines above have been slightly adjusted for readability
(this particular raw output is rather difficult to interpret correctly
IMO), but you can easily find raw lines if you look for the preserved
function names.


I hope others on the list will guide you towards a resolution of this
problem.


HTH,

Alex.

> On 4/6/20 11:49, Alex Rousskov wrote:
>> On 4/4/20 8:02 PM, zrm wrote:
>>> Attached cache.log excerpt for wget-wget-apt-apt-wget-wget. It answers
>>> the apt requests from the cache once it's in there, it just won't cache
>>> it to begin with when apt makes the request
>>
>> Thank you for sharing this log. I agree with your conclusion. The apt
>> query results in cache revalidation and does not purge the already
>> cached copy. This conclusion eliminates a few suspects.
>>
>> There is probably something special about the combination of an apt
>> request and a 200 OK miss response that prevents Squid from caching that
>> response. I do not see anything wrong in the logs you have already
>> already posted. Perhaps others will spot something.
>>
>> If you get no better responses, please post a link to a compressed
>> apt-apt-wget-wget log, starting from a cache that does not contain the
>> response in question and after enabling elevated Squid debugging with
>> "squid -k debug" or similar. You can find more instructions about
>> debugging individual transactions at
>> https://wiki.squid-cache.org/SquidFaq/BugReporting#Debugging_a_single_transaction
>>
>>
>> A detailed apt-apt-wget-wget log should tell us why Squid is refusing to
>> cache a 200 OK response to the apt query but caches a very similar
>> response to a very similar wget query.
>>
>>
>> Thank you,
>>
>> Alex.
>>
>>
>>
>>> [wget] 1586041686.600    725 192.168.111.55 TCP_MISS/200 1281195 GET
>>> http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
>>> ORIGINAL_DST/199.232.66.133 application/x-debian-package
>>> [wget] 1586041710.518    107 192.168.111.55 TCP_REFRESH_UNMODIFIED/200
>>> 1281232 GET
>>> http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
>>> ORIGINAL_DST/199.232.66.133 application/x-debian-package
>>> [apt] 1586041733.058     69 192.168.111.55 TCP_REFRESH_UNMODIFIED/200
>>> 1281234 GET
>>> http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
>>> ORIGINAL_DST/151.101.200.204 application/x-debian-package
>>> [apt] 1586041753.971    101 192.168.111.55 TCP_REFRESH_UNMODIFIED/200
>>> 1281234 GET
>>> http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
>>> ORIGINAL_DST/151.101.200.204 application/x-debian-package
>>> [wget] 1586041769.162    160 192.168.111.55 TCP_REFRESH_UNMODIFIED/200
>>> 1281232 GET
>>> http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
>>> ORIGINAL_DST/199.232.66.133 application/x-debian-package
>>> [wget] 1586041786.916     71 192.168.111.55 TCP_REFRESH_UNMODIFIED/200
>>> 1281232 GET
>>> http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
>>> ORIGINAL_DST/151.101.250.133 application/x-debian-package
>>>
>>>> BTW, you probably do not need to make ALL,2 logs pretty -- we can
>>>> figure
>>>> out what happens based on Squid messages if you submit one transaction
>>>> at a time and disclose transaction sequence. You can just post (a link
>>>> to) raw (or sanitized) logs. Compress them if they are too big.
>>>
>>>> Before sharing the logs, please double check that the problem you want
>>>> to address was reproduced during the test.
>>>
>>> In this case we start with wget and then it is already in the cache for
>>> the requests made by apt. The problem is the data not being cached when
>>> apt makes the request and it isn't already there. The apt requests do
>>> get answered from the cache if it is already there.
>>>
>>> The headers from the previous email show what happens when apt makes the
>>> request and it's not already in the cache.
>>>
>>>>> Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
>>>>> ETag: "1389dc-58b605823fa6e"
>>>>> Cache-Control: public, max-age=2592000
>>>>> Content-Length: 1280476
>>>>> Age: 4248100
>>>>
>>>> FWIW: The object is 4'248'100 seconds old. The object max-age is
>>>> 2'592'000 seconds. Your Squid is probably using an internal max-age of
>>>> 259'200 seconds, so Squid will require cache hit revalidation during
>>>> subsequent transactions after Squid caches the object (if it caches
>>>> it).
>>>
>>> That makes sense. These packages never really change at all -- the
>>> package version is part of the URI so if it's updated the package URI
>>> changes rather than the data at the old URI. I might set a longer max
>>> age in the config once this is worked out.
>>>
>>>> HTH,
>>>>
>>>> Alex.
>>>>
>>>>
>>>
>>> Thanks.
>>

_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
zrm
Reply | Threaded
Open this post in threaded view
|

Re: Squid transparent not caching apt requests from deb.debian.org

zrm
On 4/8/20 10:46, Alex Rousskov wrote:
> On 4/7/20 8:48 PM, zrm wrote:
>
>> https://www.trustiosity.com/squid/cache-debug.log.xz
>
> I found the reason for the difference.
>
> After the destination IP address of your apt requests fails Host header
> validation, Squid marks the request as "not cachable":

I checked the DNS query apt is making to see why it's different. It's
making a SRV query for _http._tcp.deb.debian.org and then using the IP
address of the name (prod.debian.map.fastly.net) returned in the SRV
query. By contrast, squid does the A record query for deb.debian.org and
gets a CNAME for debian.map.fastly.net. Almost the same, but since it's
a CDN with many IP addresses, enough different that they happen to not
both return the same address and then validation fails.

Meanwhile wget does the same A record query as squid and gets the same
address.

The question then becomes what to do about it. Maybe if squid fails the
validation for the A query, it should try the SRV query and accept the
address as valid if it matches that. Another possibility would be a
config option to have squid completely ignore the address the client
used and always use the address it gets by doing its own DNS query for
the host, in which case the result would be safe to cache.

But these are obviously changes requiring a new version of squid. Is
there any way to make it work without that?
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
Reply | Threaded
Open this post in threaded view
|

Re: Squid transparent not caching apt requests from deb.debian.org

Matus UHLAR - fantomas
>>On 4/7/20 8:48 PM, zrm wrote:
>>>https://www.trustiosity.com/squid/cache-debug.log.xz

>On 4/8/20 10:46, Alex Rousskov wrote:
>>I found the reason for the difference.
>>
>>After the destination IP address of your apt requests fails Host header
>>validation, Squid marks the request as "not cachable":

On 08.04.20 13:01, zrm wrote:

>I checked the DNS query apt is making to see why it's different. It's
>making a SRV query for _http._tcp.deb.debian.org and then using the IP
>address of the name (prod.debian.map.fastly.net) returned in the SRV
>query. By contrast, squid does the A record query for deb.debian.org
>and gets a CNAME for debian.map.fastly.net. Almost the same, but since
>it's a CDN with many IP addresses, enough different that they happen
>to not both return the same address and then validation fails.
>
>Meanwhile wget does the same A record query as squid and gets the same
>address.
>
>The question then becomes what to do about it. Maybe if squid fails
>the validation for the A query, it should try the SRV query and accept
>the address as valid if it matches that. Another possibility would be
>a config option to have squid completely ignore the address the client
>used and always use the address it gets by doing its own DNS query for
>the host, in which case the result would be safe to cache.
>
>But these are obviously changes requiring a new version of squid. Is
>there any way to make it work without that?

I'd contact debian.org DNS masters. I believe CDN wasn't designedto cause this
kind of issues.

--
Matus UHLAR - fantomas, [hidden email] ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Despite the cost of living, have you noticed how popular it remains?
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users
zrm
Reply | Threaded
Open this post in threaded view
|

Re: Squid transparent not caching apt requests from deb.debian.org

zrm
On 4/8/20 13:13, Matus UHLAR - fantomas wrote:

> On 08.04.20 13:01, zrm wrote:
>> I checked the DNS query apt is making to see why it's different. It's
>> making a SRV query for _http._tcp.deb.debian.org and then using the IP
>> address of the name (prod.debian.map.fastly.net) returned in the SRV
>> query. By contrast, squid does the A record query for deb.debian.org
>> and gets a CNAME for debian.map.fastly.net. Almost the same, but since
>> it's a CDN with many IP addresses, enough different that they happen
>> to not both return the same address and then validation fails.
>>
>> Meanwhile wget does the same A record query as squid and gets the same
>> address.
>>
>> The question then becomes what to do about it. Maybe if squid fails
>> the validation for the A query, it should try the SRV query and accept
>> the address as valid if it matches that. Another possibility would be
>> a config option to have squid completely ignore the address the client
>> used and always use the address it gets by doing its own DNS query for
>> the host, in which case the result would be safe to cache.
>>
>> But these are obviously changes requiring a new version of squid. Is
>> there any way to make it work without that?
>
> I'd contact debian.org DNS masters. I believe CDN wasn't designedto
> cause this
> kind of issues.
>

It seems to be like this on purpose:

https://deb.debian.org/

I'm not sure there is actually anything wrong with their DNS
configuration, it's not required for the SRV record to point to the same
name/address as the A record (which would in fact defeat the point of
the SRV record).

It *might* be a bug in apt, since if the SRV record says to use
prod.debian.map.fastly.net then maybe it ought to be using that as
"Host: " in the HTTP headers, but I'm not sure about that. If it's
supposed to use the original domain in the HTTP headers when using a SRV
record then this could still be a failing of squid for not checking and
considering valid the address from the HTTP SRV record.

I'll send them an email and see what they have to say about it.
_______________________________________________
squid-users mailing list
[hidden email]
http://lists.squid-cache.org/listinfo/squid-users